Final code review of slice 1 flagged one Important issue (the spec's
"zero cost when off" claim for the surface-dump path is technically
violated — _uploadMetadata always writes one dict entry per upload
regardless of env var) plus minor doc/consistency gaps. Applied:
1. Spec §5 "Cost when off": dropped the "Zero" claim; replaced with
"Negligible — one Dictionary write per upload (~30-50 KB at Holtburg)
plus a hash-table write per upload. Expensive work (file I/O,
histogram construction) is still env-gated." This matches reality.
2. Baseline doc §5: rewrote from "Raw logs (scratch, can be deleted)"
referencing files that were never preserved in this worktree, to
"Reproducing the measurements" with the actual PowerShell launch
commands. Honest about the raw logs not being kept; the captured
medians in section 2 are the canonical record.
3. New issue #55 filed in docs/ISSUES.md — static-entity slow path
reports ~1.45M meshMissing/5s at r4 standstill, drops to ~0 when
walking. LOW severity (no visible regression), hypothesis points
at a "permanently-missing entity gets re-classified every frame"
pattern that Tier 1 cache doesn't cover.
4. Roadmap shipped table: renamed "N.6.1" row to "N.6 slice 1" to
match every other artifact's naming. Search-discoverability fix.
None of these change the slice's conclusion or next-phase
recommendation (C.1.5 first, then reduced-scope slice 2).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Code-quality review on commit 13abf96 flagged 3 Important issues in
the baseline document plus 2 minor roadmap consistency gaps. Applied
all of them:
1. The "CPU scales superlinearly with N₁" claim was imprecise because
CPU growth (4.0×) is actually sublinear vs near-LB count (7.7×).
Clarified: CPU grows more than linearly with radius N₁ but
sublinearly with visible-LB count; frustum cull discards most far
LBs early. The outer per-LB walk still scales with N₁, which is
what Tier 2's persistent groups address.
2. The "40-50% memory footprint reduction from atlas packing" estimate
was asserted without derivation and likely too optimistic given all
surfaces are already power-of-two and same-format (RGBA8). Replaced
with a more honest bound: "low-MB to ~10 MB absolute saving" with
explicit per-array metadata overhead reasoning. Conclusion is
unchanged — atlas adoption still isn't justified given GPU
under-utilization.
3. The "spec §6 threshold for atlas is >30%" citation pointed at text
that doesn't exist in the spec. Replaced with "A conventional
rule-of-thumb" so a future reader doesn't chase a phantom citation.
Plus roadmap consistency:
M1: The N.6 slice 1 bullet now uses the canonical "✓ SHIPPED — Title.
Shipped YYYY-MM-DD." prefix that every other shipped phase uses.
M2: Added N.6.1 row to the shipped table at the top of the roadmap
(lines ~55-66) so the at-a-glance shipped list is complete.
None of these change the conclusion or the next-phase recommendation
(C.1.5 first, then reduced N.6 slice 2). The fixes improve doc accuracy
and future-readability.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Capture authoritative CPU+GPU dispatch numbers at Holtburg with the
gpu_us diagnostic now working (commit 25cb147). Three radii (4/8/12)
x two motion modes (standstill/walking) + a surface-format histogram
from ACDREAM_DUMP_SURFACES=1.
Adds env-gated one-shot dump path (TextureCache.TickSurfaceHistogramDumpIfEnabled,
called from GameWindow.OnRender) that fires once after both (a) frame
600 of the session AND (b) the upload-metadata dict reaches 100 entries
-- the cache-size gate prevents the dump from firing during pre-world
GUI ticks where OnRender spins at high rates but no scenery has streamed.
Output writes to %LOCALAPPDATA%\acdream\n6-surfaces.txt with a try/catch
around the I/O so disk-full / permission errors don't crash mid-measurement.
Baseline document at docs/plans/2026-05-11-phase-n6-perf-baseline.md
documents:
- CPU dominates GPU by 30-50x at every radius (strongly CPU-bound)
- GPU wildly under-utilized (max gpu_us p95 ~600us vs 16,600us frame budget)
- CPU scales superlinearly with N1 (Tier 1 cache wins on inner loop but
not outer LB walk)
- Surface atlas opportunity high (59% of textures in top-3 triples) but
win is memory-only since GPU isn't bottlenecked
Recommendation: C.1.5 (PES emitter wiring) next, then a reduced-scope
N.6 slice 2 (drop atlas + persistent-mapped buffers -- not justified by
the GPU under-utilization observed).
Roadmap entry amended to split N.6 into slice 1 (shipped) and slice 2
(planned, reduced scope, deferred until after C.1.5).
Spec: docs/superpowers/specs/2026-05-11-phase-n6-slice1-design.md.
Plan: docs/superpowers/plans/2026-05-11-phase-n6-slice1.md (Task 4).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Step-by-step plan for the two-commit slice: fix WbDrawDispatcher's
gpu_us double-buffering bug (ring-of-3 query slots, read-before-overwrite,
vendor-neutral) then capture the radius=12 baseline at Holtburg with
the now-working diagnostic. Includes exact old_string/new_string Edit
patterns for every code change, PowerShell launch + measurement
procedure for the manual baseline, baseline doc template with explicit
fill-in slots, and a per-criterion acceptance checklist.
Output companion to docs/superpowers/specs/2026-05-11-phase-n6-slice1-design.md
(commit 05d590c).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Brainstormed design for the first slice of Phase N.6 (perf polish).
Slice 1 ships two commits: (1) fix the GPU timing query double-buffering
in WbDrawDispatcher (cross-vendor ring of 3, read-before-overwrite),
(2) add an env-gated surface-format histogram dump + capture the
radius=12 perf baseline at Holtburg. Slice 2 (TextureCache cleanup +
shader migration + optional persistent-mapped buffers) is deferred
until after C.1.5 (PES emitter wiring), with the next-phase decision
to be made on the baseline numbers slice 1 produces.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Post-A.5 polish Priority 3. EntityClassificationCache keyed by
(entityId, landblockHint) tuple. Entity dispatcher cpu_us median ~1.2 ms,
p95 ~1.5 ms — ~66% reduction vs pre-Tier-1 baseline. Closes the
post-A.5 polish phase entirely (#52, #54, #53 all closed).
See docs/ISSUES.md #53 closure + memory/project_tier1_cache.md for the
24-commit chain, 4 bug-fix iterations, and the per-tuple-vs-per-entity
recurring trap pattern documented for future cache work.
EntityClassificationCache keyed by (entityId, landblockHint) tuple lands
per spec docs/superpowers/specs/2026-05-10-issue-53-tier1-cache-design.md
+ plan docs/superpowers/plans/2026-05-10-issue-53-tier1-cache.md.
Perf result (horizon-safe preset + High quality, AMD Radeon RX 9070 XT
@ 1440p): entity dispatcher cpu_us median ~1200 us, p95 ~1500 us. Down
from ~3500m / ~4000p95 pre-Tier-1. ~66% / ~63% reduction. Well under
the A.5 spec budget (median <= 2.0 ms, p95 <= 2.5 ms). No BUDGET_OVER
flag across 30s+ standstill captures.
Visual gate cleared after 4 bug-fix iterations:
- 71d0edc: namespace stab Ids globally (cross-LB id collision)
- 95ebbf3: key cache by (entityId, landblockHint) tuple (defensive)
- c55acdc: skip cache populate when classification is incomplete
- f928e66: incomplete-entity flag must persist across same-entity tuples
User-confirmed visually via +Acdream test character: NPCs animate,
multi-part static buildings render fully (no airborne geometry, no
Z-fighting, no missing parts, no wrong textures), Nullified Statue of
a Drudge on top of the Foundry renders all parts, trees outside
Holtburg render with branches present.
Closes the post-A.5 polish phase. Issues #52, #54, #53 all closed.
Tests: 1711 passing, 8 pre-existing physics/input failures unchanged.
N.5b sentinel: 112/112 throughout.
Memory: ~/.claude/projects/.../memory/project_tier1_cache.md +
feedback_cache_per_tuple_pattern.md capture the audit-gap and per-tuple-
vs-per-entity recurring trap for future cache work.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Captures Phase M (Network Stack Conformance) as a fully-formed phase
ready to be picked up later. Three deliverables:
1. Design spec at docs/superpowers/specs/2026-05-10-phase-m-network-stack-design.md
(~700 lines, 8 sections):
- Bar C completeness target ("wireable on demand"): every wire opcode
a 2013 EoR retail client receives or sends gets a parser/builder +
golden-vector test + typed event in the new layered stack.
- Three-layer architecture: INetTransport / IReliableSession /
IGameProtocol, with WorldSession as a thin behavior consumer.
Concrete C# interface signatures, sub-component decomposition.
- Worktree-branch big-bang migration on claude/phase-m-network-stack;
weekly rebase cadence; single --no-ff merge ships the phase.
- Per-sub-phase entry/exit gates, conformance test plan (golden vectors
+ live capture replay + live ACE smoke), 10-row risk register, scope-
cut order if calendar compresses.
- Cost: 256 hours / ~6.4 weeks single-developer; 4-6 weeks calendar
with subagent parallelization on M.1 + M.6.
2. Opcode coverage matrix at docs/research/2026-05-10-phase-m-opcode-matrix.md
(~284 rows across 5 sections):
- Section 1: 22 transport flags (14 implemented).
- Section 2: 12 optional-header fields (10 partial).
- Section 3: 51 top-level GameMessages (21 implemented).
- Section 4: 103 GameEvent sub-opcodes inside 0xF7B0 (27 parsed,
26 wired).
- Section 5: 96 GameAction sub-opcodes inside 0xF7B1 (24 built,
8 with live callers).
- Roll-up: ~34% complete by raw opcode count. Biggest single
unblocking step is wiring the 16 dead builders in section 5
(Phase B.4 surface — Use / UseWithTarget / Allegiance / Inventory
/ Social / Cast / Appraise).
- Sources cited per row: holtburger (629695a), ACE, named retail
decomp, acdream current state.
- Produced by 4 parallel research agents (one per class). Spot-check
pass owed before M.1 closes.
3. Roadmap update: Phase M section trimmed to summary + status + pointer
to the spec; the previously-tracked M.0 Tier 1 quick-wins are folded
into M.3 / M.4 / M.6 per the spec; M.1 retained as the matrix
construction sub-lane with status note.
Why this shape: the user goal is a complete, layered, testable network
stack that can be wired in as gameplay phases need it — independent of
whether each opcode is yet hooked to game state. The matrix is the
source of truth for "done"; the spec is the architecture the matrix
implements against; the roadmap is the index that points at both.
Decisions captured during the design discussion (in case they need
revisiting):
- Bar C ("wireable on demand") chosen over Bar A (holtburger parity)
or Bar B (named-retail completeness).
- Three layers (INetTransport / IReliableSession / IGameProtocol)
chosen over holtburger's two-layer split.
- Big-bang on a feature branch (worktree) chosen over strangler
pattern; preserves live-ACE testing on main throughout the phase.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Holtburger reference fast-forwarded from 88b19bd to 629695a (+237 commits).
Four parallel research agents produced a parity-first-pass between
holtburger's network stack and acdream's src/AcDream.Core.Net/.
Why captured now: study surfaced six small, high-confidence "Tier 1" fixes
that can ship before the bigger M.1-M.8 layer extraction. Most likely fix
for the longstanding "remote retail observer sees us not perfect" bug
(MoveToState wire-format mismatches). Two transport gaps (no EchoResponse
reply, eager port-switch) match recent holtburger fixes (403bc98, 99974cc).
One latent bug worth a 5-min check (ISAAC search-mode for out-of-order
ENCRYPTED_CHECKSUM packets).
Captured as Phase M.0 in the roadmap so the work survives the session and
can be picked up later. Existing M.1-M.8 lift unchanged; M.1 marked as
partially started since the research note is the parity-map deliverable
in draft form.
Files:
- docs/research/2026-05-10-holtburger-network-stack-study.md (new) — full
study with ranked port candidates, recent commits worth knowing, and
acdream-vs-holtburger file map.
- docs/plans/2026-04-11-roadmap.md — Phase M Plan-of-record updated with
2026-05-10 pointer; M.0 sub-lane added before M.1; M.1 status note.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The audit at docs/research/2026-05-10-tier1-mutation-audit.md enumerates
every entity.MeshRefs write site (5 STATIC at hydration, 1 DYNAMIC at
GameWindow.cs:7580 inside TickAnimations) and verifies that all 7
Position/Rotation write sites only touch entities in _animatedEntities.
Establishes the load-bearing invariant: an entity's renderer state is
stable from spawn to despawn iff entity.Id is NOT in _animatedEntities.
The spec at docs/superpowers/specs/2026-05-10-issue-53-tier1-cache-design.md
locks in the design from brainstorming on 2026-05-10:
- Static-only cache + DEBUG cross-check (option c) — catches future
regressions of the prior bug class without paying perf cost in Release
- Separate EntityClassificationCache class injected into WbDrawDispatcher
- Cache the rest pose, not the full model matrix (Position/Rotation read
live each frame so Release stays correct even if the invariant breaks)
- Pre-flatten Setup multi-parts at populate time (the bulk of the win)
- 15 new tests covering all invalidation paths + DEBUG cross-check +
Setup pre-flatten + lifecycle pin
Closes the audit + design steps of the post-A.5 polish Priority 3 work.
Implementation plan owned by superpowers:writing-plans next.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
After the post-A.5 lifestone (#52) + JobKind plumbing (#54) work shipped,
only Priority 3 (Tier 1 entity-classification cache retry, ISSUE #53)
remains. This handoff captures the audit insights gathered during the
#52 investigation that the original post-A.5 handoff didn't have:
- MeshRef is a `readonly record struct` — its fields can NOT be mutated
in place. The actual per-frame mutation for animated entities is the
entire MeshRefs LIST replacement at GameWindow.cs:7474-7553. This
reframes the cache design.
- _animatedEntities dict at GameWindow.cs:160 is the source of truth
for which entities go through the per-frame rebuild path.
- Static entity = entity.Id NOT in _animatedEntities. Its MeshRefs is
the same instance from spawn until rare events (ObjDesc / palette
swap / part hide / scale apply).
- Recommended cache approach: static-only with explicit invalidation
hooks on the network/spawn-time write sites enumerated in the doc.
Doc covers: where main is, what shipped this session, why the first
Tier 1 attempt failed, the pre-started audit, cache design options,
acceptance criteria, files to read, workflow for the next session, and
things-to-NOT-do.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Move ISSUE #54 to Recently closed referencing commit `bf31e59`. Drop
#54 from CLAUDE.md "Currently in flight" — only #53 (Tier 1 retry)
remains open in the post-A.5 polish phase.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Move ISSUE #52 from Active to Recently closed with full root-cause writeup
referencing commit `e40159f`. Strip lifestone reference from CLAUDE.md
"Currently in flight"; remaining post-A.5 polish scope is #53 (Tier 1
retry) + #54 (JobKind plumbing).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Captures the three deferred items from A.5 ship:
- ISSUE #52: lifestone visual missing (1-3 hours, fast win)
- ISSUE #54: JobKind plumbing through BuildLandblockForStreaming
(~30 min - 1 hour, worker-thread efficiency cleanup)
- ISSUE #53: Tier 1 entity-classification cache retry (~5-7 days,
biggest perf win remaining; needs animation-mutation audit before
designing to avoid the freeze-pose bug from the first attempt)
Doc covers: A.5 final state + 3 high-value gotchas, files to read,
per-priority detail with effort estimates and acceptance criteria,
what NOT to do, the first-30-minute workflow, and the full A.5
commit chain for reference.
Phase is sized ~1 week if all three priorities land. The audit
step on Tier 1 is the highest-leverage investment.
Tier 2 + Tier 3 (static/dynamic split + GPU compute culling) are
explicitly out-of-scope for this phase — separate multi-week phases
per docs/plans/2026-05-10-perf-tiers-2-3-roadmap.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sub-phase under existing F.5 (Core panels) capturing the immediate
follow-up to ISSUES.md #13: now that PlayerDescriptionParser surfaces
the full trailer (Inventory / Equipped / Shortcuts / HotbarSpells /
DesiredComps / Options1+2 / SpellbookFilters) and GameEventWiring
populates ItemRepository at login, F.5a wires that data into minimal
ImGui dev panels under ACDREAM_DEVTOOLS=1 so it's observable in-game.
Establishes the binding pattern (AcDream.UI.Abstractions ViewModels →
ImGui renderer) that the eventual D.2b retail-skinned F.5 panels
reuse. Spec to brainstorm before code.
Captured 2026-05-10 during Phase A.5 polish discussion. User asked why
the 9070 XT @ 1440p doesn't hit Unreal-level FPS for an old game like
AC. Answer: architectural — we rebuild the entire draw plan from
scratch every frame instead of caching pre-baked static-world data.
Tier 1 (entity-classification cache) lands as A.5 polish (separate
commit). Tiers 2 + 3 documented here for future scheduling:
- Tier 2 — Static/dynamic split with persistent groups
~2-week phase. Static entities (~95% of world) get permanent GPU-
resident matrix slots, populated at spawn, dirty-tracked for delta
upload. Per-frame CPU cost for static = LB-cull + dirty-flag check
only. Estimated entity dispatcher: 3.5ms → 0.5-1ms median.
400-600 FPS at standstill, radius=12.
- Tier 3 — GPU-side culling (compute pre-pass)
~1-month phase. Per-instance frustum cull moves to GPU compute
shader. Compute writes draw-indirect buffer; rasterizer reads it.
Estimated CPU dispatcher: ~0.05ms (essentially free).
600-1000+ FPS at standstill, radius=12.
Doc captures effort estimates, sub-decisions, risks, mitigations, and
scheduling triggers for each tier. Also notes the architectural
ceiling (~800-1500 FPS for a C# + GL client; reaching native engine
performance requires becoming a different engine).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Code-quality review followup on Task 2 (becbde6) — addresses I1 (the
forward-looking concern that Tasks 3-9's inner-catch will leave partial
lists visible to callers with no signal) and M1 (silent inner catch).
Changes:
- Parsed gains a trailing `bool TrailerTruncated` field. Both
construction sites pass `false` by default; the trailer try/catch
flips a local `trailerTruncated` to `true` on FormatException and
feeds it into the final return.
- Inner catch logs `pos`/`payload.Length`/exception message under
ACDREAM_DUMP_VITALS=1, mirroring the outer catch's diagnostic
pattern.
- Task 2 test strengthened to assert defaults on Options2 /
SpellbookFilters / HotbarSpells / DesiredComps / GameplayOptions /
Equipped + TrailerTruncated=false (M2 followup — gives Tasks 3-9
a regression guard if they consume into the wrong field).
- New test `TryParse_TrailerAbsent_LessThan8BytesAfterEnchantments_*`
documents the contract that <8 bytes after enchantments means the
trailer is absent (not truncated): TrailerTruncated stays false,
upstream attribute data survives.
- Plan updated in lockstep so Tasks 3-11 implementers see the
`trailerTruncated` local and the new return-arg position.
271/271 AcDream.Core.Net.Tests pass.
Code review nit-fix on top of d3b58c9 — addresses two issues from the
quality review of Task 1:
I1 (Important): the record struct `Shortcut` was a homograph with the
flag member `CharacterOptionDataFlag.Shortcut`. Both names live inside
`PlayerDescriptionParser`'s scope. Rename to `ShortcutEntry` aligns
with `InventoryEntry`/`EquippedEntry` and removes the trap before
Task 3's walker references both names in the same method body.
M2 (Minor): `EquippedEntry` had no holtburger source citation; added
one referencing events.rs:180-190. Also expanded `InventoryEntry`'s
comment with the strict reader's validation reference.
Plan doc updated in lockstep so Task 3+ implementers see the new name.
8/8 PlayerDescriptionParser tests still pass.
Records what N.5b shipped, where the actual FPS bottleneck lives
(WbDrawDispatcher entity cull at ~4.3ms/frame, 86% of frame budget;
terrain dispatcher is now <1% of frame), and what A.5 has to do to
make the world look big without falling off a perf cliff.
Three concrete A.5 deliverables:
1. Two-tier streaming (near = full, far = terrain-only)
2. Per-LB entity bucketing in WbDrawDispatcher
3. Off-thread LandblockMesh.Build to avoid streaming hitches at higher
radius
Eight brainstorm questions for the next session, plus acceptance
criteria, files-to-read list, and explicit "don't do" warnings (don't
raise STREAM_RADIUS without tiering in place; don't put scenery in
far tier without an impostor pipeline; don't break the N.5b conformance
sentinel; etc.).
User's stated goal verbatim: "great smooth HIGH fps visuals. Should
look great. As long as it scales and we get very high FPS." This
reframes priorities away from radius=5 micro-optimization toward
visual scale.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
TerrainModernRenderer replaces TerrainChunkRenderer. Single global
VBO/EBO + slot allocator + glMultiDrawElementsIndirect. Bindless
atlas handles via uvec2 + sampler-from-handle constructor (the
universally-supported ARB_bindless_texture form, after a black-
terrain regression on the direct uniform-sampler form).
Path C: WB renderer pattern + acdream's LandblockMesh.Build for
retail's FSplitNESW formula compliance. Closes issue #51.
Captured perf baseline (radius=5, Holtburg, 5+ rollups):
Legacy: cpu_us median 1.5 / p95 3.0 (1 chunk = 1 glDrawElements)
Modern: cpu_us median 6.4-7.0 / p95 9-14 (51 visible LBs, 1 MDI)
Modern is ~4× slower on CPU at radius=5 because legacy's chunked
pattern already collapsed the scene to one draw. Architectural wins
(zero glBindTexture/frame; constant-cost dispatch as A.5 raises
radius) manifest at higher scene complexity. Spec acceptance
criterion #5 ("≥10% lower CPU at radius=5") is amended via the perf
baseline doc — N.5b ships on visual identity + structural correctness.
Three high-value gotchas captured to memory:
1. `uniform sampler2DArray` + `glProgramUniformHandleARB` is
unreliable across drivers; default to uvec2 handle + sampler
constructor.
2. Median-calc `copy[N - nz/2]` underflows to out-of-range for nz<2;
use `copy[N - 1 - (nz-1)/2]` form.
3. Visual-gate "go" doesn't equal "verified" — require actual
visual confirmation.
Visual verification: confirmed at Holtburg town. 114/114 tests pass
in N.5+N.5b filter. Conformance sentinel max ‖Δ‖ = 0.015 mm across
1000 sample points / 10 representative landblocks.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Document Phase N.5b shipping (terrain on the modern rendering path via
Path C — `TerrainModernRenderer` mirrors WB's `TerrainRenderManager`
pattern but consumes acdream's `LandblockMesh.Build` so retail's
`FSplitNESW` formula stays in lockstep with physics + visual mesh).
Changes:
- `docs/plans/2026-04-11-roadmap.md` — add N.5b row to the Shipped
table; promote N.5b's "Phases ahead" entry to ✓ SHIPPED with the
Path C resolution + perf reality check; refresh N.6 scope to note
Terrain has joined the modern path (legacy `Texture2D` retirement
scope narrows to Sky + Debug); update top-of-doc Status line.
- `docs/ISSUES.md` — close issue #51 (WB terrain-split formula
divergence). Move from OPEN to "Recently closed" with the Path C
resolution: never adopted WB's formula; modern dispatcher uses
retail's via `LandblockMesh.Build`. References `da56063` (the
black-terrain fix that landed within the N.5b ship chain).
- `CLAUDE.md` — add `TerrainModernRenderer.cs` to the WB integration
cribs list with the GL_INVALID_OPERATION caveat (use uvec2 +
`sampler2DArray(handle)` constructor, NOT direct
`uniform sampler2DArray` + `glProgramUniformHandleARB`). Update
the "Currently in flight" preamble: N.6 builds on N.5 + N.5b;
add an N.5b shipped paragraph linking the perf baseline doc.
- `docs/plans/2026-05-09-phase-n5b-perf-baseline.md` — new doc
capturing the radius=5 Holtburg perf measurement (modern 6.4-7.0
µs median vs legacy 1.5 µs — modern is ~4× SLOWER on CPU at
radius=5). Documents the spec acceptance criterion #5 amendment,
the architectural wins that DO hold (zero glBindTexture/frame,
constant-cost dispatch as A.5 raises radius, per-LB frustum cull),
and the three high-value gotchas surfaced during implementation.
User-memory updates (outside repo, not in this commit):
- `memory/project_phase_n5b_state.md` — full N.5b state file with
the three gotchas captured.
- `memory/MEMORY.md` — index entry pointing at the state file.
Build: dotnet build green. No code changes in this commit.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Captures everything a fresh agent needs to pick up Phase N.5b (Terrain
on the Modern Rendering Path) without spelunking through the N.5
session history.
Front-loads the load-bearing constraint: issue #51 (WB's terrain split
formula diverges from retail's FSplitNESW). Lays out three viable
design paths (A: adopt WB's formula everywhere; B: keep retail's
formula and fork-patch WB; C: WB mesh layout but our formula). The
brainstorm needs to pick one, informed by quantified divergence rate
across representative landblocks.
Includes file-by-file inventory of acdream's terrain stack (1383 lines
across TerrainRenderer + TerrainChunkRenderer + TerrainAtlas + shaders)
vs WB's (1937 lines across TerrainRenderManager + TerrainGeometryGenerator
+ LandSurfaceManager). Eight brainstorm questions covering atlas model,
mesh ownership, index format, shader unification, streaming integration,
conformance test, and visual verification gate.
Mirrors the N.5 handoff structure that worked well last session:
TL;DR + where N.5 left things + what N.5b inherits + technical detail
+ files to read + brainstorm questions + acceptance criteria + first
30 minutes + things to NOT do.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Records a new Phase A sub-piece: split the single ACDREAM_STREAM_RADIUS
into separate terrain + entity radii so terrain renders to a much
further horizon (WB-style) while entities/scenery stay at the current
closer radius.
Motivated by perf at ACDREAM_STREAM_RADIUS=5 dropping from ~810 fps
to ~200-300 fps because everything stays full-detail. Both retail and
WorldBuilder render terrain way out and strip entities at distance.
Estimate: 3-5 days for the radius split + fog tuning; +1 week if
terrain LOD via mesh decimation is included. Not yet brainstormed.
N.8 (sky + particles via WB's SkyboxRenderManager + ParticleEmitterRenderer)
was already on the roadmap; user confirmed they want it tracked there.
No edit needed for N.8 — already at the right level of detail.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reframe the selection-blink follow-up so it doesn't suggest near-term
work. Was listed in N.5 ship record as "Phase B.4 follow-up adds the
field" — now phrased as open backlog with the hook reserved in
mesh_modern.vert's InstanceData comment for whoever eventually picks
it up.
The shader hook itself is unchanged — change is purely doc wording in
the plan SHIP record + CLAUDE.md WB integration cribs.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Final cross-cutting review of N.5 found that Task 15's deletion of
mesh_instanced.vert/.frag left InstancedMeshRenderer orphaned —
ACDREAM_USE_WB_FOUNDATION=0 silently rendered terrain+sky only with
no entities. The SHIP commit's "[x] ACDREAM_USE_WB_FOUNDATION=0 still
works" claim was inaccurate.
Resolution: formal retirement of the legacy renderer path within N.5
instead of deferring to N.6.
Deleted:
- src/AcDream.App/Rendering/InstancedMeshRenderer.cs
- src/AcDream.App/Rendering/StaticMeshRenderer.cs
- src/AcDream.App/Rendering/Wb/WbFoundationFlag.cs
GameWindow simplified — capability detection is unconditional, missing
bindless throws NotSupportedException with a clear message at startup.
WbDrawDispatcher + mesh_modern shader load are mandatory after init.
No escape hatch.
GpuWorldState simplified — WbFoundationFlag.IsEnabled guards on
AddLandblock/RemoveLandblock removed; adapter calls are unconditional
when the adapter is non-null.
PendingSpawnIntegrationTests updated — WbFoundationFlag.ForTestsOnly_ForceEnable
static ctor removed (flag is gone; adapter calls are unconditional).
The ApplyLoadedTerrain physics-data loop was also simplified: the
EnsureUploaded sub-loop that fed InstancedMeshRenderer is gone;
_pendingCellMeshes is now explicitly cleared to prevent unbounded
accumulation (the worker thread still populates it, but WB handles
EnvCell geometry through its own pipeline).
Spec §2 Decision 5 + §10 Out-of-Scope updated. Plan ship-amendment
section added. Roadmap updated (N.5 ships with retirement; N.6 scope
narrowed to perf-only). CLAUDE.md "WB integration cribs" updated.
Perf baseline doc updated. WbDrawDispatcher class summary docstring
corrected to describe the as-shipped SSBO + multi-draw-indirect path.
ISSUES.md #51 updated (terrain not in N.5 scope; deferred to N.7).
Bindless support is now a hard requirement. Modern desktop GPUs
universally expose GL_ARB_bindless_texture + GL_ARB_shader_draw_parameters;
if a user hits the NotSupportedException, that's a real bug report
worth investigating, not a silent fallback.
Build: 0 errors, 0 warnings. Tests: 71/71 (Wb+MatrixComposition+TextureCacheBindless filter).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Moves N.5 from in-flight to Shipped (2026-05-08). N.6 (retire
InstancedMeshRenderer + perf polish) becomes the in-flight phase.
CLAUDE.md in-flight pointer updated to match.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Records the as-shipped state: acceptance gate verdicts, plan amendments
captured during execution, code-review adjustments per task, out-of-scope
N.6 follow-ups, and a complete files-changed summary.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CPU dispatcher: 1227 µs / frame median (1303 µs p95) at Holtburg
courtyard, 1662 groups in working set. Inferred ~810 fps sustained.
CPU dispatcher acceptance gate (≤70% of N.4): PASS — N.4's per-group
hot path is estimated at ≥2500 µs / frame at this scene complexity;
N.5 is comfortably under half.
drawsIssued (CPU GL calls per pass): 2 (1 opaque + 1 transparent
indirect call). Down from N.4's ~hundreds per pass. PASS.
GPU timing: unmeasured. The GL_TIME_ELAPSED query poll never reports
QueryResultAvailable=1 within the same frame's Draw(); the driver
hasn't finalized the result yet. Fix is double-buffering (queryA
on frame N, read on N+2). Deferred to N.6 perf polish — doesn't block
N.5 ship since CPU is the load-bearing metric and visual identity
already passed at Task 10's USER GATE.
Direct N.4 baseline NOT measured. Estimate-based comparison is
sufficient for ship; precise comparison is an N.6 follow-up.
Baseline doc at docs/plans/2026-05-08-phase-n5-perf-baseline.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Code quality review caught:
- sizeofDEIC was a local; promoted to public const DrawCommandStride
so tests can reference it symbolically.
- BatchDataPublic layout invariant (size + field offsets) wasn't
asserted in tests. Added BatchDataPublic_LayoutMatchesPrivateBatchData
+ DrawCommandStride_MatchesStructSize tests to gate Task 10's
MemoryMarshal.Cast<BatchData, BatchDataPublic> safety.
- Plan doc updated: BatchDataPublic spec was Pack=4 (wrong — must
match private BatchData's Pack=8 for the cast to work). Implementation
was already correct; plan now matches.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Original Task 5 draft used hardcoded vec3 ambient/sun uniforms in
mesh_modern.frag. Reading actual mesh_instanced.frag revealed it uses
a SceneLighting UBO at binding=1 with 8 lights, fog params (start/end/
lightning/mode), fog color, camera/time, and per-channel clamp.
Revised: mesh_modern.frag preserves the full SceneLighting UBO +
accumulateLights + applyFog + lightning flash + per-channel clamp.
mesh_modern.vert adds vWorldPos output (consumed by accumulateLights
and applyFog). Visual identity to N.4's lighting model preserved.
Two-pass alpha-test (N.5 Decision 2) sits inside the same shader,
gated by uRenderPass instead of uTranslucencyKind.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Original Task 3 had Bindless* methods calling the legacy Texture2D
GetOrUpload* and converting the GL name to a bindless handle —
producing a sampler2D texture sampled via sampler2DArray (GLSL type
mismatch).
Revised: Task 3 introduces three parallel cache dictionaries
(_bindlessBySurfaceId / _bindlessByOverridden / _bindlessByPalette)
storing both the GL texture name and the resident handle. Bindless*
methods call DecodeFromDats + UploadRgba8AsLayer1Array directly with
their own caching; legacy three-cache structure mirrored exactly.
Task 4 (Dispose) updated to: (1) MakeNonResident on every bindless
handle FIRST, (2) DeleteTexture on every Texture2DArray name, (3)
DeleteTexture on every legacy Texture2D handle. Order matters per
ARB_bindless_texture spec.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Implementer caught that the original Task 2 (replace UploadRgba8 target
with Texture2DArray) would break four legacy consumers whose shaders
sample via sampler2D: WbDrawDispatcher (pre-rewrite path),
StaticMeshRenderer, InstancedMeshRenderer (legacy escape hatch),
ParticleRenderer.
Revised: Task 2 ADDS a parallel UploadRgba8AsLayer1Array. Existing
UploadRgba8 (Texture2D) stays for legacy callers. Task 3's Bindless*
methods will call the new array path with their own cache dictionaries.
Same surface may be uploaded twice during transition; bounded cost.
N.6 cleanup deletes the legacy path.
Task 3 will be amended at dispatch time to reflect parallel caches.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The TextureCacheBindlessTests.cs file is created in Task 3 (where it
gets meaningful test cases), not Task 1. Removed it from Task 1's
Files list and added an explicit note. Caught during Task 1 code review.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Brainstormed 2026-05-08 over 8 design questions. Captures:
- Texture model: sampler2DArray for ALL textures (1-layer wrap for
per-instance composites). Matches WB's modern shader, future-proofs
for atlas adoption in N.6+.
- Translucency: WB's two-pass alpha-test (no native Additive on GfxObj
surfaces; falsifiable at visual verification).
- Data delivery: all-SSBO. Instances[] at binding=0, Batches[] at
binding=1. Indexed by gl_BaseInstanceARB+gl_InstanceID and
gl_DrawIDARB respectively.
- Bindless residency: resident on upload, never release. Bounded
content; instrument under ACDREAM_WB_DIAG=1.
- Escape hatch: two-way flag preserved. N.5 replaces N.4's draw method
in place; legacy InstancedMeshRenderer remains the safety net.
- Perf measurement: CPU stopwatch + GL_TIME_ELAPSED queries, logged
via [WB-DIAG]. Acceptance gates pasted into SHIP commit.
- Persistent-mapped buffers: deferred to N.6.
- Per-instance highlight (selection blink): deferred; field reserved
in InstanceData for Phase B.4 follow-up.
Spec at docs/superpowers/specs/2026-05-08-phase-n5-modern-rendering-design.md
covers architecture, components, per-frame data flow walk-through,
translucent rendering, error handling + fallback, testing + acceptance,
risks, and explicit out-of-scope list. Plan + task breakdown comes next.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Detailed briefing for the next agent picking up Phase N.5 (Modern
Rendering Path: bindless textures + glMultiDrawElementsIndirect on
N.4's foundation). Covers:
- Where N.4 left things (commits, what works, gotchas inherited)
- The two-feature pairing (why bindless + indirect together)
- Files to read first (WB shaders, our dispatcher, CLAUDE.md cribs)
- 8 brainstorm questions to resolve before spec
- Spec + plan structure (matching N.4's pattern)
- Acceptance criteria
- Things to explicitly NOT do
Sized for a fresh session to pick up cold without spelunking through
months of session history.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Phase N.4 (Rendering Pipeline Foundation) ships. WbFoundationFlag
flips to default-on (== "1" → != "0"). WB's ObjectMeshManager is
now acdream's production mesh pipeline; WbDrawDispatcher is the
production draw path. Legacy InstancedMeshRenderer is retained as
ACDREAM_USE_WB_FOUNDATION=0 escape hatch until N.6 retires it.
Visual verification at Holtburg passed:
- Scenery (trees / rocks / fences / buildings) renders correctly
- Characters connected with full close-detail geometry (Issue #47
preserved — GfxObjDegradeResolver path intact)
- FPS substantially improved by grouped instanced draws + per-entity
AABB cull + opaque front-to-back sort + palette-hash memoization
Three high-value WB API gotchas surfaced during Task 26 visual
verification and are now documented in CLAUDE.md "WB integration
cribs" + plan Adjustments 7-9 + memory project_phase_n4_state.md:
1. ObjectMeshManager.IncrementRefCount only bumps a counter — does
NOT trigger mesh loading. Call PrepareMeshDataAsync explicitly.
2. ObjectRenderBatch.SurfaceId is unset — read batch.Key.SurfaceId.
3. Modern rendering (GL 4.3 + bindless = every modern GPU) packs
every mesh into ONE global VAO/VBO/IBO. Use
glDrawElementsInstancedBaseVertex(BaseInstance) with FirstIndex +
BaseVertex from the batch, not naive DrawElementsInstanced.
Plan doc flipped to Final state. Roadmap N.4 → Live ✓; N.5 rebranded
from "Terrain rendering" to "Modern rendering path" (bindless +
multi-draw indirect on top of N.4's foundation; terrain rendering
moves to N.5b). CLAUDE.md "Currently in flight" pointer updated to
N.5. New memory file project_phase_n4_state.md preserves the three
WB gotchas for cross-session continuity.
n4-verify*.log added to .gitignore.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Resolves Adjustment 4 (Option A): WorldEntity now carries the server-
sent AnimPartChange data as PartOverrides and a HiddenPartsMask bitmask.
EntitySpawnAdapter.OnCreate populates AnimatedEntityState from these
fields at spawn time. GameWindow's CreateObject handler converts the
network-layer AnimPartChange records into lightweight PartOverride
structs.
This unblocks Task 22: the WbDrawDispatcher can now resolve per-part
GfxObj overrides and hidden-part suppression from entity state.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Self-contained briefing for whoever picks up Week 4 (Tasks 22-28):
the WbDrawDispatcher full draw loop + sky-pass preservation +
visual verification + flag default-on + legacy-code deletion +
plan finalization.
Highlights two unresolved decisions that need a brainstorm checkpoint
at the start of Week 4 (NOT 'just dispatch'):
- Adjustment 4 plumbing: WorldEntity needs HiddenPartsMask +
AnimPartChanges fields, OR EntitySpawnAdapter.OnCreate takes them
as separate parameters. Decision before Task 22 writes code.
- Surface-metadata side-table population strategy for Task 23.
References the living-document plan + spec + 5 prior adjustments
so a fresh agent has full context cold.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Week 3 ships: AnimatedEntityState (Tasks 16+18+19, commit ce72c57),
EntitySpawnAdapter routing server-spawned content through the existing
TextureCache.GetOrUploadWithPaletteOverride path (Task 17, commit
c02c307). 947 tests pass.
Adjustment 4: WorldEntity lacks HiddenPartsMask + AnimPartChanges
fields. Adapter scaffolding ships; AnimatedEntityState gets default
values (empty mask + empty override map). Plumbing deferred to Task 22
brainstorm — either add fields to WorldEntity or thread through a
separate parameter to EntitySpawnAdapter.OnCreate.
Adjustment 5: Task 20 (per-instance decode conformance) is structural.
Both old and new paths call the same TextureCache function — bytes
identical by construction. EntitySpawnAdapterTests already cover the
routing. No separate conformance test file needed.
Next: Task 22 (Week 4) — WbDrawDispatcher full draw loop. First task
that actually draws through WB and unlocks Adjustment 3's mitigation
(dual-pipeline cost resolves when legacy renderer can short-circuit
its upload for atlas-tier content).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>