acdream/docs/research/2026-05-10-post-a5-polish-handoff.md
Erik c111312e13 docs(post-A.5): cold-start handoff for the next session
Captures the three deferred items from A.5 ship:
- ISSUE #52: lifestone visual missing (1-3 hours, fast win)
- ISSUE #54: JobKind plumbing through BuildLandblockForStreaming
  (~30 min - 1 hour, worker-thread efficiency cleanup)
- ISSUE #53: Tier 1 entity-classification cache retry (~5-7 days,
  biggest perf win remaining; needs animation-mutation audit before
  designing to avoid the freeze-pose bug from the first attempt)

Doc covers: A.5 final state + 3 high-value gotchas, files to read,
per-priority detail with effort estimates and acceptance criteria,
what NOT to do, the first-30-minute workflow, and the full A.5
commit chain for reference.

Phase is sized ~1 week if all three priorities land. The audit
step on Tier 1 is the highest-leverage investment.

Tier 2 + Tier 3 (static/dynamic split + GPU compute culling) are
explicitly out-of-scope for this phase — separate multi-week phases
per docs/plans/2026-05-10-perf-tiers-2-3-roadmap.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 10:16:10 +02:00

21 KiB
Raw Permalink Blame History

Phase Post-A.5 Polish — Cold-Start Handoff

Created: 2026-05-10, immediately after A.5 SHIP + merge to main (d3d78fa). Audience: the next agent picking up post-A.5 polish work. Purpose: give you everything you need to start the polish phase cold, without spelunking through the A.5 session's 200+ messages.


TL;DR

A.5 just shipped. Two-tier streaming is live (N₁=4 near, N₂=12 far) with a 2.3 km fog horizon, off-thread mesh build, entity dispatcher tightening, mipmaps + 16x AF, MSAA 4x + A2C foliage, depth-write audit, BUDGET_OVER diag, and a full Quality Preset system (Low/Medium/High/Ultra) with env-var overrides + F11 mid-session re-apply.

A.5 was an enormous phase (29 numbered tasks + T22.5 mid-execution scope add + Bug A + Bug B post-T26 fixes). Spec at docs/superpowers/specs/2026-05-09-phase-a5-two-tier-streaming-design.md (~700 lines). Plan at docs/superpowers/plans/2026-05-09-phase-a5-two-tier-streaming.md (~2400 lines).

Three things were intentionally deferred to this phase:

  1. Lifestone visual missing (ISSUE #52). The Holtburg lifestone — a known visual landmark — hasn't been rendering since earlier in A.5 development. User confirmed they noticed it earlier but didn't flag it; deferred to post-ship. Highest user-perception value to fix.

  2. JobKind plumbing through BuildLandblockForStreaming (ISSUE #54). Bug A's fix patches at the worker output by stripping entities from far-tier LoadedLandblocks after the full load runs. The worker still wastes CPU on hydration + scenery generation that gets thrown away. Cleaner fix: make the worker SKIP that work for far-tier loads. ~30 min - 1 hour. Smallest cleanup, biggest worker-thread efficiency win.

  3. Tier 1 entity-classification cache retry (ISSUE #53). First attempt (commit 3639a6f, reverted at 9b49009) cached meshRef.PartTransform which is mutated per frame for animated entities — froze animations. Retry needs a careful read of AnimationSequencer + AnimationHookRouter first to map ALL the per-frame mutations of MeshRef state, then design a cache that bypasses animated entities OR caches only the animation-invariant subset. Biggest perf headroom available — math says it should drop the entity dispatcher from 3.5ms to 1-1.5ms, hitting the spec's 2.0ms budget.

The phase is sized ~1 week if all three land cleanly. Could be longer if Tier 1's animation audit reveals something subtle.


Where A.5 left things

Branch state

  • main is at d3d78fa ("Merge branch 'claude/hopeful-darwin-ae8b87' — Phase A.5 SHIP + Quality Preset system").
  • A.5 SHIP commit at 9245db5 (one commit before the merge bubble).
  • Roadmap entry: A.5 moved from "Phases ahead" → "Phases already shipped" table.
  • CLAUDE.md "Currently in flight" updated to "Post-A.5 polish — Tier 1 retry + lifestone fix + JobKind plumbing".

What works in A.5 (final post-fix state)

  • Two-tier streaming end-to-end: StreamingRegion with RecenterTo returning a 5-list TwoTierDiff (ToLoadFar/ToLoadNear/ToPromote/ToDemote/ToUnload) with hysteresis radius+2 on both tiers; StreamingController.Tick routes by LandblockStreamJobKind; LandblockStreamer worker thread does dat reads + mesh build off the render thread.
  • Bug A fixed: LandblockStreamer.HandleJob strips entities for LoadFar results before posting Loaded. Far-tier ships terrain only as the spec promised.
  • Bug B fixed: WalkEntities uses _walkScratch field reused across frames, no per-frame List allocation.
  • Quality Preset system: Low / Medium / High / Ultra presets with per-preset radii + MSAA + anisotropic + A2C + max-completions. 6 env-var overrides per field. F11 → Display tab dropdown for mid-session change. DisplaySettings.Quality persists in settings.json. GameWindow.ReapplyQualityPreset rebuilds the streaming pipeline for radius changes.
  • Visual quality stack: mipmaps + 16x anisotropic on TerrainAtlas. MSAA 4x + alpha-to-coverage on foliage shader. Depth-write audit + lock-in test (5 cases).
  • Fog horizon: FogStart = N₁ × 192m × 0.7 ≈ 538m. FogEnd = N₂ × 192m × 0.95 ≈ 2188m. Tunable via ACDREAM_FOG_START_MULT / ACDREAM_FOG_END_MULT.
  • DIAG: [WB-DIAG] and [TERRAIN-DIAG] flag BUDGET_OVER when median exceeds the per-subsystem spec budget (entity 2.0ms, terrain 1.0ms).

Final perf state at A.5 SHIP (horizon-safe Quality preset)

User hardware: AMD Radeon RX 9070 XT, 240 Hz @ 2560×1440.

Settings tested: NEAR_RADIUS=4, FAR_RADIUS=12, MSAA=0, A2C=0, ANISOTROPIC=4, MAX_COMPLETIONS=2.

Subsystem cpu_us median cpu_us p95
Entity dispatcher ~3500 µs (3.5 ms) ~4000 µs
Terrain dispatcher ~21 µs ~26 µs

Total frame time math: ~4-5 ms = ~200-240 FPS at standstill. User reported "Better now" — not the 240Hz spec target but a 5× improvement from the broken pre-Bug-A state (~40 FPS).

The 1.5ms gap to the 2.0ms entity dispatcher budget is what Tier 1 closes (per ISSUE #53 + the perf-tier roadmap).

What was NOT validated at SHIP

  • Full High preset (radius=4/12, MSAA 4x, A2C on, anisotropic 16x). Crashed the entire OS at first attempt earlier in A.5 development. Bug A was likely the trigger (CPU dispatcher saturating + GPU command queue overflowing). With Bug A fixed, this likely works — but never re-tested. Re-testing is part of this phase's stretch goal.
  • Visual gate at full quality. Same — only validated at horizon-safe settings.
  • Walking trace at any preset. Brief walking observed but not metric-captured.

Three high-value gotchas captured in A.5 memory

These are at ~/.claude/projects/.../memory/project_phase_a5_state.md:

  1. Worker-side JobKind routing was the load-bearing far-tier optimization. T13/T16 wired the controller side; the worker never branched on Kind. ~5x perf regression that wasn't caught by spec/code reviews.
  2. WalkEntities's "extract a list-producing helper" pattern is a perf antipattern. ~480 KB / frame allocation. Implementer flagged "future N.6 optimization" in self-review; review should have caught that "future" was actually "now."
  3. Caching mutable per-frame state silently breaks animation. Tier 1's first attempt. The "trust MeshRefs as the source of truth" comment in the dispatcher is true but misleading — MeshRefs IS the source of truth, but it's mutated EVERY frame for animated entities.

(Full memory entry has 5 gotchas; these three are the load-bearing ones for post-A.5.)


Files to read before brainstorming

In rough order:

  1. docs/superpowers/specs/2026-05-09-phase-a5-two-tier-streaming-design.md — A.5 spec, full design rationale + Quality Preset system (§4.10) + acceptance criteria reshape (§2). Skim for vocabulary; read §4.10 in full.
  2. docs/plans/2026-05-10-perf-tiers-2-3-roadmap.md — Tier 2 (static/dynamic split) + Tier 3 (GPU compute culling) roadmap. Read for context on where Tier 1 fits in the perf optimization tower.
  3. docs/ISSUES.md issues #52, #53, #54 — the three deferred items in tactical-list form.
  4. memory/project_phase_a5_state.md — the 5 gotchas. Critical for avoiding the same traps in this phase.
  5. src/AcDream.App/Streaming/LandblockStreamer.csHandleJob is where Bug A's patch lives + where ISSUE #54's cleaner fix will go.
  6. src/AcDream.App/Rendering/Wb/WbDrawDispatcher.csWalkEntities + Draw's inner loop. Where Tier 1's retry will operate.
  7. src/AcDream.Core/Physics/AnimationSequencer.cs — the per-frame animation engine. Read this BEFORE designing Tier 1's retry. Pay specific attention to anywhere it touches meshRef.PartTransform or any other field that the dispatcher reads.
  8. src/AcDream.App/Animation/AnimationHookRouter.cs (or similar) — the hook fan-out from animation events. Same audit reason as #7.

Per-priority detail

Priority 1 — Lifestone missing (ISSUE #52)

Estimated effort: 1-3 hours. Could be a 1-line fix or could surface a deeper issue.

The Holtburg lifestone is a Setup-multi-part entity (the spinning blue crystal pillar). User reports it hasn't been rendering since earlier in A.5 development. They noticed but didn't flag during the session.

Hypotheses:

  • Bug A's strip caught a near-tier entity. The current strip in LandblockStreamer.HandleJob only fires when tier == LandblockStreamTier.Far. Holtburg's lifestone is in a near-tier LB (Holtburg's center, ~LB 0xA9B4). Should NOT have been stripped. But verify — maybe the LB's tier resolution at first-tick is wrong.
  • Earlier visual regression from a different commit. User said it was missing in earlier runs too. Could be from N.5b, an N.5b follow-up, or even older. Requires a git log -- docs/ISSUES.md correlation with visible state.
  • Setup-rendering edge case. The lifestone has unusual properties (animated rotation, particle effects on top). Maybe it's a Setup with some sub-mesh that the dispatcher's SetupParts walk filters out.
  • Dat-state mismatch. The lifestone's GfxObj id might be in a part of the dat that's failing decode.

Investigation steps:

  1. Launch the client + walk to Holtburg lifestone position.
  2. Check [WB-DIAG] for meshMissing count — if non-zero, some entity's mesh isn't loading.
  3. Use the cdb attach toolchain (per CLAUDE.md "Retail debugger toolchain") if needed to compare vs retail's lifestone rendering.
  4. Compare to ACViewer / WorldBuilder to see if the lifestone renders there. If yes, our renderer has a regression. If no, the issue is dat-side or in shared decode logic.
  5. Identify the GfxObj/Setup id for the lifestone (likely well-known retail ID; check docs/research/named-retail/ or ACViewer reference).
  6. Trace: does _meshAdapter.TryGetRenderData(lifestoneId) return non-null? Does the resulting renderData.Batches have entries?

Acceptance: lifestone renders correctly (visible spinning blue crystal at the Holtburg town center).

Priority 2 — JobKind plumbing through BuildLandblockForStreaming (ISSUE #54)

Estimated effort: 30 min - 1 hour.

Currently LandblockStreamer.HandleJob strips entities POST-load for far-tier:

case LandblockStreamJob.Load load:
    var lb = _loadLandblock(load.LandblockId);  // full load
    var mesh = _buildMeshOrNull(load.LandblockId, lb);
    var tier = load.Kind == LandblockStreamJobKind.LoadFar ? Far : Near;
    if (tier == LandblockStreamTier.Far && lb.Entities.Count > 0)
    {
        // Strip entities — far-tier ships terrain only.
        lb = new LoadedLandblock(...empty entities...);
    }
    _outbox.Writer.TryWrite(new Loaded(... lb, mesh ...));
    break;

The full _loadLandblock does:

  1. Read LandBlock heightmap (cheap).
  2. Read LandBlockInfo (medium).
  3. LandblockLoader.BuildEntitiesFromInfo (extract stabs/buildings).
  4. Hydrate stab/building meshRefs (medium).
  5. Run scenery generation (heavy — ~50-200 procedural entities × meshRef hydration).
  6. Build interior cell entities.

For far-tier, only step 1 is needed. Steps 2-6 are wasted CPU on the worker thread.

Refactor plan:

  1. Change the streamer's _loadLandblock factory to take LandblockStreamJobKind:
    private readonly Func<uint, LandblockStreamJobKind, LoadedLandblock?> _loadLandblock;
    
  2. In GameWindow, the factory closure branches:
    loadLandblock: (id, kind) => kind == LandblockStreamJobKind.LoadFar
        ? BuildLandblockHeightmapOnly(id)
        : BuildLandblockForStreaming(id),
    
  3. New BuildLandblockHeightmapOnly returns a LoadedLandblock with the heightmap dat record + empty entity list. Cheap — no LandBlockInfo read, no scenery generation.
  4. Remove the post-load strip in HandleJob (no longer needed).
  5. Worker-thread CPU drops measurably; horizon fill on first traversal speeds up.

Acceptance:

  • Build green; existing 999+ tests pass.
  • Streaming worker thread is measurably faster on first-traversal (the user can validate with [WB-DIAG] worker queue depth or just feel the responsiveness when walking into virgin region).
  • Visible behavior unchanged — far tier looks the same as before.

Priority 3 — Tier 1 entity-classification cache retry (ISSUE #53)

Estimated effort: ~5-7 days. Substantial because the audit step is critical.

This is the BIG perf win remaining for A.5's CPU dispatcher. Math says entity dispatcher 3.5ms → 1-1.5ms = ~300-400 FPS at standstill. Drops the dispatcher inside the spec's 2.0ms budget.

The first attempt's failure (commit 3639a6f, reverted at 9b49009):

Cached meshRef.PartTransform baked into per-(entity, batch) classification at first-frame visit. For static entities, this is stable forever. For animated entities, meshRef.PartTransform is updated EVERY FRAME by AnimationSequencer to apply the current skeletal pose. The cache froze the pose.

User-visible symptoms:

  • NPCs / players stop animating.
  • Some buildings (likely those mistakenly in animatedEntityIds) draw at wrong positions.

The retry's audit step (do this BEFORE designing the cache):

Read src/AcDream.Core/Physics/AnimationSequencer.cs and trace EVERY assignment to meshRef.PartTransform (and any other field on MeshRef, WorldEntity, or related state that the dispatcher reads). Likely write sites:

  • AnimationSequencer.TickAnimations per-frame skeletal pose update
  • AnimationHookRouter for hooks like AnimSetPose
  • Live network handlers that mutate entity.Position / entity.Rotation (T18 already migrated these to SetPosition for AABB invalidation; double-check)
  • EntitySpawnAdapter for ObjDescEvent / palette swap

For each write site, decide: is this entity STATIC (write only at spawn) or DYNAMIC (write per-frame or in response to network events)?

Cache design options after the audit:

(a) Static-only cache. Only cache entities where animatedEntityIds.Contains(entity.Id) == false. Animated entities use today's per-frame classification path. Cleanest, but requires animatedEntityIds to be a stable signal (it is — _animatedEntities dict in GameWindow is the source).

(b) Dynamic-aware cache with invalidation hooks. Cache everything but expose InvalidateEntity(uint) / RefreshEntityPalette(uint) for the dispatcher's invalidation. Wire from the network layer (palette swap fires invalidation; ObjDesc event fires invalidation). More complex but might let animated entities also benefit.

(c) Static-only + animated-bypass + diagnostic check. Like (a), but in DEBUG builds, log a warning every frame if a cached entity's meshRef.PartTransform differs from the cached value (catches mis-classified dynamics). Belt-and-suspenders.

Recommendation: start with (a). Ship Tier 1 for static entities only. Animated path stays slow but correct. If perf gate finds the static-only Tier 1 isn't enough, escalate to (c) for safety + (b) later.

Acceptance:

  • Build green; existing 999+ tests pass.
  • 1-3 new tests covering: cache hit for static entity, cache bypass for animated entity, cache invalidation on entity remove.
  • Visual gate: launch + walk Holtburg → North Yanshi at horizon-safe preset; confirm:
    • Animation works (NPCs, player character animate normally)
    • Buildings at correct positions
    • Lifestone (still depending on Priority 1 fix) renders correctly
    • No new visual regressions
  • Perf gate (with [WB-DIAG]):
    • Entity dispatcher cpu_us median drops from ~3.5ms to ≤2.0ms (matches spec budget).
    • p95 stays ≤ 2.5ms.

What's NOT in this phase

  • Tier 2 (static/dynamic split with persistent groups). Separate ~2-week phase. See docs/plans/2026-05-10-perf-tiers-2-3-roadmap.md.
  • Tier 3 (GPU compute culling). Separate ~1-month phase. Same roadmap.
  • Full High preset crash investigation beyond casual retest. Stretch goal: re-test the High preset with Bug A + B fixed, see if it's stable now. If it crashes, file a new issue and continue. Don't deep-dive in this phase.
  • EnvCell modern path migration, Sky/Particles modern path, Shadow mapping — all later phases.
  • N.6 perf polish (the previously-flagged "next phase"). N.6 was the original CLAUDE.md "Currently in flight" target before A.5. Most of N.6's scope was rolled into A.5 (perf-tier work). What's left of N.6 (persistent-mapped indirect buffer, GPU-side culling) overlaps with Tier 2/3 and should be re-scoped after Tier 1 lands.

Acceptance criteria (whole phase)

  • All three priorities (Lifestone, JobKind, Tier 1) shipped or one is explicitly deferred with documented reasoning.
  • Build green throughout. ~999+ tests pass; 8 pre-existing physics/input failures stay at 8.
  • N.5b conformance sentinel intact (TerrainSlot, TerrainModernConformance, Wb*, MatrixComposition, TextureCacheBindless, SplitFormulaDivergence — all clean).
  • Visual gate: lifestone renders; animation works; horizon visible at ~2.3km; smooth walking trace; no new artifacts.
  • Perf gate (post-Tier-1): entity dispatcher cpu_us median ≤ 2.0ms at horizon-safe preset, ~250-300 FPS at standstill.
  • Memory entry written + roadmap "shipped" row updated for the polish phase.

What you'll be doing in the first 30 minutes

  1. Read this handoff in full.
  2. Verify build green: dotnet build. Verify ~999 tests pass: dotnet test --no-build.
  3. Read docs/superpowers/specs/2026-05-09-phase-a5-two-tier-streaming-design.md §2, §4.10, §11 (deferred section).
  4. Read docs/plans/2026-05-10-perf-tiers-2-3-roadmap.md Tier 1 section.
  5. Read docs/ISSUES.md issues #52, #53, #54 in full.
  6. Read memory/project_phase_a5_state.md (5 gotchas).
  7. Read src/AcDream.App/Streaming/LandblockStreamer.cs HandleJob method.
  8. Read src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs Draw + WalkEntitiesInto methods.
  9. Skim src/AcDream.Core/Physics/AnimationSequencer.cs for write-sites of meshRef.PartTransform (Tier 1 retry's audit prerequisite).
  10. Decide: which priority to start with? Recommendation order: 1 (lifestone, fast win), 2 (JobKind, easy cleanup), 3 (Tier 1, biggest perf win + most complex).
  11. Brainstorm with the user on the chosen priority before writing code.
  12. Write a small spec or just the implementation if the priority is small (1 + 2 are small enough to skip a formal spec). Tier 1 (priority 3) needs a spec because of the audit + invalidation design.

Don't skip the audit step on Tier 1. The first attempt failed because of an incomplete read of the animation mutation graph; the second attempt should not repeat that.


Things to NOT do

  • Don't rush Tier 1. Audit first. Write down which entities are static vs dynamic. Write tests that specifically verify animated entities still animate after caching is enabled.
  • Don't bundle Tier 2 or Tier 3 into this phase. Those are dedicated multi-week phases with their own brainstorm + spec + plan cycles.
  • Don't break the N.5b conformance sentinel. Run the filter on every commit:
    dotnet test --no-build --filter "FullyQualifiedName~TerrainSlot|FullyQualifiedName~TerrainModernConformance|FullyQualifiedName~Wb|FullyQualifiedName~MatrixComposition|FullyQualifiedName~TextureCacheBindless|FullyQualifiedName~SplitFormulaDivergence"
    
    Expect 89+ passing, 0 failures.
  • Don't skip the visual gate. Lifestone fix specifically requires looking at the lifestone in-game. Tier 1 retry requires confirming animation works on a moving NPC.
  • Don't delete the _walkScratch field added in Bug B fix. It's load-bearing — without it, Tier 1 retry would re-introduce the per-frame allocation bug.
  • Don't re-add the Tier1 cache that was reverted. Start the retry with a fresh design after the animation audit. Cherry-picking the reverted code will re-introduce the bug.

Reference: A.5's commit chain

Final A.5 commit chain on claude/hopeful-darwin-ae8b87 (merged into main at d3d78fa):

SHA Subject
9245db5 phase(A.5): SHIP — two-tier streaming + horizon LOD + Quality Preset system
d93d823 docs(A.5 T27): roadmap + ISSUES + CLAUDE.md updates for A.5 ship
a28a5b7 docs(A.5 T27): spec + plan amendments for T22.5 + ship
9b49009 Revert "feat(perf): Tier 1 entity classification cache"
3639a6f feat(perf): Tier 1 entity classification cache (REVERTED)
462f9d6 docs(perf): roadmap for Tier 2 + Tier 3 entity-dispatcher optimizations
0ad8c99 fix(A.5): WalkEntities scratch-list pattern (Bug B — T17 GC pressure)
9217fd9 fix(A.5): strip far-tier entities in worker (Bug A — far tier optimization)
28d2c60 feat(A.5 T22.5): wire QualityPreset into renderer + streaming (commit 2/2)
afa4200 feat(A.5 T22.5): QualityPreset schema + tests (commit 1/2)
c473fee feat(A.5 T23): BUDGET_OVER flag in [WB-DIAG] / [TERRAIN-DIAG]
3b684db feat(A.5 T22): fog wired from N₁/N₂ + ACDREAM_FOG_*_MULT env vars
1488ec6 test(A.5 T21): lock in depth-write attribution per translucency kind
26b2871 feat(A.5 T20): MSAA 4x + alpha-to-coverage on foliage
4b84e56 feat(A.5 T19): mipmaps + 16x anisotropic on TerrainAtlas
(...60+ commits earlier in the chain through T1-T18) (see full log on the merge bubble)

The merge bubble preserves the full chain. To inspect any A.5 commit:

git log d3d78fa^..d3d78fa
git show <sha>

Good luck. The phase is well-bounded; the audit step on Tier 1 is the single highest-leverage thing to invest in. The lifestone and JobKind cleanup should be quick wins. After this phase ships, the project is in a great position — A.5 + polish + Tier 2/3 roadmap covers the rendering + perf work for the next several months.

Holler at the user if any of the three priorities reveals scope you didn't expect.