acdream/docs/research/2026-05-10-post-a5-polish-handoff.md
Erik c111312e13 docs(post-A.5): cold-start handoff for the next session
Captures the three deferred items from A.5 ship:
- ISSUE #52: lifestone visual missing (1-3 hours, fast win)
- ISSUE #54: JobKind plumbing through BuildLandblockForStreaming
  (~30 min - 1 hour, worker-thread efficiency cleanup)
- ISSUE #53: Tier 1 entity-classification cache retry (~5-7 days,
  biggest perf win remaining; needs animation-mutation audit before
  designing to avoid the freeze-pose bug from the first attempt)

Doc covers: A.5 final state + 3 high-value gotchas, files to read,
per-priority detail with effort estimates and acceptance criteria,
what NOT to do, the first-30-minute workflow, and the full A.5
commit chain for reference.

Phase is sized ~1 week if all three priorities land. The audit
step on Tier 1 is the highest-leverage investment.

Tier 2 + Tier 3 (static/dynamic split + GPU compute culling) are
explicitly out-of-scope for this phase — separate multi-week phases
per docs/plans/2026-05-10-perf-tiers-2-3-roadmap.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 10:16:10 +02:00

307 lines
21 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Phase Post-A.5 Polish — Cold-Start Handoff
**Created:** 2026-05-10, immediately after A.5 SHIP + merge to main (`d3d78fa`).
**Audience:** the next agent picking up post-A.5 polish work.
**Purpose:** give you everything you need to start the polish phase cold, without spelunking through the A.5 session's 200+ messages.
---
## TL;DR
A.5 just shipped. Two-tier streaming is live (N₁=4 near, N₂=12 far) with a 2.3 km fog horizon, off-thread mesh build, entity dispatcher tightening, mipmaps + 16x AF, MSAA 4x + A2C foliage, depth-write audit, BUDGET_OVER diag, and a full Quality Preset system (Low/Medium/High/Ultra) with env-var overrides + F11 mid-session re-apply.
**A.5 was an enormous phase** (29 numbered tasks + T22.5 mid-execution scope add + Bug A + Bug B post-T26 fixes). Spec at `docs/superpowers/specs/2026-05-09-phase-a5-two-tier-streaming-design.md` (~700 lines). Plan at `docs/superpowers/plans/2026-05-09-phase-a5-two-tier-streaming.md` (~2400 lines).
**Three things were intentionally deferred to this phase:**
1. **Lifestone visual missing (ISSUE #52).** The Holtburg lifestone — a known visual landmark — hasn't been rendering since earlier in A.5 development. User confirmed they noticed it earlier but didn't flag it; deferred to post-ship. **Highest user-perception value to fix.**
2. **JobKind plumbing through `BuildLandblockForStreaming` (ISSUE #54).** Bug A's fix patches at the worker output by stripping entities from far-tier `LoadedLandblock`s after the full load runs. The worker still wastes CPU on hydration + scenery generation that gets thrown away. Cleaner fix: make the worker SKIP that work for far-tier loads. ~30 min - 1 hour. **Smallest cleanup, biggest worker-thread efficiency win.**
3. **Tier 1 entity-classification cache retry (ISSUE #53).** First attempt (commit `3639a6f`, reverted at `9b49009`) cached `meshRef.PartTransform` which is mutated per frame for animated entities — froze animations. Retry needs a careful read of `AnimationSequencer` + `AnimationHookRouter` first to map ALL the per-frame mutations of MeshRef state, then design a cache that bypasses animated entities OR caches only the animation-invariant subset. **Biggest perf headroom available** — math says it should drop the entity dispatcher from 3.5ms to 1-1.5ms, hitting the spec's 2.0ms budget.
The phase is sized ~1 week if all three land cleanly. Could be longer if Tier 1's animation audit reveals something subtle.
---
## Where A.5 left things
### Branch state
- `main` is at `d3d78fa` ("Merge branch 'claude/hopeful-darwin-ae8b87' — Phase A.5 SHIP + Quality Preset system").
- A.5 SHIP commit at `9245db5` (one commit before the merge bubble).
- Roadmap entry: A.5 moved from "Phases ahead" → "Phases already shipped" table.
- CLAUDE.md "Currently in flight" updated to "Post-A.5 polish — Tier 1 retry + lifestone fix + JobKind plumbing".
### What works in A.5 (final post-fix state)
- **Two-tier streaming end-to-end:** `StreamingRegion` with `RecenterTo` returning a 5-list `TwoTierDiff` (ToLoadFar/ToLoadNear/ToPromote/ToDemote/ToUnload) with hysteresis radius+2 on both tiers; `StreamingController.Tick` routes by `LandblockStreamJobKind`; `LandblockStreamer` worker thread does dat reads + mesh build off the render thread.
- **Bug A fixed:** `LandblockStreamer.HandleJob` strips entities for `LoadFar` results before posting Loaded. Far-tier ships terrain only as the spec promised.
- **Bug B fixed:** `WalkEntities` uses `_walkScratch` field reused across frames, no per-frame List allocation.
- **Quality Preset system:** Low / Medium / High / Ultra presets with per-preset radii + MSAA + anisotropic + A2C + max-completions. 6 env-var overrides per field. F11 → Display tab dropdown for mid-session change. `DisplaySettings.Quality` persists in settings.json. `GameWindow.ReapplyQualityPreset` rebuilds the streaming pipeline for radius changes.
- **Visual quality stack:** mipmaps + 16x anisotropic on TerrainAtlas. MSAA 4x + alpha-to-coverage on foliage shader. Depth-write audit + lock-in test (5 cases).
- **Fog horizon:** FogStart = N₁ × 192m × 0.7 ≈ 538m. FogEnd = N₂ × 192m × 0.95 ≈ 2188m. Tunable via `ACDREAM_FOG_START_MULT` / `ACDREAM_FOG_END_MULT`.
- **DIAG:** `[WB-DIAG]` and `[TERRAIN-DIAG]` flag `BUDGET_OVER` when median exceeds the per-subsystem spec budget (entity 2.0ms, terrain 1.0ms).
### Final perf state at A.5 SHIP (horizon-safe Quality preset)
User hardware: AMD Radeon RX 9070 XT, 240 Hz @ 2560×1440.
Settings tested: `NEAR_RADIUS=4, FAR_RADIUS=12, MSAA=0, A2C=0, ANISOTROPIC=4, MAX_COMPLETIONS=2`.
| Subsystem | cpu_us median | cpu_us p95 |
|---|---|---|
| Entity dispatcher | ~3500 µs (3.5 ms) | ~4000 µs |
| Terrain dispatcher | ~21 µs | ~26 µs |
Total frame time math: ~4-5 ms = ~200-240 FPS at standstill. User reported "Better now" — not the 240Hz spec target but a 5× improvement from the broken pre-Bug-A state (~40 FPS).
The 1.5ms gap to the 2.0ms entity dispatcher budget is what Tier 1 closes (per ISSUE #53 + the perf-tier roadmap).
### What was NOT validated at SHIP
- **Full High preset (radius=4/12, MSAA 4x, A2C on, anisotropic 16x).** Crashed the entire OS at first attempt earlier in A.5 development. Bug A was likely the trigger (CPU dispatcher saturating + GPU command queue overflowing). With Bug A fixed, this likely works — but never re-tested. **Re-testing is part of this phase's stretch goal.**
- **Visual gate at full quality.** Same — only validated at horizon-safe settings.
- **Walking trace at any preset.** Brief walking observed but not metric-captured.
### Three high-value gotchas captured in A.5 memory
These are at `~/.claude/projects/.../memory/project_phase_a5_state.md`:
1. **Worker-side JobKind routing was the load-bearing far-tier optimization.** T13/T16 wired the controller side; the worker never branched on Kind. ~5x perf regression that wasn't caught by spec/code reviews.
2. **WalkEntities's "extract a list-producing helper" pattern is a perf antipattern.** ~480 KB / frame allocation. Implementer flagged "future N.6 optimization" in self-review; review should have caught that "future" was actually "now."
3. **Caching mutable per-frame state silently breaks animation.** Tier 1's first attempt. The "trust MeshRefs as the source of truth" comment in the dispatcher is true but misleading — MeshRefs IS the source of truth, but it's mutated EVERY frame for animated entities.
(Full memory entry has 5 gotchas; these three are the load-bearing ones for post-A.5.)
---
## Files to read before brainstorming
In rough order:
1. **`docs/superpowers/specs/2026-05-09-phase-a5-two-tier-streaming-design.md`** — A.5 spec, full design rationale + Quality Preset system (§4.10) + acceptance criteria reshape (§2). Skim for vocabulary; read §4.10 in full.
2. **`docs/plans/2026-05-10-perf-tiers-2-3-roadmap.md`** — Tier 2 (static/dynamic split) + Tier 3 (GPU compute culling) roadmap. Read for context on where Tier 1 fits in the perf optimization tower.
3. **`docs/ISSUES.md` issues #52, #53, #54** — the three deferred items in tactical-list form.
4. **`memory/project_phase_a5_state.md`** — the 5 gotchas. Critical for avoiding the same traps in this phase.
5. **`src/AcDream.App/Streaming/LandblockStreamer.cs`** — `HandleJob` is where Bug A's patch lives + where ISSUE #54's cleaner fix will go.
6. **`src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs`** — `WalkEntities` + `Draw`'s inner loop. Where Tier 1's retry will operate.
7. **`src/AcDream.Core/Physics/AnimationSequencer.cs`** — the per-frame animation engine. Read this BEFORE designing Tier 1's retry. Pay specific attention to anywhere it touches `meshRef.PartTransform` or any other field that the dispatcher reads.
8. **`src/AcDream.App/Animation/AnimationHookRouter.cs`** (or similar) — the hook fan-out from animation events. Same audit reason as #7.
---
## Per-priority detail
### Priority 1 — Lifestone missing (ISSUE #52)
**Estimated effort:** 1-3 hours. Could be a 1-line fix or could surface a deeper issue.
The Holtburg lifestone is a Setup-multi-part entity (the spinning blue crystal pillar). User reports it hasn't been rendering since earlier in A.5 development. They noticed but didn't flag during the session.
Hypotheses:
- **Bug A's strip caught a near-tier entity.** The current strip in `LandblockStreamer.HandleJob` only fires when `tier == LandblockStreamTier.Far`. Holtburg's lifestone is in a near-tier LB (Holtburg's center, ~LB 0xA9B4). Should NOT have been stripped. But verify — maybe the LB's tier resolution at first-tick is wrong.
- **Earlier visual regression from a different commit.** User said it was missing in earlier runs too. Could be from N.5b, an N.5b follow-up, or even older. Requires a `git log -- docs/ISSUES.md` correlation with visible state.
- **Setup-rendering edge case.** The lifestone has unusual properties (animated rotation, particle effects on top). Maybe it's a Setup with some sub-mesh that the dispatcher's `SetupParts` walk filters out.
- **Dat-state mismatch.** The lifestone's GfxObj id might be in a part of the dat that's failing decode.
**Investigation steps:**
1. Launch the client + walk to Holtburg lifestone position.
2. Check `[WB-DIAG]` for `meshMissing` count — if non-zero, some entity's mesh isn't loading.
3. Use the cdb attach toolchain (per CLAUDE.md "Retail debugger toolchain") if needed to compare vs retail's lifestone rendering.
4. Compare to ACViewer / WorldBuilder to see if the lifestone renders there. If yes, our renderer has a regression. If no, the issue is dat-side or in shared decode logic.
5. Identify the GfxObj/Setup id for the lifestone (likely well-known retail ID; check `docs/research/named-retail/` or ACViewer reference).
6. Trace: does `_meshAdapter.TryGetRenderData(lifestoneId)` return non-null? Does the resulting `renderData.Batches` have entries?
**Acceptance:** lifestone renders correctly (visible spinning blue crystal at the Holtburg town center).
### Priority 2 — JobKind plumbing through `BuildLandblockForStreaming` (ISSUE #54)
**Estimated effort:** 30 min - 1 hour.
Currently `LandblockStreamer.HandleJob` strips entities POST-load for far-tier:
```csharp
case LandblockStreamJob.Load load:
var lb = _loadLandblock(load.LandblockId); // full load
var mesh = _buildMeshOrNull(load.LandblockId, lb);
var tier = load.Kind == LandblockStreamJobKind.LoadFar ? Far : Near;
if (tier == LandblockStreamTier.Far && lb.Entities.Count > 0)
{
// Strip entities — far-tier ships terrain only.
lb = new LoadedLandblock(...empty entities...);
}
_outbox.Writer.TryWrite(new Loaded(... lb, mesh ...));
break;
```
The full `_loadLandblock` does:
1. Read `LandBlock` heightmap (cheap).
2. Read `LandBlockInfo` (medium).
3. `LandblockLoader.BuildEntitiesFromInfo` (extract stabs/buildings).
4. Hydrate stab/building meshRefs (medium).
5. Run scenery generation (heavy — ~50-200 procedural entities × meshRef hydration).
6. Build interior cell entities.
For far-tier, only step 1 is needed. Steps 2-6 are wasted CPU on the worker thread.
**Refactor plan:**
1. Change the streamer's `_loadLandblock` factory to take `LandblockStreamJobKind`:
```csharp
private readonly Func<uint, LandblockStreamJobKind, LoadedLandblock?> _loadLandblock;
```
2. In `GameWindow`, the factory closure branches:
```csharp
loadLandblock: (id, kind) => kind == LandblockStreamJobKind.LoadFar
? BuildLandblockHeightmapOnly(id)
: BuildLandblockForStreaming(id),
```
3. New `BuildLandblockHeightmapOnly` returns a `LoadedLandblock` with the heightmap dat record + empty entity list. Cheap — no LandBlockInfo read, no scenery generation.
4. Remove the post-load strip in `HandleJob` (no longer needed).
5. Worker-thread CPU drops measurably; horizon fill on first traversal speeds up.
**Acceptance:**
- Build green; existing 999+ tests pass.
- Streaming worker thread is measurably faster on first-traversal (the user can validate with `[WB-DIAG]` worker queue depth or just feel the responsiveness when walking into virgin region).
- Visible behavior unchanged — far tier looks the same as before.
### Priority 3 — Tier 1 entity-classification cache retry (ISSUE #53)
**Estimated effort:** ~5-7 days. Substantial because the audit step is critical.
This is the BIG perf win remaining for A.5's CPU dispatcher. Math says entity dispatcher 3.5ms → 1-1.5ms = ~300-400 FPS at standstill. Drops the dispatcher inside the spec's 2.0ms budget.
**The first attempt's failure (commit 3639a6f, reverted at 9b49009):**
Cached `meshRef.PartTransform` baked into per-(entity, batch) classification at first-frame visit. For static entities, this is stable forever. For animated entities, `meshRef.PartTransform` is updated EVERY FRAME by `AnimationSequencer` to apply the current skeletal pose. The cache froze the pose.
User-visible symptoms:
- NPCs / players stop animating.
- Some buildings (likely those mistakenly in `animatedEntityIds`) draw at wrong positions.
**The retry's audit step (do this BEFORE designing the cache):**
Read `src/AcDream.Core/Physics/AnimationSequencer.cs` and trace EVERY assignment to `meshRef.PartTransform` (and any other field on `MeshRef`, `WorldEntity`, or related state that the dispatcher reads). Likely write sites:
- `AnimationSequencer.TickAnimations` per-frame skeletal pose update
- `AnimationHookRouter` for hooks like `AnimSetPose`
- Live network handlers that mutate `entity.Position` / `entity.Rotation` (T18 already migrated these to `SetPosition` for AABB invalidation; double-check)
- `EntitySpawnAdapter` for ObjDescEvent / palette swap
For each write site, decide: is this entity STATIC (write only at spawn) or DYNAMIC (write per-frame or in response to network events)?
**Cache design options after the audit:**
(a) **Static-only cache.** Only cache entities where `animatedEntityIds.Contains(entity.Id) == false`. Animated entities use today's per-frame classification path. Cleanest, but requires `animatedEntityIds` to be a stable signal (it is — `_animatedEntities` dict in GameWindow is the source).
(b) **Dynamic-aware cache with invalidation hooks.** Cache everything but expose `InvalidateEntity(uint)` / `RefreshEntityPalette(uint)` for the dispatcher's invalidation. Wire from the network layer (palette swap fires invalidation; ObjDesc event fires invalidation). More complex but might let animated entities also benefit.
(c) **Static-only + animated-bypass + diagnostic check.** Like (a), but in DEBUG builds, log a warning every frame if a cached entity's `meshRef.PartTransform` differs from the cached value (catches mis-classified dynamics). Belt-and-suspenders.
Recommendation: start with (a). Ship Tier 1 for static entities only. Animated path stays slow but correct. If perf gate finds the static-only Tier 1 isn't enough, escalate to (c) for safety + (b) later.
**Acceptance:**
- Build green; existing 999+ tests pass.
- 1-3 new tests covering: cache hit for static entity, cache bypass for animated entity, cache invalidation on entity remove.
- Visual gate: launch + walk Holtburg → North Yanshi at horizon-safe preset; confirm:
- Animation works (NPCs, player character animate normally)
- Buildings at correct positions
- Lifestone (still depending on Priority 1 fix) renders correctly
- No new visual regressions
- Perf gate (with `[WB-DIAG]`):
- Entity dispatcher cpu_us median drops from ~3.5ms to ≤2.0ms (matches spec budget).
- p95 stays ≤ 2.5ms.
---
## What's NOT in this phase
- **Tier 2 (static/dynamic split with persistent groups).** Separate ~2-week phase. See `docs/plans/2026-05-10-perf-tiers-2-3-roadmap.md`.
- **Tier 3 (GPU compute culling).** Separate ~1-month phase. Same roadmap.
- **Full High preset crash investigation beyond casual retest.** Stretch goal: re-test the High preset with Bug A + B fixed, see if it's stable now. If it crashes, file a new issue and continue. Don't deep-dive in this phase.
- **EnvCell modern path migration, Sky/Particles modern path, Shadow mapping** — all later phases.
- **N.6 perf polish (the previously-flagged "next phase").** N.6 was the original CLAUDE.md "Currently in flight" target before A.5. Most of N.6's scope was rolled into A.5 (perf-tier work). What's left of N.6 (persistent-mapped indirect buffer, GPU-side culling) overlaps with Tier 2/3 and should be re-scoped after Tier 1 lands.
---
## Acceptance criteria (whole phase)
- All three priorities (Lifestone, JobKind, Tier 1) shipped or one is explicitly deferred with documented reasoning.
- Build green throughout. ~999+ tests pass; 8 pre-existing physics/input failures stay at 8.
- N.5b conformance sentinel intact (TerrainSlot, TerrainModernConformance, Wb*, MatrixComposition, TextureCacheBindless, SplitFormulaDivergence — all clean).
- Visual gate: lifestone renders; animation works; horizon visible at ~2.3km; smooth walking trace; no new artifacts.
- Perf gate (post-Tier-1): entity dispatcher cpu_us median ≤ 2.0ms at horizon-safe preset, ~250-300 FPS at standstill.
- Memory entry written + roadmap "shipped" row updated for the polish phase.
---
## What you'll be doing in the first 30 minutes
1. Read this handoff in full.
2. Verify build green: `dotnet build`. Verify ~999 tests pass: `dotnet test --no-build`.
3. Read `docs/superpowers/specs/2026-05-09-phase-a5-two-tier-streaming-design.md` §2, §4.10, §11 (deferred section).
4. Read `docs/plans/2026-05-10-perf-tiers-2-3-roadmap.md` Tier 1 section.
5. Read `docs/ISSUES.md` issues #52, #53, #54 in full.
6. Read `memory/project_phase_a5_state.md` (5 gotchas).
7. Read `src/AcDream.App/Streaming/LandblockStreamer.cs` HandleJob method.
8. Read `src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs` Draw + WalkEntitiesInto methods.
9. Skim `src/AcDream.Core/Physics/AnimationSequencer.cs` for write-sites of `meshRef.PartTransform` (Tier 1 retry's audit prerequisite).
10. Decide: which priority to start with? Recommendation order: 1 (lifestone, fast win), 2 (JobKind, easy cleanup), 3 (Tier 1, biggest perf win + most complex).
11. Brainstorm with the user on the chosen priority before writing code.
12. Write a small spec or just the implementation if the priority is small (1 + 2 are small enough to skip a formal spec). Tier 1 (priority 3) needs a spec because of the audit + invalidation design.
Don't skip the audit step on Tier 1. The first attempt failed because of an incomplete read of the animation mutation graph; the second attempt should not repeat that.
---
## Things to NOT do
- **Don't rush Tier 1.** Audit first. Write down which entities are static vs dynamic. Write tests that specifically verify animated entities still animate after caching is enabled.
- **Don't bundle Tier 2 or Tier 3 into this phase.** Those are dedicated multi-week phases with their own brainstorm + spec + plan cycles.
- **Don't break the N.5b conformance sentinel.** Run the filter on every commit:
```
dotnet test --no-build --filter "FullyQualifiedName~TerrainSlot|FullyQualifiedName~TerrainModernConformance|FullyQualifiedName~Wb|FullyQualifiedName~MatrixComposition|FullyQualifiedName~TextureCacheBindless|FullyQualifiedName~SplitFormulaDivergence"
```
Expect 89+ passing, 0 failures.
- **Don't skip the visual gate.** Lifestone fix specifically requires looking at the lifestone in-game. Tier 1 retry requires confirming animation works on a moving NPC.
- **Don't delete the `_walkScratch` field** added in Bug B fix. It's load-bearing — without it, Tier 1 retry would re-introduce the per-frame allocation bug.
- **Don't re-add the `Tier1` cache that was reverted.** Start the retry with a fresh design after the animation audit. Cherry-picking the reverted code will re-introduce the bug.
---
## Reference: A.5's commit chain
Final A.5 commit chain on `claude/hopeful-darwin-ae8b87` (merged into main at `d3d78fa`):
| SHA | Subject |
|---|---|
| 9245db5 | phase(A.5): SHIP — two-tier streaming + horizon LOD + Quality Preset system |
| d93d823 | docs(A.5 T27): roadmap + ISSUES + CLAUDE.md updates for A.5 ship |
| a28a5b7 | docs(A.5 T27): spec + plan amendments for T22.5 + ship |
| 9b49009 | Revert "feat(perf): Tier 1 entity classification cache" |
| 3639a6f | feat(perf): Tier 1 entity classification cache (REVERTED) |
| 462f9d6 | docs(perf): roadmap for Tier 2 + Tier 3 entity-dispatcher optimizations |
| 0ad8c99 | fix(A.5): WalkEntities scratch-list pattern (Bug B — T17 GC pressure) |
| 9217fd9 | fix(A.5): strip far-tier entities in worker (Bug A — far tier optimization) |
| 28d2c60 | feat(A.5 T22.5): wire QualityPreset into renderer + streaming (commit 2/2) |
| afa4200 | feat(A.5 T22.5): QualityPreset schema + tests (commit 1/2) |
| c473fee | feat(A.5 T23): BUDGET_OVER flag in [WB-DIAG] / [TERRAIN-DIAG] |
| 3b684db | feat(A.5 T22): fog wired from N₁/N₂ + ACDREAM_FOG_*_MULT env vars |
| 1488ec6 | test(A.5 T21): lock in depth-write attribution per translucency kind |
| 26b2871 | feat(A.5 T20): MSAA 4x + alpha-to-coverage on foliage |
| 4b84e56 | feat(A.5 T19): mipmaps + 16x anisotropic on TerrainAtlas |
| (...60+ commits earlier in the chain through T1-T18) | (see full log on the merge bubble) |
The merge bubble preserves the full chain. To inspect any A.5 commit:
```
git log d3d78fa^..d3d78fa
git show <sha>
```
---
Good luck. The phase is well-bounded; the audit step on Tier 1 is the single highest-leverage thing to invest in. The lifestone and JobKind cleanup should be quick wins. After this phase ships, the project is in a great position — A.5 + polish + Tier 2/3 roadmap covers the rendering + perf work for the next several months.
Holler at the user if any of the three priorities reveals scope you didn't expect.