docs(post-A.5): cold-start handoff for the next session

Captures the three deferred items from A.5 ship:
- ISSUE #52: lifestone visual missing (1-3 hours, fast win)
- ISSUE #54: JobKind plumbing through BuildLandblockForStreaming
  (~30 min - 1 hour, worker-thread efficiency cleanup)
- ISSUE #53: Tier 1 entity-classification cache retry (~5-7 days,
  biggest perf win remaining; needs animation-mutation audit before
  designing to avoid the freeze-pose bug from the first attempt)

Doc covers: A.5 final state + 3 high-value gotchas, files to read,
per-priority detail with effort estimates and acceptance criteria,
what NOT to do, the first-30-minute workflow, and the full A.5
commit chain for reference.

Phase is sized ~1 week if all three priorities land. The audit
step on Tier 1 is the highest-leverage investment.

Tier 2 + Tier 3 (static/dynamic split + GPU compute culling) are
explicitly out-of-scope for this phase — separate multi-week phases
per docs/plans/2026-05-10-perf-tiers-2-3-roadmap.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Erik 2026-05-10 10:16:10 +02:00
parent d3d78fa14f
commit c111312e13

View file

@ -0,0 +1,307 @@
# Phase Post-A.5 Polish — Cold-Start Handoff
**Created:** 2026-05-10, immediately after A.5 SHIP + merge to main (`d3d78fa`).
**Audience:** the next agent picking up post-A.5 polish work.
**Purpose:** give you everything you need to start the polish phase cold, without spelunking through the A.5 session's 200+ messages.
---
## TL;DR
A.5 just shipped. Two-tier streaming is live (N₁=4 near, N₂=12 far) with a 2.3 km fog horizon, off-thread mesh build, entity dispatcher tightening, mipmaps + 16x AF, MSAA 4x + A2C foliage, depth-write audit, BUDGET_OVER diag, and a full Quality Preset system (Low/Medium/High/Ultra) with env-var overrides + F11 mid-session re-apply.
**A.5 was an enormous phase** (29 numbered tasks + T22.5 mid-execution scope add + Bug A + Bug B post-T26 fixes). Spec at `docs/superpowers/specs/2026-05-09-phase-a5-two-tier-streaming-design.md` (~700 lines). Plan at `docs/superpowers/plans/2026-05-09-phase-a5-two-tier-streaming.md` (~2400 lines).
**Three things were intentionally deferred to this phase:**
1. **Lifestone visual missing (ISSUE #52).** The Holtburg lifestone — a known visual landmark — hasn't been rendering since earlier in A.5 development. User confirmed they noticed it earlier but didn't flag it; deferred to post-ship. **Highest user-perception value to fix.**
2. **JobKind plumbing through `BuildLandblockForStreaming` (ISSUE #54).** Bug A's fix patches at the worker output by stripping entities from far-tier `LoadedLandblock`s after the full load runs. The worker still wastes CPU on hydration + scenery generation that gets thrown away. Cleaner fix: make the worker SKIP that work for far-tier loads. ~30 min - 1 hour. **Smallest cleanup, biggest worker-thread efficiency win.**
3. **Tier 1 entity-classification cache retry (ISSUE #53).** First attempt (commit `3639a6f`, reverted at `9b49009`) cached `meshRef.PartTransform` which is mutated per frame for animated entities — froze animations. Retry needs a careful read of `AnimationSequencer` + `AnimationHookRouter` first to map ALL the per-frame mutations of MeshRef state, then design a cache that bypasses animated entities OR caches only the animation-invariant subset. **Biggest perf headroom available** — math says it should drop the entity dispatcher from 3.5ms to 1-1.5ms, hitting the spec's 2.0ms budget.
The phase is sized ~1 week if all three land cleanly. Could be longer if Tier 1's animation audit reveals something subtle.
---
## Where A.5 left things
### Branch state
- `main` is at `d3d78fa` ("Merge branch 'claude/hopeful-darwin-ae8b87' — Phase A.5 SHIP + Quality Preset system").
- A.5 SHIP commit at `9245db5` (one commit before the merge bubble).
- Roadmap entry: A.5 moved from "Phases ahead" → "Phases already shipped" table.
- CLAUDE.md "Currently in flight" updated to "Post-A.5 polish — Tier 1 retry + lifestone fix + JobKind plumbing".
### What works in A.5 (final post-fix state)
- **Two-tier streaming end-to-end:** `StreamingRegion` with `RecenterTo` returning a 5-list `TwoTierDiff` (ToLoadFar/ToLoadNear/ToPromote/ToDemote/ToUnload) with hysteresis radius+2 on both tiers; `StreamingController.Tick` routes by `LandblockStreamJobKind`; `LandblockStreamer` worker thread does dat reads + mesh build off the render thread.
- **Bug A fixed:** `LandblockStreamer.HandleJob` strips entities for `LoadFar` results before posting Loaded. Far-tier ships terrain only as the spec promised.
- **Bug B fixed:** `WalkEntities` uses `_walkScratch` field reused across frames, no per-frame List allocation.
- **Quality Preset system:** Low / Medium / High / Ultra presets with per-preset radii + MSAA + anisotropic + A2C + max-completions. 6 env-var overrides per field. F11 → Display tab dropdown for mid-session change. `DisplaySettings.Quality` persists in settings.json. `GameWindow.ReapplyQualityPreset` rebuilds the streaming pipeline for radius changes.
- **Visual quality stack:** mipmaps + 16x anisotropic on TerrainAtlas. MSAA 4x + alpha-to-coverage on foliage shader. Depth-write audit + lock-in test (5 cases).
- **Fog horizon:** FogStart = N₁ × 192m × 0.7 ≈ 538m. FogEnd = N₂ × 192m × 0.95 ≈ 2188m. Tunable via `ACDREAM_FOG_START_MULT` / `ACDREAM_FOG_END_MULT`.
- **DIAG:** `[WB-DIAG]` and `[TERRAIN-DIAG]` flag `BUDGET_OVER` when median exceeds the per-subsystem spec budget (entity 2.0ms, terrain 1.0ms).
### Final perf state at A.5 SHIP (horizon-safe Quality preset)
User hardware: AMD Radeon RX 9070 XT, 240 Hz @ 2560×1440.
Settings tested: `NEAR_RADIUS=4, FAR_RADIUS=12, MSAA=0, A2C=0, ANISOTROPIC=4, MAX_COMPLETIONS=2`.
| Subsystem | cpu_us median | cpu_us p95 |
|---|---|---|
| Entity dispatcher | ~3500 µs (3.5 ms) | ~4000 µs |
| Terrain dispatcher | ~21 µs | ~26 µs |
Total frame time math: ~4-5 ms = ~200-240 FPS at standstill. User reported "Better now" — not the 240Hz spec target but a 5× improvement from the broken pre-Bug-A state (~40 FPS).
The 1.5ms gap to the 2.0ms entity dispatcher budget is what Tier 1 closes (per ISSUE #53 + the perf-tier roadmap).
### What was NOT validated at SHIP
- **Full High preset (radius=4/12, MSAA 4x, A2C on, anisotropic 16x).** Crashed the entire OS at first attempt earlier in A.5 development. Bug A was likely the trigger (CPU dispatcher saturating + GPU command queue overflowing). With Bug A fixed, this likely works — but never re-tested. **Re-testing is part of this phase's stretch goal.**
- **Visual gate at full quality.** Same — only validated at horizon-safe settings.
- **Walking trace at any preset.** Brief walking observed but not metric-captured.
### Three high-value gotchas captured in A.5 memory
These are at `~/.claude/projects/.../memory/project_phase_a5_state.md`:
1. **Worker-side JobKind routing was the load-bearing far-tier optimization.** T13/T16 wired the controller side; the worker never branched on Kind. ~5x perf regression that wasn't caught by spec/code reviews.
2. **WalkEntities's "extract a list-producing helper" pattern is a perf antipattern.** ~480 KB / frame allocation. Implementer flagged "future N.6 optimization" in self-review; review should have caught that "future" was actually "now."
3. **Caching mutable per-frame state silently breaks animation.** Tier 1's first attempt. The "trust MeshRefs as the source of truth" comment in the dispatcher is true but misleading — MeshRefs IS the source of truth, but it's mutated EVERY frame for animated entities.
(Full memory entry has 5 gotchas; these three are the load-bearing ones for post-A.5.)
---
## Files to read before brainstorming
In rough order:
1. **`docs/superpowers/specs/2026-05-09-phase-a5-two-tier-streaming-design.md`** — A.5 spec, full design rationale + Quality Preset system (§4.10) + acceptance criteria reshape (§2). Skim for vocabulary; read §4.10 in full.
2. **`docs/plans/2026-05-10-perf-tiers-2-3-roadmap.md`** — Tier 2 (static/dynamic split) + Tier 3 (GPU compute culling) roadmap. Read for context on where Tier 1 fits in the perf optimization tower.
3. **`docs/ISSUES.md` issues #52, #53, #54** — the three deferred items in tactical-list form.
4. **`memory/project_phase_a5_state.md`** — the 5 gotchas. Critical for avoiding the same traps in this phase.
5. **`src/AcDream.App/Streaming/LandblockStreamer.cs`** — `HandleJob` is where Bug A's patch lives + where ISSUE #54's cleaner fix will go.
6. **`src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs`** — `WalkEntities` + `Draw`'s inner loop. Where Tier 1's retry will operate.
7. **`src/AcDream.Core/Physics/AnimationSequencer.cs`** — the per-frame animation engine. Read this BEFORE designing Tier 1's retry. Pay specific attention to anywhere it touches `meshRef.PartTransform` or any other field that the dispatcher reads.
8. **`src/AcDream.App/Animation/AnimationHookRouter.cs`** (or similar) — the hook fan-out from animation events. Same audit reason as #7.
---
## Per-priority detail
### Priority 1 — Lifestone missing (ISSUE #52)
**Estimated effort:** 1-3 hours. Could be a 1-line fix or could surface a deeper issue.
The Holtburg lifestone is a Setup-multi-part entity (the spinning blue crystal pillar). User reports it hasn't been rendering since earlier in A.5 development. They noticed but didn't flag during the session.
Hypotheses:
- **Bug A's strip caught a near-tier entity.** The current strip in `LandblockStreamer.HandleJob` only fires when `tier == LandblockStreamTier.Far`. Holtburg's lifestone is in a near-tier LB (Holtburg's center, ~LB 0xA9B4). Should NOT have been stripped. But verify — maybe the LB's tier resolution at first-tick is wrong.
- **Earlier visual regression from a different commit.** User said it was missing in earlier runs too. Could be from N.5b, an N.5b follow-up, or even older. Requires a `git log -- docs/ISSUES.md` correlation with visible state.
- **Setup-rendering edge case.** The lifestone has unusual properties (animated rotation, particle effects on top). Maybe it's a Setup with some sub-mesh that the dispatcher's `SetupParts` walk filters out.
- **Dat-state mismatch.** The lifestone's GfxObj id might be in a part of the dat that's failing decode.
**Investigation steps:**
1. Launch the client + walk to Holtburg lifestone position.
2. Check `[WB-DIAG]` for `meshMissing` count — if non-zero, some entity's mesh isn't loading.
3. Use the cdb attach toolchain (per CLAUDE.md "Retail debugger toolchain") if needed to compare vs retail's lifestone rendering.
4. Compare to ACViewer / WorldBuilder to see if the lifestone renders there. If yes, our renderer has a regression. If no, the issue is dat-side or in shared decode logic.
5. Identify the GfxObj/Setup id for the lifestone (likely well-known retail ID; check `docs/research/named-retail/` or ACViewer reference).
6. Trace: does `_meshAdapter.TryGetRenderData(lifestoneId)` return non-null? Does the resulting `renderData.Batches` have entries?
**Acceptance:** lifestone renders correctly (visible spinning blue crystal at the Holtburg town center).
### Priority 2 — JobKind plumbing through `BuildLandblockForStreaming` (ISSUE #54)
**Estimated effort:** 30 min - 1 hour.
Currently `LandblockStreamer.HandleJob` strips entities POST-load for far-tier:
```csharp
case LandblockStreamJob.Load load:
var lb = _loadLandblock(load.LandblockId); // full load
var mesh = _buildMeshOrNull(load.LandblockId, lb);
var tier = load.Kind == LandblockStreamJobKind.LoadFar ? Far : Near;
if (tier == LandblockStreamTier.Far && lb.Entities.Count > 0)
{
// Strip entities — far-tier ships terrain only.
lb = new LoadedLandblock(...empty entities...);
}
_outbox.Writer.TryWrite(new Loaded(... lb, mesh ...));
break;
```
The full `_loadLandblock` does:
1. Read `LandBlock` heightmap (cheap).
2. Read `LandBlockInfo` (medium).
3. `LandblockLoader.BuildEntitiesFromInfo` (extract stabs/buildings).
4. Hydrate stab/building meshRefs (medium).
5. Run scenery generation (heavy — ~50-200 procedural entities × meshRef hydration).
6. Build interior cell entities.
For far-tier, only step 1 is needed. Steps 2-6 are wasted CPU on the worker thread.
**Refactor plan:**
1. Change the streamer's `_loadLandblock` factory to take `LandblockStreamJobKind`:
```csharp
private readonly Func<uint, LandblockStreamJobKind, LoadedLandblock?> _loadLandblock;
```
2. In `GameWindow`, the factory closure branches:
```csharp
loadLandblock: (id, kind) => kind == LandblockStreamJobKind.LoadFar
? BuildLandblockHeightmapOnly(id)
: BuildLandblockForStreaming(id),
```
3. New `BuildLandblockHeightmapOnly` returns a `LoadedLandblock` with the heightmap dat record + empty entity list. Cheap — no LandBlockInfo read, no scenery generation.
4. Remove the post-load strip in `HandleJob` (no longer needed).
5. Worker-thread CPU drops measurably; horizon fill on first traversal speeds up.
**Acceptance:**
- Build green; existing 999+ tests pass.
- Streaming worker thread is measurably faster on first-traversal (the user can validate with `[WB-DIAG]` worker queue depth or just feel the responsiveness when walking into virgin region).
- Visible behavior unchanged — far tier looks the same as before.
### Priority 3 — Tier 1 entity-classification cache retry (ISSUE #53)
**Estimated effort:** ~5-7 days. Substantial because the audit step is critical.
This is the BIG perf win remaining for A.5's CPU dispatcher. Math says entity dispatcher 3.5ms → 1-1.5ms = ~300-400 FPS at standstill. Drops the dispatcher inside the spec's 2.0ms budget.
**The first attempt's failure (commit 3639a6f, reverted at 9b49009):**
Cached `meshRef.PartTransform` baked into per-(entity, batch) classification at first-frame visit. For static entities, this is stable forever. For animated entities, `meshRef.PartTransform` is updated EVERY FRAME by `AnimationSequencer` to apply the current skeletal pose. The cache froze the pose.
User-visible symptoms:
- NPCs / players stop animating.
- Some buildings (likely those mistakenly in `animatedEntityIds`) draw at wrong positions.
**The retry's audit step (do this BEFORE designing the cache):**
Read `src/AcDream.Core/Physics/AnimationSequencer.cs` and trace EVERY assignment to `meshRef.PartTransform` (and any other field on `MeshRef`, `WorldEntity`, or related state that the dispatcher reads). Likely write sites:
- `AnimationSequencer.TickAnimations` per-frame skeletal pose update
- `AnimationHookRouter` for hooks like `AnimSetPose`
- Live network handlers that mutate `entity.Position` / `entity.Rotation` (T18 already migrated these to `SetPosition` for AABB invalidation; double-check)
- `EntitySpawnAdapter` for ObjDescEvent / palette swap
For each write site, decide: is this entity STATIC (write only at spawn) or DYNAMIC (write per-frame or in response to network events)?
**Cache design options after the audit:**
(a) **Static-only cache.** Only cache entities where `animatedEntityIds.Contains(entity.Id) == false`. Animated entities use today's per-frame classification path. Cleanest, but requires `animatedEntityIds` to be a stable signal (it is — `_animatedEntities` dict in GameWindow is the source).
(b) **Dynamic-aware cache with invalidation hooks.** Cache everything but expose `InvalidateEntity(uint)` / `RefreshEntityPalette(uint)` for the dispatcher's invalidation. Wire from the network layer (palette swap fires invalidation; ObjDesc event fires invalidation). More complex but might let animated entities also benefit.
(c) **Static-only + animated-bypass + diagnostic check.** Like (a), but in DEBUG builds, log a warning every frame if a cached entity's `meshRef.PartTransform` differs from the cached value (catches mis-classified dynamics). Belt-and-suspenders.
Recommendation: start with (a). Ship Tier 1 for static entities only. Animated path stays slow but correct. If perf gate finds the static-only Tier 1 isn't enough, escalate to (c) for safety + (b) later.
**Acceptance:**
- Build green; existing 999+ tests pass.
- 1-3 new tests covering: cache hit for static entity, cache bypass for animated entity, cache invalidation on entity remove.
- Visual gate: launch + walk Holtburg → North Yanshi at horizon-safe preset; confirm:
- Animation works (NPCs, player character animate normally)
- Buildings at correct positions
- Lifestone (still depending on Priority 1 fix) renders correctly
- No new visual regressions
- Perf gate (with `[WB-DIAG]`):
- Entity dispatcher cpu_us median drops from ~3.5ms to ≤2.0ms (matches spec budget).
- p95 stays ≤ 2.5ms.
---
## What's NOT in this phase
- **Tier 2 (static/dynamic split with persistent groups).** Separate ~2-week phase. See `docs/plans/2026-05-10-perf-tiers-2-3-roadmap.md`.
- **Tier 3 (GPU compute culling).** Separate ~1-month phase. Same roadmap.
- **Full High preset crash investigation beyond casual retest.** Stretch goal: re-test the High preset with Bug A + B fixed, see if it's stable now. If it crashes, file a new issue and continue. Don't deep-dive in this phase.
- **EnvCell modern path migration, Sky/Particles modern path, Shadow mapping** — all later phases.
- **N.6 perf polish (the previously-flagged "next phase").** N.6 was the original CLAUDE.md "Currently in flight" target before A.5. Most of N.6's scope was rolled into A.5 (perf-tier work). What's left of N.6 (persistent-mapped indirect buffer, GPU-side culling) overlaps with Tier 2/3 and should be re-scoped after Tier 1 lands.
---
## Acceptance criteria (whole phase)
- All three priorities (Lifestone, JobKind, Tier 1) shipped or one is explicitly deferred with documented reasoning.
- Build green throughout. ~999+ tests pass; 8 pre-existing physics/input failures stay at 8.
- N.5b conformance sentinel intact (TerrainSlot, TerrainModernConformance, Wb*, MatrixComposition, TextureCacheBindless, SplitFormulaDivergence — all clean).
- Visual gate: lifestone renders; animation works; horizon visible at ~2.3km; smooth walking trace; no new artifacts.
- Perf gate (post-Tier-1): entity dispatcher cpu_us median ≤ 2.0ms at horizon-safe preset, ~250-300 FPS at standstill.
- Memory entry written + roadmap "shipped" row updated for the polish phase.
---
## What you'll be doing in the first 30 minutes
1. Read this handoff in full.
2. Verify build green: `dotnet build`. Verify ~999 tests pass: `dotnet test --no-build`.
3. Read `docs/superpowers/specs/2026-05-09-phase-a5-two-tier-streaming-design.md` §2, §4.10, §11 (deferred section).
4. Read `docs/plans/2026-05-10-perf-tiers-2-3-roadmap.md` Tier 1 section.
5. Read `docs/ISSUES.md` issues #52, #53, #54 in full.
6. Read `memory/project_phase_a5_state.md` (5 gotchas).
7. Read `src/AcDream.App/Streaming/LandblockStreamer.cs` HandleJob method.
8. Read `src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs` Draw + WalkEntitiesInto methods.
9. Skim `src/AcDream.Core/Physics/AnimationSequencer.cs` for write-sites of `meshRef.PartTransform` (Tier 1 retry's audit prerequisite).
10. Decide: which priority to start with? Recommendation order: 1 (lifestone, fast win), 2 (JobKind, easy cleanup), 3 (Tier 1, biggest perf win + most complex).
11. Brainstorm with the user on the chosen priority before writing code.
12. Write a small spec or just the implementation if the priority is small (1 + 2 are small enough to skip a formal spec). Tier 1 (priority 3) needs a spec because of the audit + invalidation design.
Don't skip the audit step on Tier 1. The first attempt failed because of an incomplete read of the animation mutation graph; the second attempt should not repeat that.
---
## Things to NOT do
- **Don't rush Tier 1.** Audit first. Write down which entities are static vs dynamic. Write tests that specifically verify animated entities still animate after caching is enabled.
- **Don't bundle Tier 2 or Tier 3 into this phase.** Those are dedicated multi-week phases with their own brainstorm + spec + plan cycles.
- **Don't break the N.5b conformance sentinel.** Run the filter on every commit:
```
dotnet test --no-build --filter "FullyQualifiedName~TerrainSlot|FullyQualifiedName~TerrainModernConformance|FullyQualifiedName~Wb|FullyQualifiedName~MatrixComposition|FullyQualifiedName~TextureCacheBindless|FullyQualifiedName~SplitFormulaDivergence"
```
Expect 89+ passing, 0 failures.
- **Don't skip the visual gate.** Lifestone fix specifically requires looking at the lifestone in-game. Tier 1 retry requires confirming animation works on a moving NPC.
- **Don't delete the `_walkScratch` field** added in Bug B fix. It's load-bearing — without it, Tier 1 retry would re-introduce the per-frame allocation bug.
- **Don't re-add the `Tier1` cache that was reverted.** Start the retry with a fresh design after the animation audit. Cherry-picking the reverted code will re-introduce the bug.
---
## Reference: A.5's commit chain
Final A.5 commit chain on `claude/hopeful-darwin-ae8b87` (merged into main at `d3d78fa`):
| SHA | Subject |
|---|---|
| 9245db5 | phase(A.5): SHIP — two-tier streaming + horizon LOD + Quality Preset system |
| d93d823 | docs(A.5 T27): roadmap + ISSUES + CLAUDE.md updates for A.5 ship |
| a28a5b7 | docs(A.5 T27): spec + plan amendments for T22.5 + ship |
| 9b49009 | Revert "feat(perf): Tier 1 entity classification cache" |
| 3639a6f | feat(perf): Tier 1 entity classification cache (REVERTED) |
| 462f9d6 | docs(perf): roadmap for Tier 2 + Tier 3 entity-dispatcher optimizations |
| 0ad8c99 | fix(A.5): WalkEntities scratch-list pattern (Bug B — T17 GC pressure) |
| 9217fd9 | fix(A.5): strip far-tier entities in worker (Bug A — far tier optimization) |
| 28d2c60 | feat(A.5 T22.5): wire QualityPreset into renderer + streaming (commit 2/2) |
| afa4200 | feat(A.5 T22.5): QualityPreset schema + tests (commit 1/2) |
| c473fee | feat(A.5 T23): BUDGET_OVER flag in [WB-DIAG] / [TERRAIN-DIAG] |
| 3b684db | feat(A.5 T22): fog wired from N₁/N₂ + ACDREAM_FOG_*_MULT env vars |
| 1488ec6 | test(A.5 T21): lock in depth-write attribution per translucency kind |
| 26b2871 | feat(A.5 T20): MSAA 4x + alpha-to-coverage on foliage |
| 4b84e56 | feat(A.5 T19): mipmaps + 16x anisotropic on TerrainAtlas |
| (...60+ commits earlier in the chain through T1-T18) | (see full log on the merge bubble) |
The merge bubble preserves the full chain. To inspect any A.5 commit:
```
git log d3d78fa^..d3d78fa
git show <sha>
```
---
Good luck. The phase is well-bounded; the audit step on Tier 1 is the single highest-leverage thing to invest in. The lifestone and JobKind cleanup should be quick wins. After this phase ships, the project is in a great position — A.5 + polish + Tier 2/3 roadmap covers the rendering + perf work for the next several months.
Holler at the user if any of the three priorities reveals scope you didn't expect.