Merge branch 'claude/hopeful-darwin-ae8b87' — Phase A.5 SHIP + Quality Preset system
Phase A.5 — Two-tier Streaming + Horizon LOD shipped. Headline: 2.3 km terrain horizon (radius=4 near + 12 far) with off-thread mesh build, fog blend at N₁, mipmaps + 16x AF, MSAA 4x + A2C foliage, depth-write audit, BUDGET_OVER diag, Quality Preset system (Low/Medium/ High/Ultra) with env-var overrides + F11 mid-session re-apply. ~999 tests pass, 8 pre-existing physics/input failures unchanged. Two structural-to-A.5 bug fixes shipped post-T26: - Bug A (9217fd9): far-tier worker strips entities (T13/T16 had only wired the controller side; far-tier was loading full entity layers, ~71K entities instead of ~10K, 5x perf regression). - Bug B (0ad8c99): WalkEntities scratch list reused across frames (was 480 KB / frame allocation). Tier 1 entity-classification cache attempted as polish (3639a6f), reverted (9b49009) — broke animation by caching mutable per-frame state. Retry deferred to post-A.5 polish phase (ISSUE #53). Deferred to post-A.5 polish: - Tier 1 retry with animation-mutation audit (ISSUE #53) - Lifestone missing visual (ISSUE #52) - JobKind plumbing through BuildLandblockForStreaming (ISSUE #54) - Tier 2 (static/dynamic split) + Tier 3 (GPU compute cull) — separate multi-week phases. Roadmap at docs/plans/2026-05-10-perf-tiers-2-3-roadmap.md. SHIP commit:9245db5.
This commit is contained in:
commit
d3d78fa14f
37 changed files with 6001 additions and 281 deletions
|
|
@ -46,6 +46,74 @@ Copy this block when adding a new issue:
|
|||
|
||||
# Active issues
|
||||
|
||||
## #54 — A.5/jobkind-plumbing: far-tier worker loads full entity layer then strips
|
||||
|
||||
**Status:** OPEN
|
||||
**Severity:** LOW (correctness/perf; worker wastes CPU on far-tier LandBlockInfo + scenery generation that is immediately discarded)
|
||||
**Filed:** 2026-05-10
|
||||
**Component:** streaming / LandblockStreamer
|
||||
|
||||
**Description:** Bug A's fix (commit `9217fd9`) patches at the worker output — after a far-tier job completes the full `LoadNear` path, the result's entity list is stripped before posting to the completion queue. This means far-tier LBs still load `LandBlockInfo` + run `SceneryGenerator` + call `LandblockLoader.BuildEntitiesFromInfo` even though those results are thrown away. At N₂=12, that is ~544 far-tier LBs × unnecessary dat reads + scenery math on promotion sequences.
|
||||
|
||||
**Proper fix:** plumb `LandblockStreamJobKind` through `BuildLandblockForStreaming` so far-tier jobs call only `LandBlock` heightmap read + `LandblockMesh.Build`, skipping `LandBlockInfo` + `SceneryGenerator` entirely. The function signature change is ~5 lines; wiring is ~10 lines. Estimated 30 min–1 hour total.
|
||||
|
||||
**Files:**
|
||||
- `src/AcDream.App/Streaming/LandblockStreamer.cs` — `HandleJob` + `BuildLandblockForStreaming`
|
||||
|
||||
**Acceptance:** Far-tier LB worker path reads only the `LandBlock` dat file (no `LandBlockInfo`, no `SceneryGenerator` call). Verified by adding a counter diagnostic or via dotnet-trace showing the dat-read call count per job kind.
|
||||
|
||||
---
|
||||
|
||||
## #53 — A.5/tier1-redo: entity-classification cache broke animation (reverted)
|
||||
|
||||
**Status:** OPEN
|
||||
**Severity:** MEDIUM (perf gap; the classification cache would save ~1-2ms/frame but cannot land until animation-mutation audit is done)
|
||||
**Filed:** 2026-05-10
|
||||
**Component:** rendering / WbDrawDispatcher / AnimationSequencer
|
||||
|
||||
**Description:** Tier 1 entity-classification cache (commit `3639a6f`) was reverted at `9b49009` due to an animation regression. The cache stored `meshRef.PartTransform` at first-classify time. For static entities this is stable. For animated entities, `AnimationSequencer` mutates `meshRef.PartTransform` every frame to apply the current skeletal pose. The cache froze the pose, causing NPCs and some animated entities to stop animating (some buildings also showed at wrong positions, likely entities incorrectly flagged as animated).
|
||||
|
||||
**Root cause:** the "trust MeshRefs as the source of truth" comment in the dispatcher gave false confidence — MeshRefs IS the source of truth, but it is mutated EVERY frame for animated entities.
|
||||
|
||||
**Next attempt needs:**
|
||||
|
||||
1. Audit `AnimationSequencer` + `AnimationHookRouter` to identify ALL per-frame mutations of `MeshRef` state (not just `PartTransform` — are any other fields mutated?).
|
||||
2. Redesign cache to: (a) bypass animated entities entirely (classify them each frame, cache only static entities), OR (b) cache only the animation-invariant subset of the classification key (group key, texture handle, blend mode) while reading the per-frame pose from the live `MeshRef`.
|
||||
3. Test specifically with a moving animated NPC visible on screen before shipping.
|
||||
|
||||
**Estimated:** 1 week including audit + redesign + retest.
|
||||
|
||||
**Files:**
|
||||
- `src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs` — dispatcher classification logic
|
||||
- `src/AcDream.Core/Animation/AnimationSequencer.cs` — mutation source
|
||||
- `src/AcDream.Core/Animation/AnimationHookRouter.cs` — secondary mutation source
|
||||
|
||||
---
|
||||
|
||||
## #52 — A.5/lifestone-missing: Holtburg lifestone not rendering
|
||||
|
||||
**Status:** OPEN
|
||||
**Severity:** MEDIUM (visible missing landmark; lifestone is the player's respawn anchor and should always be visible)
|
||||
**Filed:** 2026-05-10
|
||||
**Component:** streaming / rendering
|
||||
|
||||
**Description:** The Holtburg lifestone (spinning blue crystal) has not rendered since earlier in A.5 development. Reproduce: launch live client, walk to Holtburg town center, look toward the lifestone position. Should see the spinning blue crystal; instead see nothing.
|
||||
|
||||
**Root cause (suspected, two candidates):**
|
||||
|
||||
1. Bug A's far-tier strip (commit `9217fd9`) may be incorrectly stripping a near-tier entity. The lifestone's server GUID is `0x5000000A`; its dat object may be registering via the `LandBlockInfo` path but getting stripped as if it were a far-tier entity due to a tier-classification race or incorrect LB-tier tracking.
|
||||
2. Separate regression from earlier in the A.5 development chain — possibly introduced when entity registration was restructured during T13/T16 streaming controller wiring.
|
||||
|
||||
**Investigation approach:**
|
||||
|
||||
1. Add a `[STREAMING-DIAG]` log line when far-tier stripping drops an entity — log the entity's GfxObj ID and LB address so the lifestone's GfxObj ID appears in the log if it is being stripped.
|
||||
2. If not in the strip log, check whether the lifestone's LB is registering as near-tier at all during first-tick bootstrap.
|
||||
3. Bisect to find the commit that broke it if the above two checks don't isolate the cause.
|
||||
|
||||
**Acceptance:** Launch live, walk to Holtburg center, spinning blue crystal visible at the lifestone position. No regression on other static entities in the area.
|
||||
|
||||
---
|
||||
|
||||
## #50 — Road-edge tree at 0xA9B1 visible in acdream but not retail
|
||||
|
||||
**Status:** OPEN
|
||||
|
|
|
|||
|
|
@ -1,6 +1,6 @@
|
|||
# acdream — strategic roadmap
|
||||
|
||||
**Status:** Living document. Updated 2026-05-09 for Phase N.5b shipping (terrain on the modern rendering path via Path C — mirror WB's `TerrainRenderManager` pattern, consume `LandblockMesh.Build` for retail formula compliance; closes ISSUE #51). N.6 (perf polish) remains the in-flight phase.
|
||||
**Status:** Living document. Updated 2026-05-10 for Phase A.5 shipping (two-tier streaming N₁=4/N₂=12 + QualityPreset system + Bug A/B fixes; closes the two-tier streaming spec). Post-A.5 polish (Tier 1 retry + lifestone fix + JobKind plumbing) is now the in-flight work.
|
||||
**Purpose:** One source of truth for where the project is and where it's going. Every observed defect or missing feature has a named phase that owns it; when something looks wrong in-game, look here to find the phase that'll address it. Implementation details live in per-phase specs under `docs/superpowers/specs/`, not in this file.
|
||||
|
||||
---
|
||||
|
|
@ -31,6 +31,7 @@
|
|||
| A.1 | Streaming landblock loader — runtime-configurable visible window (default 5×5, `ACDREAM_STREAM_RADIUS`), camera-centered offline / player-centered live, hysteresis-based unloads, pending-spawn list for late CreateObject events | Live ✓ |
|
||||
| A.2 | Frustum culling — per-landblock AABB test (Gribb-Hartmann), terrain + static-mesh renderers skip culled landblocks, perf overlay in window title | Visual ✓ |
|
||||
| A.3 | Background net receive thread — dedicated daemon thread buffers UDP into Channel, render thread drains | Visual ✓ |
|
||||
| A.5 | Two-tier streaming + horizon LOD — N₁=4 (full detail, 81 LBs) + N₂=12 (terrain only, 544 LBs); fog blend at N₁; per-LB entity dispatcher walk tightened (Change #1 animated-walk fix + Change #2 cached AABB); single-worker off-thread mesh build; mipmaps + 16x anisotropic on TerrainAtlas; A2C with MSAA 4x on foliage; depth-write audit + lock-in test; **NEW T22.5: QualityPreset system** (Low/Medium/High/Ultra) with per-preset radii + MSAA + anisotropic + A2C + completions; env-var overrides per field; F11 mid-session re-apply. **Bug fixes post-T26 ship-prep**: (Bug A) far-tier worker now strips entities from far-tier loads — without this fix, far-tier LBs were loading their full entity layer (~71K entities) defeating the two-tier optimization; (Bug B) WalkEntities switched from per-frame fresh-list allocation to caller-provided scratch list (eliminated ~480 KB/frame GC pressure). **Deferred to post-A.5**: Tier 1 entity-classification cache (first attempt broke animation; revert + redo with animation-mutation audit), lifestone visual (missing in render), JobKind plumbing through BuildLandblockForStreaming (proper Bug A fix), Tier 2/3 perf optimizations (roadmap at docs/plans/2026-05-10-perf-tiers-2-3-roadmap.md). Plan archived at docs/superpowers/plans/2026-05-09-phase-a5-two-tier-streaming.md. | Live ✓ |
|
||||
| B.3 | Physics MVP resolver foundation — terrain contact, CellSurface prototype, streaming-populated collision inputs, and first `PhysicsEngine` resolver path. Not the complete retail collision system. | Tests ✓ |
|
||||
| B.2 | Player movement mode — Tab-toggled WASD ground walking, walk/run/idle animations, third-person chase camera, MoveToState + AutonomousPosition outbound, portal entry. Outdoor-only MVP. | Live ✓ |
|
||||
| D.1 | 2D ortho overlay + font rendering (StbTrueTypeSharp atlas + TextRenderer + DebugOverlay) | Visual ✓ |
|
||||
|
|
@ -82,7 +83,7 @@ Plus polish that doesn't get its own phase number:
|
|||
- **✓ SHIPPED — A.2 — Frustum culling.** Per-landblock AABB test (Gribb-Hartmann plane extraction + positive-vertex AABB test) in both `TerrainRenderer.Draw` and `StaticMeshRenderer.Draw`. Per-entity culling deferred. LOD deferred to Phase C. Performance overlay in window title shows FPS, frame time, visible/total landblock ratio, entity count, animated count. ~160fps uncapped at 5×5 radius.
|
||||
- **✓ SHIPPED — A.3 — Background net receive thread.** Dedicated daemon thread continuously pulls raw UDP datagrams from the kernel buffer into a `Channel<byte[]>`. Render thread's `Tick()` drains the channel. All decode, fragment assembly, ISAAC crypto, event dispatch, and ack-sending remain on the render thread — minimal change that prevents packet drops during frame stalls. Thread starts after `EnterWorld()` completes; `PumpOnce()` during handshake still reads the socket directly.
|
||||
- **A.4 — Async dat decoding.** Folded into the streaming worker — it's the worker's read path, not a separate subsystem. Called out here because regressions in dat caching could land on this surface.
|
||||
- **A.5 — Two-tier streaming + terrain horizon LOD.** Split `ACDREAM_STREAM_RADIUS` into two: `ACDREAM_TERRAIN_RADIUS` (large, 8-12 cells = 1.5-2.3km) for terrain mesh + `ACDREAM_ENTITY_RADIUS` (small, 2-3 cells, current default) for entities + scenery. Distant landblocks render terrain only — no NPCs, no procedural scenery, no static objects. Tune `SceneLightingUbo`'s `uFogParams` so the far edge fades into sky color (eliminates the hard streaming boundary visible at higher radii). Optional: terrain LOD via mesh decimation for very distant chunks (combine 2×2 landblocks into one decimated mesh; cribs from `references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/TerrainRenderManager.cs`). Motivation: at radius=5 today, perf scales from ~810 fps → ~200-300 fps because everything stays full-detail; both retail and WorldBuilder render terrain way out and strip entities/scenery at distance. Enables WB-style horizon visibility. **Estimate: 3-5 days for the radius split + fog tuning; +1 week if terrain LOD is included.** Not yet brainstormed.
|
||||
- **✓ SHIPPED — A.5 — Two-tier streaming + horizon LOD.** Shipped 2026-05-10. See shipped table above for full description. Plan archived at `docs/superpowers/plans/2026-05-09-phase-a5-two-tier-streaming.md`.
|
||||
|
||||
**Acceptance:**
|
||||
- Walk across 10+ landblocks in any direction, no crashes, no empty voids.
|
||||
|
|
@ -665,7 +666,7 @@ for our deletions/additions; merge upstream `master` periodically.
|
|||
manifest at higher radius. Spec acceptance criterion #5 was wrong;
|
||||
amended via `docs/plans/2026-05-09-phase-n5b-perf-baseline.md`. Plan
|
||||
archived at `docs/superpowers/plans/2026-05-09-phase-n5b-terrain-modern.md`.
|
||||
- **N.6 — Perf polish.** **Currently in flight.**
|
||||
- **N.6 — Perf polish.** **Planned (post-A.5 polish takes priority).**
|
||||
Builds on N.5 + N.5b. Legacy renderer retirement was pulled forward
|
||||
into N.5 ship amendment — `InstancedMeshRenderer`, `StaticMeshRenderer`,
|
||||
`WbFoundationFlag` are gone — and the terrain legacy renderer
|
||||
|
|
@ -676,8 +677,8 @@ for our deletions/additions; merge upstream `master` periodically.
|
|||
is a candidate), GPU-side culling via compute pre-pass (eliminates
|
||||
the per-frame slot walk + DEIC build entirely), GL_TIME_ELAPSED query
|
||||
double-buffering (deferred from N.5 — diagnostic shows `gpu_us=0/0`
|
||||
under `ACDREAM_WB_DIAG=1`), direct higher-radius perf comparison once
|
||||
A.5 lands (where modern's architectural wins manifest), retire the
|
||||
under `ACDREAM_WB_DIAG=1`), direct higher-radius perf comparison (A.5
|
||||
has now landed — modern's architectural wins are measurable), retire the
|
||||
legacy `Texture2D`/`sampler2D` path in `TextureCache` (currently kept
|
||||
for Sky + Debug + particle paths now that Terrain has migrated).
|
||||
Plan + spec written when work begins. **Estimate: 1-2 weeks.**
|
||||
|
|
|
|||
195
docs/plans/2026-05-10-perf-tiers-2-3-roadmap.md
Normal file
195
docs/plans/2026-05-10-perf-tiers-2-3-roadmap.md
Normal file
|
|
@ -0,0 +1,195 @@
|
|||
# Performance Tiers 2 + 3 — Future Roadmap
|
||||
|
||||
**Created:** 2026-05-10 during Phase A.5 polish.
|
||||
**Status:** Future planning — not for current execution.
|
||||
**Context:** A.5 shipped two-tier streaming with the entity dispatcher landing at ~3.5ms median (post-Bug-A and Bug-B fixes). Tier 1 (entity-classification cache) lands as A.5 polish and brings the dispatcher inside the 2.0ms spec budget. Tiers 2 + 3 are the "next big perf wins" beyond Tier 1.
|
||||
|
||||
---
|
||||
|
||||
## Background — why this exists
|
||||
|
||||
Discussion captured 2026-05-10: user observed 200-240 FPS at radius=12 on a Radeon 9070 XT @ 1440p and asked why an "old game like AC" doesn't deliver Unreal-level (1000+ FPS) on this hardware.
|
||||
|
||||
The honest answer: the bottleneck is *architectural*, not hardware. The CPU is single-threaded and rebuilds the entire draw plan from scratch every frame. Modern engines pre-bake static-world batches at content-cook time and rebuild only what changes.
|
||||
|
||||
AC's design — server-spawned per-entity world streamed at runtime — doesn't naturally batch the way Unreal's pre-cooked content does. Closing the gap requires backporting modern techniques while preserving AC's data model. Tiers 2 and 3 are that backporting work.
|
||||
|
||||
---
|
||||
|
||||
## Tier 2 — Static/dynamic split with persistent groups
|
||||
|
||||
**Estimated effort:** ~10-15 days (2-week phase).
|
||||
**Estimated win:** entity dispatcher ~3.5ms → **~0.5-1ms median** at radius=12.
|
||||
**Total frame time:** ~4-5ms → **~2-3ms = 400-600 FPS at standstill.**
|
||||
|
||||
### The core idea
|
||||
|
||||
Today, `WbDrawDispatcher._groups` (the dictionary of "(mesh + texture + blend) → list of instances to draw") is cleared and rebuilt from scratch every frame.
|
||||
|
||||
For trees, rocks, buildings, and other static entities (~95% of the world), the answer is identical every frame forever. Tier 2 makes the static-group instance buffers **persistent GPU-resident data**, just like Unreal's pre-baked world. The CPU only orchestrates "which groups are visible" per frame.
|
||||
|
||||
### Architectural shift
|
||||
|
||||
```csharp
|
||||
class StaticInstancedGroup
|
||||
{
|
||||
public GroupKey Key;
|
||||
public Matrix4x4[] Matrices; // grown as entities spawn
|
||||
public BitArray ActiveSlots; // for free-list reuse
|
||||
public bool NeedsGpuUpload; // dirty flag for delta upload
|
||||
public Dictionary<uint, int> EntityToSlot; // for despawn lookup
|
||||
public uint InstanceBufferOffset; // start of group's slice in global SSBO
|
||||
}
|
||||
```
|
||||
|
||||
**On entity spawn (atlas-tier static):** allocate a slot in each relevant group, write the matrix, mark dirty.
|
||||
|
||||
**On entity despawn:** free the slot, mark dirty.
|
||||
|
||||
**Per frame:**
|
||||
- Static groups: LB-cull each group (cheap). For visible groups, flag for draw. **No matrix copy. No list rebuild.**
|
||||
- Dynamic entities (~50 NPCs/players): today's per-frame walk-and-classify. Keeps the existing slow path for things that legitimately change every frame.
|
||||
- Upload only the dirty groups' matrix slices (delta upload, not full reupload).
|
||||
- Issue 2 multi-draw-indirect calls.
|
||||
|
||||
### Sub-decisions
|
||||
|
||||
**Frustum cull granularity at the group level:** at group level you can't reject individual instances; you draw the whole group or none of it. Two strategies:
|
||||
|
||||
- **Per-LB subgroups:** split each group into per-landblock subgroups. LB-frustum-culls reject subgroups whose LB is invisible. ~2K groups × ~5 LBs per group on average = ~10K subgroups. Each subgroup AABB cull is ~0.3 µs → ~3 ms per frame. Roughly a wash with today's per-entity cull.
|
||||
- **Per-instance GPU cull (Tier 3):** compute pre-pass on the GPU writes which instances are visible to a draw-indirect buffer. ~0.05ms CPU. The right long-term answer.
|
||||
|
||||
For Tier 2 alone, per-LB subgroups are the recommended approach — keep CPU culling, just at coarser granularity than per-entity.
|
||||
|
||||
**Dynamic entities crossing LB boundaries:** when an NPC walks across a landblock boundary, it stays in the same group key but its "spatial bucket" changes. Solution: dynamic entities are tracked in a single global "dynamic group" outside the per-LB structure; they don't need spatial bucketing because there are only ~50 of them.
|
||||
|
||||
**Palette override invalidation:** server event swaps an NPC's clothing color → group key changes. Treat as despawn-from-old + spawn-into-new. NPCs are dynamic so this just rebuckets them.
|
||||
|
||||
**Animation overrides on static entities:** static entities don't animate. Trees don't bend (foliage wave is a vertex shader effect, not a group-key change). Buildings don't move. So the static path never invalidates.
|
||||
|
||||
**EnvCell visibility:** dungeon entities are gated by per-cell visibility state. Need to track which group instances are tied to which cell, and during visibility cull, gate per-cell. Keep using existing `ParentCellId` field on WorldEntity.
|
||||
|
||||
**Streaming load/unload integration:** when an LB unloads, all its static entity matrices need to be removed from their groups. Free-list management. Matches existing `LandblockSpawnAdapter` lifecycle.
|
||||
|
||||
### Effort breakdown
|
||||
|
||||
| Task | Days |
|
||||
|---|---|
|
||||
| Design + invariants document | 2 |
|
||||
| Spawn-time slot allocator + free-list | 3 |
|
||||
| Per-frame visibility + dirty-flag delta upload | 2 |
|
||||
| Dynamic entity path (NPCs, projectiles) | 2 |
|
||||
| Invalidation (palette/ObjDesc events) | 2 |
|
||||
| EnvCell visibility integration | 1 |
|
||||
| Streaming load/unload integration | 1 |
|
||||
| Conformance testing | 2-3 |
|
||||
| **Total** | **~10-15 days** |
|
||||
|
||||
### Risks
|
||||
|
||||
- **Slot management bugs** = double-frees or leaks (entities draw at random positions — visible).
|
||||
- **Invalidation bugs** = stale matrices (entity teleports back to spawn point when palette changes).
|
||||
- **Dynamic entity tracking** adds complexity around the static/dynamic boundary.
|
||||
|
||||
### Mitigations
|
||||
|
||||
- **Conformance test:** render a fixed scene through both pipelines, compare draw output. Adds CI infrastructure.
|
||||
- **Per-frame validation in debug:** walk all groups, assert no orphan slots.
|
||||
- **Hash invariant test:** static entities should produce stable group keys frame-over-frame. Add a debug assertion that fires once per frame in Debug builds.
|
||||
|
||||
---
|
||||
|
||||
## Tier 3 — GPU-side culling (compute pre-pass)
|
||||
|
||||
**Estimated effort:** ~1 month (longer phase).
|
||||
**Estimated win:** entity dispatcher ~0.5-1ms (post-Tier-2) → **~0.05ms median.**
|
||||
**Total frame time:** ~2-3ms → **~1.5-2ms = 600-1000+ FPS at standstill.**
|
||||
|
||||
### The core idea
|
||||
|
||||
Today (and after Tier 2), the CPU does per-LB or per-subgroup frustum culling and tells the GPU which groups to draw.
|
||||
|
||||
Tier 3 moves per-instance frustum cull to the GPU via a compute shader pre-pass. The CPU just uploads "here are all 1M instance matrices" once; the GPU compute shader writes which ones are visible to a draw-indirect buffer; the rasterizer draws only those.
|
||||
|
||||
This is the level Unreal is at. With this, per-frame CPU work for the entity dispatcher becomes essentially "tell the GPU what to do" + a tiny scratch upload.
|
||||
|
||||
### Why Tier 3 needs Tier 2 first
|
||||
|
||||
Without Tier 2's persistent group structure, GPU culling has nothing stable to operate on. The compute shader needs an addressable "here are the static instances" buffer to read from; that buffer only exists after Tier 2.
|
||||
|
||||
### Sub-decisions to be made
|
||||
|
||||
**Compute shader API:** OpenGL 4.3+ compute shaders are sufficient. We're already at GL 4.3+ for bindless. No additional capability requirement.
|
||||
|
||||
**Indirect draw command generation:** the compute shader writes a `DrawElementsIndirectCommand[]` buffer per pass. Render thread issues `glMultiDrawElementsIndirect` reading from that buffer. No CPU readback.
|
||||
|
||||
**LOD selection:** opportunity to add per-instance LOD selection in the compute shader (distance-based mesh detail). Not needed for A.5's scope; could be a Tier 4 follow-up.
|
||||
|
||||
**Per-light shadow map culling:** if shadows ship, GPU culling extends naturally to per-light frustum cull. Significant win for shadow rendering.
|
||||
|
||||
### Effort breakdown
|
||||
|
||||
| Task | Days |
|
||||
|---|---|
|
||||
| Compute shader design + GLSL implementation | 4 |
|
||||
| Buffer layout coordination with Tier 2 | 2 |
|
||||
| Silk.NET compute dispatch integration | 3 |
|
||||
| Indirect command compaction logic | 4 |
|
||||
| LOD selection (optional, ~stretch) | 4 |
|
||||
| Validation: per-instance cull matches CPU cull within epsilon | 3 |
|
||||
| Conformance + regression testing | 5 |
|
||||
| **Total** | **~21-25 days, ~1 month** |
|
||||
|
||||
### Risks
|
||||
|
||||
- **GPU stalls** if the compute shader takes longer than expected (esp. on lower-end GPUs).
|
||||
- **Sync overhead** between compute pre-pass and rasterizer pass.
|
||||
- **Debugging difficulty** — GPU compute bugs are harder to diagnose than CPU bugs.
|
||||
|
||||
### Mitigations
|
||||
|
||||
- **Profile-driven design:** measure compute shader runtime on target hardware before committing.
|
||||
- **Fallback path:** keep CPU cull as a runtime-toggleable option (env var) so we can A/B compare.
|
||||
- **GPU debugging tools:** RenderDoc captures + frame-by-frame compute shader inspection.
|
||||
|
||||
---
|
||||
|
||||
## When to schedule these
|
||||
|
||||
**Tier 2:**
|
||||
- Best fit: dedicated 2-week phase after a SHIP cycle. Treat it like a Phase B/C/N (i.e., name it Phase A.6 or N.7).
|
||||
- Trigger: user wants to push radius beyond 12 (e.g., to 15 or 20 for true continent-scale horizon).
|
||||
- Trigger: user wants to add 100+ active NPCs in a city without dropping below 240Hz.
|
||||
|
||||
**Tier 3:**
|
||||
- Best fit: after Tier 2 has been live and stable for at least one cycle.
|
||||
- Trigger: shadow map work begins (GPU cull + shadow cull share the same compute pre-pass infrastructure).
|
||||
- Trigger: user wants 500+ FPS sustained for very-high-refresh scenarios (360Hz monitors, future hardware).
|
||||
|
||||
**Both:**
|
||||
- Don't bundle with other phases. These are dedicated perf phases with their own brainstorm + spec + plan + SHIP cycles.
|
||||
|
||||
---
|
||||
|
||||
## What's "free" or smaller (out of Tier 1/2/3 scope but worth noting)
|
||||
|
||||
- **Plumb `JobKind` properly through `BuildLandblockForStreaming`** (~30 min). Today's Bug A patch wastes worker-thread CPU on hydration that gets thrown away for far-tier. Cleaner code, slight CPU savings on worker.
|
||||
- **Eliminate `ToEntries` adapter allocation in `Draw`** (~15 min). Tiny win (~25 KB / frame). Could fold into Tier 1.
|
||||
- **Persistent-mapped indirect buffer** (~2 days). Today's `glBufferData` per frame becomes a pre-mapped persistent buffer. Marginal win on RDNA 4; meaningful on lower-end GPUs.
|
||||
- **Multi-thread mesh-build worker pool** (~1 day). 2.7s first-traversal horizon-fill drops to 0.7s with 4 workers. UX win on first walk-into-region.
|
||||
|
||||
These are good candidates for a "perf polish" mini-phase or to backfill into Tier 2.
|
||||
|
||||
---
|
||||
|
||||
## The architectural ceiling
|
||||
|
||||
Even with all three tiers, **a faithful AC client written in C# with bindless OpenGL tops out around 800-1500 FPS at radius=12 on RDNA 4 hardware**. Beyond that requires:
|
||||
|
||||
- Native C++ rendering core (eliminate .NET GC + JIT overhead)
|
||||
- DX12/Vulkan API (eliminate driver state validation)
|
||||
- Offline content cooking (eliminate runtime mesh/texture decode)
|
||||
|
||||
Each of those is a several-month undertaking and represents "becoming a different engine." The realistic target for acdream is 240-500 FPS at the user's monitor refresh, comfortably ahead of the visible-stutter threshold. Tier 1 + Tier 2 alone should deliver that for radius=12-15.
|
||||
|
||||
For "Unreal-level FPS at full quality," that's a different project.
|
||||
2525
docs/superpowers/plans/2026-05-09-phase-a5-two-tier-streaming.md
Normal file
2525
docs/superpowers/plans/2026-05-09-phase-a5-two-tier-streaming.md
Normal file
File diff suppressed because it is too large
Load diff
|
|
@ -0,0 +1,829 @@
|
|||
# Phase A.5 — Two-tier Streaming + Horizon LOD — Design
|
||||
|
||||
**Created:** 2026-05-09 (immediately after N.5b ship + brainstorm).
|
||||
**Status:** Spec — awaiting user review before plan-writing.
|
||||
**Branch:** `claude/hopeful-darwin-ae8b87` (worktree under `.claude/worktrees/hopeful-darwin-ae8b87`).
|
||||
**Predecessor:** Phase N.5b SHIP at `08b7362`. A.5 handoff at `f7f8867`.
|
||||
|
||||
---
|
||||
|
||||
## 1. Goal
|
||||
|
||||
Scale acdream's visible reach from radius=5 (~1 km) to radius=12 (~2.3 km horizon)
|
||||
while sustaining 240 FPS at standstill on a 240 Hz / 1440p monitor.
|
||||
|
||||
Delivered through:
|
||||
1. Two-tier streaming (near = full detail, far = terrain only).
|
||||
2. Tightening the existing per-LB entity dispatcher walk.
|
||||
3. Off-thread mesh build (single worker).
|
||||
4. Fog blend at the near-tier boundary to mask the scenery cutoff.
|
||||
5. Three nearly-free visual quality wins (terrain mipmaps + anisotropic, A2C
|
||||
with MSAA on foliage, depth-write audit).
|
||||
|
||||
The headline win: walking around Holtburg, the user sees a real horizon
|
||||
(2.3 km of visible terrain) without the client falling off a perf cliff.
|
||||
|
||||
**User goal verbatim (2026-05-09):**
|
||||
> "I just want great smooth HIGH fps visuals. Should look great. As long as
|
||||
> it scales and we get very high FPS"
|
||||
|
||||
---
|
||||
|
||||
## 2. Hardware target + acceptance metrics
|
||||
|
||||
### Target hardware
|
||||
|
||||
- AMD Radeon RX 9070 XT (RDNA 4, ~December 2025).
|
||||
- 240 Hz @ 2560×1440 (verified via `Get-CimInstance Win32_VideoController`).
|
||||
- Frame budget: **4.166 ms** at vsync.
|
||||
|
||||
### Acceptance metrics (as shipped — revised with Quality Preset system)
|
||||
|
||||
1. **Build green; existing tests still green.** N.5b conformance sentinel
|
||||
passes (visual mesh Z = TerrainSurface.SampleZ within 1 mm).
|
||||
2. **Standstill at user's selected preset on user's hardware:**
|
||||
- 95% of frames hit ≤ (1000ms / monitor refresh rate).
|
||||
- No absolute FPS number is required — the Quality Preset system (§4.10)
|
||||
is the user's knob for trading quality vs frame budget.
|
||||
3. **Walking at user's selected preset:**
|
||||
- 95% of frames hit ≤ 1.5× (1000ms / monitor refresh rate).
|
||||
4. **First traversal into virgin region (cold mesh cache):**
|
||||
- Render thread frame time stays within 2× the standstill budget while
|
||||
the worker fills the far-tier horizon (~2.7 s of "horizon filling in" is OK).
|
||||
5. **Visual gate (user-driven, same on all presets):** user launches the
|
||||
client, walks Holtburg → North Yanshi, and confirms:
|
||||
- Horizon visible at ~2.3 km.
|
||||
- Fog blend at N₁ smooths the scenery boundary (no harsh cliff).
|
||||
- Distant terrain does not shimmer (mipmaps work).
|
||||
- Tree edges are smooth (A2C works).
|
||||
- No new z-fighting / depth artifacts (depth-write audit).
|
||||
6. **Per-subsystem regression budgets** (added to `[WB-DIAG]` /
|
||||
`[TERRAIN-DIAG]` output):
|
||||
- Entity dispatcher cpu_us median ≤ **2.0 ms** at standstill.
|
||||
- Terrain dispatcher cpu_us median ≤ **1.0 ms** at standstill (all 625 LBs).
|
||||
7. **N.5b sentinel intact:** TerrainSlot, TerrainModernConformance, Wb*,
|
||||
MatrixComposition, TextureCacheBindless, SplitFormulaDivergence — all
|
||||
pass clean.
|
||||
8. **SHIP record + perf baseline doc + memory entry** mirroring N.5b's pattern.
|
||||
|
||||
A failure on (5) is a SHIP-blocker. A failure on (3) walking-FPS criterion
|
||||
escalates to "fix or document the tradeoff and ship N.6 next" — not a
|
||||
direct blocker but pushes the gate to user discretion.
|
||||
|
||||
---
|
||||
|
||||
## 3. Two-tier streaming model
|
||||
|
||||
### Tier definitions
|
||||
|
||||
| Tier | Radius | LB count | Loads | GPU mem |
|
||||
|---|---|---|---|---|
|
||||
| **Near** (N₁ = 4) | 9×9 = 81 LBs | terrain mesh + LandBlockInfo (stabs/buildings) + scenery generation + EnvCells + collision data + entity registration with WB dispatcher | scenery instance buffers + per-entity textures (depends on PaletteOverrides) |
|
||||
| **Far** (N₂ = 12) | 25×25 - 9×9 = 544 LBs | terrain mesh ONLY (LandBlock heightmap + atlas blend) | ~14 MB shared atlas slots |
|
||||
| **Total** | 25×25 = 625 LBs | combined | ~30 MB total estimated |
|
||||
|
||||
### Hysteresis (Q7 Option A — match existing radius+2 convention)
|
||||
|
||||
- **Near-tier:** entity load at distance 4, demote (entity unload) at distance 6.
|
||||
- **Far-tier:** terrain load at distance 12, terrain unload at distance 14.
|
||||
|
||||
Both boundaries get the same 2-LB buffer. Phase A.1's existing hysteresis
|
||||
mechanism in `StreamingRegion.RecenterTo` is the reference pattern; A.5
|
||||
extends it from one radius to two.
|
||||
|
||||
### Tier transitions
|
||||
|
||||
| Transition | Trigger | Action |
|
||||
|---|---|---|
|
||||
| `null → far` | LB enters far window from outside | Worker reads LandBlock heightmap, builds mesh, posts `LandblockStreamResult.Loaded { Tier = Far }`. Render thread adds slot in `TerrainModernRenderer`. No entity work. |
|
||||
| `null → near` | LB jumps null → near in one tick (first-tick bootstrap; teleport into virgin region) | Worker reads LandBlock heightmap + `LandBlockInfo`, generates scenery, builds entity list, builds mesh. Posts `LandblockStreamResult.Loaded { Tier = Near }`. Render thread adds terrain slot AND merges entities. |
|
||||
| `far → near` | LB enters near window from far-resident | Worker reads `LandBlockInfo`, generates scenery, builds entity list. Posts `LandblockStreamResult.Promoted`. Render thread merges entities into `GpuWorldState` for the existing LB (terrain already loaded). |
|
||||
| `near → far` | LB leaves near window past hysteresis (distance > 6) | Render thread drops the LB's entities from `GpuWorldState` (which fires `_wbSpawnAdapter.OnLandblockUnloaded`). Terrain stays. |
|
||||
| `far → null` | LB leaves far window past hysteresis (distance > 14) | Render thread removes the terrain slot from `TerrainModernRenderer`. |
|
||||
|
||||
The order matters: when a player walks outward, the same LB goes
|
||||
`near → far → null` over time. Each transition is one event per LB per
|
||||
crossing.
|
||||
|
||||
### Why the player crossing the N₁ boundary works
|
||||
|
||||
The player is always at radius=0 from the streaming center (the streaming
|
||||
center IS the player). The boundary effects are about LBs at the edge of N₁
|
||||
crossing inward/outward as the player moves. Server-spawned NPCs are
|
||||
delivered by ACE's broadcast (radius typically 5-7 LBs ≥ N₁), so when an
|
||||
LB promotes back to near, ACE will already have its NPCs broadcast or
|
||||
re-broadcast as the player moves through. Dat-static entities (stabs,
|
||||
buildings) are reloaded from `LandBlockInfo` on promotion. Scenery is
|
||||
re-generated from the deterministic seed at the same time.
|
||||
|
||||
---
|
||||
|
||||
## 4. Component-by-component design
|
||||
|
||||
### 4.1 `LandblockStreamTier` — new enum
|
||||
|
||||
```csharp
|
||||
namespace AcDream.App.Streaming;
|
||||
|
||||
public enum LandblockStreamTier
|
||||
{
|
||||
Far, // terrain only
|
||||
Near, // full detail (terrain + entities + scenery + EnvCells)
|
||||
}
|
||||
```
|
||||
|
||||
### 4.2 `StreamingRegion` — extended to two radii
|
||||
|
||||
```csharp
|
||||
public sealed class StreamingRegion
|
||||
{
|
||||
public int CenterX { get; }
|
||||
public int CenterY { get; }
|
||||
public int NearRadius { get; } // N₁ (default 4)
|
||||
public int FarRadius { get; } // N₂ (default 12)
|
||||
|
||||
public IReadOnlyCollection<uint> NearVisible { get; } // 9×9 window
|
||||
public IReadOnlyCollection<uint> FarVisible { get; } // 25×25 window minus near
|
||||
public IReadOnlyCollection<uint> Resident { get; } // hysteresis-retained
|
||||
|
||||
public TwoTierDiff RecenterTo(int newCx, int newCy);
|
||||
}
|
||||
|
||||
public readonly record struct TwoTierDiff(
|
||||
IReadOnlyList<uint> ToLoadFar, // entered far window from null (need terrain only)
|
||||
IReadOnlyList<uint> ToLoadNear, // entered near window from null (need terrain + entities — first-tick bootstrap, teleport)
|
||||
IReadOnlyList<uint> ToPromote, // entered near window from far-resident (need entities only — terrain already loaded)
|
||||
IReadOnlyList<uint> ToDemote, // exited near window past hysteresis (drop entities)
|
||||
IReadOnlyList<uint> ToUnload); // exited far window past hysteresis (drop terrain)
|
||||
```
|
||||
|
||||
The hysteresis math:
|
||||
- Near-unload threshold: `NearRadius + 2` = 6.
|
||||
- Far-unload threshold: `FarRadius + 2` = 14.
|
||||
|
||||
A landblock is "near-resident" if its distance ≤ 6; "far-resident" if its
|
||||
distance is in (6, 14]. Beyond 14, it unloads entirely.
|
||||
|
||||
### 4.3 `StreamingController` — routes by tier
|
||||
|
||||
```csharp
|
||||
public sealed class StreamingController
|
||||
{
|
||||
public int NearRadius { get; set; } = 4;
|
||||
public int FarRadius { get; set; } = 12;
|
||||
public int MaxCompletionsPerFrame { get; set; } = 4;
|
||||
|
||||
// Action signatures change to carry the tier.
|
||||
private readonly Action<uint, LandblockStreamTier> _enqueueLoad;
|
||||
private readonly Action<uint> _enqueueUnload;
|
||||
// ...
|
||||
|
||||
public void Tick(int observerCx, int observerCy)
|
||||
{
|
||||
// First-tick bootstrap: every near-window LB → ToLoadNear; every
|
||||
// far-window-only LB → ToLoadFar.
|
||||
// Steady-state RecenterTo: produces 5 transition lists.
|
||||
// - ToLoadFar → _enqueueLoad(id, JobKind.LoadFar)
|
||||
// - ToLoadNear → _enqueueLoad(id, JobKind.LoadNear)
|
||||
// - ToPromote → _enqueueLoad(id, JobKind.PromoteToNear)
|
||||
// - ToDemote → _state.RemoveEntities(id) on render thread (no worker job)
|
||||
// - ToUnload → _enqueueUnload(id)
|
||||
// Drain completions and route by result variant.
|
||||
}
|
||||
}
|
||||
|
||||
public enum LandblockStreamJobKind { LoadFar, LoadNear, PromoteToNear }
|
||||
```
|
||||
|
||||
The render thread decides the job kind up-front based on its own knowledge
|
||||
of which LBs are currently terrain-resident; the worker never peeks at
|
||||
render-thread state. Three distinct worker paths:
|
||||
|
||||
- **`LoadFar`:** read `LandBlock` heightmap only. Skip `LandBlockInfo`,
|
||||
skip `LandblockLoader.BuildEntitiesFromInfo`, skip
|
||||
`SceneryGenerator`/`WbSceneryAdapter`. Build `LandblockMesh`. Post
|
||||
`LandblockStreamResult.Loaded(Tier=Far, Entities=[], MeshData=mesh)`.
|
||||
- **`LoadNear`:** read `LandBlock` + `LandBlockInfo` + scenery generation
|
||||
+ build mesh. Post `LandblockStreamResult.Loaded(Tier=Near, Entities=...,
|
||||
MeshData=mesh)`. Used for first-tick bootstrap of the inner ring and
|
||||
for the rare null→Near jump (teleport into virgin region).
|
||||
- **`PromoteToNear`:** read `LandBlockInfo` + scenery generation only.
|
||||
Skip `LandBlock` heightmap (mesh already on GPU). Skip
|
||||
`LandblockMesh.Build`. Post `LandblockStreamResult.Promoted(id, entities)`.
|
||||
|
||||
### 4.4 `LandblockStreamResult` — new variants
|
||||
|
||||
```csharp
|
||||
public abstract record LandblockStreamResult
|
||||
{
|
||||
public sealed record Loaded(
|
||||
uint LandblockId,
|
||||
LandblockStreamTier Tier,
|
||||
LandBlock Heightmap,
|
||||
IReadOnlyList<WorldEntity> Entities, // empty for Far
|
||||
LandblockMeshData MeshData // built off-thread
|
||||
) : LandblockStreamResult;
|
||||
|
||||
public sealed record Promoted(
|
||||
uint LandblockId,
|
||||
IReadOnlyList<WorldEntity> Entities // entity layer for an already-loaded far-tier LB
|
||||
) : LandblockStreamResult;
|
||||
|
||||
// Existing:
|
||||
public sealed record Unloaded(uint LandblockId) : LandblockStreamResult;
|
||||
public sealed record Failed(uint LandblockId, string Error) : LandblockStreamResult;
|
||||
public sealed record WorkerCrashed(string Error) : LandblockStreamResult;
|
||||
}
|
||||
```
|
||||
|
||||
`Loaded` carries `MeshData` — the mesh is built on the worker thread, NOT
|
||||
in `_applyTerrain` on the render thread. `Promoted` only carries entities;
|
||||
the mesh is already in `TerrainModernRenderer`.
|
||||
|
||||
### 4.5 `LandblockStreamer` — single worker, mesh-build on-worker
|
||||
|
||||
Existing `LandblockStreamer` (today on a single background thread) gets
|
||||
extended to:
|
||||
|
||||
1. Read dat as today (`DatCollection.Get<LandBlock>` etc.).
|
||||
2. Build `LandblockMesh` on the same thread:
|
||||
```csharp
|
||||
var meshData = LandblockMesh.Build(
|
||||
block, lbX, lbY, heightTable, _ctx, _surfaceCache);
|
||||
```
|
||||
3. Post `LandblockStreamResult.Loaded(... MeshData = meshData)` to the
|
||||
completion queue.
|
||||
|
||||
Thread-safety implications:
|
||||
- `_ctx` (TerrainBlendingContext) is read-only after init — no change.
|
||||
- `_surfaceCache`: today a plain `Dictionary<uint, SurfaceInfo>`,
|
||||
populated lazily by `LandblockMesh.Build`. Currently safe because
|
||||
Build runs on the render thread; A.5 moves Build to the worker, so
|
||||
the cache must be thread-safe. **Swap to
|
||||
`ConcurrentDictionary<uint, SurfaceInfo>`** with `GetOrAdd` for the
|
||||
populate path. The factory inside `GetOrAdd` may run twice for the
|
||||
same key under contention (acceptable — the result is deterministic).
|
||||
|
||||
### 4.6 `WbDrawDispatcher` — entity bucketing tightening (Q5 Option A)
|
||||
|
||||
Three targeted changes inside the existing `Draw` flow:
|
||||
|
||||
#### Change 1: Animated-entity walk fix
|
||||
|
||||
Today (at lines 197-204 of `WbDrawDispatcher.cs`):
|
||||
|
||||
```csharp
|
||||
foreach (var entry in landblockEntries) {
|
||||
bool landblockVisible = ...;
|
||||
if (!landblockVisible && (animatedEntityIds is null || animatedEntityIds.Count == 0))
|
||||
continue;
|
||||
|
||||
foreach (var entity in entry.Entities) {
|
||||
...
|
||||
if (!landblockVisible && !isAnimated) continue;
|
||||
```
|
||||
|
||||
The `if (!landblockVisible && ...) continue;` only skips if there are NO
|
||||
animated entities. When `animatedEntityIds` is non-empty, the inner loop
|
||||
walks every entity in the invisible LB just to find the few animated
|
||||
ones. With ~10.7K entities at N₁=4, this is wasted iteration.
|
||||
|
||||
**Fix:** when an LB is invisible, iterate `animatedEntityIds` directly
|
||||
and look each up in a per-LB `Dictionary<uint, WorldEntity>` map (added
|
||||
to `LoadedLandblock` or kept in a parallel structure).
|
||||
|
||||
```csharp
|
||||
foreach (var entry in landblockEntries) {
|
||||
bool landblockVisible = ...;
|
||||
if (!landblockVisible) {
|
||||
if (animatedEntityIds is null || animatedEntityIds.Count == 0) continue;
|
||||
// Walk only animated entities in this invisible LB.
|
||||
foreach (var animatedId in animatedEntityIds) {
|
||||
if (!entry.AnimatedById.TryGetValue(animatedId, out var entity)) continue;
|
||||
// ... draw the entity
|
||||
}
|
||||
continue;
|
||||
}
|
||||
foreach (var entity in entry.Entities) { ... }
|
||||
}
|
||||
```
|
||||
|
||||
#### Change 2: Per-entity AABB cache at register time
|
||||
|
||||
Today: `Draw` recomputes `aMin = position - 5`, `aMax = position + 5` per
|
||||
entity per frame. Cheap individually, but ~16K × per frame = measurable.
|
||||
|
||||
**Fix:** add `Vector3 AabbMin, AabbMax` fields to `WorldEntity` (or a
|
||||
parallel struct keyed by entity id). Populate at `EntitySpawnAdapter.OnCreate`
|
||||
(server-spawned) and `LandblockLoader.BuildEntitiesFromInfo` (dat-static)
|
||||
time. Static entities never invalidate. Dynamic entities (NPCs, players)
|
||||
update on position change — add `WorldEntity.PositionDirty` flag set by
|
||||
the live position update path; AABB recompute happens lazily on first
|
||||
read after dirty.
|
||||
|
||||
The AABB radius today is hard-coded `PerEntityCullRadius = 5.0f` — keep
|
||||
that as a per-mesh-bucket fallback; future improvement is to compute the
|
||||
real AABB from the mesh, but defer that to a later phase (it's a
|
||||
cross-cutting change).
|
||||
|
||||
#### Change 3: 4×4 sub-LB cell cull for partially-visible LBs
|
||||
|
||||
When an LB is fully visible (its AABB entirely inside the frustum), all
|
||||
its entities are drawn — no per-entity cull needed. Today's per-entity
|
||||
cull is wasted work in this case.
|
||||
|
||||
When an LB is partially visible, today's per-entity cull is the right
|
||||
work — but it walks all ~132 entities. Cheap with the AABB-cache fix
|
||||
(memory read), so the win here is small. Worth doing only if the cache
|
||||
fix alone isn't enough to hit the 2.0ms budget.
|
||||
|
||||
**Add only if needed:** bucket each LB's entities into 4×4 sub-cells
|
||||
(each 48 m). Compute a sub-cell AABB at register time. Per frame: for
|
||||
partially-visible LBs, cull at sub-cell granularity first; walk
|
||||
entities only inside surviving sub-cells.
|
||||
|
||||
Ship change #1 and #2 unconditionally; ship #3 only if the budget
|
||||
isn't hit by #1 + #2.
|
||||
|
||||
### 4.7 `TerrainModernRenderer` — no structural change
|
||||
|
||||
The slot allocator (`TerrainSlotAllocator`) already grows by power-of-two
|
||||
doubling. At N₂=12 worst case, ~961 slots × ~15 KB per slot = ~14 MB.
|
||||
Allocator handles it without modification.
|
||||
|
||||
Per-LB frustum cull stays per-slot — at ~961 slots × ~0.3 µs/AABB-test
|
||||
the worst-case cull pass is ~0.3 ms. Acceptable inside the 1.0 ms terrain
|
||||
dispatcher budget.
|
||||
|
||||
The DEIC (`DrawElementsIndirectCommand`) array grows accordingly. The
|
||||
existing per-frame `BufferSubData` upload absorbs a 961-entry array
|
||||
without issue (~19 KB).
|
||||
|
||||
### 4.8 Fog tuning (`SceneLightingUbo`)
|
||||
|
||||
Existing fields (Phase G.1+):
|
||||
- `FogStart` — distance at which fog begins (today: somewhere outside the
|
||||
visible terrain range).
|
||||
- `FogEnd` — distance at which fog reaches full opacity.
|
||||
- `FogColor` — sourced from current sky state.
|
||||
|
||||
A.5 change: dynamically tune `FogStart` and `FogEnd` based on the
|
||||
current N₁/N₂:
|
||||
|
||||
- `FogStart = N₁ × LandblockSize × 0.7` ≈ `4 × 192 × 0.7` = **~538 m**.
|
||||
- `FogEnd = N₂ × LandblockSize × 0.95` ≈ `12 × 192 × 0.95` = **~2188 m**.
|
||||
|
||||
The fog color matches the current sky color (already provided by
|
||||
`SkyStateProvider`) — at the far horizon, fog blends terrain into
|
||||
sky, hiding the N₂ edge.
|
||||
|
||||
The 0.7 / 0.95 multipliers are tuning knobs. Iterate during user gate.
|
||||
**Expose as env vars during development** (`ACDREAM_FOG_START_MULT`,
|
||||
`ACDREAM_FOG_END_MULT`) to allow fast iteration without a recompile.
|
||||
|
||||
### 4.9 Visual quality wins (Q8 Option C — all three)
|
||||
|
||||
#### 4.9.1 Mipmaps + 16x anisotropic on `TerrainAtlas`
|
||||
|
||||
Today: `TerrainAtlas.Upload` uses `GL_LINEAR` minification, no mipmaps.
|
||||
|
||||
A.5 change: after upload, call `glGenerateMipmap(GL_TEXTURE_2D_ARRAY)`.
|
||||
Sampler state: `GL_LINEAR_MIPMAP_LINEAR` (trilinear) +
|
||||
`GL_TEXTURE_MAX_ANISOTROPY = 16`.
|
||||
|
||||
Affects only `TerrainAtlas`. Mesh atlas (entity textures) and other
|
||||
texture caches stay as-is.
|
||||
|
||||
Verification: at N₂=12, walk to a vantage point looking at terrain at
|
||||
range 2 km. With the fix, no shimmer. Without, "moving sparkles" visible
|
||||
at distance.
|
||||
|
||||
#### 4.9.2 Alpha-to-coverage with MSAA on foliage
|
||||
|
||||
Today: `mesh_modern.frag` uses `if (alpha < cutoff) discard;` for ClipMap
|
||||
translucency. Produces hard, pixel-edged tree silhouettes.
|
||||
|
||||
A.5 change:
|
||||
- Enable MSAA 4x on the GL render target (window framebuffer).
|
||||
- In `mesh_modern.frag`, for ClipMap pass: write
|
||||
`gl_SampleMask[0]` based on alpha threshold instead of binary discard.
|
||||
|
||||
Risk: MSAA framebuffer interaction with sky / particles / UI overlay.
|
||||
Audit:
|
||||
- `SkyRenderer` — clears its own framebuffer? If so, must clear the MSAA
|
||||
attachment instead. Investigate.
|
||||
- `ParticleRenderer` — billboards already use alpha-blend; MSAA-friendly.
|
||||
- ImGui overlay — drawn after the 3D pass; must not interact with MSAA
|
||||
resolve.
|
||||
|
||||
If the audit finds blocking issues, ship 4.9.1 + 4.9.3 only and defer
|
||||
4.9.2 to a later phase. Document the result either way.
|
||||
|
||||
#### 4.9.3 Depth-write audit on translucent batches
|
||||
|
||||
Walk all translucent batch paths in `WbDrawDispatcher.Draw` and verify:
|
||||
- Alpha-blend (`AlphaBlend`, `Additive`, `InvAlpha`): `glDepthMask(false)`.
|
||||
- Clip-map (binary alpha): `glDepthMask(true)` (foliage casts depth).
|
||||
- Opaque: `glDepthMask(true)`.
|
||||
|
||||
Today's code at lines 401-433 sets `DepthMask(true)` for opaque,
|
||||
`DepthMask(false)` for transparent. Confirm ClipMap is in the opaque
|
||||
pass (it is, per `IsOpaque` returning true for ClipMap at line 738).
|
||||
|
||||
If audit finds nothing wrong, ship a comment + a unit test that locks in
|
||||
the partition. Cheap insurance against future regression.
|
||||
|
||||
### 4.10 Quality Preset System (T22.5 — added mid-execution)
|
||||
|
||||
**Background:** Added between T22 (fog wiring) and T23 (DIAG budgets) at
|
||||
user's direction. The original spec had no preset concept; §2 was written
|
||||
against absolute 240 FPS on fixed N₁/N₂. T22.5 makes both radii and every
|
||||
quality knob user-controllable via a single enum. §2 was amended above to
|
||||
reflect the per-preset, refresh-rate-relative acceptance criteria.
|
||||
|
||||
#### Schema
|
||||
|
||||
```csharp
|
||||
public enum QualityPreset { Low, Medium, High, Ultra }
|
||||
|
||||
public readonly record struct QualitySettings(
|
||||
int NearRadius,
|
||||
int FarRadius,
|
||||
int MsaaSamples,
|
||||
int AnisotropicLevel,
|
||||
bool AlphaToCoverage,
|
||||
int MaxCompletionsPerFrame);
|
||||
```
|
||||
|
||||
`QualitySettings.From(preset)` returns the canonical values:
|
||||
|
||||
| Preset | NearRadius | FarRadius | MsaaSamples | AnisotropicLevel | AlphaToCoverage | MaxCompletionsPerFrame |
|
||||
|---|---|---|---|---|---|---|
|
||||
| Low | 2 | 5 | 0 | 4 | false | 2 |
|
||||
| Medium | 3 | 8 | 2 | 8 | false | 3 |
|
||||
| High | 4 | 12 | 4 | 16 | true | 4 |
|
||||
| Ultra | 5 | 15 | 4 | 16 | true | 6 |
|
||||
|
||||
`QualitySettings.WithEnvOverrides(baseSettings)` applies per-field env-var
|
||||
overrides (see §4.10.3).
|
||||
|
||||
#### Persistence and UI
|
||||
|
||||
`DisplaySettings.Quality` (type `QualityPreset`) persists via the existing
|
||||
`settings.json` infrastructure (Phase L.0). The Settings panel (F11) exposes
|
||||
a Quality dropdown in its Display tab (`SettingsPanel.RenderDisplayTab`).
|
||||
|
||||
#### Wiring (GameWindow.OnLoad + ReapplyQualityPreset)
|
||||
|
||||
1. `GameWindow.OnLoad` resolves the active `QualitySettings`:
|
||||
`QualitySettings.From(displaySettings.Quality).WithEnvOverrides(...)`.
|
||||
2. `StreamingController` and `LandblockStreamer` are built with the preset's
|
||||
`NearRadius` / `FarRadius`.
|
||||
3. `TerrainAtlas.SetAnisotropic(settings.AnisotropicLevel)` called once at
|
||||
load and again on reapply.
|
||||
4. `WindowOptions.Samples = settings.MsaaSamples` applied at window creation
|
||||
time only (MSAA mid-session change is structurally unsupported by OpenGL).
|
||||
5. `WbDrawDispatcher.AlphaToCoverage = settings.AlphaToCoverage`.
|
||||
6. `StreamingController.MaxCompletionsPerFrame = settings.MaxCompletionsPerFrame`.
|
||||
|
||||
Mid-session quality change (F11 dropdown change → Save):
|
||||
|
||||
- `GameWindow.ReapplyQualityPreset` rebuilds `StreamingController` +
|
||||
`LandblockStreamer` with the new radii, re-applies anisotropic and
|
||||
AlphaToCoverage.
|
||||
- If `MsaaSamples` changed, logs a warning that MSAA sample count cannot be
|
||||
changed mid-session; requires restart.
|
||||
|
||||
#### Env-var overrides (§4.10.3)
|
||||
|
||||
Applied by `QualitySettings.WithEnvOverrides` after the base preset is resolved.
|
||||
Each field has one env var; all are optional. Logged at startup.
|
||||
|
||||
| Env var | Field overridden |
|
||||
|---|---|
|
||||
| `ACDREAM_NEAR_RADIUS` | `NearRadius` |
|
||||
| `ACDREAM_FAR_RADIUS` | `FarRadius` |
|
||||
| `ACDREAM_MSAA_SAMPLES` | `MsaaSamples` |
|
||||
| `ACDREAM_ANISOTROPIC` | `AnisotropicLevel` |
|
||||
| `ACDREAM_A2C` | `AlphaToCoverage` (1/0/true/false) |
|
||||
| `ACDREAM_MAX_COMPLETIONS_PER_FRAME` | `MaxCompletionsPerFrame` |
|
||||
|
||||
#### Tests
|
||||
|
||||
12 tests in `tests/AcDream.UI.Abstractions.Tests/Settings/QualityPresetTests.cs`
|
||||
cover: canonical preset values per enum member; `WithEnvOverrides` no-op when
|
||||
no env vars set; `WithEnvOverrides` each override individually; invalid env-var
|
||||
value falls back to base setting.
|
||||
|
||||
#### Files
|
||||
|
||||
- `src/AcDream.UI.Abstractions/Settings/QualityPreset.cs` — new
|
||||
- `src/AcDream.UI.Abstractions/Settings/DisplaySettings.cs` — `Quality` field added
|
||||
- `src/AcDream.UI.Abstractions/Panels/Settings/SettingsPanel.cs` — Display tab
|
||||
Quality dropdown (`RenderDisplayTab` method)
|
||||
- `src/AcDream.App/Rendering/GameWindow.cs` — `ReapplyQualityPreset`,
|
||||
`OnLoad` preset wiring
|
||||
- `tests/AcDream.UI.Abstractions.Tests/Settings/QualityPresetTests.cs` — new (12 tests)
|
||||
|
||||
#### Out of scope (deferred)
|
||||
|
||||
- Auto-detect preset on first launch (Phase A.6 / N.6.5).
|
||||
- Adaptive runtime preset drop on budget miss.
|
||||
- Per-feature toggles below preset level.
|
||||
|
||||
Commits: `afa4200` (schema + tests), `28d2c60` (wiring).
|
||||
|
||||
---
|
||||
|
||||
## 5. Data flow
|
||||
|
||||
### Per-frame (steady state)
|
||||
|
||||
```
|
||||
GameWindow.OnUpdate(dt)
|
||||
└─ StreamingController.Tick(playerCx, playerCy)
|
||||
├─ region.RecenterTo(...) // produces TwoTierDiff if center changed
|
||||
├─ for each ToLoadFar: _enqueueLoad(id, LoadFar)
|
||||
├─ for each ToLoadNear: _enqueueLoad(id, LoadNear)
|
||||
├─ for each ToPromote: _enqueueLoad(id, PromoteToNear)
|
||||
├─ for each ToDemote: _state.RemoveEntities(id) // on render thread
|
||||
├─ for each ToUnload: _enqueueUnload(id)
|
||||
└─ drainCompletions(MaxCompletionsPerFrame=4)
|
||||
├─ Loaded.Far: _terrain.AddLandblock(meshData); _state.AddLandblock(...)
|
||||
├─ Loaded.Near: _terrain.AddLandblock(meshData); _state.AddLandblock(... entities)
|
||||
├─ Promoted: _state.AddEntitiesToExisting(id, entities)
|
||||
├─ Unloaded: _terrain.RemoveLandblock(id); _state.RemoveLandblock(id)
|
||||
└─ Failed/Crash: log
|
||||
|
||||
GameWindow.OnRender
|
||||
├─ TerrainModernRenderer.Draw(camera, frustum)
|
||||
│ └─ glMultiDrawElementsIndirect across all near + far slots that pass cull
|
||||
└─ WbDrawDispatcher.Draw(camera, gpuWorldState.LandblockEntries, frustum, visibleCellIds, animatedEntityIds)
|
||||
├─ for each LB entry:
|
||||
│ ├─ if invisible: walk only animatedEntityIds (Change #1)
|
||||
│ └─ if visible: walk entities, AABB cache lookup (Change #2)
|
||||
├─ classify into groups, build SSBO, multi-draw indirect
|
||||
└─ flush DIAG every ~5 s
|
||||
```
|
||||
|
||||
### Worker thread
|
||||
|
||||
```
|
||||
LandblockStreamer.WorkerLoop
|
||||
while running:
|
||||
job = jobQueue.dequeue()
|
||||
switch job.Kind:
|
||||
LoadFar:
|
||||
block = dats.Get<LandBlock>(id)
|
||||
meshData = LandblockMesh.Build(block, ..., _surfaceCache)
|
||||
completionQueue.enqueue(Loaded(id, Far, block, [], meshData))
|
||||
LoadNear:
|
||||
block = dats.Get<LandBlock>(id)
|
||||
info = dats.Get<LandBlockInfo>(...)
|
||||
entities = LandblockLoader.BuildEntitiesFromInfo(info)
|
||||
scenery = WbSceneryAdapter.GenerateScenery(block, ...)
|
||||
meshData = LandblockMesh.Build(block, ..., _surfaceCache)
|
||||
completionQueue.enqueue(Loaded(id, Near, block, entities ∪ scenery, meshData))
|
||||
PromoteToNear:
|
||||
info = dats.Get<LandBlockInfo>(...)
|
||||
// Heightmap not re-read; scenery generation needs LandBlock for height
|
||||
// sampling — read it again from disk cache (DatCollection caches the
|
||||
// last-read block; cheap second access) OR pass through from render
|
||||
// thread's terrain-slot snapshot (deferred plan-level decision).
|
||||
block = dats.Get<LandBlock>(id)
|
||||
entities = LandblockLoader.BuildEntitiesFromInfo(info)
|
||||
scenery = WbSceneryAdapter.GenerateScenery(block, ...)
|
||||
completionQueue.enqueue(Promoted(id, entities ∪ scenery))
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 6. Threading model
|
||||
|
||||
- **Render thread:** drives `StreamingController.Tick`, drains the
|
||||
completion queue, calls `TerrainModernRenderer.AddLandblock` /
|
||||
`RemoveLandblock`, mutates `GpuWorldState`. All GL calls on this thread.
|
||||
- **One streaming worker thread:** dat reads, mesh build, scenery generation.
|
||||
Owns `_surfaceCache` (now `ConcurrentDictionary`) — render thread does
|
||||
not access it directly.
|
||||
- **Network thread:** unchanged from Phase A.3 — drains UDP into the
|
||||
channel; render thread decodes.
|
||||
|
||||
Synchronization:
|
||||
- Job queue: `Channel<LbStreamJob>` (writer = render thread via
|
||||
`_enqueueLoad`; reader = worker).
|
||||
- Completion queue: `ConcurrentQueue<LandblockStreamResult>` (writer =
|
||||
worker; reader = render thread).
|
||||
- `_surfaceCache`: `ConcurrentDictionary<uint, SurfaceInfo>` populated by
|
||||
`LandblockMesh.Build` on the worker; read by future paths if any
|
||||
(none today).
|
||||
- `TerrainBlendingContext`: read-only post-init. No lock.
|
||||
|
||||
---
|
||||
|
||||
## 7. Error handling
|
||||
|
||||
- **Worker crash:** caught in worker loop, posts
|
||||
`LandblockStreamResult.WorkerCrashed`. Render thread logs to console.
|
||||
(Existing pattern.)
|
||||
- **Dat read failure:** posts `LandblockStreamResult.Failed`. Render
|
||||
thread logs. Streaming continues with the LB skipped — region still
|
||||
tracks it as resident so we don't retry forever, but the slot stays empty.
|
||||
- **AABB cache invalidation race:** dynamic entity moves while the
|
||||
dispatcher is walking. Acceptable — at worst, the entity culls or
|
||||
draws based on the previous frame's position. Position is updated in
|
||||
the network handler (also render-thread today) so no actual race.
|
||||
- **Promotion timing:** if the player crosses N₁ inward, we enqueue a
|
||||
`Near` load on the worker. Until it completes, the LB has terrain but
|
||||
no scenery / entities. Frame budget is unaffected (only `LoadedLandblock`
|
||||
changes, and the dispatcher already handles missing entities by walking
|
||||
zero-length lists).
|
||||
- **Unload during in-flight load:** enqueue an unload while a load is
|
||||
in flight. When the load completes, render thread sees the LB is no
|
||||
longer resident — drop the result silently. Same pattern as today.
|
||||
|
||||
---
|
||||
|
||||
## 8. Testing strategy
|
||||
|
||||
### Unit tests (offline, no GL)
|
||||
|
||||
Add to `tests/AcDream.Core.Tests/Streaming/`:
|
||||
- `StreamingRegion_TwoTier_FirstTick_LoadsNearAndFarSeparately` — first
|
||||
call produces `ToLoadNear` populated for inner ring, `ToLoadFar`
|
||||
populated for outer ring, `ToPromote` empty (nothing was previously
|
||||
resident).
|
||||
- `StreamingRegion_TwoTier_NullToFar_OnFarRingEntry` — LB rolls into
|
||||
far window from null. Asserts entry in `ToLoadFar`, not
|
||||
`ToLoadNear`.
|
||||
- `StreamingRegion_TwoTier_FarToNear_OnNearRingEntry` — LB was
|
||||
far-resident, player walks toward it, LB enters near window. Asserts
|
||||
entry in `ToPromote`, not `ToLoadNear`.
|
||||
- `StreamingRegion_TwoTier_NullToNear_OnTeleport` — observer center
|
||||
jumps far enough that an LB goes from null → Near in one frame
|
||||
(e.g., teleport). Asserts entry in `ToLoadNear`, not `ToPromote`.
|
||||
- `StreamingRegion_TwoTier_NearToFar_OnNearBoundaryExitPlusHysteresis` —
|
||||
asserts entry in `ToDemote` only after distance exceeds
|
||||
`NearRadius + 2`.
|
||||
- `StreamingRegion_TwoTier_FarToNull_OnFarBoundaryExitPlusHysteresis` —
|
||||
asserts entry in `ToUnload` only after distance exceeds
|
||||
`FarRadius + 2`.
|
||||
- `StreamingRegion_TwoTier_HysteresisHoldsAcrossOscillation` — walk
|
||||
back-and-forth across N₁ five times within the hysteresis radius;
|
||||
assert no demote events fire.
|
||||
- `StreamingController_TwoTier_DrainsRoutedByVariant` — `Loaded.Far`,
|
||||
`Loaded.Near`, and `Promoted` each route to the right state mutation
|
||||
on the render thread.
|
||||
|
||||
Add to `tests/AcDream.Core.Tests/Rendering/Wb/`:
|
||||
- `WbDrawDispatcher_AnimatedEntities_InInvisibleLb_NoFullEntityWalk` —
|
||||
verify Change #1 (only iterates `animatedEntityIds`, not `Entities`).
|
||||
- `WbDrawDispatcher_PerEntityAabbCached_NotRecomputed` — assert AABB
|
||||
fields are read, not recomputed, for static entities.
|
||||
|
||||
### Conformance tests
|
||||
|
||||
- `TerrainModernConformanceTests` (existing) — must still pass. The
|
||||
visual mesh Z must agree with `TerrainSurface.SampleZFromHeightmap`
|
||||
to within 1 mm across both tiers.
|
||||
- `LandblockMeshTests` (existing) — must still pass. Worker-thread
|
||||
mesh build produces byte-identical results to render-thread build
|
||||
for the same inputs.
|
||||
|
||||
### Perf gate (manual, with `[WB-DIAG]` + `[TERRAIN-DIAG]`)
|
||||
|
||||
- **Standstill bench:** launch with `ACDREAM_WB_DIAG=1`, stand at
|
||||
Holtburg dueling field for 60 s. Read median + p95 + p99 from log.
|
||||
- **Walking bench:** launch with diag, run from Holtburg to North
|
||||
Yanshi, ~60 s. Same metrics.
|
||||
- **First traversal bench:** clear OS file cache (or reboot), launch
|
||||
with diag, walk into a region not previously visited, capture the
|
||||
worker-thread fill duration + render-thread frame time during fill.
|
||||
|
||||
### Visual gate (manual, user-driven)
|
||||
|
||||
User launches the client, walks the standard route, confirms:
|
||||
1. Horizon visible at 2.3 km.
|
||||
2. Fog blend is smooth (no scenery cliff at N₁).
|
||||
3. No shimmer on distant terrain.
|
||||
4. Smooth tree edges (foliage A2C).
|
||||
5. No new z-fighting / depth artifacts.
|
||||
|
||||
---
|
||||
|
||||
## 9. Out of scope (explicitly deferred)
|
||||
|
||||
Per the brainstorm Q10 confirmation:
|
||||
|
||||
- **GPU-side culling** (compute pre-pass) — N.6.
|
||||
- **Persistent-mapped indirect buffer** — N.6.
|
||||
- **Multi-thread mesh-build worker pool** — N.6 if first-traversal fill
|
||||
feels too slow at gate.
|
||||
- **Static/dynamic persistent groups** (Q5 Option B — the "compute the
|
||||
group key once at spawn" architecture change) — separate later phase
|
||||
(likely A.6 or N.6.5).
|
||||
- **Billboard / impostor scenery** at far tier — escalation only if the
|
||||
fog'd terrain horizon looks too bare at gate.
|
||||
- **Wider N₁ hysteresis** (Option C, radius+3) — single-line tweak only
|
||||
if gate finds entity pop-in along the boundary.
|
||||
- **Far-tier terrain mesh LOD** (decimating 2×2 LBs) — not needed at
|
||||
N₂=12; revisit only if N₂ grows beyond 15.
|
||||
- **Sky / particles modern path migration** — N.7+ phases.
|
||||
- **EnvCell modern path migration** — separate phase.
|
||||
- **Shadow mapping** — separate visual phase, later.
|
||||
- **Strict 240 Hz during walking** (Q9 Option A) — graduate to in a
|
||||
perf-polish phase if we want to commit to it.
|
||||
|
||||
---
|
||||
|
||||
## 10. Risks
|
||||
|
||||
1. **Fog tuning visual gate** *(highest risk).* Hardest non-engineering
|
||||
risk. The 0.7 / 0.95 multipliers in §4.8 are first-cut numbers. If
|
||||
the fog band is too thin (visible scenery cliff at N₁) or too thick
|
||||
(terrain looks washed out), iterate on the multipliers. Mitigation:
|
||||
expose `FogStart` / `FogEnd` as tunable env vars during A.5
|
||||
development for fast iteration.
|
||||
2. **A2C / MSAA framebuffer interaction** *(moderate risk).* MSAA on
|
||||
the GL render target may break sky / particles / UI rendering.
|
||||
Audit during implementation. **Fallback: ship Q8 Option B (mipmaps
|
||||
+ depth-audit only) if A2C goes sideways.** Document the result.
|
||||
3. **Worker starvation on first-traversal** *(low-moderate risk).*
|
||||
~2.7 s of sequential mesh build on first walk into virgin region.
|
||||
Render thread frame time stays in budget; the visible effect is the
|
||||
horizon visibly filling. Acceptable per Q9 Option B; graduate to
|
||||
multi-worker pool in N.6 if user complains.
|
||||
4. **Tier-boundary churn** *(low risk).* When player crosses N₁ both
|
||||
directions, demote→promote→demote fires. Hysteresis (radius+2) is
|
||||
the buffer. If thrash visible, widen to radius+3.
|
||||
5. **Entity AABB cache invalidation** *(low risk).* Dynamic entities
|
||||
must recompute AABB on position change. Single-threaded render
|
||||
thread means no concurrent mutation; the dirty-flag pattern is
|
||||
straightforward.
|
||||
6. **Server broadcast radius mismatch** *(low risk).* If ACE's broadcast
|
||||
radius is < N₁=4, NPCs in outer near-tier LBs won't be
|
||||
server-broadcast (they don't exist in our state). Mitigation:
|
||||
N₁=4 is conservative — typical ACE configs broadcast at 5-7 LBs.
|
||||
If observed, drop N₁ to 3.
|
||||
|
||||
---
|
||||
|
||||
## 11. What was deferred (post-A.5)
|
||||
|
||||
The following items were identified during A.5 development but deferred to
|
||||
post-A.5 phases. They are tracked as OPEN issues in `docs/ISSUES.md`.
|
||||
|
||||
1. **Tier 1 entity-classification cache** (commit `3639a6f` reverted at
|
||||
`9b49009`): First attempt cached `meshRef.PartTransform` which is mutated
|
||||
per frame for animated entities (skeletal pose). Next attempt needs:
|
||||
(a) audit AnimationSequencer + AnimationHookRouter to identify ALL
|
||||
per-frame mutations of MeshRef state; (b) redesign cache to bypass
|
||||
animated entities OR cache only the animation-invariant subset; (c) test
|
||||
specifically with a moving animated NPC on screen. (`docs/ISSUES.md` #53)
|
||||
|
||||
2. **Lifestone missing visual**: The Holtburg lifestone has not rendered since
|
||||
earlier in A.5 development. Possibly Bug A's far-tier strip incorrectly
|
||||
catching a near-tier entity, or a separate earlier regression.
|
||||
(`docs/ISSUES.md` #52)
|
||||
|
||||
3. **Plumb JobKind through BuildLandblockForStreaming**: Bug A's fix (commit
|
||||
`9217fd9`) strips entities post-load in the worker. Proper fix: skip the
|
||||
`LandBlockInfo` + scenery load entirely for far-tier jobs. ~30 min.
|
||||
(`docs/ISSUES.md` #54)
|
||||
|
||||
4. **Tier 2 — Static/dynamic split with persistent groups**: ~2-week phase.
|
||||
Avoids per-frame entity re-classification by maintaining stable groups
|
||||
keyed at spawn time. Roadmap doc at
|
||||
`docs/plans/2026-05-10-perf-tiers-2-3-roadmap.md`.
|
||||
|
||||
5. **Tier 3 — GPU-side culling via compute pre-pass**: ~1-month phase.
|
||||
Same roadmap doc.
|
||||
|
||||
6. **Eliminate ToEntries adapter allocation**: tiny win (~25 KB/frame).
|
||||
|
||||
7. **InvalidateEntity wiring on palette/ObjDesc events**: needed by the
|
||||
Tier 1 retry.
|
||||
|
||||
8. **Visual gate at full High preset**: never validated due to the
|
||||
GPU+CPU stack-up OS crash earlier in A.5. With Bug A fixed the crash
|
||||
likely won't recur; defer retest to post-A.5 perf polish.
|
||||
|
||||
---
|
||||
|
||||
## 12. References (formerly §11)
|
||||
|
||||
- **Handoff (cold-start):** [`docs/research/2026-05-10-phase-a5-handoff.md`](../../research/2026-05-10-phase-a5-handoff.md)
|
||||
- **N.5b handoff (predecessor):** [`docs/research/2026-05-09-phase-n5b-handoff.md`](../../research/2026-05-09-phase-n5b-handoff.md)
|
||||
- **N.5b perf baseline:** [`docs/plans/2026-05-09-phase-n5b-perf-baseline.md`](../../plans/2026-05-09-phase-n5b-perf-baseline.md)
|
||||
- **Roadmap A.5 entry:** [`docs/plans/2026-04-11-roadmap.md`](../../plans/2026-04-11-roadmap.md)
|
||||
- **N.5b memory state:** `memory/project_phase_n5b_state.md` (three high-value
|
||||
gotchas — bindless uniform-sampler driver quirk, MaybeFlushTerrainDiag
|
||||
underflow, visual gate confirmation requirement).
|
||||
- **Existing streaming files:**
|
||||
- [`src/AcDream.App/Streaming/StreamingController.cs`](../../../src/AcDream.App/Streaming/StreamingController.cs)
|
||||
- [`src/AcDream.App/Streaming/StreamingRegion.cs`](../../../src/AcDream.App/Streaming/StreamingRegion.cs)
|
||||
- [`src/AcDream.App/Streaming/GpuWorldState.cs`](../../../src/AcDream.App/Streaming/GpuWorldState.cs)
|
||||
- [`src/AcDream.App/Streaming/LandblockStreamer.cs`](../../../src/AcDream.App/Streaming/LandblockStreamer.cs)
|
||||
- **Existing dispatcher:** [`src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs`](../../../src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs)
|
||||
- **Existing terrain renderer:** [`src/AcDream.App/Rendering/TerrainModernRenderer.cs`](../../../src/AcDream.App/Rendering/TerrainModernRenderer.cs)
|
||||
- **Mesh builder (will move off render thread):** [`src/AcDream.Core/Terrain/LandblockMesh.cs`](../../../src/AcDream.Core/Terrain/LandblockMesh.cs)
|
||||
Loading…
Add table
Add a link
Reference in a new issue