docs(spec): indoor cell rendering fix — Phase 1 diagnostics

Initial brainstorm assumed N.5 retirement broke EnvCell rendering by
leaving _pendingCellMeshes unconsumed. Pivoted mid-brainstorm:

- WB's PrepareMeshData routes EnvCell dat-record types to
  PrepareEnvCellMeshData (ObjectMeshManager.cs:557) which produces an
  IsSetup=true ObjectMeshData with the floor mesh as EnvCellGeometry.
- WbDrawDispatcher correctly handles IsSetup=true (line 607-621) by
  iterating SetupParts and drawing each.
- DefaultDatReaderWriter loads region cell dats; ResolveId resolves
  envCellId correctly.
- LandblockSpawnAdapter calls IncrementRefCount on every entity's
  GfxObjId, including envCellId for cell entities. ServerGuid==0 passes
  the atlas-tier filter.

Chain is structurally intact. The bug is somewhere subtler.

Spec pivots to a diagnostics-first phase: ACDREAM_PROBE_INDOOR=1
captures per-frame cell-entity walk + render-data lookup + SetupParts
traversal + composed-transform values. Six hypotheses (WB silently
returns null, empty batches, cull bug, double-spawn, transform
double-apply, dispatcher MeshRefs mismatch) match six concrete fix
shapes. Phase 2 design follows the probe data.

This is more honest than the original "build a new upload path"
design, which would have hidden the actual bug.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Erik 2026-05-19 11:02:05 +02:00
parent 1024ba34e0
commit f6e9c58932

View file

@ -0,0 +1,185 @@
# Indoor Cell Rendering Fix — Design
**Status:** Brainstormed 2026-05-19. Pivoted mid-brainstorm — see §1.5 for
the corrected root-cause analysis. Awaiting user review.
**Scope:** Diagnose + fix the actual break in the EnvCell rendering chain.
**Out of scope this phase:** Cell collision symptoms (no wall collision
exiting, weird open-air collisions). Filed as a follow-up phase pending
user repro data.
---
## 1. Symptom
Walking into Holtburg Inn: the exterior building stab renders (walls visible
from inside), but the interior cell's own room mesh — floor, inner walls,
ceiling — is missing. The user can walk through the empty interior with no
floor visible underfoot.
## 1.5 What the root cause is NOT (corrected mid-brainstorm)
Initial hypothesis: N.5 retirement (commit
[`dcae2b6`](../../../#) 2026-05-08) deleted the legacy cell-mesh drain path
with the assumption "WB handles EnvCell geometry through its own pipeline,"
and that assumption was wrong.
**Closer inspection during brainstorm proved that assumption is correct.**
WB's `ObjectMeshManager.PrepareMeshData(id, isSetup)` at
[`ObjectMeshManager.cs:557`](../../../references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/ObjectMeshManager.cs:557)
dispatches on the **dat record type** (not on the `isSetup` parameter).
When the id resolves to a `DBObjType.EnvCell`, it routes to
`PrepareEnvCellMeshData(id, envCell, ct)` at line 1186, which produces an
`ObjectMeshData` with `IsSetup=true`, `SetupParts` = [static objects +
cellGeometry], `EnvCellGeometry` = the floor/wall/ceiling room mesh.
The dispatcher correctly handles `IsSetup=true` at
[`WbDrawDispatcher.cs:607-621`](../../../src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs:607) —
it iterates `SetupParts`, looks up each part's render data, composes
transforms, and draws each.
`DefaultDatReaderWriter` loads region cell dats during construction
([`DefaultDatReaderWriter.cs:66-89`](../../../references/WorldBuilder/WorldBuilder.Shared/Services/DefaultDatReaderWriter.cs:66))
so `ResolveId(envCellId)` will find the cell record.
`LandblockSpawnAdapter.OnLandblockLoaded` iterates `landblock.Entities` and
calls `_adapter.IncrementRefCount(meshRef.GfxObjId)` for each
([`LandblockSpawnAdapter.cs:75-80`](../../../src/AcDream.App/Rendering/Wb/LandblockSpawnAdapter.cs:75)).
Cell entities have `ServerGuid == 0` (atlas-tier), so they pass the filter
at line 73. Their `MeshRef.GfxObjId == envCellId` reaches `IncrementRefCount`.
**The chain looks structurally intact.** Floors SHOULD render today. They
don't. Therefore the failure is subtler than "we never invoke the load."
## 2. Real failure point — to be determined by diagnostics
Six untested hypotheses, in rough order of probability:
1. **WB silently fails to build the `ObjectMeshData`.** `PrepareEnvCellMeshData`
returns null when the Environment dat can't resolve, or when
`PrepareCellStructMeshData` returns null (texture issues, surface
resolution failure). WB doesn't log; the failure is invisible.
2. **`SetupParts.cellGeomId` is uploaded but its texture batches are empty.**
`UploadGfxObjMeshData` returning null at line 675 is treated as a
non-fatal substitution — the render data has no draw batches, dispatcher
silently draws nothing.
3. **Cell entity is culled before reaching the dispatcher.** `visibleCellIds`
filter at `WbDrawDispatcher.cs:317-319` rejects entities whose
`ParentCellId` isn't in the visible set. If the cell entity's
`ParentCellId == envCellId` but the visibility BFS doesn't include the
player's current cell (because `FindCameraCell` returns null when camera
is in third-person above the building, etc.), the cell entity is
skipped.
4. **Double-spawn conflict between WB's static-object SetupParts and
acdream's per-stab entity hydration.** `PrepareEnvCellMeshData` iterates
`envCell.StaticObjects` and adds each as a SetupPart. Meanwhile acdream
already hydrates the same static objects as separate `WorldEntity`
instances at [`GameWindow.cs:5390-5439`](../../../src/AcDream.App/Rendering/GameWindow.cs:5390).
WB might be holding extra ref counts on those GfxObj IDs that block
eviction or cause cache thrash. Unlikely to cause "missing floor" but
worth ruling out.
5. **Transform composition bug.** `ComposePartWorldMatrix(entityWorld,
meshRef.PartTransform, partTransform)` — if our cell entity's
`meshRef.PartTransform == cellTransform` and WB's `partTransform`
already bakes the cell origin, the floor lands at `2 × cellOrigin`,
far below or beside the actual cell. The user would describe this
as "missing" because the floor is now outside the visible frustum.
6. **The cell entity's `MeshRefs` only has one entry, but WB expects
multiple.** The dispatcher iterates `entity.MeshRefs`, but each MeshRef
gets its own `TryGetRenderData(meshRef.GfxObjId)` call. For cell
entities we have `MeshRefs = { MeshRef(envCellId, cellTransform) }`.
When the lookup returns an `IsSetup=true` render data, the dispatcher
does the right thing (line 607-621) — iterates SetupParts. So this
should work; ruling out.
## 3. Solution
**Phase 1: Diagnostics.** Add a runtime-toggleable `ACDREAM_PROBE_INDOOR=1`
env-var (mirrored as a DebugPanel checkbox) that prints one line per frame
with:
- Number of cell entities walked by the dispatcher.
- Per-cell-entity: `TryGetRenderData(envCellId)` hit/miss.
- On hit: `renderData.IsSetup`, `renderData.SetupParts.Count`.
- For each SetupPart: `TryGetRenderData(partGfxObjId)` hit/miss.
- The composed world matrix for the cell-geometry part (so we can see
where the floor actually ends up in world space).
- Whether the entity was culled by `visibleCellIds` (and why).
Run the client, walk into Holtburg Inn, capture probe output. The log
tells us exactly which step in the chain is breaking.
**Phase 2: Fix the specific break.** Once the probe identifies the
failure point, implement the surgical fix. Likely shapes per hypothesis:
| Hypothesis | Fix shape |
|---|---|
| H1 — WB returns null | Add WB logging or pre-check the dat resolution path in WbMeshAdapter |
| H2 — Empty batches | Investigate WB texture pipeline; possibly a missing texture in the cell's surface list |
| H3 — Cull bug | Fix `ParentCellId` assignment OR loosen the visibility filter for cell entities |
| H4 — Double-spawn | Stop WB from spawning static-object parts in EnvCell setups (filter them in PrepareEnvCellMeshData, or skip acdream's per-stab hydration when WB handles the cell) |
| H5 — Transform double-apply | Replace `MeshRef.PartTransform = cellTransform` with `entity.Position+Rotation = cellPosition` |
| H6 — MeshRefs structure | Already ruled out in §2 |
Phase 2's actual code change is small and well-targeted once Phase 1
gives us a definite answer.
## 4. Why NOT build a separate cell renderer
The original brainstorm proposed adapting `_pendingCellMeshes` data into
WB via a new `UploadCellMesh` adapter method. **That solution is wrong**
it would duplicate work WB already does, fragment the rendering pipeline,
and bypass WB's existing GPU memory management. Worse, it would hide
whatever the actual bug is, not fix it.
## 5. Edge cases
| Scenario | Behavior |
|---|---|
| Visible during diagnostic capture | Probe is heavy (per-frame, per-entity). Bounded by short walk; runtime-toggle off when done. |
| Probe spam in production | Default OFF, mirrored to DebugPanel. Same pattern as L.2a `ACDREAM_PROBE_RESOLVE` / `ACDREAM_PROBE_CELL`. |
| Concurrent landblock stream | Probe records per frame across all loaded cells — useful for cross-cell comparison ("does cell X load but cell Y not?"). |
## 6. Testing strategy
**Unit tests:** none in Phase 1. The probe is diagnostic, not behavioral.
**Visual verification (user-driven, end-to-end):**
- Add probe, launch client, walk into Holtburg Inn.
- Read probe output to identify which hypothesis matches.
- Brief Phase 2 in a new design (or amend this one) once the failure
point is known.
**Phase 2 unit tests:** depend on the fix shape. If H5 (transform
double-apply), tests verify the world matrix composition. If H3 (cull
bug), tests verify visibility BFS for indoor entities.
## 7. What's NOT in this phase
- Cell collision symptoms — investigated separately.
- Particle/fire emitter integration — already shipped.
- Light registration — already shipped.
- Stab-leak-through-walls — deferred.
## 8. Acceptance criteria
**Phase 1 (this phase):**
- [ ] `ACDREAM_PROBE_INDOOR=1` env var + DebugPanel mirror.
- [ ] One log line per frame, per cell entity, showing render-data lookup
results, SetupParts traversal, and composed transforms.
- [ ] Probe captured at Holtburg Inn confirms which hypothesis matches.
- [ ] Phase 2 design (amended spec or new spec) documents the surgical fix.
**Phase 2 (next phase, driven by Phase 1 output):**
- [ ] `dotnet build` clean, `dotnet test` clean.
- [ ] Visual verification: walking into Holtburg Inn renders interior floor +
walls correctly.
- [ ] Roadmap updated.
- [ ] Probe left in place for future regressions but defaulted off.