acdream/docs/research/2026-05-19-indoor-cell-rendering-probe-capture.md
Erik 25f009140a docs(research): Phase 1 indoor probe capture — H1 confirmed
Captured at Holtburg landblock 0xA9B4 with ACDREAM_PROBE_INDOOR_ALL=1.

Result: 123 EnvCells in Holtburg get [indoor-upload] requested but
ONLY 97 get a matching [indoor-upload] completed. 26 cells silently
fail in WB's PrepareEnvCellMeshData / PrepareMeshData. The first
interior cell 0xA9B40100 — likely the inn entry or another major
building anchor — is among the failures, exactly matching the
user's "floor missing" symptom.

Other hypotheses ruled out:
- H2 (empty batches): completed cells have cellGeomVerts=14-86.
- H3 (cull bug): walk probe confirms cells pass all visibility filters.
- H4 (double-spawn): partCount values match expected SetupParts.
- H5 (transform double-apply): xform probe shows composedT==meshRefT;
  no double-apply.
- H6 (MeshRefs structure): lookup probe shows isSetup=True and
  partsHit≈partCount for uploaded cells.

Phase 2 plan: wrap PrepareMeshDataAsync with our own catch-and-log
in WbMeshAdapter so the swallowed exception (most likely cause of
the 26 silent failures, per WB ObjectMeshManager.cs:589) becomes
visible. Once we know the actual failure reason, target the fix.

Also flags IsEnvCellId false-positives on GfxObj IDs whose lower 24
bits ≥ 0x0100 — tightening recommended in Phase 2.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 12:03:25 +02:00

105 lines
7.5 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Indoor Cell Rendering — Phase 1 Probe Capture
**Date:** 2026-05-19
**Probe:** Phase 1 diagnostic probes from spec `2026-05-19-indoor-cell-rendering-fix-design.md`
**Capture conditions:** `ACDREAM_PROBE_INDOOR_ALL=1`, walk into Holtburg (landblock `0xA9B4`).
**Verdict:** Hypothesis **H1 (WB silently returns null from `PrepareEnvCellMeshData`)** is **CONFIRMED** for ~21% of Holtburg's EnvCells, including the first interior cell `0xA9B40100`.
---
## Probe line breakdown (real EnvCell-format IDs only)
| Probe | Count | Notes |
|---|---|---|
| `[indoor-upload] requested` (0xA9B4 cells) | 123 (unique) | LandblockSpawnAdapter triggers PrepareMeshDataAsync for every cell in Holtburg landblock. |
| `[indoor-upload] completed` (0xA9B4 cells) | **97** (unique) | **26 cells never produce a completed line.** |
| `[indoor-walk]` (cell-room entities, 0xA9B4) | 27,631 | Cell-room entities pass `landblockVisible` + `aabbVisible` + `cellInVis` filters. Walk path is healthy. |
| `[indoor-lookup]` (0xA9B4 cells) | 6,067 | Total dispatcher lookups for Holtburg cells. |
| `[indoor-lookup] hit=True` | 45 | Only ~0.7% hit rate — the rate-limited probe captures one snapshot per cell after rendering stabilizes. |
| `[indoor-lookup] hit=False` | 6,022 | Most are pre-upload-completion frames + the 26 silently-failing cells. |
| `[indoor-xform]` | 97 | One per successfully-uploaded cell. Cell-geom SetupPart's render data is non-null and reaches `ComposePartWorldMatrix`. |
## Hypotheses
### H1 — WB silently returns null from `PrepareEnvCellMeshData` ✅ CONFIRMED
26 out of 123 Holtburg cells (21%) get an `[indoor-upload] requested` line but **never** produce an `[indoor-upload] completed` line. This is the classic H1 signature: WB's `ObjectMeshManager.PrepareMeshData` either returns null (line 568, 583, 592 of `ObjectMeshManager.cs`) or its catch-block swallows an exception at line 589-592. The pending `meshData` never reaches `StagedMeshData`, so `Tick()`'s drain never sees it, no completion line emits.
**First 15 cells with no completion:**
```
0xA9B40100, 0xA9B40111, 0xA9B40112, 0xA9B40117, 0xA9B4011B,
0xA9B40121, 0xA9B40123, 0xA9B40129, 0xA9B4012A, 0xA9B4012E,
0xA9B40138, 0xA9B4013F, 0xA9B40141, 0xA9B40143, 0xA9B40147
```
`0xA9B40100` is **the first indoor cell** in Holtburg landblock. Almost certainly the inn entry or another major building's anchor cell — exactly where the user reported "floor missing."
### H2 — Empty batches ❌ RULED OUT
For successfully-completed cells, `cellGeomVerts` ranges 1486 and `hasEnvCellGeom=True`. Geometry is non-empty when the upload completes. The 26 failing cells fail BEFORE batch construction, so this isn't an empty-batch problem.
### H3 — Cull bug ❌ RULED OUT
`[indoor-cull]` lines for cell-room entities show `visibleCellIds-miss` reasons only for cells in *other* landblocks (`0xA9B0`, `0xA9B2`, `0xA9B3` etc., visible neighbours of Holtburg but outside the active visibility set). For Holtburg's own cells, the walk probe shows `landblockVisible=true aabbVisible=true cellInVis=true` consistently — the dispatcher reaches them.
### H4 — Double-spawn ❌ RULED OUT
For completed cells, `[indoor-lookup]` reports modest `partCount` values (146) matching the number of static objects + 1 cell-geom part. No evidence of duplicate registration.
### H5 — Transform double-apply ❌ RULED OUT
`[indoor-xform]` consistently shows `entityWorldT=(0,0,0)`, `partT=(0,0,0)`, and `composedT==meshRefT`. The composed translation equals the cell's world origin — no double-apply. Sample:
```
[indoor-xform] cellGeomId=0x00000001A9B40101
entityWorldT=(0.00,0.00,0.00)
meshRefT=(84.09,131.54,66.02)
partT=(0.00,0.00,0.00)
composedT=(84.09,131.54,66.02)
```
### H6 — MeshRefs structure mismatch ❌ RULED OUT
For uploaded cells, `[indoor-lookup]` shows `hit=True isSetup=True partsHit≈partCount`. The dispatcher correctly traverses the Setup parts. Sample: `[indoor-lookup] cellId=0xA9B40101 hit=True isSetup=True partCount=10 hasEnvCellGeom=True partsHit=9 partsMiss=1`.
---
## What's special about the 26 failing cells?
Unknown from Phase 1 probes alone. Possible causes (each verifiable with one or two more targeted probes or code reads in Phase 2):
1. **Missing Environment dat record**`envCell.EnvironmentId` points at an Environment id that `_dats.Portal.TryGet<Environment>` can't find. WB's `PrepareEnvCellMeshData` line 1245 would silently return without populating `cellGeometry`, then the outer Setup path produces a result with `hasBounds=false` and an empty `parts` list. Hmm, but that would still produce a `completed` line — just with empty data. **So this would be H2-shaped, not H1-shaped.** Ruled out.
2. **Exception in `PrepareCellStructMeshData`** — texture decode failure, surface ID resolution failure, polygon enumeration crash. The catch-block at `PrepareMeshData` line 589 silently swallows. **Most likely cause.**
3. **`ResolveId(envCellId)` returns empty** — WB's `DefaultDatReaderWriter` can't find the cell record in its loaded dats. Unlikely (all region cells are loaded at construction), but possible if `_wbDats.Portal.TryGet<Region>` skipped the region containing 0xA9B4.
4. **Race condition**`PrepareMeshData` runs on a background worker; if the same cell id is requested twice in fast succession before the first completes, the second `TryAdd` to `_preparationTasks` returns false and silently skips. Unlikely given LandblockSpawnAdapter's per-landblock dedup at line 68 of `LandblockSpawnAdapter.cs`, but possible if multiple landblocks share state.
---
## Phase 2 — recommended approach
The fix shape per the spec table maps H1 to: *"Add WB logging or pre-check the dat resolution path in WbMeshAdapter."*
Concrete Phase 2 plan:
1. **Targeted probe extension** — add a SECOND probe inside the failing path. Either patch WB to surface the swallowed exception (`PrepareMeshData` line 589 catch block) OR wrap the `PrepareMeshDataAsync` call in WbMeshAdapter with our own try/catch + task continuation that logs the actual `Exception` for EnvCell ids. One launch with this captures the actual failure reason for the 26 cells.
2. **Match the failure to a fix** — once we know the failure mode:
- If a texture/surface bug → file as a Phase 2 WB-fork patch.
- If a missing dat reference → check whether the user's `client_cell_1.dat` is up to date.
- If an exception in our code path → fix the specific bug.
3. **Verify** by re-launching with the probe and confirming `[indoor-upload] completed` appears for previously-missing cells (e.g., `0xA9B40100`).
---
## Phase 1 leftover observations
- The `IsEnvCellId(ulong id) => (id & 0xFFFFu) >= 0x0100u` helper has false positives on GfxObj IDs whose lower 24 bits happen to be ≥ 0x0100 (e.g., `0x01001841`). This polluted ~95% of probe emissions with non-cell entities. Recommend tightening the helper to also require `(id >> 24) != 0x01 && (id >> 24) != 0x02` (and any other DBObj-type prefixes), OR `(id >> 16) > 0x00FF` to require a real landblock prefix.
- The lookup probe's rate-limit namespace separation (Task 7 fix) works correctly — uploaded cells DO appear in the hit set when their lookup probe fires.
- Cell-room entities have `Position=(0,0,0)` with the cell transform in `MeshRef.PartTransform`. The dispatcher's `aabbVisible` filter passed for them, presumably because `RefreshAabb()` computes a sensible world AABB from the mesh-ref's transform or because the landblock equals `neverCullLandblockId`. Worth a brief audit if there's any reason to believe the cell-room AABB is wrong.