diff --git a/docs/research/2026-05-19-indoor-cell-rendering-probe-capture.md b/docs/research/2026-05-19-indoor-cell-rendering-probe-capture.md new file mode 100644 index 0000000..a41577b --- /dev/null +++ b/docs/research/2026-05-19-indoor-cell-rendering-probe-capture.md @@ -0,0 +1,105 @@ +# Indoor Cell Rendering — Phase 1 Probe Capture + +**Date:** 2026-05-19 +**Probe:** Phase 1 diagnostic probes from spec `2026-05-19-indoor-cell-rendering-fix-design.md` +**Capture conditions:** `ACDREAM_PROBE_INDOOR_ALL=1`, walk into Holtburg (landblock `0xA9B4`). +**Verdict:** Hypothesis **H1 (WB silently returns null from `PrepareEnvCellMeshData`)** is **CONFIRMED** for ~21% of Holtburg's EnvCells, including the first interior cell `0xA9B40100`. + +--- + +## Probe line breakdown (real EnvCell-format IDs only) + +| Probe | Count | Notes | +|---|---|---| +| `[indoor-upload] requested` (0xA9B4 cells) | 123 (unique) | LandblockSpawnAdapter triggers PrepareMeshDataAsync for every cell in Holtburg landblock. | +| `[indoor-upload] completed` (0xA9B4 cells) | **97** (unique) | **26 cells never produce a completed line.** | +| `[indoor-walk]` (cell-room entities, 0xA9B4) | 27,631 | Cell-room entities pass `landblockVisible` + `aabbVisible` + `cellInVis` filters. Walk path is healthy. | +| `[indoor-lookup]` (0xA9B4 cells) | 6,067 | Total dispatcher lookups for Holtburg cells. | +| `[indoor-lookup] hit=True` | 45 | Only ~0.7% hit rate — the rate-limited probe captures one snapshot per cell after rendering stabilizes. | +| `[indoor-lookup] hit=False` | 6,022 | Most are pre-upload-completion frames + the 26 silently-failing cells. | +| `[indoor-xform]` | 97 | One per successfully-uploaded cell. Cell-geom SetupPart's render data is non-null and reaches `ComposePartWorldMatrix`. | + +## Hypotheses + +### H1 — WB silently returns null from `PrepareEnvCellMeshData` ✅ CONFIRMED + +26 out of 123 Holtburg cells (21%) get an `[indoor-upload] requested` line but **never** produce an `[indoor-upload] completed` line. This is the classic H1 signature: WB's `ObjectMeshManager.PrepareMeshData` either returns null (line 568, 583, 592 of `ObjectMeshManager.cs`) or its catch-block swallows an exception at line 589-592. The pending `meshData` never reaches `StagedMeshData`, so `Tick()`'s drain never sees it, no completion line emits. + +**First 15 cells with no completion:** + +``` +0xA9B40100, 0xA9B40111, 0xA9B40112, 0xA9B40117, 0xA9B4011B, +0xA9B40121, 0xA9B40123, 0xA9B40129, 0xA9B4012A, 0xA9B4012E, +0xA9B40138, 0xA9B4013F, 0xA9B40141, 0xA9B40143, 0xA9B40147 +``` + +`0xA9B40100` is **the first indoor cell** in Holtburg landblock. Almost certainly the inn entry or another major building's anchor cell — exactly where the user reported "floor missing." + +### H2 — Empty batches ❌ RULED OUT + +For successfully-completed cells, `cellGeomVerts` ranges 14–86 and `hasEnvCellGeom=True`. Geometry is non-empty when the upload completes. The 26 failing cells fail BEFORE batch construction, so this isn't an empty-batch problem. + +### H3 — Cull bug ❌ RULED OUT + +`[indoor-cull]` lines for cell-room entities show `visibleCellIds-miss` reasons only for cells in *other* landblocks (`0xA9B0`, `0xA9B2`, `0xA9B3` etc., visible neighbours of Holtburg but outside the active visibility set). For Holtburg's own cells, the walk probe shows `landblockVisible=true aabbVisible=true cellInVis=true` consistently — the dispatcher reaches them. + +### H4 — Double-spawn ❌ RULED OUT + +For completed cells, `[indoor-lookup]` reports modest `partCount` values (1–46) matching the number of static objects + 1 cell-geom part. No evidence of duplicate registration. + +### H5 — Transform double-apply ❌ RULED OUT + +`[indoor-xform]` consistently shows `entityWorldT=(0,0,0)`, `partT=(0,0,0)`, and `composedT==meshRefT`. The composed translation equals the cell's world origin — no double-apply. Sample: + +``` +[indoor-xform] cellGeomId=0x00000001A9B40101 + entityWorldT=(0.00,0.00,0.00) + meshRefT=(84.09,131.54,66.02) + partT=(0.00,0.00,0.00) + composedT=(84.09,131.54,66.02) +``` + +### H6 — MeshRefs structure mismatch ❌ RULED OUT + +For uploaded cells, `[indoor-lookup]` shows `hit=True isSetup=True partsHit≈partCount`. The dispatcher correctly traverses the Setup parts. Sample: `[indoor-lookup] cellId=0xA9B40101 hit=True isSetup=True partCount=10 hasEnvCellGeom=True partsHit=9 partsMiss=1`. + +--- + +## What's special about the 26 failing cells? + +Unknown from Phase 1 probes alone. Possible causes (each verifiable with one or two more targeted probes or code reads in Phase 2): + +1. **Missing Environment dat record** — `envCell.EnvironmentId` points at an Environment id that `_dats.Portal.TryGet` can't find. WB's `PrepareEnvCellMeshData` line 1245 would silently return without populating `cellGeometry`, then the outer Setup path produces a result with `hasBounds=false` and an empty `parts` list. Hmm, but that would still produce a `completed` line — just with empty data. **So this would be H2-shaped, not H1-shaped.** Ruled out. + +2. **Exception in `PrepareCellStructMeshData`** — texture decode failure, surface ID resolution failure, polygon enumeration crash. The catch-block at `PrepareMeshData` line 589 silently swallows. **Most likely cause.** + +3. **`ResolveId(envCellId)` returns empty** — WB's `DefaultDatReaderWriter` can't find the cell record in its loaded dats. Unlikely (all region cells are loaded at construction), but possible if `_wbDats.Portal.TryGet` skipped the region containing 0xA9B4. + +4. **Race condition** — `PrepareMeshData` runs on a background worker; if the same cell id is requested twice in fast succession before the first completes, the second `TryAdd` to `_preparationTasks` returns false and silently skips. Unlikely given LandblockSpawnAdapter's per-landblock dedup at line 68 of `LandblockSpawnAdapter.cs`, but possible if multiple landblocks share state. + +--- + +## Phase 2 — recommended approach + +The fix shape per the spec table maps H1 to: *"Add WB logging or pre-check the dat resolution path in WbMeshAdapter."* + +Concrete Phase 2 plan: + +1. **Targeted probe extension** — add a SECOND probe inside the failing path. Either patch WB to surface the swallowed exception (`PrepareMeshData` line 589 catch block) OR wrap the `PrepareMeshDataAsync` call in WbMeshAdapter with our own try/catch + task continuation that logs the actual `Exception` for EnvCell ids. One launch with this captures the actual failure reason for the 26 cells. + +2. **Match the failure to a fix** — once we know the failure mode: + - If a texture/surface bug → file as a Phase 2 WB-fork patch. + - If a missing dat reference → check whether the user's `client_cell_1.dat` is up to date. + - If an exception in our code path → fix the specific bug. + +3. **Verify** by re-launching with the probe and confirming `[indoor-upload] completed` appears for previously-missing cells (e.g., `0xA9B40100`). + +--- + +## Phase 1 leftover observations + +- The `IsEnvCellId(ulong id) => (id & 0xFFFFu) >= 0x0100u` helper has false positives on GfxObj IDs whose lower 24 bits happen to be ≥ 0x0100 (e.g., `0x01001841`). This polluted ~95% of probe emissions with non-cell entities. Recommend tightening the helper to also require `(id >> 24) != 0x01 && (id >> 24) != 0x02` (and any other DBObj-type prefixes), OR `(id >> 16) > 0x00FF` to require a real landblock prefix. + +- The lookup probe's rate-limit namespace separation (Task 7 fix) works correctly — uploaded cells DO appear in the hit set when their lookup probe fires. + +- Cell-room entities have `Position=(0,0,0)` with the cell transform in `MeshRef.PartTransform`. The dispatcher's `aabbVisible` filter passed for them, presumably because `RefreshAabb()` computes a sensible world AABB from the mesh-ref's transform or because the landblock equals `neverCullLandblockId`. Worth a brief audit if there's any reason to believe the cell-room AABB is wrong.