feat(render): Phase A8 — indoor visibility + streaming fixes batch

Lands the working A8 indoor-rendering and streaming fixes accumulated this
session. User has verified these visually to some degree (e.g. lifestone /
translucent meshes confirmed fine under the FrontFace flip; bridge / wall /
collision regressions confirmed fixed after travel); not every path has been
exhaustively gated. The cellar-flap defect remains OPEN and will be solved
the retail-faithful way via a dedicated brainstorm (see handoff docs).

Rendering core (reviewed, high confidence):
- EnvCellRenderer SSBO stride fix: upload packed Matrix4x4[] (64B) instead of
  the 80B CPU InstanceData struct the shader never expected — fixes the
  transform/texture "explosion" for any draw with >1 instance (cells that
  dedupe to a shared cellGeomId). Real root cause.
- WB-style global FrontFace(CW) + per-batch CullMode carried through the MDI
  layout (GroupKey + BuildIndirectArrays + DrawIndirectRange split into
  same-cull runs with absolute uDrawIDOffset per run).
- EntitySet partitioning (IndoorPass / OutdoorScenery / LiveDynamic) +
  WorldEntity.BuildingShellAnchorCellId so building shells scope to their
  dat-derived building cell instead of rendering everywhere.
- RenderOutsideInAcdream (look into buildings from outside) +
  CollectVisiblePortalBuildings frustum cull of portal bounds.
- Sky-when-inside-building + per-cell audit probe + GL-state probe.

Streaming / perf (test-covered; not independently code-reviewed this session):
- Near/far priority queues so near work wins over far; PromoteToNear carries
  full landblock + mesh data; LandblockEntriesWithoutAnimatedIndex avoids
  rebuilding the animated-lookup dict in the hot draw path. Fixes the
  bridge-not-appearing / missing-walls / broken-collision-after-travel
  regressions and improves post-transition FPS.

Tooling + docs:
- tools/A8CellAudit: offline dat cell/portal/building dumper (portals +
  buildings modes) — reproduces the cellar-flap investigation with no launch.
- docs/research cellar-flap root-cause + option-2 handoff (the didInsideStencil
  double-duty finding + the WB-recursive design decision + brainstorm prompt),
  entity-taxonomy, replan, issue-78 visibility investigation.

Diagnostics retained on purpose: ACDREAM_A8_DIAG_* gates, portal_stencil.vert
provisional pos.w clamp, and the probe families are kept (env-var gated, zero
cost when off) because the pending option-2 cellar-flap brainstorm needs them.
Strip in the option-2 ship commit.

Indoor branch stays behind ACDREAM_A8_INDOOR_BRANCH=1 (default off = pre-A8
visual). Build green; App tests + Core (streaming/dispatcher/loader) tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Erik 2026-05-29 10:14:50 +02:00
parent e415bb3863
commit 5dc4140c11
38 changed files with 3965 additions and 277 deletions

View file

@ -0,0 +1,120 @@
# A8 cellar-flap — structured debugging root cause (2026-05-28 PM)
## Method
Systematic-debugging Phases 1-3, all evidence gathered **offline** via the
`tools/A8CellAudit` tool (extended with a `portals` mode) — no live launches
needed. Deterministic, instant, reproducible.
## Phase 1 — evidence
Scenario (from `launch-a8-probe-normal-20260528-194536.out.log`):
- Camera in cell `0xA9B40171`, `inside=True really=True`.
- `camBldgs=[0xA]`, `visN=7 [0x16F,0x170,0x171,0x172,0x173,0x174,0x175]`.
- Portal stencil mask = 12 verts (not the old over-punch case).
- Bisection (prior session): writer is **Step 4 content**; disabling Step-2
punch does **not** fix it.
Offline audit findings:
**Building grouping** (`A8CellAudit buildings 0xA9B40000`):
```
buildingOrdinal=10 registryId=0xA model=0x01002232 portalCells=[0xA9B4016F,0xA9B40170]
```
Building `0xA`'s LandBlockInfo seed = `{0x16F, 0x170}`. `BuildingLoader` then
BFS-expands through interior portals → all 7 cells (incl. the cellar). The
BFS matches WB's `PortalService` (same algorithm), so the grouping is not the
divergence.
**Exit-portal ownership** (`A8CellAudit portals ...`):
| Cell | exit portals (0xFFFF) | interior | role |
|------|----|----|------|
| `0x16F` | **1** | 1 | ground floor (window/door) |
| `0x170` | **1** | 1 | ground floor (window/door) |
| `0x171` (camera) | **0** | 3 | cellar |
| `0x172``0x175` | **0** | 12 | cellar rooms |
So the 12-vert mask = `0x16F` exit (6v) + `0x170` exit (6v). **The cellar
camera (zero exit portals) is marking the two ground-floor windows.**
**Topology**:
```
0x171.portal[0] -> 0x170 (stairwell/hatch, polyId 54)
0x170.portal[1] -> 0x171 (polyId 5)
0x170.portal[0] -> EXIT (window/door to outside, polyId 4)
```
Cellar connects directly up to ground floor `0x170`; `0x16F` is one further hop.
**Occluder geometry** (`A8CellAudit 0xA9B40170` / `0xA9B40171`):
- `0x170` floor poly `0x0002` (n.Z=+1) **emits** — the cellar's ceiling/occluder exists.
- `0x171` has a ceiling `0x0003` (n.Z=-1, emits) AND three `NoPos` polys
`0x0036/0x0037/0x0038` (surface `0x080000DF`) that do **not** emit —
`0x0038` is a ceiling-plane poly = the **stairwell hole** up to the ground floor.
## Phase 2 — pattern vs WB
WB `RenderInsideOut` marks the building's exit portals (flat — same as us) and
relies on **Step-3 cell depth** to occlude them: terrain only survives where the
punched/cleared far-depth isn't overwritten by rendered cell geometry.
Our code matches that structure. The difference that produces the visible flap:
WB's outside view through a portal is the world geometrically behind that
portal; from a cellar, the only un-occluded opening is the **stairwell hole**
(`0x0038`, not rendered). Through that hole, stencil=1 (ground-floor window
marked) and depth=far → **Step 4 draws the entire outdoor world (terrain +
buildings) through the hole**, not a window-sized sliver. The two ground-floor
windows are 12 BFS hops above the camera and should contribute essentially
nothing from the cellar, but their full silhouettes are marked.
"Disable Step-2 punch doesn't fix it" is explained: the leak pixels are the
stairwell hole, which has **cleared (far) depth** regardless of the punch
because no cell geometry covers it — terrain passes `DepthFunc.Less` either way.
## Phase 3 — single hypothesis (root cause)
**The inside-out exit-portal stencil mask is built by flat-marking the exit
portals of every visibility-BFS-reached cell. From the cellar, the BFS reaches
the ground-floor cells, whose windows get full-silhouette-marked. Where the
cellar's stairwell hole leaves those silhouettes un-occluded, Step 4 paints the
whole outdoor world through them. There is no constraint tying a deeper cell's
exit portal to the portal chain (here: the narrow stairwell) through which its
cell became visible.**
This is a flat-vs-constrained masking gap. Not a depth bug (occluders emit and
render), not the Step-2 punch, not the camera-side filter (the cellar camera is
geometrically on the interior side of a ground-floor window's plane, so the
per-portal filter passes it).
## Phase 4 — fix options
1. **Camera-cell-scoped mask (minimal, conservative).** Mark only the camera
cell's own exit portals. Cellar (0 exit portals) → empty mask → no leak;
windowed room → marks its own windows. **Risk:** loses daylight through an
*adjacent* cell's window seen across a doorway in multi-cell ground-floor
rooms (e.g. the inn) — a visible-but-minor regression, and the flat approach
was wrong there anyway.
2. **Vertical-portal-aware scoping (targeted).** Don't propagate exit-portal
marking across a floor/ceiling (vertical-normal) portal. The cellar→ground
stairwell is a horizontal-plane portal; suppressing inheritance across it
stops the cellar from marking ground-floor windows while preserving
same-level multi-cell rooms. Needs per-portal polygon-normal classification.
3. **WB recursive/constrained portal masking (faithful, largest).** Constrain
each deeper portal's stencil to the screen region of the portal chain leading
to it. Correct for all cases (cellar + multi-cell rooms) but a substantial
port of WB's recursive RenderInsideOut.
**Recommendation:** option 2 is the best correctness/effort trade — it fixes the
cellar without the inn regression risk of option 1, and is a principled scoping
rule (don't inherit a different vertical level's exterior openings) rather than a
band-aid. Option 3 remains the eventual faithful target if cross-level portal
visibility ever needs to be exact.
## Reproduction / verification assets
- `tools/A8CellAudit` `portals` mode (added this session) dumps any cell's
`CellPortals` offline. `A8CellAudit buildings <lb> <radius>` dumps
building→cell grouping. These make the whole investigation re-runnable in
seconds with zero launches.