acdream/docs/research/2026-05-25-issue-78-visibility-culling-investigation.md
Erik 5dc4140c11 feat(render): Phase A8 — indoor visibility + streaming fixes batch
Lands the working A8 indoor-rendering and streaming fixes accumulated this
session. User has verified these visually to some degree (e.g. lifestone /
translucent meshes confirmed fine under the FrontFace flip; bridge / wall /
collision regressions confirmed fixed after travel); not every path has been
exhaustively gated. The cellar-flap defect remains OPEN and will be solved
the retail-faithful way via a dedicated brainstorm (see handoff docs).

Rendering core (reviewed, high confidence):
- EnvCellRenderer SSBO stride fix: upload packed Matrix4x4[] (64B) instead of
  the 80B CPU InstanceData struct the shader never expected — fixes the
  transform/texture "explosion" for any draw with >1 instance (cells that
  dedupe to a shared cellGeomId). Real root cause.
- WB-style global FrontFace(CW) + per-batch CullMode carried through the MDI
  layout (GroupKey + BuildIndirectArrays + DrawIndirectRange split into
  same-cull runs with absolute uDrawIDOffset per run).
- EntitySet partitioning (IndoorPass / OutdoorScenery / LiveDynamic) +
  WorldEntity.BuildingShellAnchorCellId so building shells scope to their
  dat-derived building cell instead of rendering everywhere.
- RenderOutsideInAcdream (look into buildings from outside) +
  CollectVisiblePortalBuildings frustum cull of portal bounds.
- Sky-when-inside-building + per-cell audit probe + GL-state probe.

Streaming / perf (test-covered; not independently code-reviewed this session):
- Near/far priority queues so near work wins over far; PromoteToNear carries
  full landblock + mesh data; LandblockEntriesWithoutAnimatedIndex avoids
  rebuilding the animated-lookup dict in the hot draw path. Fixes the
  bridge-not-appearing / missing-walls / broken-collision-after-travel
  regressions and improves post-transition FPS.

Tooling + docs:
- tools/A8CellAudit: offline dat cell/portal/building dumper (portals +
  buildings modes) — reproduces the cellar-flap investigation with no launch.
- docs/research cellar-flap root-cause + option-2 handoff (the didInsideStencil
  double-duty finding + the WB-recursive design decision + brainstorm prompt),
  entity-taxonomy, replan, issue-78 visibility investigation.

Diagnostics retained on purpose: ACDREAM_A8_DIAG_* gates, portal_stencil.vert
provisional pos.w clamp, and the probe families are kept (env-var gated, zero
cost when off) because the pending option-2 cellar-flap brainstorm needs them.
Strip in the option-2 ship commit.

Indoor branch stays behind ACDREAM_A8_INDOOR_BRANCH=1 (default off = pre-A8
visual). Build green; App tests + Core (streaming/dispatcher/loader) tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-29 10:14:50 +02:00

16 KiB
Raw Blame History

Issue #78 + cellar-stairs visibility culling — investigation report

Date: 2026-05-25 PM (continuation session) Status: REPORT-ONLY. Awaiting user (a) camera-rotation falsification test and (b) approach selection before any code work. Predecessor handoff: docs/research/2026-05-25-issue-100-shipped-and-culling-handoff.md


Symptom

Two visible defects share one root cause:

  1. Cellar-stairs (observed 2026-05-25 PM, evidence for #78): standing in a Holtburg cottage cellar with the camera at certain angles, the outdoor terrain mesh renders as a sharp-edged grass rectangle covering the cellar stair geometry. Clears when camera moves closer (cottage walls + stair treads geometrically occlude). Gameplay unaffected — player can walk up/down normally.
  2. Inn-wall stabs (#78, filed 2026-05-19): standing inside the Holtburg Inn looking at the floor or walls, the user sees other buildings in the distance at their correct world position + scale, visible THROUGH the floor and walls.

The user has NOT yet run the camera-rotation falsification test (Phase 1a of the handoff). Until they do, the diagnosis below is "high confidence" but not certain.

Sibling: #95 (dungeon portal-graph blowup) is the same visibility subsystem but a different specific failure (over-inclusion). scen5 log shows visibleCells per cell reaching 295 (worse than the 135-145 filed).


Hypotheses (ranked)

H1 — Indoor-camera gate missing on outdoor render passes (HIGH confidence)

Mechanism: TerrainModernRenderer.Draw and WbDrawDispatcher render outdoor geometry unconditionally regardless of whether the camera is inside an EnvCell. Retail and WorldBuilder both gate the outdoor passes by the indoor portal-walk result. acdream does neither.

Evidence FOR (strong):

  • Retail anchor verified: PView::DrawCells at acclient_2013_pseudo_c.txt:432709 gates LScape::draw (outdoor terrain dispatch) by if (outside_view.view_count > 0). outside_view.view_count is only incremented during the indoor portal BFS (PView::ConstructView) when a portal targets other_cell_id == 0xFFFFFFFF (outdoor sentinel). When no portal sees outside, the entire outdoor pass is skipped.
  • Retail's per-mesh draw (RenderDeviceD3D::DrawMesh line 429245) iterates Render::PortalList->view_count and skips meshes that straddle 0 sub-views. No stencil — retail uses screen-space polygon clipping via PView::GetClip.
  • WB anchor verified: VisibilityManager.RenderInsideOut (lines 73-239) uses stencil: mark current-building portals stencil=1, punch portal regions to far depth, draw EnvCells unconditionally, then terrain/scenery/statics gated by glStencilFunc(Equal, 1, 0x01). The top-level loop already skips the unconditional terrain draw via if (!isInside) terrainManager.Render(...) at GameScene.cs:965.
  • acdream audit verified the gate is missing: WbDrawDispatcher.cs:360-362 gates by entity.ParentCellId.HasValue && !visibleCellIds.Contains(...). When ParentCellId == null (outdoor stabs, scenery, live-spawned entities), the boolean short-circuits to cellInVis = true — the entity passes regardless of visibleCellIds.
  • TerrainModernRenderer.Draw (lines 191-208) only does per-slot frustum cull. No visibleCellIds parameter, no indoor-camera awareness.
  • Patch geometry size (~24 m × 24 m rectangle) matches a terrain cell footprint — that's a polygon, not a precision artifact.
  • "Clears when closer" matches geometric occlusion: cottage walls + stair treads come to occlude the offending terrain cells screen-space as the camera approaches. A 1 cm depth-buffer Z-fight (#100's nudge) at 2-5 m camera distance with 24-bit depth has sub-millimeter resolving power; precision is not the bottleneck.

Evidence AGAINST:

  • User has not yet run the camera-rotation test. If the patch flickers/shimmers when rotating the camera in place, the diagnosis pivots to Z-precision.

How to falsify: Stand at the spot showing the cellar-stairs artifact, look at the grass patch, rotate the camera slowly without moving the character. Polygon-stable edges that track predictably with the view = culling (H1). Flickering / shimmering = Z-precision (H2).

H2 — Residual Z-fight from #100's nudge (LOW confidence)

The 1 cm shader nudge from issue #100 might be insufficient at certain Z values or with shader precision quirks.

Evidence FOR: Same code area was just touched. Evidence AGAINST: Predecessor research already established 1 cm @ 24-bit depth has sub-mm resolving at gameplay camera distances. Patch is rectangular polygon, not thin Z-fight strip. "Clears when closer" reverses precision direction. How to falsify: Same camera-rotation test.

H3 — #95 portal-traversal blowup is independent of H1 (HIGH confidence it IS independent)

Mechanism: CellVisibility.GetVisibleCells BFS over portals lacks termination/cap-depth logic. Network hubs expose 100+ outbound portals to disconnected dungeons, all marked visible. scen5 log shows up to 295 cells in one visible set.

Evidence FOR independence:

  • H1 is an asymmetric over-render (outdoor passes ignore indoor state).
  • H3 is a symmetric over-inclusion (BFS doesn't terminate properly).
  • A fix to H1 would gate WHEN to render outdoor; H3's fix is to bound WHICH indoor cells the BFS includes.
  • Different code paths: H1 lives in TerrainModernRenderer.Draw + WbDrawDispatcher; H3 lives in CellVisibility.GetVisibleCells.

Conclusion: H1 and H3 should be separate fixes. Closing H1 will close cellar-stairs + the outdoor-stab side of #78 but NOT close #95. The next phase should plan H1 in scope and decide whether H3 fits in the same milestone (M1.5).


What we've ruled out

  • It's not the #100 cell-collapse bug returning. hiddenTerrainCells plumbing was fully removed in a64e6f2; terrain mesh now correctly renders everywhere on the landblock per retail. The new artifact's mechanism is "outdoor geometry visible at all when indoor," not "incorrect terrain mesh shape."
  • It's not a depth-precision issue (high confidence, pending falsification). Patch shape + "clears closer" both contradict Z-fight.
  • It's not a ParentCellId propagation bug. Audit confirmed that interior cell static objects (GameWindow.BuildInteriorEntitiesForStreaming:5476) and cell-mesh entities (line 5416) both receive non-null ParentCellId = envCellId. The dispatcher's existing filter already correctly culls them when the camera is in a different building. The bug is the OPPOSITE direction (outdoor entities w/ ParentCellId == null always pass).
  • It's not WB extraction divergence. Phase O extracted ~33 WB files into src/AcDream.App/Rendering/Wb/ but the VisibilityManager / RenderInsideOut pipeline was NOT extracted — that code never existed in our tree.
  • It's not a missing camera-cell signal at the render layer. cameraInsideCell, visibility.VisibleCellIds, and visibility.HasExitPortalVisible are all already computed in GameWindow.cs:6970-6984 and live in scope at the two Draw call sites (lines 7074 + 7110). No new plumbing required.

Approach options for the fix

Three viable approaches, with tradeoffs:

Port VisibilityManager.RenderInsideOut's stencil pipeline to acdream. Two-pass render: (1) mark current-building portal silhouettes in stencil, (2) gate outdoor passes by glStencilFunc(Equal, 1, 0x01).

Pros:

  • Closest to acdream's existing modern GL pipeline (we already use stencil for nothing else; adding one stencil bit is cheap).
  • WB is acdream's documented rendering base (per CLAUDE.md). Cross-reference checked against retail confirms WB's intent matches retail's, just via a different mechanism.
  • Handles the "see outside through open door" case correctly — terrain renders through portal silhouettes only.
  • Reusable for both outdoor terrain AND outdoor entities (single stencil gate applies to all subsequent draws).

Cons:

  • Multi-pass render adds GPU cost (small — one stencil pass per current-building's portals).
  • Requires a portal-mesh upload pipeline (WB has one in PortalRenderManager.cs:488-628; we'd port it).
  • More LOC than Approach C.

Estimated scope: 4-6 tasks, 1-2 weeks of implementation + verification.

Approach B — Retail-faithful polygon-clip sub-views

Port PView::ConstructView + PView::GetClip + Render::PortalList from retail. Per-mesh viewport set to clipped portal polygon.

Pros:

  • 100% retail-faithful.

Cons:

  • Requires per-draw viewport scissor changes — current rendering uses bindless + MDI with one viewport per pass. Wedging per-mesh viewport in would break the modern pipeline's batching.
  • Multi-week port. Out of scope for one session.

Estimated scope: 8-12 tasks, 4-6 weeks. Defer to a future milestone if needed.

Approach C — Ship-now binary gate

When cameraInsideCell && !visibility.HasExitPortalVisible, skip outdoor terrain pass entirely and gate WbDrawDispatcher to exclude ParentCellId == null entities.

Pros:

  • Smallest change. ~2-3 tasks. Closes the cellar-stairs symptom and the sealed-interior side of #78 immediately.
  • All required state already computed (HasExitPortalVisible from CellVisibility.GetVisibleCells line 404).

Cons:

  • Under-renders when player can see outside through an open door/window (renders nothing instead of clipping correctly). This is regressive vs. today for the doorway-view case.
  • Per CLAUDE.md "no workarounds": this is a symptom-gate rather than a root-cause fix. Would need explicit user approval. Approach A is the correct shape; Approach C is a temporary patch.

Estimated scope: 2-3 tasks, 1-2 days.


  1. User runs the camera-rotation falsification test (~60 seconds). Spawn at Holtburg, walk into a cottage cellar, find the angle showing the grass patch, rotate the camera in place without moving. Report what happens.

    • Polygon-stable → confirms H1, proceed.
    • Flickering → pivots to H2, this report needs major revision.
  2. If H1 confirmed: user picks Approach A vs C. Recommendation: Approach A (WB-style stencil). Per CLAUDE.md's "no workarounds" rule, the right thing is to port the stencil pipeline, not gate at the symptom site. Approach C is offered only if the user wants to close cellar-stairs immediately and defer doorway-view correctness as known-incomplete; that's an explicit workaround that needs user sign-off.

  3. #95 should NOT be in scope for this work. Different mechanism, different code path. File continues as separate work in M1.5.

  4. Phase identifier: the handoff proposes A8 (visibility) alongside A6 (physics) and A7 (lighting). I'll defer naming to the user.

  5. CLAUDE.md update for #100 ship: the handoff calls this out as pending. Recommendation: add a brief #100 ship entry mentioning the cellar-stairs finding linked to #78. Out of scope for investigate mode; will happen at the start of the implementation session.


What this is NOT

This is NOT a #100 regression. The terrain Z-nudge ship works correctly; the new artifact has a different root cause (indoor-camera gate on outdoor passes was already missing pre-#100 — #100 just made it more visible by removing the terrain-cell hide mechanism that incidentally masked it inside building footprints).

This is NOT a depth-precision fix. The 1cm nudge is correctly sized; larger nudges would break coplanar-floor disambiguation elsewhere.

This is NOT a ParentCellId data fix. Interior entities are correctly tagged.

This is NOT covered by Phase O's WB extraction. The visibility-management code was deliberately NOT extracted.


Reference appendix

Retail anchors (acclient_2013_pseudo_c.txt)

Line Symbol Role
92635 SmartBox::RenderNormalMode Per-frame top-level dispatcher (indoor vs outdoor branch)
267912 LScape::draw Outdoor terrain dispatch
311397 CEnvCell::find_visible_child_cell Point-in-visible-cell query
311878 CEnvCell::grab_visible_cells Loads outdoor on seen_outside
427843 RenderDeviceD3D::DrawInside Indoor entry point
429245 RenderDeviceD3D::DrawMesh Per-mesh portal-sub-view loop
430027 RenderDeviceD3D::DrawBlock Outdoor landblock dispatch
432709 PView::DrawCells The outside_view.view_count > 0 gate
433750 PView::ConstructView BFS portal walk

WorldBuilder anchors

File:Line Role
references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/VisibilityManager.cs:73-239 RenderInsideOut — full stencil pipeline
Same file:241-359 RenderOutsideIn — outdoor branch
Same file:47-71 PrepareVisibility — visible cell set
references/WorldBuilder/Chorizite.OpenGLSDLBackend/GameScene.cs:880-1008 Main render dispatch (lines 965, 988 are the gates)
references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/PortalRenderManager.cs:488-628 Portal mesh upload
references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/CameraController.cs:142-174 Camera-cell tracking (portal raycasts)
references/WorldBuilder/Chorizite.OpenGLSDLBackend/Shaders/PortalStencil.frag:7-16 Stencil shader (writes gl_FragDepth = 1.0)

acdream extension points (audit-verified)

File:Line Current behavior Extension required
src/AcDream.App/Rendering/CellVisibility.cs:222-232 Returns VisibilityResult with VisibleCellIds, HasExitPortalVisible, CameraCell None — state already in place
src/AcDream.App/Rendering/GameWindow.cs:6970-6984 Computes cameraInsideCell and playerInsideCell per frame None — values already in scope
src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs:360-374 Gates by ParentCellId ∈ visibleCellIds; outdoor entities (null) always pass Add second gate: when cameraInsideCell == true and entity is outdoor (ParentCellId == null), require stencil pass or skip entirely
src/AcDream.App/Rendering/TerrainModernRenderer.cs:191-208 Frustum-only cull; renders all loaded landblocks Add parameter for stencil pass / indoor-camera state
src/AcDream.App/Rendering/GameWindow.cs:7074 _terrain?.Draw(camera, frustum, neverCullLandblockId: playerLb) Add cameraInsideCell (or equivalent) parameter
src/AcDream.App/Rendering/GameWindow.cs:7110 WbDrawDispatcher.Draw(... visibleCellIds: visibility?.VisibleCellIds, ...) Add cameraInsideCell parameter
src/AcDream.Core/Rendering/RenderingDiagnostics.cs:75-77 Existing probe flag registry (mirror of PhysicsDiagnostics) Add ProbeVisibilityEnabled from ACDREAM_PROBE_VIS=1

Issues family map

ID Symptom Closes with H1 fix?
#78 Outdoor stabs visible through inn floor/walls YES (same root cause)
Cellar-stairs (NEW) Outdoor terrain visible inside cottage cellar YES (same root cause; new evidence for #78)
#95 Portal-graph visibility blowup (visibleCells up to 295) NO — independent (different code path)
#79/#80/#81/#93/#94 Indoor lighting bugs Maybe — #93 explicitly suspects "indoor visibility culling for lights" sub-cause; lighting subsystem may share infrastructure with visibility-gate but not directly impacted

Workflow notes (from CLAUDE.md "How to operate")

  • "No workarounds without explicit approval" — Approach C is a workaround; Approach A is the correct shape.
  • Visual verification is the user's job; can't be automated.
  • Phase ID for visibility work is undecided. User picks at implementation-session start.
  • Per the milestones doc, this is M1.5 scope; cellar-stairs is on the M1.5 critical path because it blocks the building/cellar half of the M1.5 demo.