acdream/docs/research/2026-06-07-cutover-flip-render-residuals-diagnosis-handoff.md
2026-06-07 22:12:38 +02:00

12 KiB

Handoff — Cutover FLIP shipped; see-through + oscillation DIAGNOSED (evidence-based) — 2026-06-07 (PM)

CANONICAL PICKUP for the render-unification residuals. Worktree thirsty-goldberg-51bb9b, branch claude/thirsty-goldberg-51bb9b, HEAD 774cb22. The cutover flip is SHIPPED (one render path, no branch-toggle flap). It exposed two residuals — see-through building walls and oscillation — whose root causes are now PROVEN with a live probe (not guessed). The fixes are identified but NOT yet implemented. Read §3 (diagnosis) and §5 (do-not-retry) before touching code.


1. What shipped (committed, keep)

The CUTOVER FLIP from 2026-06-07-render-unification-cutover-flip-handoff.md landed:

Commit What
5379f6e Step A — PortalVisibilityBuilder.Build seeds a full-screen OutsideView when the root is the outdoor node (LoadedCell.IsOutdoorNode, set by OutdoorCellNode.Build). +2 UnifiedFloodTests, +2 flag assertions.
445e861 Step B — the flip: GameWindow.cs:~7387 clipRoot = viewerRoot ?? _outdoorNode. Drops the playerIndoorGate gate. ONE path, no inside/outside branch. Preserves the LiveDynamic draw for the outdoor root.
88caa0d depth-clear fix — ClearDepthSlice = null for the outdoor root (the full-screen depth clear was painting the cellar over the player; fixed).
774cb22 Revert of 0030dac (the slot-0 skip — a FAILED fix, see §5).

The flip's PRIMARY goal succeeded: [render-sig] shows branch=RetailPViewInside every frame, zero OutdoorRoot frames across a whole session. The two-branch-toggle flap is gone by construction. Baselines: build green, App.Tests 216/0.


2. The two residuals the flip exposed (user-observed)

  1. See-through building walls from outside — standing outside a building you see into it through the walls (doors closed).
  2. Oscillation — the interior/walls flicker between "showing nothing", "see-through", and "full interior" frame-to-frame while standing still.

3. ROOT CAUSE — proven by a live [bshell] probe (NOT guessed)

A throttled probe in RetailPViewRenderer.DrawInside (now stripped; re-add from git history of this doc's session if needed) logged, for the outdoor-node root on a loaded frame at the Holtburg cottage:

[bshell] total=6 withMesh=6 inOutdoorPartition=6 envCellsFlooded=1 outdoorEntities=637

Interpretation (each number is decisive):

  • total=6 withMesh=6 inOutdoorPartition=6 — there ARE 6 building ModelId "shell" entities (WorldEntity.IsBuildingShell, created in LandblockLoader.cs:75-91 from LandBlockInfo.Buildings[].ModelId), ALL carry meshes, ALL land in partition.Outdoor (they have ParentCellId==null; InteriorEntityPartition line 47 → Outdoor; WbDrawDispatcher.EntityPassesVisibleCellGate returns true for null visibleCellIds). So the ModelId exterior DOES render.
  • BUT the earlier "skip all interior shell draws for the outdoor root" experiment (uncommitted, reverted) made the building fully see-through — i.e. drawing ONLY the ModelId shells is NOT a solid building. Therefore the ModelId Setup is a partial frame, and the building's actual WALLS are the EnvCell shell geometry (ObjectMeshManager.PrepareCellStructMeshData, drawn by DrawEnvCellShells).
  • envCellsFlooded=1 — in this frame the outdoor-node flood reached ZERO building interior cells (only the node itself). Earlier [render-sig] frames at the same spot showed ids=[node + ~12 building cells] (≈13). So the flood membership swings between 1 and ~13 frame-to-frame.

The two residuals, explained

  1. Oscillation = flood instability gating the walls. The flip made wall-drawing depend on the portal flood reaching each building's interior cells. That flood is unstable (1 ↔ ~13), so the EnvCell walls blink in and out. ("showing nothing" = flood=1, no interior; "full interior / see-through" = flood reached the building.)
  2. See-through = single-sided EnvCell walls. Even when the walls DO draw, the EnvCell wall polys are single-sided for SidesType==CounterClockwise (interior-facing). PrepareCellStructMeshData (ObjectMeshManager ~1299-1310) builds the back face only for SidesType==None (front twice reversed) and SidesType==Clockwise (neg surface). A CounterClockwise wall = front face only → from outside you see its culled back → see-through.

4. Fix path (identified, NOT implemented)

Two independent fixes, both needed:

  • F1 — Stabilise the flood membership so a building's interior cells are CONSISTENTLY in/out of the visible set (no 1↔13 swing). This is the same metastability family as the indoor flicker. Likely levers: ground the outdoor-node flood's building membership in the cell stab_list/PVS (stable, precomputed) instead of the per-frame portal-side test + projection; or hysteresis on which buildings are flooded. Probe to re-add: envCellsFlooded per frame (RLE it; it should be constant when standing still).
  • F2 — Make the EnvCell walls solid from outside. Either build the missing back faces for SidesType==CounterClockwise walls in PrepareCellStructMeshData, or render those shells double-sided (CullMode.None) when the viewer is outside the cell. Verify against retail: dump a real Holtburg cell's wall-poly SidesType distribution first.

Open research question (reconcile before F2): pre-flip the buildings looked SOLID from outside. What drew the solid walls pre-flip — a global EnvCell-shell render, the DrawPortal/BuildFromExterior look-in, or were the ModelId shells solid then? Find what the flip replaced. The old outdoor else block (GameWindow.cs:~7557-7663, now dead-when-clipRoot-non-null but still present) is the place to read. This answers whether F2 is "build back faces" or "restore a pre-flip draw".


5. DO NOT RETRY (failed this session, with evidence)

  • Slot-0 skip (0030dac, reverted 774cb22): "for the outdoor root, skip flooded cells whose clip degenerated to no-clip slot 0." Made the oscillation WORSE — slot-0-ness flickers per frame, so cells blinked. Wrong: the see-through is not the slot-0 fallback.
  • Skip-all-interiors experiment (uncommitted, reverted): "outdoor root draws terrain + ModelId exteriors only, no EnvCell shells." Made buildings FULLY see-through + flashing — proved the ModelId Setup is not the walls (the walls are the EnvCell shells). Do not ship this.
  • Backface-culling-of-shells hypothesis (never coded): plausible but the cull mode is already data-driven (poly.SidesType); the real gap is single-sided geometry (no back face built), not a cull-state bug.
  • The subagent hypothesis "ModelId exterior occludes; interior overdraws it; fix = gate DrawEnvCellShells off for the outdoor root" is disproven — that gate IS the skip-all-interiors experiment, which removed the walls entirely.

6. State + how to resume

  • HEAD 774cb22, tree clean, build green, App.Tests 216/0. The flip + depth-clear are committed; the branch renders with the two residuals (see-through + oscillation).
  • The flip is on this BRANCH only (main is unaffected). To get a stable client meanwhile, revert the flip commits (445e861 Step B is the behaviour change; reverting it alone restores the pre-flip outdoor path — verify Step A 5379f6e is inert without it).
  • Re-add the [bshell] / envCellsFlooded probe (see this session's git reflog for the exact code) to watch flood stability while working F1.
  • Memory: project_indoor_flap_rootcause (update with this corrected diagnosis), reference_render_pipeline_state, feedback_render_downstream_of_membership (the oscillation IS a membership/flood-stability bug, per that note).

7. Next-session kickoff prompt — VERIFY-FIRST (the diagnosis above is a SUSPECT'S statement)

This session reached the §3 diagnosis only AFTER three wrong guesses, so the next session must verify it cold before building any fix. Paste this to start:

Continue acdream M1.5 render unification. Branch claude/thirsty-goldberg-51bb9b, HEAD 774cb22.
PowerShell on Windows; build before launch; live ACE 127.0.0.1:9000, testaccount/testpassword, char
+Acdream (spawns at the Holtburg/Arcanum cottage, landblock 0xA9B4).

The render-unification CUTOVER FLIP is committed. It is CLAIMED to have killed the two-branch render
"flap" but left two residuals — see-through building walls and oscillation — and a root-cause diagnosis
was reached. That diagnosis was only reached AFTER THREE WRONG GUESSES in the same session, so DO NOT
trust it. Your FIRST job is to verify it cold, with fresh primary evidence, and you are explicitly
licensed to REFUTE it.

READ (as a suspect's statement, NOT as truth):
- docs/research/2026-06-07-cutover-flip-render-residuals-diagnosis-handoff.md  (claimed diagnosis +
  do-not-retry list + the probe numbers it rests on)
- memory project_indoor_flap_rootcause (corrected diagnosis claim)

=== TASK 1 — UNBIASED VERIFICATION (complete fully BEFORE proposing any fix) ===
Do not anchor on the handoff's conclusions — re-derive each from independent evidence. For each claim,
report CONFIRMED / REFUTED / CORRECTED with the evidence. If ANY load-bearing claim is refuted, the
diagnosis is wrong: STOP, re-diagnose, do not build a fix on it. Prefer dispatching the verification to a
fresh subagent that has NOT seen the conclusions, to avoid confirmation bias.

1.1 "The branch-toggle flap is gone (one render path)." Launch (ACDREAM_PROBE_FLAP=1); walk
    indoor<->outdoor and pan the camera at a doorway; RLE the [render-sig] `branch` field. Expected if
    true: zero `branch=OutdoorRoot` after spawn. Refute if OutdoorRoot reappears.

1.2 "Oscillation == outdoor-node flood membership instability." Add a probe logging
    pvFrame.OrderedVisibleCells.Count per outdoor-root frame WHILE STANDING PERFECTLY STILL at the
    cottage. Swings frame-to-frame (e.g. 1<->13) -> unstable (confirms). Constant while the user still
    sees oscillation -> DIFFERENT cause (refute + re-diagnose). Correlate the swings with what visibly
    flickers.

1.3 "See-through == single-sided EnvCell walls." Dump the actual sides_type distribution of a REAL
    Holtburg building cell's wall polygons (Environment dat -> CellStruct, focused test/tool over the
    cottage cell). Confirm walls are predominantly single-sided AND that PrepareCellStructMeshData
    (ObjectMeshManager ~1299-1310) builds a back face only for SidesType None/Clockwise (not
    CounterClockwise). FALSIFIABLE cross-check: temporarily force the EnvCell shell pass to CullMode.None
    (double-sided) and confirm THAT alone makes the walls solid from outside; revert after.

1.4 "The building WALLS are the EnvCell shells; the ModelId 'shell' is only a partial frame." Re-add the
    [bshell] probe (total/withMesh/inOutdoorPartition/envCellsFlooded). Independently inspect what the
    building ModelId Setup geometry IS (poly count, bbox) vs the EnvCell shell. Reproduce or refute the
    skip-all-interiors experiment (building went fully see-through).

1.5 (OPEN, decides the fix shape) "Pre-flip buildings were solid from outside — what drew the walls?"
    Check out the pre-flip commit (parent of 445e861), launch, confirm buildings solid from outside,
    trace what drew the solid walls (old outdoor `else` block GameWindow.cs ~7557-7663 /
    DrawPortal+BuildFromExterior / a global EnvCell-shell render). Decides whether F2 is "build missing
    back faces" or "restore a pre-flip draw the flip replaced".

DO-NOT-RETRY (proven failures last session): the slot-0 skip (made oscillation worse); skipping all
interior shells / gating DrawEnvCellShells off for the outdoor root (building fully see-through — already
ran); any render-side debounce/grace (forbidden, no-workarounds).

=== TASK 2 — only AFTER Task 1 confirms or corrects the diagnosis ===
Implement F1 (stabilise flood membership — e.g. ground building membership in the cell stab_list/PVS
instead of the per-frame portal-side test) and F2 (the verified wall-sidedness fix). TDD where possible;
each lands behind a USER VISUAL GATE at the cottage. Do not delete the dead DrawPortal/BuildFromExterior/
outdoor-else paths until the residuals are visually confirmed fixed.

Why verify-first: the fastest single decisive test is the §1.3 falsifiable cross-check (force CullMode.None; if walls go solid from outside, the single-sided-wall hypothesis is confirmed and F2 is "build back faces"). Run the verification under a fresh subagent so it can't rubber-stamp these conclusions.