acdream/docs/superpowers/specs/2026-06-08-portal-flood-membership-stability-design.md
Erik d6aa526dd3 diag(render/physics): flap root-caused to physics rest µm-jitter; refute prior diagnoses
Apparatus + handoff for the indoor flap. Confirmed (primary evidence): the flap is the
portal-flood clip being µm-sensitive at the threshold, driven by a ~1-8µm jitter in the
player RenderPosition (physics resting position not bit-stable; Lerp surfaces it). REFUTES
the 2026-06-07 see-through/EnvCell/outdoor-node diagnosis (ModelId GfxObj 0x01000A2B IS the
solid exterior) AND an enqueue-once attempt (retail propagates late slices via AddToCell;
the existing PropagatesNewSlicesToExit test caught it; reverted). Adds: Build determinism
test, A8CellAudit gfxobj dump, [pv-input] 6dp probe + [render-sig] outRoot/bshell fields.
No functional fix shipped. Next: higher-precision physics rest trace -> port retail
kill_velocity/contact rest-stability. Canonical: docs/research/2026-06-08-flap-rootcause-physics-rest-handoff.md

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-08 09:16:12 +02:00

14 KiB
Raw Blame History

Portal-Flood Membership Stability — the indoor "flap" root-cause fix

Date: 2026-06-08 Branch: claude/thirsty-goldberg-51bb9b Status: ⚠️ §4 (enqueue-once) REFUTED 2026-06-08 — retail propagates late slices via AddToCell (decomp :433494); the existing Build_ViewGrowthAfterDoneCell_PropagatesNewSlicesToExit test encodes that and enqueue-once broke it (reverted). The flap's confirmed root is the physics resting position µm-jitter (§6 contingency, now the active direction). CANONICAL PICKUP: docs/research/2026-06-08-flap-rootcause-physics-rest-handoff.md. Keep §1§3 (mechanism + retail grounding) as accurate diagnosis; treat §4§5 as a refuted approach.


1. Summary

The indoor render flap (textures "battling" at the doorway threshold) is portal-flood set-membership instability: from a stable viewer cell, the PView BFS includes or excludes a deeper cell cluster frame-to-frame, redrawing a different set each frame. The fix is a verbatim port of retail's enqueue-once traversal (PView::ConstructView/AddViewToPortals): a cell is enqueued only on first discovery; later view-growth into an already-discovered cell is unioned in place (retail AddToCell/FixCellList) and never re-enqueues or re-clips that cell's portals. This removes acdream's MaxReprocessPerCell re-enqueue fixpoint — the documented per-round ProjectToClip drift that lets µm viewpoint jitter re-discover/undiscover the deep cluster. Localized to PortalVisibilityBuilder; no overlap-predicate, no added robustness, no camera/movement/physics/clip-math change. (Contingency: if a residual flap survives — the deep portal's first clip being knife-edge under µm jitter independent of drift — the next retail-faithful step is bit-stabilizing the viewpoint at rest; see §6.)


2. Root cause — confirmed with primary evidence

2.1 What the flap actually is

Live [render-sig] + [pv-input] capture at the Holtburg cottage threshold (landblock 0xA9B4), standing at the doorway:

  • The render root is stable (root=0xA9B40170, outRoot=n, i.e. an interior viewer cell — NOT the outdoor node, NOT a root toggle).
  • The flood cell set oscillates frame-to-frame: ids=[0170,0171,0172,0173,0174,0175] (6) ↔ ids=[0170,0171] (2). The deeper cluster {0172,0173,0174,0175} pops in/out.
  • The oscillation occurs at a byte-identical (to cm) eye AND player position — e.g. three consecutive frames at eye (155.55,15.45,96.05), player (155.40,13.20,94.00) with flood 6,2,6.

2.2 Why it flips — the mechanism

  1. PortalVisibilityBuilder.Build is a pure static function with all-fresh per-call state (new frame/todo/queued/popCounts every call). Proven deterministic by PortalVisibilityBuilderTests.Build_IsDeterministic_IdenticalInputsGiveIdenticalVisibleSet (passes). So for identical inputs the output cannot flip → the flip requires a varying input.
  2. The high-precision [pv-input] probe (6 dp) shows the camera eye and the player RenderPosition carry perpetual ~18 µm float jitter every frame even "standing still" (e.g. player 94.000000 ↔ 94.000008, eye 96.248863 ↔ 96.248871). At most poses this is harmless; the flood is stable.
  3. The per-portal clip is a faithful homogeneous port of retail's polyClipFinish (PortalProjection.ProjectToClipClipToRegion, w-aware SutherlandHodgman). But the re-enqueue fixpoint (MaxReprocessPerCell) re-clips a cell's view each round, and the codebase documents that this drifts per round (PortalVisibilityBuilder.cs:43,151,732: "ProjectToClip drift keeps a view growing forever").
  4. At the threshold pose a deeper portal is grazing (oblique / near the eye) → it projects to a thin sliver. The per-round drift + the µm viewpoint jitter flip ClipToRegion's surviving-vertex count across the <3 boundary (PortalProjection.cs:118/121) → clippedRegion.Count flips 0 ↔ N → the cull at PortalVisibilityBuilder.cs:235 (if (clippedRegion.Count == 0 && !EyeInsidePortalOpening) continue;) drops the deeper cluster on the empty-clip frames → flood 2 ↔ 6 → the flap.

2.3 Why prior fixes did not work

  • boom-snap (camera stabilization, shipped): the jitter is sub-cm and perpetual (it is in the player RenderPosition, propagating to the camera); snapping the boom distance did not make the viewpoint bit-exact, so the knife-edge still flips.
  • w-space clip (ProjectToClip/ClipToRegion, shipped): this made the single clip robust, but the instability is in the re-clip drift across rounds + the membership gate's dependence on the surviving-vertex count, not in a single clip.
  • viewer-cell dead-zone (tried, reverted): the root does not toggle here (root=0170 stable), so a root-resolution dead-zone is irrelevant to this symptom.

2.4 What this REFUTES (the 2026-06-07 handoff diagnosis)

The predecessor handoff (docs/research/2026-06-07-cutover-flip-render-residuals-diagnosis-handoff.md) is wrong on its load-bearing claims; do not act on its F1/F2:

  • "See-through walls from outside" — not reproduced: standing outside with the door closed is stable (user visual gate, 2026-06-08).
  • "The walls ARE the EnvCell shells; the ModelId is a partial frame" — refuted: the cottage ModelId GfxObj 0x01000A2B is a full closed exterior (76 render polys, bbox 20×18×10.4 m, 46 outward-facing walls + roof; cross-checked vs the physics BSP + retail DrawBuilding). The EnvCell shells are interior-facing room surfaces. F2 (build EnvCell back faces / double-side) targets the wrong geometry.
  • "Oscillation = outdoor-node flood instability (1↔13)" — corrected: it is the indoor flood (outRoot=n, stable root) swinging 2↔6. F1 targeted the wrong root.
  • "branch=RetailPViewInside every frame proves the flap is gone" — tautological: post-flip clipRoot = viewerRoot ?? _outdoorNode is essentially never null, so the branch label can no longer report OutdoorRoot. It proves nothing.

3. Retail grounding

Retail PView::ConstructView (decomp acclient_2013_pseudo_c.txt:433750): a cell becomes a draw-set member the moment it is popped from the todo list (:433783). A neighbour is enqueued only if the per-portal ConstructView (:433827) passes: the side-test (:433832-433849, dot(viewpoint, planeN)+d vs a 0.2 mm epsilon → POSITIVE/IN_PLANE/NEGATIVE) AND GetClip (:432344) returns a non-empty clip (:433858 if (arg3 != 0)). GetClip projects via xformStart and clips via ACRender::polyClipFinish (:702749).

So retail gates membership on a non-empty clip too — it never flaps because (a) it processes each cell once (enqueue-once; no re-clip drift) and (b) its viewpoint is bit-stable at rest (the authoritative local position does not move). acdream diverges on both (re-enqueue drift + µm viewpoint jitter), and the two combine at the grazing portal.

The fix restores retail's traversal verbatim — enqueue-once on first discovery, union-in-place on growth — so acdream stops diverging from AddViewToPortals and the per-round re-clip drift disappears. No new predicate, no added robustness.


4. The fix (design)

Principle: membership is set by first discovery in distance-priority order (retail InsCellTodoList in the AddViewToPortals update_count == 0 branch, decomp :433478). A cell already discovered is never re-enqueued and never re-clipped; later view-growth into it is unioned in place and only refines that cell's own draw clip / draw-list position (retail AddToCell + FixCellList, :433492-433502). The drift-prone re-clip loop is deleted, so µm viewpoint jitter can no longer re-discover/undiscover a cell.

Change A — enqueue-once (the core fix), PortalVisibilityBuilder.cs ~308-327. Today a neighbour is RE-enqueued whenever its view grew, capped by MaxReprocessPerCell:

bool grew = AddRegion(nview, clippedRegion);                    // union in place (= retail AddToCell)
if (grew && popCounts[neighbourId] < MaxReprocessPerCell        // RE-ENQUEUE on growth ← the divergence
    && queued.Add(neighbourId))
    todo.Insert(neighbour, dist);

New: enqueue a neighbour only on first discovery (no CellViews / processedViewCounts entry yet). On growth into an already-discovered neighbour, union in place (keep AddRegion) and update its draw-list position if already drawn (port FixCellList), but do not re-insert it into the todo list. Remove MaxReprocessPerCell, popCounts, and the per-pop cap — enqueue-once terminates by construction (≤ N cells), matching retail's cell_view_done guarantee (:433784).

Change B — exit-portal / OutsideView contribution stays first-process. Retail contributes a cell's exit-portal slice to OutsideView once, when the cell is processed; there is no re-enqueue path in AddViewToPortals to re-contribute a grown view. acdream's OutsideView contribution (line 256) already happens at process time, so removing the re-enqueue makes it match retail. Regression watch: the re-enqueue was added 2026-06-07 "to propagate late-discovered slices to exit portals" — which retail does not do, so dropping it is faithful, but a look-in / outside-view slice could shrink. The existing OutsideView tests (Builder_Cellar_WindowClippedToStairwell, the look-in tests) must stay green; if one shrinks, the fix is retail's AddToCell/FixCellList ordering, not reinstating the re-enqueue.

EyeInsidePortalOpening (line 235-244) is unchanged by this fix. It is a separate near-degenerate single-clip guard (eye standing in a doorway), orthogonal to the re-enqueue, and stays as-is. No overlap predicate is introduced.

Why this is the flap fix, not a band-aid: the re-enqueue re-clips a popped cell's portals from its grown (drifted) view and can therefore add or drop the deep 0172-0175 cluster as the drift walks across the clip boundary under µm jitter. Enqueue-once decides the cluster's membership once, at first discovery, from the cell's clean first-accumulated view — the same decision retail makes.


5. Verification (TDD)

The flap itself is float-drift-dependent (it manifests only under live µm jitter at a specific grazing geometry), so the visual gate is the acceptance; the unit layer pins enqueue-once correctness and guards regressions.

  1. Enqueue-once correctness + termination (new). A multi-path fixture in PortalVisibilityBuilderTests: a diamond (a cell reachable from two parents, so its view grows after first discovery) and a cycle (portals looping back). Assert the flood (a) terminates with MaxReprocessPerCell removed, (b) yields a deduped OrderedVisibleCells, and (c) each reachable cell is present exactly once. This is the property the re-enqueue cap was protecting; enqueue-once provides it by construction. If a per-cell pop counter is cheap to surface, also assert each cell is popped ≤ 1 (RED under the re-enqueue, GREEN after) — the direct enqueue-once signal.
  2. No membership regression on known geometries. Build_EyeStandingInInteriorPortal_FloodsNeighbour, Build_CollapsedInteriorPortalNearEyeBeyondHalfMeter_FloodsNeighbour, Build_DegeneratePortalToTheSide_NotFlooded_NoOverInclusion (#95 guard), Build_IsDeterministic_*, and the cellar/window/look-in tests stay green (re-enqueue and enqueue-once agree on non-drifting geometry; if one changes, that is the §4 Change-B regression to handle retail's way, NOT by reinstating the re-enqueue).
  3. Visual gate (user) — the acceptance. At the cottage doorway threshold, hold still: the 2↔6 oscillation is gone; the deeper rooms render steadily through the door; walking in/out stays seamless. Re-run the [pv-input]/[render-sig] probes to confirm ids=/flood is stable while standing still.

dotnet build + dotnet test green before the visual gate.


6. Scope / non-goals

  • In scope: PortalVisibilityBuilder enqueue logic — enqueue-once on first discovery; remove the MaxReprocessPerCell re-enqueue, popCounts, and the per-pop cap; union-in-place + draw-list re-position on growth (port retail AddToCell/FixCellList); the new + existing tests.
  • Non-goals (explicitly deferred):
    • No overlap predicate / no added robustness — this is a verbatim retail port, not a new membership rule. EyeInsidePortalOpening (line 235) is untouched.
    • No clip-math rewrite (ProjectToClip/ClipToRegion stay).
    • No camera / movement / interpolation / physics changes in this step.
  • Contingency (next retail-faithful step, only if a residual flap survives the visual gate): bit-stabilize the viewpoint at rest. The live [pv-input] probe shows the player RenderPosition carries ~18 µm float noise at rest (e.g. Z 94.000000 ↔ 94.000008), which retail's authoritative local position does not. If enqueue-once leaves a residual flicker (the deep portal's first clip is knife-edge under that jitter), trace the jitter to its source (interpolation residual vs physics contact-settling) and make the local-player viewpoint bit-stable at rest, matching retail. Scoped as a separate step because it touches the movement/physics path; do it only if measured necessary.

7. Apparatus (diagnostic probes added this session)

  • Keep: PortalVisibilityBuilderTests.Build_IsDeterministic_* (regression value); tools/A8CellAudit gfxobj dump mode (reusable).
  • Strip after the fix is visually verified: the [pv-input] probe + RenderingDiagnostics.ProbePvInputEnabled (GameWindow.cs / RenderingDiagnostics.cs), the outRoot=/bshell= fields added to [render-sig], and launch-bshell-probe.ps1 / launch-pvinput.ps1. All env-var-gated and inert when off; safe to leave until the visual gate passes, then remove.

8. References

  • Diagnosis evidence + refutation: this session's [render-sig]/[pv-input] captures (cottage threshold), the Build_IsDeterministic test, the GfxObj 0x01000A2B render-geometry dump.
  • Retail decomp: PView::ConstructView :433750/:433827, PView::GetClip :432344, ACRender::polyClipFinish :702749 (docs/research/named-retail/acclient_2013_pseudo_c.txt).
  • Superseded: docs/research/2026-06-07-cutover-flip-render-residuals-diagnosis-handoff.md (wrong on see-through / EnvCell-walls / outdoor-node — see §2.4).
  • Memory to correct: project_indoor_flap_rootcause, reference_render_pipeline_state.