diff --git a/docs/superpowers/specs/2026-06-08-portal-flood-membership-stability-design.md b/docs/superpowers/specs/2026-06-08-portal-flood-membership-stability-design.md new file mode 100644 index 00000000..90bb5b98 --- /dev/null +++ b/docs/superpowers/specs/2026-06-08-portal-flood-membership-stability-design.md @@ -0,0 +1,213 @@ +# Portal-Flood Membership Stability — the indoor "flap" root-cause fix + +**Date:** 2026-06-08 +**Branch:** `claude/thirsty-goldberg-51bb9b` +**Status:** design approved (user, 2026-06-08); TDD implementation pending behind a visual gate. + +--- + +## 1. Summary + +The indoor render **flap** (textures "battling" at the doorway threshold) is **portal-flood +set-membership instability**: from a *stable* viewer cell, the PView BFS includes or excludes a +deeper cell cluster frame-to-frame, redrawing a different set each frame. The fix makes set +membership depend on a **stable visibility predicate** (side-test + view-region overlap) instead of +the **drift-prone surviving-vertex count** of the per-portal clip. Localized to +`PortalVisibilityBuilder`; no camera/movement/physics/clip-math rewrite. + +--- + +## 2. Root cause — confirmed with primary evidence + +### 2.1 What the flap actually is + +Live `[render-sig]` + `[pv-input]` capture at the Holtburg cottage threshold (landblock `0xA9B4`), +standing at the doorway: + +- The render root is **stable** (`root=0xA9B40170`, `outRoot=n`, i.e. an interior viewer cell — NOT + the outdoor node, NOT a root toggle). +- The flood cell set **oscillates frame-to-frame**: `ids=[0170,0171,0172,0173,0174,0175]` (6) ↔ + `ids=[0170,0171]` (2). The deeper cluster `{0172,0173,0174,0175}` pops in/out. +- The oscillation occurs **at a byte-identical (to cm) eye AND player position** — e.g. three + consecutive frames at eye `(155.55,15.45,96.05)`, player `(155.40,13.20,94.00)` with flood + `6,2,6`. + +### 2.2 Why it flips — the mechanism + +1. `PortalVisibilityBuilder.Build` is a **pure** static function with all-fresh per-call state + (new `frame`/`todo`/`queued`/`popCounts` every call). Proven deterministic by + `PortalVisibilityBuilderTests.Build_IsDeterministic_IdenticalInputsGiveIdenticalVisibleSet` + (passes). **So for identical inputs the output cannot flip** → the flip requires a varying input. +2. The high-precision `[pv-input]` probe (6 dp) shows the camera eye and the **player + `RenderPosition` carry perpetual ~1–8 µm float jitter every frame** even "standing still" + (e.g. player `94.000000 ↔ 94.000008`, eye `96.248863 ↔ 96.248871`). At most poses this is + harmless; the flood is stable. +3. The per-portal clip is a faithful homogeneous port of retail's `polyClipFinish` + (`PortalProjection.ProjectToClip` → `ClipToRegion`, w-aware Sutherland–Hodgman). But the + **re-enqueue fixpoint** (`MaxReprocessPerCell`) re-clips a cell's view each round, and the + codebase documents that this **drifts per round** (`PortalVisibilityBuilder.cs:43,151,732`: + "ProjectToClip drift keeps a view growing forever"). +4. At the threshold pose a deeper portal is **grazing** (oblique / near the eye) → it projects to a + thin sliver. The per-round drift + the µm viewpoint jitter flip `ClipToRegion`'s surviving-vertex + count across the `<3` boundary (PortalProjection.cs:118/121) → `clippedRegion.Count` flips + `0 ↔ N` → the cull at **`PortalVisibilityBuilder.cs:235`** + (`if (clippedRegion.Count == 0 && !EyeInsidePortalOpening) continue;`) drops the deeper cluster + on the empty-clip frames → flood `2 ↔ 6` → the flap. + +### 2.3 Why prior fixes did not work + +- **boom-snap** (camera stabilization, shipped): the jitter is sub-cm and **perpetual** (it is in the + player `RenderPosition`, propagating to the camera); snapping the boom distance did not make the + viewpoint bit-exact, so the knife-edge still flips. +- **w-space clip** (`ProjectToClip`/`ClipToRegion`, shipped): this made the *single* clip robust, but + the instability is in the **re-clip drift across rounds** + the membership gate's dependence on the + surviving-vertex count, not in a single clip. +- **viewer-cell dead-zone** (tried, reverted): the root does not toggle here (`root=0170` stable), so + a root-resolution dead-zone is irrelevant to this symptom. + +### 2.4 What this REFUTES (the 2026-06-07 handoff diagnosis) + +The predecessor handoff +(`docs/research/2026-06-07-cutover-flip-render-residuals-diagnosis-handoff.md`) is **wrong** on its +load-bearing claims; do not act on its F1/F2: + +- "See-through walls from outside" — **not reproduced**: standing outside with the door closed is + **stable** (user visual gate, 2026-06-08). +- "The walls ARE the EnvCell shells; the ModelId is a partial frame" — **refuted**: the cottage + ModelId GfxObj `0x01000A2B` is a full closed exterior (76 render polys, bbox 20×18×10.4 m, 46 + outward-facing walls + roof; cross-checked vs the physics BSP + retail `DrawBuilding`). The EnvCell + shells are interior-facing room surfaces. **F2 (build EnvCell back faces / double-side) targets the + wrong geometry.** +- "Oscillation = outdoor-node flood instability (1↔13)" — **corrected**: it is the *indoor* flood + (`outRoot=n`, stable root) swinging **2↔6**. F1 targeted the wrong root. +- "branch=RetailPViewInside every frame proves the flap is gone" — **tautological**: post-flip + `clipRoot = viewerRoot ?? _outdoorNode` is essentially never null, so the `branch` label can no + longer report `OutdoorRoot`. It proves nothing. + +--- + +## 3. Retail grounding + +Retail `PView::ConstructView` (decomp `acclient_2013_pseudo_c.txt:433750`): a cell becomes a draw-set +member the moment it is popped from the todo list (`:433783`). A neighbour is enqueued only if the +per-portal `ConstructView` (`:433827`) passes: the **side-test** (`:433832-433849`, `dot(viewpoint, +planeN)+d` vs a 0.2 mm epsilon → POSITIVE/IN_PLANE/NEGATIVE) **AND** `GetClip` (`:432344`) returns a +**non-empty** clip (`:433858` `if (arg3 != 0)`). `GetClip` projects via `xformStart` and clips via +`ACRender::polyClipFinish` (`:702749`). + +So retail gates membership on a non-empty clip **too** — it never flaps because (a) it processes each +cell **once** (enqueue-once; no re-clip drift) and (b) its viewpoint is **bit-stable at rest** (the +authoritative local position does not move). acdream diverges on **both** (re-enqueue drift + µm +viewpoint jitter), and the two combine at the grazing portal. + +The fix restores retail's **intent** — "the portal is visible through the accumulated view" — with a +predicate that is stable under acdream's residual drift/jitter, rather than the literal +drift-sensitive vertex count. + +--- + +## 4. The fix (design) + +**Principle:** set-membership is decided by a **stable** visibility predicate, not by the drift-prone +surviving-vertex count of the clip. The clip still computes the *draw* region; it no longer decides +*whether* a reachable cell is in the set. + +**Change — localized to `PortalVisibilityBuilder` (the line-235 gate):** + +- Today (`PortalVisibilityBuilder.cs:235-244`): + ``` + if (clippedRegion.Count == 0) + { + if (!EyeInsidePortalOpening(poly, cell.WorldTransform, cameraPos)) continue; // cull + foreach (var vp in activeViewPolygons) clippedRegion.Add(clone(vp)); // flood with parent view + } + ``` +- New: when `clippedRegion.Count == 0` but the portal **passed the side-test** (already computed: + `sideAllowed`, the stable plane-side test) **and its projection still overlaps the current view + region** (a stable convex-overlap predicate — true for a thin grazing sliver inside the region, + false for an off-screen portal), keep the neighbour by flooding it with the parent's view (the same + substitution the `EyeInsidePortalOpening` branch already does). Otherwise cull as today. + +The drift-prone `clippedRegion.Count` no longer flips membership; a portal that is genuinely visible +through the accumulated view (stable side-test + stable overlap) stays in the set every frame. + +**The stable overlap predicate** (`PortalOverlapsView`, new small helper): does the portal's +projected polygon overlap any of the `activeViewPolygons`? Implemented to be stable for the +near/grazing case (the failure mode is `ClipToRegion` losing a vertex to float noise, NOT the gross +position of the sliver, which sits well inside the view region — so a robust "any-overlap" test +returns a steady boolean). Exact formulation is fixed in TDD (§5); candidates: (a) any portal NDC +vertex inside the region OR any region vertex inside the portal OR any edge crossing; (b) reuse the +existing `EyeInsidePortalOpening` 3D near-region test generalized from "eye in opening" to "eye within +the portal's view cone." The chosen formulation MUST keep the #95 guard test green. + +**This subsumes the `EyeInsidePortalOpening` special-case** (a portal the eye stands in trivially +overlaps the full-screen region), so that ad-hoc patch is removed once the general predicate is in +place — fewer special cases, not more. + +**#95 over-inclusion guard preserved:** an off-screen portal (2 m to the side) does not overlap the +view region → still culled. No visible-set blowup. + +--- + +## 5. Verification (TDD) + +Write the failing test first, then the fix. + +1. **RED → GREEN — degenerate-clip membership.** New deterministic test in + `PortalVisibilityBuilderTests`: construct an interior portal that (a) passes the side-test, (b) + whose projection overlaps the view region, but (c) whose `ClipToRegion` returns `<3` verts + (degenerate sliver — the live failure mode), and the eye is NOT standing in the opening. Assert the + neighbour **is** in `OrderedVisibleCells`. RED today (culled at line 235 because not + `EyeInsidePortalOpening`); GREEN after the fix (kept because side-test + overlap). This pins the + gate change without needing to reproduce the exact µm knife-edge. + - *Optional companion (robustness):* if a fixture can be found whose clip flips `<3 ↔ ≥3` under a + µm eye nudge, add a test asserting `OrderedVisibleCells` is identical across the nudge. Skip if + it proves too geometry-sensitive to construct stably — the deterministic test above is the gate. +2. **Stays GREEN — #95 over-inclusion guard.** `Build_DegeneratePortalToTheSide_NotFlooded_NoOverInclusion` + (off-screen portal stays culled). +3. **Stays GREEN — existing behavior.** `Build_EyeStandingInInteriorPortal_FloodsNeighbour`, + `Build_CollapsedInteriorPortalNearEyeBeyondHalfMeter_FloodsNeighbour`, + `Build_IsDeterministic_IdenticalInputsGiveIdenticalVisibleSet`, and the existing cellar/window + clip tests. +4. **Visual gate (user).** At the cottage doorway threshold, hold still — the 2↔6 oscillation is + gone; the deeper rooms render steadily through the door. Walking in/out remains seamless. + +`dotnet build` + `dotnet test` green before the visual gate. + +--- + +## 6. Scope / non-goals + +- **In scope:** `PortalVisibilityBuilder` (the line-235 gate + the `PortalOverlapsView` helper), + removal of the now-subsumed `EyeInsidePortalOpening` force-flood branch, the new + existing tests. +- **Non-goals (explicitly deferred):** + - No camera / movement / interpolation / physics changes (the µm viewpoint jitter is left as-is; + the fix is robust to it). + - No clip-math rewrite (`ProjectToClip`/`ClipToRegion` stay). + - **Restoring retail's enqueue-once traversal** (removing the re-enqueue fixpoint, eliminating the + per-round drift at its source) is a real, larger, retail-faithful improvement but a **separate + step** — out of scope here. This fix neutralizes the drift's effect on membership without + restructuring the BFS. + +--- + +## 7. Apparatus (diagnostic probes added this session) + +- **Keep:** `PortalVisibilityBuilderTests.Build_IsDeterministic_*` (regression value); + `tools/A8CellAudit` `gfxobj` dump mode (reusable). +- **Strip after the fix is visually verified:** the `[pv-input]` probe + `RenderingDiagnostics.ProbePvInputEnabled` + (GameWindow.cs / RenderingDiagnostics.cs), the `outRoot=`/`bshell=` fields added to `[render-sig]`, + and `launch-bshell-probe.ps1` / `launch-pvinput.ps1`. All env-var-gated and inert when off; safe to + leave until the visual gate passes, then remove. + +--- + +## 8. References + +- Diagnosis evidence + refutation: this session's `[render-sig]`/`[pv-input]` captures (cottage + threshold), the `Build_IsDeterministic` test, the GfxObj `0x01000A2B` render-geometry dump. +- Retail decomp: `PView::ConstructView` `:433750`/`:433827`, `PView::GetClip` `:432344`, + `ACRender::polyClipFinish` `:702749` (`docs/research/named-retail/acclient_2013_pseudo_c.txt`). +- Superseded: `docs/research/2026-06-07-cutover-flip-render-residuals-diagnosis-handoff.md` (wrong on + see-through / EnvCell-walls / outdoor-node — see §2.4). +- Memory to correct: `project_indoor_flap_rootcause`, `reference_render_pipeline_state`.