acdream/docs/superpowers/specs/2026-06-09-portal-flood-bounded-propagation-r-a2b-design.md
Erik 7b8a490da9 docs(render): R-A2b plan — back-portal side-cull (Option B), verify-first B1/B2 pin
Reading retail InitCell (:432896) side test during writing-plans showed retail's flood is acyclic (the back portal fails the side test, so 0171<->0173 can't cycle). Our flood traverses the back portal -> the cycle -> the churn. Option B (user-chosen): cull the back portal like retail, keep the forward-portal void rescue, remove the dead cap. Phase 1 pins WHY the back portal is traversed (B1 eyeInsideOpening bypass vs B2 CameraOnInteriorSide convention) before the fix; spec REVISION updated A->B.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 10:25:28 +02:00

16 KiB
Raw Blame History

R-A2b — Portal-Flood Bounded Propagation (the indoor "flap" fix)

Date: 2026-06-09 Branch: claude/thirsty-goldberg-51bb9b Phase: full retail render port (Option A) → R-A2b Status: design — approved direction (Option A, the faithful clip), pending written-spec review.

Revives docs/superpowers/specs/2026-06-08-portal-flood-enqueue-once-port-design.md (REVISION banner = "bounded propagation") and docs/superpowers/plans/2026-06-08-portal-flood-bounded-propagation.md. Both were marked ⛔ SUPERSEDED on the strength of a single maxPop=1 capture that turned out to be the wrong reproduction (camera-turn at rest, not a doorway crossing). This session re-ran the pin with a slow walk-through and measured maxPop=16 on a fifth of frames — the churn is real. Their banners are corrected to redirect here.


⚠️ REVISION (2026-06-09, writing-plans decomp pass): approach changed A → B (back-portal side-cull)

Reading retail PView::InitCell (:432896; side test at :432962) + AddToCell (:433050) during plan-writing showed WHY retail never churns: its per-portal side test culls the "back" portal (the doorway just flooded through — the viewpoint is on its exit side), so retail's flood is an acyclic tree and the 0171↔0173 mutual cycle cannot form. Retail has no eye-in-opening bypass of that cull.

Our flood forms the cycle because the back portal 0173→0171 is traversed where retail culls it ([pv-trace]: pop 0173 p0->0171 grew=True). The re-enqueue churn (what §4 Option A targeted) is a consequence of that non-retail cycle. The user chose the more-faithful Option B: cull the back portal like retail (kill the cycle at its source), keep the forward-portal clip-empty void rescue, and remove the now-dead MaxReprocessPerCell cap. §4 (Option A coverage test) is superseded by §4-B below.

Open — WHY is the back portal traversed (this pins the exact fix; plan Phase 1 verifies before code):

  • (B1) the bypass: EyeInsidePortalOpening switches off the side-cull (Build lines ~208-216: !CameraOnInteriorSide(...) && !eyeInsideOpening) when the eye is within 1.75 m of a doorway → fix: drop && !eyeInsideOpening from the side-cull (back portals cull; the separate clip-empty rescue at Build ~241-250 still rescues FORWARD portals, so the 2026-06-05 void fix is preserved).
  • (B2) the side test itself: CameraOnInteriorSide (PortalVisibilityBuilder.cs:717-724) returns true for the back portal where retail's InitCell test (eax_9 == portal_side, :432962) culls it → fix: align our side test to retail's convention.
  • Discriminator: the back portal's signed distance D to the doorway plane at the churn frames — > 1.75 m ⇒ B2 (bypass is off; the side test passed on its own); ≤ 1.75 m ⇒ B1 (bypass in play). At root=0171, p1->0173 was measured at D=-2.73 m (bypass off) — indicating B2 — but the churn cluster was at a different eye pose with no captured D, so Phase 1 confirms before the fix.

1. Summary

The indoor flap (grey/background flashing through doorways while moving) is a portal-flood re-enqueue churn in PortalVisibilityBuilder.Build. When the camera crosses an interior doorway, the two rooms sharing that doorway (01710173 at the Holtburg cottage) mutually re-contribute through the shared portal. Each pass, the near-side re-clip produces a drifted near-duplicate region; the reciprocal leaves it non-empty; AddRegion's exact-polygon dedup doesn't recognize it → grew=true → the neighbour re-enqueues. It ping-pongs to the MaxReprocessPerCell=16 cap, which cuts the flood at an arbitrary depth. Because the cut depth depends on the exact eye position, sub-cm eye creep makes the visible cell set swing (2↔4 cells) frame-to-frame → the grey flap.

The fix — see the REVISION banner above: Option B (back-portal side-cull), not the Option A coverage test described in this paragraph. Retail's flood is acyclic because its per-portal side test culls the back portal; our flood cycles because the back portal is traversed (sub-mechanism B1/B2 pinned by plan Phase 1). Fix: cull the back portal like retail (kill the cycle), keep the forward-portal clip-empty void rescue, remove the now-dead MaxReprocessPerCell + popCounts cap. Scope: PortalVisibilityBuilder only — no camera, rooting, clip-math, or seal change.

(Original Option A text, superseded — kept for the record:) port retail's bounded propagation: a candidate contribution already covered by the neighbour's accumulated view does not count as growth; only the uncovered remainder propagates. Mirrors retail's "redundant → empty before copy_view". This is a non-retail mechanism bounding a cycle retail never forms — Option B removes the cycle instead.


2. Diagnosis — verified this session (the verify-first gate)

The 2026-06-08 handoff gated the fix on a measurement gate (docs/research/2026-06-08-indoor-flap-edgeon-vs-camera-position-handoff.md §5). Results:

2.1 §5.3 — retail's clip collapses at edge-on (the "port clip robustness" idea is dead)

PView::GetClip (:432344) → ACRender::polyClipFinish (:702749) bails when the clipped polygon drops below 3 vertices (:702863, no guard band). ClipPortals (:433654) only propagates if (ecx_8 != 0). ConstructView (:433750) rebuilds the flood every frame, no cross-frame hysteresis. Our PortalProjection.ClipToRegion collapses identically. Edge-on area-collapse is geometric — there is no retail clip robustness to port. That option is off the table.

2.2 §5.1 — the flap is a same-root flood oscillation, not a root-swap

analyze_flap_vis.py over launch-camprobe.log: of ~4,009 vis transitions, 3,984 are same-root vs 25 root-changes (99.4 % same-root). The flap is a flood-membership oscillation inside room 0171, not the "going-outside" root swap, and not the root doorway's D5 rescue flip (3,836/3,984 transitions had no change in the root doorway's clip/D-band/side).

2.3 The mechanism — [pv-trace] in launch-camprobe.log

At a near-stationary eye (157.30, 7.8x, 96.25, ~1 cm creep), one Build call shows 0171 popped ~19× and 0173 ~20×, each round p1->0173 addCell polys=1 clipVerts=4 recip=1->1 grew=True queued=True, the processed watermark climbing 0→1→…→19 until the cap binds at 16. The mutual contribution does not shrink (constant clipVerts=4, polys=1) — it is the same doorway aperture, drifting. Per-cell view counts swing 1↔53 and cells 016F/0172 flicker in/out → the flap.

2.4 Confirmation — launch-churn-confirm.log (live walk-through, this session)

analyze_churn_confirm.py: 44.4 % of frames maxPop ≥ 2; worst maxPop = 16 (cap saturated, 3,745 frames); root 0171 maxPopMax=16; [flap] vis oscillation reproduced (187 transitions, vis 2/3/4). The calm baseline (player at rest, root 0172) sits at maxPop=1that is exactly the position the 2026-06-08 "refuted (maxPop=1)" capture sampled. The DO-NOT was an unrepresentative sample; the churn is confirmed at flap-time.

2.5 Why retail doesn't churn (termination primitive)

Render::copy_view (:344784) — the slice-adder — just appends (with internal consecutive-vertex cleanup); it has no redundancy check (confirmed by reading it). So retail's termination is upstream: a redundant re-contribution does not generate a new propagatable slice — via the clip going empty (GetClip/OtherPortalClip < 3 verts) and/or the monotonic update_count watermark (each slice processed once). The exact primitive (empty-clip vs watermark vs both) is confirmed in the plan by tracing the ClipPortals/AddToCell/AdjustCellView mutual-cycle in full. Either way the observable contract is: a redundant contribution adds no new visible area, so it does not grow the view. Our ApplyReciprocalClipAddRegion path violates that — it leaves the redundant contribution non-empty (recip=1->1) and AddRegion's polygon-equality dedup can't catch the drifted near-duplicate → spurious grew.


3. Retail grounding (the traversal being matched)

From docs/research/named-retail/acclient_2013_pseudo_c.txt:

  • PView::ConstructView (:433750): per-frame flood — cell_todo_num=0, seed root, pop one cell at a time, append to cell_draw_list (= membership), ClipPortals(cell, 0), then AddViewToPortals.
  • PView::ClipPortals (:433572): processes the cell's view slices [update_count, view_count); per portal GetClip; exit portal → copy_view/landscape; neighbour → OtherPortalClip. Propagates only when the clipped result is non-empty (ecx_8 != 0 / eax_16 != 0).
  • PView::AddViewToPortals (:433446): first discovery (processed_stamp==0) → InitCell + InsCellTodoList (enqueue once); growth (processed_stamp != view_stamp) → AddToCell + FixCellList, then processed_stamp = view_stamp (no re-enqueue).
  • PView::AddToCell (:433050): incremental — clips the cell's portals against only the newly-added slices; does not re-contribute to OutsideView.
  • PView::OtherPortalClip (:433524): reciprocal back-clip; yields empty for a redundant back-contribution.
  • Render::copy_view (:344784): appends a slice; no dedup (confirms the empty-for-redundant decision is upstream, in the clip).

Takeaway: retail re-processes growth (faithful — keep it), but a redundant re-contribution adds no new visible area → no new propagatable slice → termination (via empty clip and/or the monotonic watermark; §2.5). Our divergence is purely that redundant re-contributions stay non-empty and grow the view.


4. The fix (design — Option A)

Scope: PortalVisibilityBuilder only.

4.1 Bounded growth (the core change). A candidate contribution to a neighbour grows the neighbour's view (and may re-enqueue) only by the area not already covered by that neighbour's accumulated view. Concretely, before unioning a candidate region into frame.CellViews[neighbour], intersect/subtract it against the neighbour's existing accumulated regions and keep only the uncovered remainder; grew is true iff that remainder is non-empty. A drifted near-duplicate of an already-covered region has ~zero uncovered area → grew=false → no re-enqueue → the mutual cycle terminates. This is retail's "redundant → empty," expressed on our region representation, and it is drift-tolerant by construction (it tests coverage, not polygon equality — so it is NOT the rejected epsilon-dedup band-aid).

4.2 Remove the band-aid. Delete MaxReprocessPerCell and popCounts and the per-pop re-enqueue cap logic in both Build and BuildFromExterior. With redundant contributions no longer growing the view, termination is structural (each cell's genuinely-new slices process a bounded number of times; the flood converges as the aperture is covered).

4.3 Keep re-processing of genuinely-new slices. A contribution that does add uncovered area still grows the view and re-enqueues, so late-discovered slices still reach exit portals (Build_ViewGrowthAfterDoneCell_PropagatesNewSlicesToExit stays GREEN).

Exact code form (→ implementation plan, Task 1). Whether 4.1 is implemented as (i) a polygon coverage test in AddRegion (candidate ⊆ union(existing) → no growth), (ii) an uncovered-remainder set-difference before the union, or (iii) matching retail's ClipPortals slice-watermark + AddToCell in-place growth, is finalized in the plan by reading the retail ClipPortals/AddToCell/AdjustCellView slice loop in full and choosing the smallest faithful form. The principle (redundant/covered → no growth; uncovered remainder propagates; cap removed; genuine re-processing kept) is fixed here.

Unchanged (explicit): ProjectToClip/ClipToRegion, EyeInsidePortalOpening, the reciprocal ApplyReciprocalClip, the OutsideView exit contribution, rooting (clipRoot = viewerRoot ?? _outdoorNode), the camera, and the landscape-through-door seal. No new heuristic, hysteresis, or epsilon.


5. Testing (TDD)

  1. Eye-sweep membership stability (new, the RED→GREEN driver). In AcDream.App.Tests, build the flood at a sequence of eye positions stepping monotonically across a grazing doorway (synthetic two-room + shared-portal topology reproducing the 0171↔0173 mutual aperture). Assert each cell's membership across the sweep is a single contiguous run — no present→absent→present flicker — and, if surfaced, per-cell pop count ≤ a small constant. RED under the churn, GREEN after the bound.
  2. Termination without the cap. Diamond + cycle fixtures: assert the flood terminates with MaxReprocessPerCell removed, OrderedVisibleCells deduped, each reachable cell present once.
  3. No membership regression. Build_ViewGrowthAfterDoneCell_PropagatesNewSlicesToExit, Build_IsDeterministic_*, Build_EyeStandingInInteriorPortal_FloodsNeighbour, Build_DegeneratePortalToTheSide_NotFlooded_NoOverInclusion (#95 over-inclusion guard), and the cellar/window/look-in tests stay GREEN. The 4 physics rest-stability guards stay GREEN.
  4. Visual gate (user) — acceptance. At the cottage doorway: walk through and turn the camera — interior rooms render steadily, no battling/popping; [pv-input] flood stable per eye pose; [portal-churn] maxPop ≤ a small constant (no near-16 churn). Then strip the [portal-churn]/[flap]/[pv-trace] apparatus.

dotnet build + dotnet test green before the visual gate.


6. Scope / non-goals / risks

  • In scope: PortalVisibilityBuilder bounded-growth (4.1) + cap removal (4.2) in both Build and BuildFromExterior; the new tests.
  • Under-inclusion risk + mitigation: an over-aggressive "covered" test could drop a genuinely-visible cell (a hole). Mitigation: "covered" is conservative (drop a candidate's growth only when fully covered); the #95 over-inclusion guard, the eye-standing/look-in/cellar tests, and the new eye-sweep test (must not drop a cell mid-sweep) bound both directions. Surface any test tension during implementation; do not weaken a test to pass.
  • §4 camera (deferred, separate divergence): the eye floating edge-on (retail's eye is pulled in, collided 93 % at the doorway — flap-cam-measure.log) can make the churn fire more often, but is not required for this fix — the churn is a real flood bug at any eye position. Revisit as a follow-up only if a residual remains after R-A2b.
  • No rooting / clip-math-rewrite / seal / physics change.

7. Apparatus + references

  • Captures (untracked, large): launch-churn-confirm.log (this session's walk-through — maxPop=16, 44 % churn); launch-camprobe.log ([pv-trace] 0171↔0173 churn detail); flap-churn.log (the maxPop=1 camera-turn-at-rest = the wrong reproduction that mis-shelved the spec).
  • Analyzers (throwaway): analyze_flap_vis.py (same-root vs root-swap split), analyze_churn_confirm.py (maxPop distribution + flap reproduction).
  • Probes: ACDREAM_PROBE_FLAP=1 ([flap] / [pv-trace]), ACDREAM_PROBE_PORTAL_CHURN=1 ([portal-churn] per-Build maxPop + reciprocal pre→post). Strip after the visual gate.
  • Retail anchors: ConstructView :433750, ClipPortals :433572, AddViewToPortals :433446, AddToCell :433050, FixCellList :433407, AdjustCellView :433741, OtherPortalClip :433524, copy_view :344784, GetClip :432344, polyClipFinish :702749.
  • Revived (banners redirected here): 2026-06-08-portal-flood-enqueue-once-port-design.md (REVISION = bounded propagation), 2026-06-08-portal-flood-bounded-propagation.md (Phase 1 done; Phase 2 = this).
  • Memory to correct after ship: project_indoor_flap_rootcause — the churn is confirmed at flap-time (maxPop=16); the "churn refuted (maxPop=1)" verdict was a non-flapping (camera-turn-at-rest) sample.