docs: Phase A8.F visual-gate failure handoff + issue #103

A8.F (retail portal-frame port) shipped Tasks 0-8 but failed its visual gate:
indoor branch renders broadly wrong at runtime (terrain over walls, transparent/
invisible walls). Default game unaffected (branch gated behind
ACDREAM_A8_INDOOR_BRANCH). Two compounding root causes documented (OutsideView
under-produces; Job-A/B else-branch floods ungated terrain) + apparatus + a
first-fix hypothesis + pickup prompt. Filed #103.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Erik 2026-05-29 14:43:24 +02:00
parent 7c3ee438bd
commit cf3d49cbd7
2 changed files with 220 additions and 0 deletions

View file

@ -44,6 +44,35 @@ Copy this block when adding a new issue:
---
## #103 — Phase A8.F portal-frame indoor rendering broken at runtime (visual-gate failure)
**Status:** OPEN
**Severity:** MEDIUM (opt-in branch only — default game unaffected)
**Filed:** 2026-05-29
**Component:** render (indoor visibility)
**Description:** With `ACDREAM_A8_INDOOR_BRANCH=1`, the A8.F retail portal-frame port
renders indoor/outside-in broadly wrong: cottage/cellar interiors covered in outdoor
terrain with transparent walls; invisible walls in other houses from inside and outside.
Default game (env var off) is unaffected — `cameraInsideBuilding = a8IndoorBranchEnabled
&& inside` (GameWindow.cs:7343). The old cellar flap remains in the default path.
**Root cause / status:** Two compounding causes (evidence in the handoff): (1) the
`OutsideView` builder under-produces — `OUTSIDEVIEW polys=0` most frames, and when
non-empty it doesn't recursively narrow (cellar shows ~full window). (2) The Task-6
Job-A/B decoupling draws terrain UNGATED when `OutsideView` is empty (`else` branch),
flooding the cell interior over the (correctly-rendered) walls. Cell walls DO render
(`[opaque]` tris=50-108). Projection math is correct; the builder integration is fragile.
**Files:** `src/AcDream.App/Rendering/PortalVisibilityBuilder.cs` (builder under-produces);
`src/AcDream.App/Rendering/GameWindow.cs` `RenderInsideOutAcdream` Step-4 `else` ungated-terrain (~11142).
**Research:** [docs/research/2026-05-29-a8f-visual-gate-failure-handoff.md](research/2026-05-29-a8f-visual-gate-failure-handoff.md) (root-cause analysis, apparatus, first-fix hypothesis, pickup prompt).
**Acceptance:** Holtburg cottage cellar renders with solid walls and no terrain flood;
terrain shows only through correctly-clipped portal openings; no invisible walls.
Related: #102 (builder dungeon-scaling fixpoint).
# Active issues
---

View file

@ -0,0 +1,191 @@
# Phase A8.F — visual-gate failure + pickup handoff (2026-05-29)
## TL;DR
The retail portal-frame visibility port (**Phase A8.F**) shipped as code (Tasks 08,
committed) but **FAILED its visual gate**. With `ACDREAM_A8_INDOOR_BRANCH=1`, indoor
and outside-in rendering is broadly broken: cottage/cellar interiors are "covered in
outdoor terrain / transparent walls," and walls are invisible in other houses from
both inside and outside.
**The default game is UNAFFECTED.** `cameraInsideBuilding = a8IndoorBranchEnabled &&
(inside a building)` (GameWindow.cs:7343), so `RenderInsideOutAcdream` only runs with
the opt-in env var. Without it, rendering is the pre-A8 path (walls render; only the
old cellar flap remains). **Do not panic — normal play is fine; the A8.F branch is the
broken opt-in.**
The work is committed (not reverted): the GL-free CPU layer is solid and unit-tested;
the **integration** (CPU-built clipped NDC mask → stencil-gate all outdoor terrain/
scenery) is what fails at runtime. This doc has the root-cause analysis, the apparatus,
and a pickup prompt.
## What was built (the A8.F port)
Spec: [`docs/superpowers/specs/2026-05-29-phase-a8f-portal-frame-visibility-design.md`](../superpowers/specs/2026-05-29-phase-a8f-portal-frame-visibility-design.md)
Plan: [`docs/superpowers/plans/2026-05-29-phase-a8f-portal-frame-visibility.md`](../superpowers/plans/2026-05-29-phase-a8f-portal-frame-visibility.md)
Idea: port retail's `PView` recursive portal-clip (`ConstructView`/`ClipPortals`/
`GetClip`) — WB has NO such recursion, so the flat WB stencil can't fix the cellar flap;
retail clips each portal to its portal chain. We built a GL-free CPU builder that walks
the portal graph and produces `OutsideView` (a screen-space NDC region = exit portals
recursively clipped), then stencil-gate outdoor terrain/scenery to it.
Commits (on `claude/strange-albattani-3fc83c`, after baseline `5dc4140`):
- `bb903bc` Task 0 — strip ACDREAM_A8_DIAG_* flags.
- `406307e` Task 1 — ViewPolygon + CellView (GL-free data model). Unit-tested.
- `7f46c27` Task 2 — ScreenPolygonClip (Sutherland-Hodgman convex intersection). Unit-tested.
- `a28a176` + `9ec8330` Task 3 — PortalProjection (NDC + near-plane clip). Unit-tested.
(A near-plane bug was caught + fixed during impl: `w>=WEps``w+z>=0`.)
- `0ed462c` + `270c21f` Task 4 — PortalVisibilityBuilder (the BFS). Unit-tested.
(Known dungeon-scaling fast-follow filed as **issue #102**.)
- `d12892b` + `08f6a0c` + `d581f4c` Task 5 — IndoorCellStencilPipeline.MarkAndPunchNdc.
- `9e2eb90` Task 6 — RenderInsideOut rewrite: builder-driven mask + **Job-A/B decouple**.
- `1c02a01` + `5a012c0` Task 7 — wire-in #2 per-cell translucent clip on stencil bit 2.
(A DepthFunc-leak bug was caught + fixed by code review.)
- `e0051e0` + `452ee5b` Task 8 — wire-in #3 cross-building (ungated Step 5, clipped bit-1).
- `7c3ee43` — triage apparatus (this debugging session; see below).
All `dotnet build` + `dotnet test` green throughout (App baseline 108).
## The visual-gate failure — symptoms
With `ACDREAM_A8_INDOOR_BRANCH=1` at Holtburg cottages (camera = `+Acdream`):
1. Outside→in (looking into a cottage from outside): cellar entrance looked correct.
2. Inside the cellar: **covered in outdoor terrain; walls transparent (see-through).** Passable (render-only).
3. Looking out from inside (toward a window): looked roughly normal.
4. Passing inside→out: **buildings + ground disappear; only server-spawned things
(doors/NPCs/particles) remain.**
5. **Invisible walls in OTHER houses, both from inside and outside.**
## Root-cause analysis (evidence-based; see apparatus below)
**Finding 1 — the cell walls DO render.** `[opaque]` probe (opaque cell-render stats,
captured BEFORE the per-cell transparent loop overwrites them): `cells=7 tris=50/60`,
`cells=25 tris=108` in occupied cottage cells. `tris=0` only in transient frustum-culled
frames. So "transparent walls" is **NOT** walls failing to render — it's terrain drawn
*over* them. (NOTE: the older `[envcells]` probe reads stats AFTER the transparent loop,
so its `cells=1 tris=0` is a misleading artifact — ignore it.)
**Finding 2 — `OutsideView` is frequently EMPTY, and when non-empty it doesn't narrow.**
`[pv-dump] OUTSIDEVIEW polys=N`: `polys=0` in the majority of frames; `polys=1` sometimes.
When non-empty, the clipped region ≈ the full source window (e.g. from the cellar, the
`0xA9B40170` window passes through ~unclipped, not narrowed to the stairwell sliver). So
the recursive-clip — the entire point of A8.F — is **not constraining at runtime**.
**Finding 3 — projection/clip MATH is correct; the builder under-produces.** When a
`[pv-dump] EXIT` line fires, the local quad → NDC → clipped chain is sane (window quad
`local=[(5.55,-8.61,0)(7.45,-8.61,0)(7.45,-8.35,2.5)(5.55,-8.35,2.5)]` → reasonable NDC →
clipped region). The `viewProj` is a valid System.Numerics row-vector `view*proj`
(`M33≈M34` because far≫near makes `proj.M33≈-1`; `M44` varies with camera, expected).
`ProjectToNdc` matches the GPU convention (verified algebraically: `Vector4.Transform(v,M)`
== GPU `M*v` for transpose=false upload). **Projection is not the bug.** The bug is the
builder yielding empty/too-wide regions for most real camera positions — the exit-portal
clip produces empty (needs deeper trace: portal-side cull? FullScreen-clip producing
empty? BFS not reaching exit portals from most positions?).
**Finding 4 — the Job-A/B decoupling floods terrain when `OutsideView` is empty (the
proximate cause of "transparent walls").** Task 6 made Step-4 terrain/scenery draw
UNCONDITIONALLY, with only the stencil *state* gated. When `OutsideView` is empty
(`didInsideStencil=false`), the `else` branch **disables the stencil and draws terrain
ungated** (GameWindow.cs ~11142). Combined with Finding 2 (empty most frames), terrain
floods over the (rendered) cell interior → "covered in terrain / transparent walls."
This is exactly the Opus Task-6 code-review **Minor #2** risk, realized at scale.
**Why WB doesn't hit this but we do:** in WB, `didInsideStencil = "inside a building"`
(always true indoors, because it marks the whole building's exit-portal set, which is
non-empty). WB never has the "inside + empty mask" case. Our builder produces empty masks
frequently, so the `else` branch (which WB effectively never exercises with an empty mask)
floods. The CPU-NDC-recursive-clip mask is far more fragile at runtime than WB's flat
building mask.
## The two compounding root causes (summary)
1. **`OutsideView` builder under-produces at runtime** — empty most frames; never narrows
recursively. (Builder/clip integration with real geometry; not the projection math.)
2. **Empty-`OutsideView` → ungated terrain flood** — the Job-A/B decoupling's `else` branch
draws terrain everywhere when the mask is empty, painting over the cell interior.
## Concrete first-fix hypothesis (try this first next session)
The `else` branch is wrong: **an empty `OutsideView` means "no outdoors visible from
here," not "all outdoors visible."** When inside a building with an empty mask, draw NO
outdoor terrain/scenery (or fall back to the pre-A8 "depth-clear-when-inside" behavior),
rather than ungated terrain. That alone should stop the flooding (walls become solid;
you temporarily lose terrain-through-portal until the builder is fixed, but the interior
renders correctly). This decouples the two bugs so each can be fixed independently.
Then separately debug Finding 2 (why the builder yields empty/too-wide regions) — the
`[pv-dump]` apparatus already traces local→NDC→clipped; extend it to log the side-test
result and the per-stage vert counts for ALL exit portals (the current dump's EXIT-CULLED/
EXIT-PROJ/EXIT-CLIP lines do this — read them across many frames to see which gate kills
the portals when `polys=0`).
## The architectural question (escalate to the human before a big rewrite)
Is "CPU-build a recursively-clipped NDC region + stencil-gate ALL outdoor terrain/scenery
to it" viable in acdream's pipeline, or is it too fragile (Finding 2)? Options:
- (a) Fix the builder + the else-branch (incremental; the first-fix hypothesis above).
- (b) Reconsider enforcement — e.g., port retail's per-cell screen-space scissor more
literally, or keep WB's flat building mask (accept the cellar flap) and special-case
only the cellar. The user explicitly chose the faithful retail port (option A) at
brainstorm; revisit only if (a) proves intractable.
## Safety / current state
- **Default game safe**: indoor branch gated behind `ACDREAM_A8_INDOOR_BRANCH=1`
(`cameraInsideBuilding = a8IndoorBranchEnabled && inside`, GameWindow.cs:7343).
- Work is **committed, not reverted** (CPU layer is good; integration needs fixing).
- The old cellar flap (the original M1.5 blocker) is **still present** in the default
(pre-A8) path — A8.F did not fix it.
- Tree clean as of `7c3ee43`.
## Apparatus (all env-gated; require `ACDREAM_A8_INDOOR_BRANCH=1` to reach the code)
- `ACDREAM_A8_DUMP_PV=1``[pv-dump]` lines: per camera cell, the exit-portal
local→NDC→clipped geometry (EXIT-CULLED / EXIT-PROJ / EXIT-CLIP / EXIT) + `OUTSIDEVIEW
polys=N`. First 2 Build calls per distinct camera cell. (PortalVisibilityBuilder.cs.)
- `ACDREAM_PROBE_ENVCELL=1``[opaque]` line: opaque cell-render stats (cells/tris)
BEFORE the transparent loop overwrites `_envCellRenderer.Stats`. One-shot per camera
cell. (GameWindow.cs, after the Step-3 opaque render.)
- `ACDREAM_PROBE_VIS=1``[buildings]`/`[draworder]`/`[stencil]`/`[envcells]` (existing).
NOTE `[envcells]` is post-transparent-loop (misleading); `[stencil] verts` reflects the
OutsideView triangle count.
- `tools/A8CellAudit` — offline cell/portal dumper (`portals <cellId>` / `buildings <lb> <radius>`).
Launch (PowerShell), then walk `+Acdream` into a Holtburg cottage ground floor + cellar:
```powershell
$env:ACDREAM_DAT_DIR="$env:USERPROFILE\Documents\Asheron's Call"; $env:ACDREAM_LIVE="1"
$env:ACDREAM_TEST_HOST="127.0.0.1"; $env:ACDREAM_TEST_PORT="9000"
$env:ACDREAM_TEST_USER="testaccount"; $env:ACDREAM_TEST_PASS="testpassword"
$env:ACDREAM_A8_INDOOR_BRANCH="1"; $env:ACDREAM_A8_DUMP_PV="1"; $env:ACDREAM_PROBE_ENVCELL="1"
dotnet run --project src\AcDream.App\AcDream.App.csproj --no-build -c Debug 2>&1 | Tee-Object -FilePath "a8f.log"
```
Cottage cells: `0xA9B40170` (ground floor, has window exit portal), `0xA9B40171` (cellar),
`0xA9B40174/175` (cellar rooms), building `0xA`. Inn vestibule: `0xA9B40164/162`.
## Code anchors
- `src/AcDream.App/Rendering/PortalVisibilityBuilder.cs` — the builder (Finding 2 lives here).
- `src/AcDream.App/Rendering/GameWindow.cs``RenderInsideOutAcdream` (~11012);
Step-4 `else` ungated-terrain branch (~11142, Finding 4); call-site gate (~7343, 7636).
- `src/AcDream.App/Rendering/IndoorCellStencilPipeline.cs` — MarkAndPunchNdc + bit-2 helpers.
- `src/AcDream.App/Rendering/PortalProjection.cs` / `ScreenPolygonClip.cs` / `PortalView.cs` — CPU layer (correct).
- `references/WorldBuilder/.../VisibilityManager.cs:73-239` — the WB reference (flat, no recursion).
- Retail oracle: `docs/research/named-retail/acclient_2013_pseudo_c.txt``PView::ConstructView` 433750, `ClipPortals` 433572, `GetClip` 432344.
## Pickup prompt
> Read `docs/research/2026-05-29-a8f-visual-gate-failure-handoff.md` and pick up the A8.F
> debugging. The default game is SAFE (indoor branch gated behind ACDREAM_A8_INDOOR_BRANCH).
> Use `superpowers:systematic-debugging`. Two compounding root causes are documented:
> (1) the OutsideView builder under-produces (empty most frames, never narrows); (2) the
> Job-A/B decoupling floods ungated terrain when OutsideView is empty. **Start with the
> first-fix hypothesis**: make the empty-OutsideView case draw NO outdoor terrain/scenery
> when inside (an empty mask = "no outdoors visible," not "all outdoors"), to stop the
> terrain-over-walls flood and isolate the two bugs. Verify via the apparatus
> (ACDREAM_A8_DUMP_PV / ACDREAM_PROBE_ENVCELL) — read the EXIT-CULLED/PROJ/CLIP lines across
> frames to learn which gate kills the exit portals when polys=0. Then fix the builder.
> If the builder proves intractable, escalate the architectural question (handoff §"The
> architectural question") to the user before any big rewrite — do NOT thrash. No
> speculative fixes without root cause (the Iron Law). The visual gate (user looking at a
> Holtburg cottage cellar) is the acceptance test.