docs(render): session-2 handoff — stencil attempt reverted, evidence-first pickup prompt

Net code change this session = 0 (stencil-occlusion T1-T4 implemented, regressed,
reverted to baseline 9bff2b0). Documents the honest failure + lessons (patchwork via
flag-based gate routing; the interior-writes-mask rule breaks outdoors; coded before
screenshotting), the still-useful evidence (cottage = IsBuildingShell GfxObjs not cell
shells; two redundant traversals; retail DrawCells outside_view gate; working window
screenshot tooling), the open questions to answer with pixels first, and a refined
evidence-first pickup prompt.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
Erik 2026-06-01 13:51:22 +02:00
parent 9bff2b0462
commit 1d7d8b1de4

View file

@ -0,0 +1,148 @@
# Render Reset — Session 2 handoff (2026-06-01)
> **Net code change this session: ZERO.** A stencil-occlusion approach was designed,
> implemented (T1T4), regressed, and **fully reverted** to the session-start baseline
> `9bff2b0`. The indoor render is **still broken exactly as it was at session start**.
> This handoff exists so the NEXT session does not repeat the mistakes that wasted this one.
> Read this together with the original analysis: `docs/research/2026-05-31-render-architecture-reset-handoff.md`.
## 1. Honest state
- **Branch** `claude/thirsty-goldberg-51bb9b`, HEAD = `9bff2b0` (baseline). UNPUSHED.
- **Two git stashes preserved** (`stash@{0}` #98/#101/A8-culling WIP, `stash@{1}` issue98 backup). Do NOT drop.
- The reverted stencil work is recoverable from the **reflog** (commits `1f9cb1b` spec, `7adb3da` plan, `9e988e7` T1T3, `003dda4` T4). Do **not** resurrect it without a reason that comes from evidence.
- The indoor world is still broken: walls/floor don't seal, outdoor terrain visible from inside the cellar ("world from below"), enclosure reads grey/transparent. The doorway **flap is still fixed** (`0ee328a`, in baseline). Camera-collision commits (`e099b4c`, `3066460`) are in baseline — **parked**, out of scope.
## 2. What was tried, and WHY it failed (the load-bearing part)
**Approach tried:** a "unified stencil-occlusion gate" — interior geometry writes `stencil=1`,
exterior (terrain + outdoor stabs) draws only where `stencil==0`, dynamics ungated. The idea
was: no `cameraInsideBuilding` path-toggle (so no flap), per-pixel mask (so no convex-clip
leak), the interior floor masks the terrain below it (so no "world from below").
**Why it failed — three distinct failures, in order of importance:**
1. **It was patchwork wearing a new hat.** It was bolted onto the **existing three separate
renderers** (terrain / cell-shells / entities) and sorted entities into different gates by
**flags** (`IsBuildingShell` / `ParentCellId` / `ServerGuid`). That per-flag special-casing
*is* the patchwork the reset was supposed to delete. **Rule for next time: if the design
routes geometry into different gates by entity flags, it is not "one gate" — stop.**
2. **The rule is wrong outdoors.** "Interior writes a screen-space mask, exterior tests it"
breaks the outdoor case: a building sits *on* terrain, but making it write the mask makes
the terrain *not draw* in the building's silhouette → **buildings punch holes in the ground,
inside AND outside** (the regression the user saw). Any unified gate must be correct
**indoors AND outdoors AND at the threshold** — the outdoor case is not an afterthought.
3. **It was driven by theory, not pixels.** ~5 relaunches, zero screenshots, until the very end.
Hypotheses were reasoned from code structure + the user's verbal reports, then code was
changed and relaunched. That is the exact anti-pattern the memory `render-one-gate` warns
about ("log-archaeology misled for days; one screenshot cracked it"). **A screenshot at the
cottage on attempt #1 would have shown the geometry reality immediately.**
Meta-cause: the session treated "pick one and execute" as license to start *typing* rather than
license to commit to an *architecture*, and it pivoted from the handoff's prescribed approach
(consolidate the PView clip machinery) to a brand-new mechanism (stencil) under pressure — i.e.
it thrashed mechanisms mid-stream instead of grounding a choice in evidence.
## 3. Evidence gathered that IS still useful (don't re-derive)
- **Two redundant portal traversals run per frame** (both player-rooted): `CellVisibility.GetVisibleCells`
produces a binary `VisibleCellIds` set (consumed by `WbDrawDispatcher.EntityPassesVisibleCellGate`);
`PortalVisibilityBuilder.Build` produces screen-space clip regions + `OutsideView` (consumed by
`ClipFrameAssembler` → the clip slots). They can disagree. Interior statics are gated by the
intersection; shells by only the second; outdoor stabs bypass the first (`ParentCellId==null`
`EntityPassesVisibleCellGate` returns `true`). This is the "3-gate patchwork" in detail.
- **CONFIRMED — the cottage structure is `IsBuildingShell` GfxObj entities, NOT (only) EnvCell
shells.** `src/AcDream.Core/World/LandblockLoader.cs:75-91`: each `info.Buildings` entry becomes
a `WorldEntity { IsBuildingShell = true, ParentCellId == null }`, rendered by `WbDrawDispatcher`.
So the cottage walls/floor/roof are landblock-level building entities that **also exist when you
are outdoors** (where they must not mask terrain). This is the single most important geometry
fact for the next design. ⚠️ **Still unconfirmed by pixels:** whether the cottage *also* has
EnvCell interior shells, and which renderer owns the walls vs. the floor vs. the stairs. The
user's report ("correct on the stairs, broken in the rooms") strongly implies a *mix* — confirm it.
- **Retail `PView::DrawCells` (`acclient_2013_pseudo_c.txt:432709`)** installs the active "view"
(clip region) before each draw; the landscape is drawn **once**, clipped to `outside_view`, and
**only when `outside_view.view_count > 0`** (empty → no landscape — the retail bleed fix). The
outside-view is a **list of convex view polygons** clipped to the union, NOT a single convex set.
- **WB's `RenderInsideOut` is a flat per-pixel STENCIL pass selected by an `isInside` toggle**
acdream ported it byte-for-byte once (`IndoorCellStencilPipeline`, 792 LOC + `portal_stencil`
shaders + tests) and **deleted it in U.1 (`3fc77be`)** precisely because the `cameraInsideBuilding`
**toggle** caused the flap. The stencil *mechanism* is sound; the *two-pipe toggle structure* is
what was bad. (Recoverable from `3fc77be^` if ever wanted.)
- **Screenshot tooling WORKS** and must be used: PowerShell `Add-Type` user32 `GetWindowRect` +
`System.Drawing` `Graphics.CopyFromScreen` on the `AcDream.App` `MainWindowHandle` → PNG → `Read`
the PNG. Proven this session (captured `shot1.png` of the live client at 1296×759). No excuse to
theorize blind.
## 4. Open questions to answer with EVIDENCE (screenshots) BEFORE any design
1. In the cottage **room**, **stairs**, and **cellar**: what geometry actually draws, and what is
the renderer for the walls / floor / ceiling / stairs (EnvCell shell vs `IsBuildingShell` GfxObj)?
(Use `ACDREAM_PROBE_SHELL` `[shell]` + an entity-inventory probe, paired with a screenshot.)
2. Is `cameraInsideCell` + the visible-cell set **stable** standing still in the rooms/cellar, or
does it flicker? (Add a one-line per-frame probe of `visibility.CameraCell` + `VisibleCellIds.Count`.)
3. **Where exactly** does terrain leak from in the cellar — below the floor (floor not occluding),
through walls (walls not rendering / back-face culled), or terrain drawn ungated (root resolved
outdoor)? A screenshot + the resolve probe answers this; do not guess.
## 5. Pickup prompt (copy-paste for the next session)
```
RENDER PIPELINE — acdream indoor+outdoor. Continue on branch claude/thirsty-goldberg-51bb9b
(do NOT branch/worktree). Preserve the 2 git stashes — do not drop. Branch UNPUSHED — ask
before pushing. Baseline 9bff2b0. A stencil attempt was made and REVERTED last session (reflog
has it; do NOT resurrect without an evidence-based reason).
READ FIRST, in full: docs/research/2026-06-01-render-reset-session2-handoff.md, then the original
analysis docs/research/2026-05-31-render-architecture-reset-handoff.md, then the "Render Pipeline
(SSOT)" section of docs/architecture/acdream-architecture.md, then memory feedbacks
render-one-gate + render-self-contained-gl-state.
GOAL (what we intend): a pipeline where moving outside<->inside is SEAMLESS and looks like
reality — interiors seal (walls/floor/ceiling), you see outside ONLY through real openings
(doors/windows), and OUTDOORS the world renders normally with buildings sitting ON the ground
(never punching holes in terrain). Acceptance test: Holtburg cottage — walk in, stand in the
room, go to the cellar, come back out: no flap, no terrain through floors/walls, no grey void,
no holes in the ground. M1.5 "indoor world feels right."
THIS IS AN ARCHITECTURE TASK, NOT A PATCH JOB. The pipeline is 3 separate renderers (terrain /
cell-shells / entities) each deciding visibility inconsistently. Goal: ONE visibility decision
all geometry obeys the same way. If a fix routes geometry into different gates by flags
(IsBuildingShell / ParentCellId / ServerGuid), that's the patchwork reappearing — stop.
HARD GATE — EVIDENCE BEFORE CODE (this is where last session failed):
- For ANY render/visual question, LOOK AT PIXELS first. Window capture works (PowerShell
System.Drawing CopyFromScreen on the AcDream.App MainWindowHandle -> PNG -> Read it). No
theorizing from code alone.
- BEFORE designing: capture+read screenshots of every broken state (cottage room, stairs,
cellar, outside) and answer §4's three questions with EVIDENCE, not theory.
- AFTER every change: screenshot the cellar and compare yourself. Do NOT "relaunch and ask the
user" as the loop — the user's eyes are the FINAL gate, not the debugger.
PROCESS (in order): (1) INVESTIGATE report-only (use the /investigate skill) — screenshots +
[shell]/[flap]/cell-resolution probes -> a written evidence model of WHAT is broken and WHY;
get the user's nod before any code. (2) DESIGN (superpowers:brainstorming, grounded in #1) —
one visibility decision, one enforcement, correct indoors AND outdoors AND at the threshold
(the seamless transition IS the hard problem). Pick ONE mechanism and COMMIT; candidates are
the retail-PView clip machinery we already own (PortalVisibilityBuilder + ClipFrame) vs a
stencil mask vs another approach — let evidence choose, then don't thrash. Honestly assess
whether the existing machinery is salvageable into the one gate or a deeper restructure is
warranted; don't ASSUME "consolidation." (3) IMPLEMENT the one gate, delete the ad-hoc gates,
small sequential commits, build green each. (4) VERIFY at the cottage cellar with screenshots,
then the user's final sign-off.
DO NOT REPEAT (evidence-disproven): camera/eye as root, cull mode, shell geometry/texture
missing, H1 PVS grounding, H2 PortalSide, zoom-confound, AND the session-2 failures: a
screen-space "interior writes mask" rule that breaks outdoors (buildings hole the ground);
flag-based per-entity gate routing (= patchwork); jumping to code before understanding the
failure from pixels. Only stop for visual verification once you have a build + screenshots.
```
## 6. Confirmed facts ledger (one-liners)
- Cottage walls/floor/roof = `IsBuildingShell` GfxObj entities, `ParentCellId==null` (`LandblockLoader.cs:75-91`).
- Two redundant traversals: `CellVisibility.GetVisibleCells` (VisibleCellIds) + `PortalVisibilityBuilder.Build` (screen regions).
- Retail draws the landscape once, clipped to `outside_view`, only if `view_count>0` (`DrawCells :432709`).
- Stencil mechanism is fine; the `cameraInsideBuilding` two-pipe toggle is what caused the flap (deleted U.1 `3fc77be`).
- Window screenshot via PowerShell `CopyFromScreen` works — use it.