Merge claude/strange-albattani-3fc83c into main — M1.5 work + render-pipeline pivot

Brings ~9 days of post-Phase-O work onto main: A6 indoor physics fidelity, issues
#98/#100/#101, A7 indoor lighting, the A8/A8.F rendering arc, and the 2026-05-30
camera-collision + physics viewer-cap work. Also lands the decision to ABANDON the
two-pipe (inside/outside) render approach in favor of Phase U — a single unified
retail-faithful portal-visibility pipeline (see
docs/research/2026-05-30-unified-render-pipeline-decision-and-handoff.md). The dormant,
gated-off A8 two-pipe code (issue #103) rides along and is deleted as Task 1 of Phase U.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Erik 2026-05-30 11:37:45 +02:00
commit 48213c5b46
240 changed files with 1640769 additions and 550 deletions

12
.gitignore vendored
View file

@ -31,6 +31,16 @@ launch-*.log
launch.utf8.log
n4-verify*.log
# A6.P5 (2026-05-25) — door-stuck reproduction captures (multi-MB);
# the 3-record fixture extracted from these lives at
# tests/AcDream.Core.Tests/Fixtures/door-bug/over-penetration-capture.jsonl
door-stuck-capture.jsonl
door-stuck-*.launch.log
door-stuck-*.launch.utf8.log
door-fix-*.launch.log
door-fix-*.jsonl
door-walkthrough.*
# ImGui auto-saved window/docking state (per-user, not source)
imgui.ini
@ -51,6 +61,8 @@ tmp/
# The committed reference workflow lives in CLAUDE.md "Retail debugger toolchain";
# session-specific traces should not pollute the repo.
*.cdb
# tools/cdb/ holds committed reference scripts — exempt them from the blanket rule above.
!tools/cdb/*.cdb
launch_*.log
launch_*.err
launch_*.ps1

310
CLAUDE.md
View file

@ -725,17 +725,289 @@ Visual side-by-side passed: Holtburg town, inn interior, dungeon all
render identically to pre-O. Spec:
[`docs/superpowers/specs/2026-05-21-phase-o-dat-path-unification-design.md`](docs/superpowers/specs/2026-05-21-phase-o-dat-path-unification-design.md).
**2026-05-30 — RENDER PIPELINE PIVOT (read this first).** The two-pipe
(inside / outside) render approach is **ABANDONED**. acdream inherited a
WorldBuilder-style split — a normal outdoor draw plus a separate flat
`RenderInsideOut` stencil pass toggled on `cameraInsideBuilding` — and that
split is the root cause of every indoor seam bug (the flap, missing/transparent
walls, terrain bleeding into interiors). Retail has no such split; it renders
through one portal-visibility traversal (`PView`) and is seamless by
construction. We are building **Phase U — a single unified retail-faithful
render pipeline**. This supersedes the A8/A8.F two-pipe arc (issue #103). The
camera-collision work (retail `SmartBox::update_viewer` spring arm) + a
physics viewer-cap fix **SHIPPED this session and are kept** (they're real and
retail-faithful, just not the seam fix). Full decision + scope + next-session
pickup prompt:
[`docs/research/2026-05-30-unified-render-pipeline-decision-and-handoff.md`](docs/research/2026-05-30-unified-render-pipeline-decision-and-handoff.md).
The M1.5 narrative below is history retained for context.
**Currently working toward: M1.5 — Indoor world feels right** (resumed
from 2026-05-20 baseline after Phase O ship). Demo scenario: enter the
Holtburg Sewer through the in-town portal, navigate through (walls
block, stairs work, items block, lighting reads correctly), exit back
to town. Phases: A6 (Indoor physics fidelity, cdb-driven) + A7
(Indoor lighting fidelity, RenderDoc + retail-decomp driven). Issues
in scope: #80, #81, #83, #88, #90 (workaround removal), L-indoor,
L-spotlight, stairs, 2nd-floor, cellar, and the
`TryFindIndoorWalkablePlane` synthesis removal. **M2 ("Kill a drudge")
is deferred until M1.5 lands.** Full M1.5 writeup at the corresponding
block in `docs/plans/2026-05-12-milestones.md`.
from 2026-05-20 baseline after Phase O ship). **A6.P1 + A6.P2 + A6.P3
slice 1 SHIPPED 2026-05-21.** **A6.P3 slice 2 v2 SHIPPED 2026-05-22**
(commit `f8d669b`): tried removing the L622 per-tick CP seed
(`892019b` v1) but it broke BSP step_up at the last step of stairs;
reverted + added a benign no-op-if-unchanged guard inside
`CollisionInfo.SetContactPlane`. Slice 2 outcome: **#96 partially
addressed — accepted as documented retail divergence** (the per-tick
seed is load-bearing for `AdjustOffset` slope-projection on sub-step 1
which BSP step_up depends on; matching retail would require deeper
refactor of AdjustOffset). Slice 2 verification surfaced a NEW
M1.5-blocking bug: **user cannot walk UP out of cottage cellar — stuck
at last step due to cell-resolver ping-pong (filed as issue #98,
Finding 3 family).** **A6.P3 slice 3 SHIPPED 2026-05-22** (commits `8898166` v1 +
`3e140cf` v2): cell-resolver stickiness added in `ResolveCellId`'s
indoor branch (point-in check against `fallbackCellId`'s CellBSP
before falling through to FindCellList). Data confirms ping-pong is
FULLY CLOSED — scen4 cellar capture shows 1 cell-transit (login
teleport) vs 20+ pre-fix. **#90 workaround now redundant — deferred
to A6.P4 removal. #98 APPARATUS COMPLETE 2026-05-23 evening**
(commits `35b37df` triage → `f62a873` cell-dump probe → `3f56915`
fixtures → `856aa78` replay harness → `6f666c1` cdb script →
`28c282a` divergence comparison doc). Four sessions of speculative
fixes (10+ variants) shipped the wrong diagnosis each time; this
session shipped the APPARATUS that turns evidence-driven analysis
into a 200ms test loop. Real divergence: retail's sphere is at
world Z ≈ 94.48 (resting on cottage floor) when find_walkable
accepts; acdream's failing-frame sphere is at world Z ≈ 92.01
(2.47m lower). Retail's ContactPlane writes during cellar-up are
ONLY flat floors (cellar floor or cottage floor), never the ramp.
Retail's find_crossed_edge fires once in 35K BPs; ours uses it
heavily. **Fix targets (priority): (1) Transition.AdjustOffset
slope projection / DoStepUp WalkInterp handling — ramp climb
doesn't gain Z; (2) cottage-cell candidacy using wrong sphere
reference; (3) find_crossed_edge over-use; (4) ramp polygon normal
divergence (low confidence).** Full divergence reading +
fix-plan pickup prompt at
[`docs/research/2026-05-23-a6-p3-issue98-replay-comparison.md`](docs/research/2026-05-23-a6-p3-issue98-replay-comparison.md).
Current A6 phase:
**A6.P3 — PAUSED 2026-05-23 (full day). Trajectory replay harness shipped
but BLOCKED on a new bug surfaced during commissioning.** Read
[`docs/research/2026-05-23-a6-p3-issue98-harness-handoff.md`](docs/research/2026-05-23-a6-p3-issue98-harness-handoff.md)
as the canonical pickup document — it has the chronological commit list,
the apparatus inventory, the exclusion list (do-not-retry), and three
concrete next-session options ranked by recommendation.
The session shipped further apparatus + first failed fix attempt + revert:
`8a232a3` (`[step-walk-adjust]` probe inside `Transition.AdjustOffset`
revealing branch tokens and per-call zGain), `8daf7e7` (findings note
at [`docs/research/2026-05-23-a6-stepwalkadjust-findings.md`](docs/research/2026-05-23-a6-stepwalkadjust-findings.md)
+ capture snapshot), `0cb4c59` (Shape 1 fix: gate `BSPQuery.AdjustSphereToPlane`'s
two `SetContactPlane` call sites by `Normal.Z >= 0.99`), `402ec10`
(revert — Shape 1 broke OnWalkable tracking, sphere went into falling
state on every sloped surface). **Refined diagnosis:** AdjustOffset is
CORRECT (145/146 calls take `into-plane` branch, +0.045 m mean zGain
per call when offset points into ramp); the climb CAPS at world Z ≈
92.80 because step-up's downward step-down probe finds no walkable
within 0.6 m below the proposed position (cottage floor is ABOVE).
Earlier "Fix targets 14" priority list is OBSOLETE — AdjustOffset
projection is not the problem. The actual bug is in the step-up
validation at the ramp top. **Honest next-session moves**: (1) build
deterministic trajectory replay harness so fix attempts iterate in
<500ms instead of 5-minute live-test cycles; (2) pivot to a less-
coupled M1.5 issue while #98 awaits the harness; (3) targeted decomp
research on `CEnvCell::find_env_collisions``BSPTREE::find_collisions`
indoor CP-setting chain (prior research worked on the outdoor
`CLandCell` path; indoor was never fully traced). Session-end ISSUES.md
entry has the full reading and pickup prompt. **NO further #98 fix
attempts until apparatus or research has converged — six+ failed
attempts in the saga is the signal.**
**Late-day extension (2026-05-23 PM):** trajectory replay harness shipped
(commits `4c9290c``5c6bdbe`). Mechanics work — runs 200 ticks in <100 ms.
Five tests pass. NEW finding: the cellar ramp polygon is in a GfxObj
(static building piece), not the cell's PhysicsPolygons. Harness now
includes `RegisterStairRampGfxObj` for synthetic stair construction
and `AttachSyntheticBsp` to wrap hydrated cells (which have BSP=null)
with a one-leaf BSP that exposes the indoor BSP collision path.
**NEW BLOCKER:** even with full apparatus, sphere goes airborne at
tick 1 with `hit=(0,1,0)` (a +Y wall normal matching no registered
geometry). 6 hypotheses tested via the harness, none isolated root cause.
Per systematic-debugging skill's "question architecture" rule, stop and
reflect. Next session: build a side-by-side comparison harness that
captures live PlayerMovementController state and diffs against the
test harness — evidence-first instead of speculation-first.
Findings doc:
[`docs/research/2026-05-21-a6-cdb-capture-findings.md`](docs/research/2026-05-21-a6-cdb-capture-findings.md).
**Evening extension v2 (2026-05-23 PM late) — apparatus shipped + root
cause identified.** Four commits (`fb5fba6``44614ab``0f2db62`
`f29c9d5`). The side-by-side comparison harness was built and exercised:
- `PhysicsResolveCapture` ships a JSON Lines writer for every player-side
`ResolveWithTransition` call. Off by default; turn on via
`ACDREAM_CAPTURE_RESOLVE=<path>`. Filtered to `IsPlayer` so NPC / remote
DR doesn't pollute.
- Two live captures from a cottage-cellar session (41K + 70K records).
- Three `LiveCompare_*` tests load 3 representative records (spawn,
on-ramp, first-cap). Spawn + on-ramp PASS bit-perfect; the first-cap
test originally FAILED with a clear divergence — and that divergence
pinpoints the root cause.
- **The cap is caused by `obj=0xA9B47900` — a landblock-baked cottage
GfxObj.** Cottage floor polygons live in this GfxObj's polygon table
(registered as a ShadowEntry), NOT in any cottage cell. The harness's
cell fixtures (0xA9B40143/146/147) don't include the cottage GfxObj,
so the harness fails to reproduce the live cn=(0,0,-1) cap.
- User's confirming observation: jumping in the cellar caps at the same
Z — purely vertical motion. This rules out every step-up / AdjustOffset
hypothesis from the prior 6-shape saga. The bug is the head sphere
hitting the cottage floor at Z=94.0 from below (math: foot Z=92.74
+ sphereHeight 1.20 = head center 93.94, head top 94.42, intersects
cottage floor Z=94.0).
- The first-cap test is now in documents-the-bug form (PASSES while
bug exists; FAILS when fix lands). Test baseline maintained at
1178 + 8 (serial run).
- 13 new cell fixtures cover the full 0xA9B4014X neighborhood (272 KB).
Findings doc (canonical pickup):
[`docs/research/2026-05-23-a6-p3-issue98-comparison-harness-findings.md`](docs/research/2026-05-23-a6-p3-issue98-comparison-harness-findings.md).
**Evening v2 follow-on — apparatus convergence SHIPPED 2026-05-23 PM.**
Two commits (`cc3afbc``97fec19`):
- `cc3afbc` adds the GfxObj dump infrastructure (`ACDREAM_DUMP_GFXOBJS`)
mirroring the existing `ACDREAM_DUMP_CELLS` pattern, with new
`GfxObjDump`/`GfxObjDumpSerializer` parallel to `CellDump`. The new
env var triggers `PhysicsDataCache.CacheGfxObj` to write the full
resolved polygon table as JSON when a listed id caches. Closes the
gap that the existing `[resolve-bldg]` probe couldn't fill (the BSP
wire site that populates `LastBspHitPoly` was never wired, so the
probe only emitted GfxObj-level metadata, not per-poly geometry).
- `97fec19` lands the cottage GfxObj fixture (`0x01000A2B`, 74 polygons,
BSP radius 13.989m matching live), the new `RegisterCottageGfxObj`
harness helper, and a minimum-stub landblock so
`TryGetLandblockContext` succeeds at the cellar XY. Harness now
reproduces the live `cn=(0,0,-1)` cap bit-perfect. The full per-field
round-trip uncovers ONE residual: live preserves +0.0266m of +X
motion through the cap (edge-slide along the cottage floor); harness
blocks all motion. Captured in
`LiveCompare_FirstCap_ResidualXMotionDivergence_DocumentsNextInvestigation`
in documents-the-bug form.
- All 21 issue-#98-relevant tests (12 harness + 4 GfxObjDumpRoundTrip +
1 new PhysicsDiagnosticsTests + 4 CellDumpRoundTripTests) pass
deterministically in isolation.
- Pre-existing test suite flakiness observed (819 failures across runs
of the same code, from PhysicsResolveCapture / PhysicsDiagnostics
statics leaking between test classes). INDEPENDENT of A6.P3 — verified
by stashing the cottage helper and reproducing the same flaky range.
Out of scope for this session; tracked as follow-up.
**Evening v3 finding (2026-05-23 PM, even later) — NEW root-cause
hypothesis identified:** the cottage-floor cap is a SYMPTOM. The actual
bug is **stale ramp contact plane causing per-tick Z drift** that makes
the cap reachable in the first place.
Evidence:
- Body's contact plane at cap = ramp's plane (n=(0, 0.7190, 0.6950),
d=-69.5035) from the live capture's `bodyBefore`
- Cellar ramp's actual world XY: X∈[129.7, 131.3], Y∈[10.19, 13.09]
(computed from the cellar cell fixture's vertex data + WorldTransform)
- Player position at cap: world (141.5, 7.22, 92.74) — **10 m away**
from the ramp in cell-local X
- `AdjustOffset` projects requested motion along the contact-plane
perpendicular. Math: dot((0.0266, -0.4022, 0), (0, 0.719, 0.695))
= -0.2892 → projected = (0.0266, -0.1943, +0.2010). **+0.201 m of
Z gain per tick**, applied because the engine believes the player
is on the slope.
- Head sphere top at cap = foot Z + 1.68 = 94.42. Cottage floor at
Z=94.00. **Head sphere exceeds cottage floor by 0.42 m** → cap fires
- If the contact plane refreshed to the flat cellar floor when the
player walked off the ramp, AdjustOffset would produce zero Z gain
(no Z component in requested motion + horizontal-plane perpendicular).
No drift, no cap.
How this question surfaced: user asked "we know how retail OPENs it
from above, how hard can it be to know how to open it from below?" —
that reframing made the question "what's different about our state
when walking up vs down?" The answer: **nothing, actually — the
cottage geometry is the same. But our contact plane is wrong.** The
six prior fix attempts were all investigating the cap-event mechanics
(step-up, slope projection at the cap, edge-slide, SidesType, +X
residual). None questioned why the contact plane was the ramp at all
when the player was 10 m from the ramp.
**Next-session move:** verify the stale-contact-plane hypothesis
chronologically against the live capture (walk the JSONL records, find
the last tick the player was on the actual ramp, quantify Z drift),
then locate the walkable-refresh code path in
`Transition.FindEnvCollisions` / `SpherePath.SetWalkable` that's
supposed to detect a new walkable polygon under the sphere and
overwrite the contact plane. Retail decomp anchor:
`CObjCell::find_env_collisions`. Full pickup prompt at the bottom of
[`docs/research/2026-05-23-a6-p3-issue98-comparison-harness-findings.md`](docs/research/2026-05-23-a6-p3-issue98-comparison-harness-findings.md).
**A6.P4 door bug — `pos_hits_sphere` near-miss recording shipped
2026-05-25 PM** (commit `3253d84`). Single-line ordering fix in
`BSPQuery.PosHitsSphere`: `if (hit) hitPoly = poly;` now precedes the
front-face cull, matching retail's `CPolygon::pos_hits_sphere` at
`acclient_2013_pseudo_c.txt:322974-322993` where `*arg5 = this` fires
on static-overlap BEFORE `dot(N, movement) >= 0 → return 0`. With this
ordering, Path 5's existing `if (hitPoly0 is not null)` near-miss
branch (`BSPQuery.cs:1869`) finally fires — `NegPolyHitDispatch`
sets `path.NegPolyHit`, the outer `transitional_insert` loop dispatches
via `slide_sphere`, and the sphere slides along walls it's touching
instead of squeezing through. The handoff hypothesized swept-sphere +
closest-considered-polygon tracking; reading retail showed both
`pos_hits_sphere` and `polygon_hits_sphere_slow_but_sure` are STATIC
tests using motion only for the cull — the fix is just the ordering.
3 new RED→GREEN unit tests in `BSPQueryTests.FindCollisions_Path5_*`
cover: overlap + parallel motion (RED→GREEN), overlap + away motion
(RED→GREEN), overlap + into motion (regression guard, already passed).
Zero regressions in full Core suite — with-fix failure set is a strict
subset of baseline (14 vs 17, the 14 are pre-existing static-leak
flakiness + 2 stale-capture document-the-bug tests). Issue #98
`LiveCompare_FirstCap_FixClosesCottageFloorCap` regression test
passes. **Needs visual verification at Holtburg cottage door inside-
out off-center ~50 cm scenario** before A6.P4 is marked complete —
sphere should block at the door surface with no squeeze-through. The
"runs a bit into the door" over-penetration symptom is hypothesized
to close together with the squeeze-through (continuous near-miss
recording while approaching a wall means the sphere slides along it
substep-by-substep rather than catastrophically penetrating then
recovering), but separate investigation if the symptom persists.
Original demo scenario (Holtburg Sewer end-to-end) is unreachable: sewer
doesn't exist on this server, and **issue #95** (portal-graph visibility
blowup) blocks any substitute dungeon. Revised M1.5 demo split into
building/cellar half (PARTIALLY ACHIEVABLE post-slice-1; cellar-ascent
blocked on #98) + dungeon half (blocked on #95). Issues in scope: #80,
#81, #83, #88, #90 (workaround removal after slice 3), **#95**
(visibility; not A6 scope), **#96** (L622 seed; retail divergence
accepted), **#97** (phantom collisions; may close as #98 side-effect),
**#98** (cellar-ascent stuck; A6.P3 slice 3 target), L-indoor,
L-spotlight, indoor sling-out (Finding 3 family with #98), and the
`TryFindIndoorWalkablePlane` definition deletion (A6.P4). **M2
("Kill a drudge") is deferred until M1.5 lands.** Full M1.5 writeup at
the corresponding block in `docs/plans/2026-05-12-milestones.md`.
**A6.P8 — Mesh-AABB-fallback phantom suppression for GfxObj-only stabs — SHIPPED 2026-05-25.**
Three commits: `f6305b1` (PhysicsDataCache.IsPhantomGfxObjSource + 3 unit tests),
`5240d65` (GameWindow.cs wire-in at line 6127), `6ca872f` (test-class doc
line-ref sync from code review). Issue #101 CLOSED — the 10 phantom stair
cyls on the Holtburg upper-floor cottage staircase are gone; collision
falls through to entity `0x40B50089` (GfxObj `0x01000C16`, `hasPhys=True`
BSP with walkable inclined polygon at `Normal.Z=0.717`, world ramp from
(111.10, 25.50, 94.00)→(107.50, 27.10, 97.50)). Visual-verified end-to-end
2026-05-25: holding W continuously climbs Z=94→97.5 over the full 45°
ramp; no phantom diagonal slides (`[cyl-test]` count on `obj=0x40B500*`
post-fix = 0 vs 7101 pre-fix). Spec:
[`docs/superpowers/plans/2026-05-25-issue-101-stairs-cyl-phantom.md`](docs/superpowers/plans/2026-05-25-issue-101-stairs-cyl-phantom.md).
**Issue #100 — Transparent ground around buildings — SHIPPED 2026-05-25 (primary acceptance);
visibility-culling follow-up handed off.** Three commits: `f48c74a` (terrain shader Z nudge,
retail `zFightTerrainAdjust = 0.00999999978` applied per-vertex in `terrain_modern.vert`),
`a64e6f2` (removed ~50 LOC of `hiddenTerrainCells` / `BuildingTerrainCells` plumbing across
LandblockMesh / LoadedLandblock / LandblockLoader / GameWindow / GpuWorldState /
LandblockStreamer + 2 dead tests), `84e3b72` (docs SHA stabilization follow-up).
Visual-verified 2026-05-25 PM at Holtburg: 24m × 24m transparent rectangles around
every cottage are GONE; ground reads as continuous cobblestone / grass. Plan:
[`docs/superpowers/plans/2026-05-25-issue-100-terrain-cutout.md`](docs/superpowers/plans/2026-05-25-issue-100-terrain-cutout.md);
predecessor research [`docs/research/2026-05-25-issue-100-terrain-cutout-handoff.md`](docs/research/2026-05-25-issue-100-terrain-cutout-handoff.md).
**Secondary finding from visual verification:** outdoor terrain mesh visible inside
cottage cellars at certain camera angles (clears when camera moves closer; gameplay
unaffected). High-confidence root cause: **indoor-cell visibility culling not gating
outdoor terrain** — same family as filed issue #78 (outdoor stabs visible through inn
floor) and #95 (dungeon portal-graph blowup). Per user direction, NOT filed as a new
issue; treated as additional evidence for #78. Next session investigates + ports
retail's `CEnvCell::find_visible_child_cell` (decomp anchor
`acclient_2013_pseudo_c.txt:311397`) and/or WB's `RenderInsideOut` stencil pipeline.
Full handoff with pickup prompt:
[`docs/research/2026-05-25-issue-100-shipped-and-culling-handoff.md`](docs/research/2026-05-25-issue-100-shipped-and-culling-handoff.md).
**Today's pre-M1.5 baseline (2026-05-20).** Five surgical fixes
shipped to close the user-reported "logged in inside the inn, ran
@ -1217,6 +1489,24 @@ via `PlayerMovementController.ApplyServerRunRate`) or from
change: old → new cell, world position, reason tag
(`resolver` / `teleport`). Low volume — only fires on actual cell
crossings. Runtime-toggleable via the same DebugPanel section.
- `ACDREAM_PROBE_PUSH_BACK=1` — A6.P1 cdb probe spike (2026-05-21).
Emits three line types per physics tick: `[push-back]` (per
`BSPQuery.AdjustSphereToPlane` call), `[push-back-disp]` (per
`BSPQuery.FindCollisions` dispatch), `[push-back-cell]` (per
`Transition.CheckOtherCells` off-cell hit). Heavy under motion
(~100500 lines/sec). Pair with retail's cdb breakpoint set at
`tools/cdb/a6-probe.cdb` for the A6.P1 capture protocol.
Runtime-toggleable via the DebugPanel "Diagnostics" section.
- `ACDREAM_CAPTURE_RESOLVE=<path>` — A6.P3 #98 live capture of every
player-side `PhysicsEngine.ResolveWithTransition` call (2026-05-23 PM
apparatus). Each call appends one JSON Lines record with full inputs,
PhysicsBody snapshot before AND after, plus the `ResolveResult`.
Filtered to `IsPlayer` mover flag — NPC / remote DR calls don't
pollute. Pairs with the trajectory replay harness comparison test
(`CellarUpTrajectoryReplayTests.Capture_*`) to diff captured vs harness
state per field — the first divergence pinpoints missing apparatus
state. Capture is OFF when the env var is unset (one null-check
cost per call).
- *(retired 2026-05-05 by L.3 M2/M3)* `ACDREAM_INTERP_MANAGER` was an
env-var gate on an experimental per-tick remote motion path. L.3 M2
(commit 40d88b9) replaced both gates (`OnLivePositionUpdated` +

View file

@ -44,10 +44,112 @@ Copy this block when adding a new issue:
---
## #103 — Phase A8.F portal-frame indoor rendering broken at runtime (visual-gate failure)
**Status:** SUPERSEDED 2026-05-30 by **Phase U (Unified Render Pipeline)**. The
two-pipe (inside/outside) approach this bug lives in is being abandoned wholesale —
the broken `RenderInsideOut` two-pipe path is deleted as Task 1 of Phase U and
replaced by a single unified retail `PView` portal-visibility pipeline. #103 will
not be fixed in place. See
[docs/research/2026-05-30-unified-render-pipeline-decision-and-handoff.md](research/2026-05-30-unified-render-pipeline-decision-and-handoff.md).
**Severity:** MEDIUM (opt-in branch only — default game unaffected)
**Filed:** 2026-05-29
**Component:** render (indoor visibility)
**Description:** With `ACDREAM_A8_INDOOR_BRANCH=1`, the A8.F retail portal-frame port
renders indoor/outside-in broadly wrong: cottage/cellar interiors covered in outdoor
terrain with transparent walls; invisible walls in other houses from inside and outside.
Default game (env var off) is unaffected — `cameraInsideBuilding = a8IndoorBranchEnabled
&& inside` (GameWindow.cs:7343). The old cellar flap remains in the default path.
**Root cause / status:** Two compounding causes (evidence in the handoff): (1) the
`OutsideView` builder under-produces — `OUTSIDEVIEW polys=0` most frames, and when
non-empty it doesn't recursively narrow (cellar shows ~full window). (2) The Task-6
Job-A/B decoupling draws terrain UNGATED when `OutsideView` is empty (`else` branch),
flooding the cell interior over the (correctly-rendered) walls. Cell walls DO render
(`[opaque]` tris=50-108). Projection math is correct; the builder integration is fragile.
**Files:** `src/AcDream.App/Rendering/PortalVisibilityBuilder.cs` (builder under-produces);
`src/AcDream.App/Rendering/GameWindow.cs` `RenderInsideOutAcdream` Step-4 `else` ungated-terrain (~11142).
**Research:** [docs/research/2026-05-29-a8f-visual-gate-failure-handoff.md](research/2026-05-29-a8f-visual-gate-failure-handoff.md) (root-cause analysis, apparatus, first-fix hypothesis, pickup prompt).
**Acceptance:** Holtburg cottage cellar renders with solid walls and no terrain flood;
terrain shows only through correctly-clipped portal openings; no invisible walls.
Related: #102 (builder dungeon-scaling fixpoint).
# Active issues
---
## #102 — A8.F PortalVisibilityBuilder — port retail update_count fixpoint (replace MaxReprocessPerCell cap)
**Status:** OPEN
**Severity:** MEDIUM
**Filed:** 2026-05-29
**Component:** rendering, visibility, EnvCell portal traversal
**Description:** A8.F Task 4 shipped a bounded-BFS port of retail's
`PView::ConstructView``ClipPortals``AddViewToPortals` in
[`src/AcDream.App/Rendering/PortalVisibilityBuilder.cs`](../src/AcDream.App/Rendering/PortalVisibilityBuilder.cs).
Code review found NO correctness bugs (the cellar-flap fix works and the
BFS terminates), but two scaling issues that bite only on CYCLIC /
high-fan-in portal graphs (dungeons, network hubs), NOT on the cottage
cellar (a 2-3 cell chain) which is the current M1.5 goal:
- **I-1 — the cap is load-bearing, not a safety net.** `MaxReprocessPerCell = 4`
is the *actual* termination mechanism for cyclic graphs. The
`if (nview.Polygons.Count > before)` re-enqueue-on-growth guard is a
near-no-op because `CellView.Add` (PortalView.cs) appends
unconditionally and never dedupes, so a cell almost always "grows" and
is re-enqueued — convergence relies entirely on the count hitting 4.
A cell reachable through **>4 contributing portals under-counts**
(drops legitimately-visible contributions).
- **I-2 — duplicate polygons accumulate on cyclic/multi-path graphs.**
Measured on a synthetic 4-room ring: 34 `OutsideView` polygons and
216-poly `CellView`s where retail converges to a small fixed set.
Correctness survives (overlapping stencil marks are idempotent) but
it's per-frame cost feeding the stencil pipeline.
**Root cause / status:** We approximate retail's monotone-fixpoint
convergence with a fixed re-process cap. Retail instead converges via an
`update_count` / `set_view(...,i)` slice watermark — each cell records a
timestamp/watermark of how much of its view has been propagated, so a
re-visit only re-propagates the *new* slice and the graph reaches a true
fixpoint with no duplicate accumulation and no arbitrary cap.
Retail anchors (`docs/research/named-retail/acclient_2013_pseudo_c.txt`):
- `AddToCell` 433050 — `esi[0x11]` update-count/slice watermark on the cell
- `InitCell` — per-cell timestamp init
- `AddViewToPortals` 433446 — change-detection that drives the fixpoint
**Related M-4 stub:** the neighbour-side `OtherPortalClip`
(decomp:433524) is also not yet ported — the builder clips the portal
opening against the *current* cell's view but does NOT additionally clip
against the *neighbour's* matching portal polygon. Its absence can only
ever OVER-include (mark cells/regions visible that retail would cull),
never under-include, so it's deferred. There is a `TODO(A8.F)` marker at
the relevant site in `PortalVisibilityBuilder.cs`.
**Files:**
- `src/AcDream.App/Rendering/PortalVisibilityBuilder.cs` — replace the
`MaxReprocessPerCell` cap + re-enqueue-on-growth guard with a
per-cell slice watermark; honest-limitation comment lives at the
`MaxReprocessPerCell` declaration.
- `src/AcDream.App/Rendering/PortalView.cs``CellView.Add` currently
never dedupes; the fixpoint port either dedupes here or tracks a
propagated-slice index per cell.
**Acceptance:** On a cyclic/hub portal graph (synthetic 4-room ring +
the Town Network dungeon hub), `OutsideView` / `CellView` polygon counts
converge to a small fixed set (no duplicate accumulation), every cell
reachable through any number of contributing portals is included, and
the BFS still terminates. Existing cottage-cellar tests stay green.
**MUST land before A8.F is relied on for dungeons** (dungeons are
currently blocked on #95 regardless).
---
## #87 — Drop WB fork patch by switching to PrepareEnvCellGeomMeshDataAsync
**Status:** OPEN
@ -131,11 +233,11 @@ the indoor-lighting plumbing.
---
## #78 — Outdoor stabs/buildings visible through the rendered floor
## #78 — Outdoor geometry (stabs + terrain mesh) visible inside EnvCells
**Status:** OPEN
**Severity:** HIGH (immediate visual jank now that floors render)
**Filed:** 2026-05-19
**Status:** OPEN — **next-session investigation target (2026-05-25)**
**Severity:** HIGH (immediate visual jank; broadened scope per 2026-05-25 PM finding)
**Filed:** 2026-05-19 (broadened 2026-05-25)
**Component:** rendering, visibility
**Description:** Standing inside Holtburg Inn looking at the floor or
@ -144,30 +246,64 @@ world position + scale — but visible THROUGH the floor and walls. As if
the cell mesh is rendered but doesn't occlude or stencil-cull what's
behind it.
**Additional evidence (2026-05-25 PM, post-#100 visual verification):**
After issue #100 shipped (commits `f48c74a`, `a64e6f2`, `84e3b72`) and
removed the `hiddenTerrainCells` cell-collapse mechanism, the OUTDOOR
TERRAIN MESH is now (correctly per retail) rendered everywhere on the
landblock — including in 3D regions occupied by indoor EnvCell volumes.
Visual verification at a Holtburg cottage cellar showed a sharp-edged
rectangular grass patch (outdoor terrain at Z≈93.99) rendering over the
cellar stair geometry at certain camera angles. Clears when camera
moves closer (cottage walls + stair treads geometrically occlude the
terrain from new vantage points). Gameplay unaffected. **This is the
same root cause as the existing #78 hypothesis #2** ("outdoor stabs not
culled when player in EnvCell"), just with outdoor terrain mesh
affected in addition to outdoor stab entities. Per user direction,
NOT filed as a new issue — additional evidence reinforces #78's
hypothesis #2, broadens scope of the fix to include terrain culling.
**Root cause / status:** Two plausible causes:
1. The `+0.02f` Z bump applied to cell origin at `GameWindow.cs:5362`
pushes the floor mesh 2 cm above terrain, so depth test correctly
occludes terrain. But OUTDOOR STABS (landblock-baked building geometry)
at the same X,Y may have Z values comparable to or higher than the
cell-mesh floor, producing z-fighting / see-through.
2. Outdoor stabs aren't being culled when the player is inside an
2. **(High confidence as of 2026-05-25)** Outdoor geometry (stabs AND
terrain mesh) isn't being culled when the player is inside an
EnvCell — this is the Phase 1 Task 3 deferred work
("Cull outdoor stabs when indoors via VisibleCellIds"). WB has a
`RenderInsideOut` stencil pipeline (`references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/VisibilityManager.cs`)
that acdream never invokes.
that acdream never invokes. Retail anchor:
`docs/research/named-retail/acclient_2013_pseudo_c.txt:311397`
(`CEnvCell::find_visible_child_cell` at address `0x0052dc50`,
called from `acclient_2013_pseudo_c.txt:280028`).
**Files:**
- `src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs` (per-entity walk —
consider gating outdoor stab entities on visible-cell membership).
the dispatcher already filters by `entity.ParentCellId ∈
visibleCellIds` but outdoor stabs have `ParentCellId == null` so they
always pass; needs an explicit indoor-camera gate).
- `src/AcDream.App/Rendering/TerrainModernRenderer.cs` (currently
renders all loaded landblock terrain unconditionally; needs
visibility gating when camera resolves to an indoor cell).
- `src/AcDream.App/Rendering/CellVisibility.cs:222+` (`ComputeVisibility`
returns `VisibleCellIds`; the dispatcher already filters by
`entity.ParentCellId ∈ visibleCellIds` but outdoor stabs have
`ParentCellId == null` so they always pass).
returns `VisibleCellIds`; existing portal-LOS infrastructure to build on).
- `references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/VisibilityManager.cs`
(`RenderInsideOut` pipeline — reference implementation, never invoked).
**Acceptance:** Standing inside a sealed-interior cell, no outdoor
geometry is visible through floor/walls. Standing where a cell has a
real outdoor portal (door open, window) outdoor geometry is correctly
visible through the portal.
visible through the portal. Cellar-stairs case (2026-05-25 finding):
standing in a Holtburg cottage cellar at any camera angle, no outdoor
terrain mesh visible over the stair geometry.
**Research:**
[`docs/research/2026-05-25-issue-100-shipped-and-culling-handoff.md`](research/2026-05-25-issue-100-shipped-and-culling-handoff.md)
— full session handoff with cellar-stairs evidence, family map (#78 +
#95 + cellar-stairs), root-cause hypothesis, retail anchors, WB
references, do-not-retry list, and pickup prompt for the
investigation session.
---
@ -574,6 +710,329 @@ Retail oracle for cell-id hysteresis: `acclient_2013_pseudo_c.txt:308742-308783`
---
## #95 — Dungeon portal-graph visibility blowup (see-through-walls / other dungeons rendered)
**Status:** OPEN — **explains user-observed "dungeons are broken"**
**Severity:** HIGH (blocks all dungeon navigation visually)
**Filed:** 2026-05-21
**Component:** rendering, visibility, EnvCell portal traversal
**Description:** When +Acdream enters a dungeon via portal (verified at Town Network hub in A6.P1 scen5), the `visibleCells` count per cell explodes from a normal ~4-7 to **135-145**, and cells from **multiple disconnected landblocks** are loaded simultaneously. Observed result: the player can see through walls, sees geometry from other dungeons rendering inside the current dungeon, and rendering is generally garbled. This single bug is responsible for "dungeons are broken" as a whole — every portal-accessed dungeon hits this on entry.
**Root cause / status:** Suspected: portal-graph traversal in the EnvCell visibility computation walks outbound portals recursively without proper termination, so a network hub (which has many outbound portals to different dungeons) marks 100+ cells from disconnected dungeons as visible. The visibility computation likely needs to (a) cap traversal depth, (b) terminate at portal boundaries to OTHER landblocks, or (c) only include cells that share line-of-sight through a chain of portals from the camera's current cell.
**Evidence (committed):**
- `docs/research/2026-05-21-a6-captures/scen5_sewer_entry/acdream.log` — full trace of the rendering breakdown after portal teleport.
- Pre-teleport: `visibleCells=4` per cell (normal outdoor).
- Post-teleport: `visibleCells=135-145` per cell at landblock 0x0007 + spurious cells from 0x020A and 0x0408 (different worldOrigins, i.e. different dungeons entirely).
- Cell-transit chain: `0xA9B40003 -> 0x00070143 reason=teleport` is the portal entry; everything after the teleport is corrupted.
**Files:**
- `src/AcDream.App/Streaming/` — cell streaming + visibility logic (suspect: cell-cache visibility computation)
- WB-extracted visibility: `src/AcDream.App/Rendering/Wb/` (whichever file owns `visibleCells`)
- Check `EnvCellRenderManager` + `VisibilityManager` in `references/WorldBuilder/` for the WB-original algorithm and where our extraction may have diverged
**Research:** scen5 acdream.log is the primary evidence. Compare against WorldBuilder's original portal-traversal termination logic.
**Acceptance:** After portal entry to any dungeon, `visibleCells` per cell stays in the normal ~4-15 range, cells from non-adjacent landblocks do NOT appear in the cell-cache, and visually no other-dungeon geometry renders through walls.
---
## #96 — Per-tick PhysicsEngine.ResolveWithTransition CP seed (retail divergence)
**Status:** PARTIALLY ADDRESSED — accepted as documented retail divergence
**Severity:** LOW (cosmetic — CP-write counter inflates but behavior is correct)
**Filed:** 2026-05-21
**Component:** physics, ContactPlane retention
**Description:** After A6.P3 slice 1 (commits `5aba071` + `5f7722a` + `39fc037`) stripped the `TryFindIndoorWalkablePlane` synthesis path from `Transition.FindEnvCollisions` indoor branch, scen3 post-fix re-capture showed acdream still writes ContactPlane fields 25,082 times during a flat-floor walk — 24,906 of those (99.3%) come from `PhysicsEngine.ResolveWithTransition` line 622, which seeds `ci.ContactPlane` from `body.ContactPlane` at every transition start when the body is grounded. Retail's equivalent code path fires `set_contact_plane` zero times during the same flat-floor walk (scen3 retail BP7 = 0).
**Slice 2 attempt + outcome (2026-05-22, commits `892019b` + `f8d669b`):**
- **v1 attempt (`892019b`):** Removed the L622 seed entirely to match retail's `CTransition::init` clear-at-start behavior. Verified per-rebuild that the change deployed. CP-write count dropped 91% (30,420 → 2,690). **But broke BSP step_up at the last step of stairs** — sub-step 1's `AdjustOffset` had no ContactPlane to compute the lift direction, BSP step_up thrashed (12,489 push-back-disp + 2,226 push-back-cell signal). User confirmed: "I can't pass the last step of the stairs."
- **v2 fix (`f8d669b`):** Reverted the seed removal + added no-op-if-unchanged guard inside `CollisionInfo.SetContactPlane`. The guard early-returns when called with values identical to current state. **The guard doesn't trigger for the L622 seed** because each tick gets a fresh `Transition` (so `ci.ContactPlaneValid=false` on entry → guard fails → write fires). So slice 2 v2 didn't actually reduce CP-write count for the seed case. It does dedupe within-tick redundant writes (e.g. Mechanism B restoring LKCP that equals current ci.CP), which is a small benign improvement.
**Root cause / status (updated 2026-05-22):** The L622 seed IS load-bearing for `AdjustOffset` slope projection on sub-step 1, which BSP step_up depends on. Retail uses a different architecture (no seed; first sub-step has no CP and BSP path-6 establishes it). Matching retail would require a deeper refactor — making `AdjustOffset` fall back to `body.ContactPlane` when `ci.ContactPlane` is invalid, OR re-architecting the sub-step loop to not require CP for the first iteration. Both are non-trivial.
**Accepting the divergence:** the per-tick seed call is functionally correct — it propagates the player's current contact plane to the transition. The cost is a noisy CP-write counter (cosmetic) but the BEHAVIOR matches retail (player stays grounded on the correct plane, slope-snap works, step_up works). Closing #96 fully is deferred to a future refactor or accepted as is.
**Lessons learned:**
- A counter-based metric (CP-write count) is not always a direct proxy for "behavior matches retail." Retail's set_contact_plane firing rate differs from ours because the call-site structure differs, not because the behavior differs.
- The slice 1 hypothesis "Finding 1 (dispatcher entry frequency) may close as side-effect of Finding 2 (CP-write)" was confirmed by stairs+cellar working post-slice-1. But the slice 2 follow-up assumption "remaining 99.3% of CP writes are also a problem" was partially wrong — those writes are correct state propagation.
**Files:**
- `src/AcDream.Core/Physics/PhysicsEngine.cs:620-626` (the seed call site, retained with updated comment)
- `src/AcDream.Core/Physics/TransitionTypes.cs:259-279` (`CollisionInfo.SetContactPlane` no-op guard, retained as small improvement)
**If revisited:** investigate `AdjustOffset` fallback to `body.ContactPlane` when `ci.ContactPlane` is invalid — that would let us safely remove the seed. Or investigate retail's exact first-sub-step behavior to see if there's a different missing piece in our BSP step_up that would let it work without a seeded CP.
---
## #97 — Phantom collisions + occasional fall-through on indoor 2nd floor (post-slice-1 happy-testing)
**Status:** OPEN — **investigate after issue #96 lands** (hypothesized to be a side-effect)
**Severity:** MEDIUM (intermittent; doesn't block stair-walking which works post-slice-1)
**Filed:** 2026-05-21
**Component:** physics, ContactPlane stability
**Description:** During user happy-testing post-A6.P3 slice 1 (2026-05-21), walking on the inn 2nd floor in acdream produced:
- Intermittent "phantom collisions" — hitting invisible barriers in open floor space.
- One observed "fall-through the floor" — character dropped through the 2nd floor at a specific spot.
These are NOT the indoor stair-climb or cellar-descent symptoms (those WORK post-slice-1). They appear during normal flat-floor walking.
**Root cause / status:** Hypothesis: caused by issue #96 (L622 per-tick CP seed). The seed writes `ci.ContactPlane` every tick from `body.ContactPlane`, which may carry stale values across cell transitions or after the BSP didn't land a fresh plane. If a transient `ci.ContactPlane` value points to a plane that doesn't match the actual current floor geometry, `ValidateWalkable` (called from the outdoor terrain fallback) or downstream physics may briefly believe the player is below the floor → fall-through; OR may believe a wall is present where there isn't one → phantom collision.
Falsifiable: if #96 fix closes #97 as a side-effect, the hypothesis is confirmed. If #97 persists post-#96, deeper investigation needed (possibly cell-resolver stickiness — Finding 3 family).
**Reproduction (informal — needs sharpening):**
- Launch acdream, teleport to inn 2nd floor.
- Walk back and forth across the floor for ~30 seconds in various patterns.
- Phantom collisions appear intermittently — exact reproduction location unknown.
- Fall-through happened at one specific spot; location not recorded.
**Files:**
- `src/AcDream.Core/Physics/PhysicsEngine.cs` (CP seed + body persist)
- `src/AcDream.Core/Physics/TransitionTypes.cs` (`Transition.FindEnvCollisions` indoor branch + `Transition.ValidateTransition`)
- `src/AcDream.Core/Physics/BSPQuery.cs` (Path-6 land write site)
**Acceptance:** Walking on inn 2nd floor for ≥60 seconds in varied patterns produces zero phantom collisions and zero fall-through events.
---
## #98 — [DONE 2026-05-24 · `b3ce505`] Cellar ascent stuck at top (NOT BSP step; per-cell-list architectural divergence)
**Closed:** 2026-05-24
**Commit:** `b3ce505 fix(phys): A6.P3 #98 — gate outdoor shadow radial sweep on indoor primary cell`
**Resolution:** The proximate fix is the indoor-primary radial-sweep
gate in `ShadowObjectRegistry.GetNearbyObjects`. Architectural root
cause: our landblock-wide spatial shadow registry diverges from
retail's per-cell `shadow_object_list` with portal-aware registration —
the cottage GfxObj (registered landblock-wide via cellScope=0) was
returned to sphere queries inside the cellar EnvCell, and its
downward-facing floor poly at world Z=94 head-bumped the climbing
sphere from below.
After ~10 failed speculative fix attempts across four sessions, the
fix landed cleanly once the apparatus converged. The "v3 stale ramp
contact plane" hypothesis was falsified by chronological replay against
`a6-issue98-resolve-capture-2.jsonl` — the player IS on the ramp at the
cap event; the contact plane is correctly the ramp's plane; the head
sphere bumps the cottage GfxObj's floor poly from below (the
evening-v2 finding was correct all along).
Decomp anchors (`docs/research/named-retail/acclient_2013_pseudo_c.txt`):
- 308742+ : `CObjCell::find_cell_list` — indoor/outdoor branch
- 308751-308769 : the branch — indoor adds 1 cell; outdoor calls `add_all_outside_cells`
- 308773-308825 : portal-visible neighbor recursion
- 308916 : `CObjCell::find_obj_collisions(this, ...)` — strict per-cell iteration
**Visual verification 2026-05-24:** user confirmed "Finally I can go up!"
**Knowledge artifacts:**
- Findings doc resolution section: [`docs/research/2026-05-23-a6-p3-issue98-comparison-harness-findings.md`](research/2026-05-23-a6-p3-issue98-comparison-harness-findings.md) (bottom)
- Memory: `feedback_retail_per_cell_shadow_list.md`, `feedback_apparatus_for_physics_bugs.md`
- A6.P4 phase planned to do the full retail-faithful per-cell port and obviate the b3ce505 stopgap
**Known regression introduced:** doors at doorway thresholds — see #99 below.
---
## #99 — Run-through doors at building thresholds (regression from b3ce505)
**Status:** OPEN
**Severity:** HIGH (M1 demo regression — opening doors was previously a working demo target)
**Filed:** 2026-05-24
**Component:** physics, shadow-object collision query
**Description:** With the issue #98 fix (commit `b3ce505`), the
indoor-primary radial-sweep gate causes our engine to miss outdoor-
registered door entities when a sphere has crossed the threshold and
the primary cell resolves to the indoor side. Players can walk through
doors that previously blocked them.
User report 2026-05-24: "I can also run through doors."
**Root cause / status:** This is the doorway edge case explicitly
flagged in the b3ce505 commit message. Doors are server-spawned
entities with their own cylinder collision, registered via
`UpdatePosition` to whichever cell their position resolves to. Doors
at building thresholds typically resolve to **outdoor** cells. The
b3ce505 gate skips the outdoor radial sweep when the sphere's primary
cell is indoor → outdoor-registered doors are not returned → no
collision → walk-through.
Retail handles this case via the portal-visible recursion in
`find_cell_list` (lines 308773-308825 of the named-retail decomp): at
registration time, an object is added to its position's cell PLUS all
portal-visible neighbor cells. So a door at a doorway portal ends up in
both the outdoor cell's shadow list AND the indoor cell's list — a
sphere on either side sees it.
**Fix path:** Closes naturally as part of A6.P4 (per-cell shadow
architecture refactor — see design spec at
`docs/superpowers/specs/2026-05-24-phase-a6-p4-retail-shadow-architecture.md`).
A6.P4 ports retail's `find_cell_list` indoor branch + portal recursion
into `ShadowObjectRegistry.Register`, eliminates the cellScope=0
landblock-wide approximation, and removes the b3ce505 stopgap.
If A6.P4 takes longer than expected, an intermediate "portal-aware
indoor query" patch (~20 lines: walk indoor cells' `VisibleCellIds`,
collect portal-reachable outdoor cells, include in `GetNearbyObjects`
indoor branch) would close #99 without touching registration. Tagged
as fallback option B in the A6.P4 spec.
**Files:**
- `src/AcDream.Core/Physics/ShadowObjectRegistry.cs``GetNearbyObjects` indoor branch
- `src/AcDream.Core/Physics/TransitionTypes.cs:2180+``FindObjCollisions` caller
**Acceptance:** Doors at Holtburg cottage/inn doorways block the player
from both sides (outside walking in, inside walking out). Issue #98's
cellar-up fix remains intact.
**Related:** #98 (sibling — same architectural cause), #97 (phantom
collisions on 2nd floor — also likely closed by A6.P4), Finding 3
family (sling-out — also likely).
---
---
## #98-old-context-preserved-for-reference
(retained from the OPEN form for historical context — superseded by the
DONE resolution above. Skip to next active issue if you've read enough.)
**Status:** OPEN — **NEW diagnosis after A6.P3 slice 3 (2026-05-22)**
**Severity:** HIGH (blocks M1.5 demo cellar half — user can descend but cannot return)
**Filed:** 2026-05-22
**Component:** physics, BSP step_up / step_down at cellar stair geometry
**Diagnosis update 2026-05-22 (post A6.P3 slice 3):** The cell-resolver ping-pong (the original hypothesis when this issue was filed) WAS confirmed and is now FIXED by slice 3 (commits `8898166` v1 + `3e140cf` v2 — point-in stickiness check in `ResolveCellId`). Data confirms: scen4_cottage_cellar_slice3v2 capture shows only 1 cell-transit event (login teleport) vs 20+ pre-fix.
BUT the cellar-up symptom PERSISTS even with the cell-resolver fix. The remaining cause is a BSP step physics issue at the cellar stair geometry. User report: "I'm running up the stairs, at the top it looks like I'm running into something. Still running animation but not going up." Player can climb most of the stair flight but gets blocked at the TOP step where the cellar transitions to the cottage main floor.
**Evidence from slice3v2 capture:**
```
[push-back] site=adjust_sphere in=(*, -0.0752, 0.0077) out=(*, -0.0752, 0.7577)
delta=(0, 0, 0.7500) n=(0, -0.7190, 0.6950) d=-0.1007
r=0.4800 winterp=1.0000->0.0000 applied=True
```
- Surface normal `(0, -0.719, 0.695)` — sloped 44° (walkable per FloorZ=0.664)
- Push-back lifts sphere by 0.75m (step_down probe distance) repeatedly
- `winterp 1.0→0.0` — entire walk interpolation consumed by the lift each tick
- Player Z stays stuck around 0.0077 (relative to cell) → not progressing
**Hypothesis:** the step_down probe at the top of the cellar stair is hitting the sloped TOP step face (or possibly a wall poly), and consuming all walk interp pushing back. No remaining interp to actually walk forward over the top.
**Diagnosis sharpened 2026-05-22 (commit `134c9b8`)** — paired retail+acdream cdb capture confirmed cellar ascent ends with retail's BP7 setting ContactPlane to the cottage main floor (flat plane at world Z=94, 18 BP7 hits all the same plane).
**Diagnosis CORRECTED 2026-05-22 evening (slice 5 `[place-fail]` probe)** — the morning handoff's "Path 5 vs Path 6 in `BSPQuery.FindCollisions`" diagnosis is **WRONG**. The slice-5 probe-driven evidence shows:
- Retail's BP4 trace has every find_collisions hit with `collide=0`. Retail enters the same `(state & 1) Contact` branch our acdream does. There is NO outer-dispatcher path-selection divergence.
- Retail's BP5 fires on the ramp poly 17+ times during the ascent, NOT "30 hits all on flat planes" as the morning claim said. We misread the retail data.
- The actual blocker is polygon **0x0020** in the cellar cell's BSP (`n=(0,0,-1) d=-0.2` in cell-local, world Z=93.82 — the cellar's ceiling). When step-up's step-down probe lifts the sphere onto a 45° walkable surface, the sphere top extends past the ceiling polygon and `SphereIntersectsSolidInternal` correctly rejects.
- Retail succeeds because its `check_cell` transitions to cottage main floor cell 0xA9B40146 during the ascent, where the cellar's ceiling polygon is absent. Our `check_cell` stays at cellar 0xA9B40147.
Full slice 5 evidence + sharpened next-step pickup at [`docs/research/2026-05-22-a6-p3-slice5-handoff.md`](docs/research/2026-05-22-a6-p3-slice5-handoff.md). Capture data at `docs/research/2026-05-21-a6-captures/scen4_cottage_cellar_place_fail/`.
**Diagnosis FINALIZED 2026-05-23 evening** (commit `28c282a`, divergence doc at [`docs/research/2026-05-23-a6-p3-issue98-replay-comparison.md`](docs/research/2026-05-23-a6-p3-issue98-replay-comparison.md)). After 4 sessions of speculative fixes (10+ variants, none worked), apparatus shipped to turn evidence-driven analysis into a 200ms test loop:
- Deterministic replay harness: [`tests/AcDream.Core.Tests/Physics/Issue98CellarUpReplayTests.cs`](tests/AcDream.Core.Tests/Physics/Issue98CellarUpReplayTests.cs) loads the three cottage/cellar cell fixtures (captured live via the new `ACDREAM_DUMP_CELLS` probe) and drives the failing-frame sphere through our walkable predicates. 7 tests, all pass, all reproduce the live failure without a client launch.
- Retail comparison: [`docs/research/2026-05-23-a6-captures/cellar_up_capture_1/retail.decoded.log`](docs/research/2026-05-23-a6-captures/cellar_up_capture_1/retail.decoded.log) — 35K cdb BP hits during the equivalent retail cellar-up.
**REAL divergence**: NOT cell-resolver. NOT path-selection. NOT polygon 0x0020 the cellar ceiling.
- Retail's sphere is at world Z ≈ **94.48** (resting on cottage floor) when `find_walkable` accepts the cottage main floor plane.
- Our failing-frame sphere is at world Z ≈ **92.01** (2.47m lower) when our walkable query rejects the cottage main floor.
- Retail's `ContactPlane` writes during cellar-up are ONLY flat horizontal planes (cellar floor Z=90.95 OR cottage floor Z=94.00). Never the ramp.
- Retail's `find_crossed_edge` fires ONCE in 35K BPs. Acdream uses it heavily.
**Fix targets** (priority order, from the comparison doc):
1. (HIGHEST) Step-up + ramp climb doesn't gain enough Z per tick. Retail climbs gradually across thousands of ticks; ours oscillates at Z≈92. Look at `Transition.AdjustOffset` slope projection + `Transition.DoStepUp` WalkInterp handling.
2. Cottage-cell candidacy uses wrong sphere reference (pre-step-up vs step-lifted center).
3. `find_crossed_edge` over-use in our walkable acceptance path.
4. (LOW) Ramp polygon normal divergence.
**Failed fix attempts (informational):**
- WalkInterp reset before placement_insert (commit `bbd1df4`) — logical retail-faithful improvement but doesn't fix the cellar-up symptom. Keep.
- Slice 3 v1/v2/v3 cell-resolver stickiness — closed ping-pong but didn't help cellar-up. v3 reverted (`8bd3117`).
- Slice 5: `[place-fail]` probe + diagnosis correction. Useful infrastructure; not a fix.
- Slice 6 (2026-05-22 PM): 6 placement-insert bypass variants. None unstuck the player.
- Slice 7 (2026-05-23 AM): terrain hole cutout, multi-sphere CellTransit, building bldg-check, negative-side polygon support, render-vs-physics origin split. Triaged in commit `35b37df`: kept render-physics split + multi-sphere CellTransit + diagnostic probes; reverted neg-poly + bldg-check (didn't fix #98).
**Related:**
- Inn stairs UP works (different geometry, doesn't trigger this specific failure mode)
- Cellar descent works (only ascent fails — direction matters)
- Issue #90 (cell-id ping-pong workaround in `ResolveCellId`) is now superseded by slice 3 v2's stickiness check; can be removed in A6.P4 after broader visual verification
**Description:** Walking UP from a Holtburg cottage cellar in acdream gets stuck "just almost at the last step up." Stairs going UP elsewhere (inn 2nd floor) work fine post-A6.P3 slice 1. Cellar DESCENT works. Only the cellar ASCENT from the bottom back to ground level fails — specifically at the last step where the player should transition from the indoor cellar cell to the cottage ground-floor cell.
**Evidence:** captured in slice 2 v2 verification at `docs/research/2026-05-21-a6-captures/scen3_inn_2nd_floor_slice2v2/acdream.log`. Cell-transit chain shows the resolver ping-ponging between three adjacent cells:
```
0xA9B4014B → 0xA9B4014A → 0xA9B4013F → 0xA9B4014A → 0xA9B4014B → ...
(Z stays ~96.4 throughout the ping-pong — vertical position stable but cell classification oscillating)
```
Eventually the player gives up and returns down: `0xA9B4013F → 0xA9B40143 (Z drops to 94.020) → 0xA9B40146 (Z 93.426) → ...`
Each cell-transit event has `reason=resolver`, meaning `PhysicsEngine.ResolveCellId` is making the decision. The resolver classifies the position into a different cell each tick → `AdjustOffset` operates against a different cell's geometry each tick → can't accumulate forward motion → stuck.
**Root cause / status:** Same family as scen4 sling-out (A6.P2 Finding 3) and issue #90 cell-id ping-pong (which has a workaround). The retail oracle is `CObjCell::find_cell_list` Position-variant at `acclient_2013_pseudo_c.txt:308742-308783`. Retail uses cell-array hysteresis / stickiness to prevent flipping CellId on adjacent-cell boundaries when the sphere is on the boundary.
Our `ResolveCellId` + `CheckBuildingTransit` lack this stickiness — every tick they re-classify based on current position, ignoring "we were already in cell X last tick; if the new position is still close to X, stay in X."
**Fix sketch (slice 3):**
1. Port retail's cell-array hysteresis from `CObjCell::find_cell_list`.
2. Modify `ResolveCellId` to prefer the previous tick's CellId when the sphere is close to (but slightly outside) the previous cell's CellBSP volume.
3. Modify `CheckBuildingTransit` similarly for building-shell transitions.
4. May obsolete issue #90's workaround (the same stickiness mechanism would handle the doorway ping-pong too).
**Related issues:**
- Issue #90 — Cell-id ping-pong at indoor doorway threshold (existing workaround; should be removed if Finding 3 fix lands cleanly)
- Issue #97 — Phantom collisions + fall-through on 2nd floor (may also be the same cell-resolver instability)
- A6.P2 Finding 3 — Indoor cell-resolver sling-out (scen4)
**Files:**
- `src/AcDream.Core/Physics/PhysicsEngine.cs` (`ResolveCellId`)
- `src/AcDream.Core/Physics/CellPhysics.cs` (`CheckBuildingTransit`)
- `src/AcDream.Core/Physics/CellTransit.cs` (cell list iteration; may need stickiness here)
**Acceptance:** User can walk up out of a Holtburg cottage cellar without getting stuck at the last step. Cell-transit log shows no ping-pong on the cellar boundary. Issue #90 workaround can be removed (verified by ping-pong staying absent at the inn doorway too).
**2026-05-23 evening session update — Shape 1 attempted + reverted:**
- New apparatus committed:
- `8a232a3``[step-walk-adjust]` probe inside `Transition.AdjustOffset` (PhysicsDiagnostics.LogStepWalkAdjust + four branch tokens). Reveals which projection branch fires per call.
- `8daf7e7` — captured findings note at [`docs/research/2026-05-23-a6-stepwalkadjust-findings.md`](docs/research/2026-05-23-a6-stepwalkadjust-findings.md) + log snapshot at `docs/research/2026-05-23-a6-captures/stepwalkadjust/acdream.log`.
- **Refined diagnosis (corrects the 2026-05-23 evening "fix targets" priority above):** AdjustOffset is CORRECT — 145/146 calls take the `into-plane` branch with consistent +0.045 m mean zGain per call when offset points into the ramp normal. Sphere world Z climbs monotonically 90.95 → 92.80 across the ramp. **The climb caps at world Z ≈ 92.80** (cottage floor at 94.00 still 1.20 m above) because at the ramp top, the proposed check (Z=92.85) gets rejected by step-up's downward step-down probe — no walkable surface exists below the proposed position within stepDownHeight=0.6 m (cottage floor is ABOVE, not below). 101 `stepdown-reject` hits in the capture vs 1 acceptance.
- **Shape 1 fix attempted (`0cb4c59`, reverted in `402ec10`):** Added `PhysicsGlobals.ContactPlaneFlatThreshold = 0.99f` and gated `BSPQuery.AdjustSphereToPlane`'s two `SetContactPlane` call sites by `worldNormal.Z >= threshold`. The intent: match retail's cdb-observed pattern where CP is ONLY ever set on flat polygons (cellar floor or cottage floor — Normal.Z = 1.0 in all 161 BPE writes). Live test confirmed the fix breaks OnWalkable tracking: 18,916 / 25,671 step-walk lines (74%) ended in `contact=False onWalkable=False cp=n/a walkPoly=False` (the falling state). User report: "can't get up the first step. Jumped, stuck in falling animation." The gate was too aggressive — sloped walkable polygons (stair tops, ramp faces) NEED ContactPlane set for the sphere to register as on a surface.
- **What we learned about Shape 1:** simply skipping `set_contact_plane` on sloped polygons doesn't match retail behavior. Either retail synthesizes a flat CP from a sloped contact (the `step_sphere_down:321203` `Plane::Plane(&plane, esi, &point)` codepath — `esi` may be a synthesized direction, not the polygon's normal), OR retail's gate is upstream of `set_contact_plane` (the polygon never reaches CP-setting in the first place), OR our `OnWalkable` tracking is over-coupled to `ContactPlaneValid` in a way retail's isn't. The named-decomp research did not converge on a definitive answer.
**Session paused 2026-05-23 evening after two days of work.** Apparatus + probe + findings + plan + first failed fix + revert all committed. M1.5 demo's cellar half remains blocked. The honest next-session moves, in order:
1. **Build a deterministic trajectory replay harness** (drives the physics engine through N ticks with mocked input + snapshotted starting state, runs in <500ms). The Issue98 replay tests are half of this they have the cell fixtures. The missing half is the per-tick driver. With a 200ms inner loop instead of 5-minute live-test iteration, evidence-driven fix attempts become tractable.
2. **OR pivot to another M1.5 issue** with less cross-subsystem coupling. The cellar-up bug lives at the seam of AdjustOffset + ContactPlane + WalkInterp + step-up + walkable tracking + OnWalkable + cell-set membership — fixing one piece breaks another. Less-coupled issues (chronic open #2/#4/#28/#29/#37/#41, or #90 workaround removal) would yield faster forward progress.
3. **OR a deeper named-decomp research pass** focused specifically on `CEnvCell::find_env_collisions``BSPTREE::find_collisions` → indoor CP-setting chain. This path was never fully traced; the first two research passes worked on the outdoor (`CLandCell`) path. The indoor path is where the cellar lives.
**Replay tests at [`tests/AcDream.Core.Tests/Physics/Issue98CellarUpReplayTests.cs`](tests/AcDream.Core.Tests/Physics/Issue98CellarUpReplayTests.cs)** document the failing-frame geometry and will be the regression oracle when a real fix lands. They do not currently simulate trajectory.
**2026-05-23 PM extension — trajectory replay harness shipped, blocked on a SECOND bug:**
Commits `4c9290c``5c6bdbe` ship a deterministic N-tick trajectory replay at [`tests/AcDream.Core.Tests/Physics/CellarUpTrajectoryReplayTests.cs`](tests/AcDream.Core.Tests/Physics/CellarUpTrajectoryReplayTests.cs). 200-tick runs complete in <100 ms. 5 tests pass.
- **Finding:** the cellar ramp polygon is NOT in `cellStruct.PhysicsPolygons`. It lives in a separate GfxObj (a static building piece, registered as a ShadowEntry on the landblock). `CellDumpSerializer` correctly captures cell polygons; the ramp comes from a different data source entirely. The harness reconstructs the ramp polygon programmatically from the live capture's polydump data via `RegisterStairRampGfxObj`.
- **Finding:** `CellDumpSerializer.Hydrate` sets `BSP=null` per its xmldoc — so the indoor BSP collision path is skipped for hydrated fixtures. Harness wraps cells with a synthetic one-leaf BSP via `AttachSyntheticBsp` to fire the indoor path.
- **Finding:** `PhysicsBody` seeding requires BOTH `ContactPlane*` AND `WalkablePolygon*` fields. The engine at `PhysicsEngine.cs:665-673` only calls `SpherePath.SetWalkable(...)` if `body.WalkablePolygonValid && body.WalkableVertices.Length >= 3`. Without this the engine treats the sphere as "grounded but anchorless" — a contradictory state.
**NEW BLOCKER (open finding):** Even with the full apparatus (CP + WalkablePolygon seeded body, synthetic BSP, synthetic stair GfxObj registered, stub landblock), the sphere goes airborne at tick 1 with `hit=(0,1,0)` — a +Y wall normal matching no registered geometry. The hit is set by `ValidateTransition` between the `after-insert` and `after-validate` probe sites, but the inner `TransitionalInsert` call sets `ci.CollisionNormal=(0,1,0)` before ValidateTransition runs. 12 different `SetCollisionNormal` call sites in `TransitionTypes.cs` — root cause not yet isolated.
6 hypotheses tested via the harness, all failed to isolate root cause: WalkablePolygon seeding, initial Z lift (0 vs 0.05m), stair GfxObj presence, stub landblock terrain, cell BSP null vs synthetic, body=null vs seeded. Per systematic-debugging skill's "3+ failures = question architecture" rule, stop speculation; next session needs a side-by-side comparison harness against live `PlayerMovementController` state.
**Pickup document:** [`docs/research/2026-05-23-a6-p3-issue98-harness-handoff.md`](docs/research/2026-05-23-a6-p3-issue98-harness-handoff.md) is the canonical resume artifact — has the chronological commit list, apparatus inventory, exclusion list, and three concrete next-session options ranked by recommendation.
---
**Status:** DONE
@ -835,7 +1294,7 @@ or +small fix if different. Not blocking M1.
## #71 — WorldPicker Stage B — polygon refine for retail-accurate clicks
**Status:** OPEN
**Severity:** LOW (Stage A — screen-rect picker — is sufficient for M1)
**Severity:** MEDIUM (Stage A now causes real play mis-picks through open doors/windows)
**Filed:** 2026-05-16
**Component:** selection / picker
@ -855,6 +1314,15 @@ to the visible mesh — under-pick what looks like empty space inside
the rect, catch visible mesh that pokes past the sphere boundary
(creature outstretched arm, sign edge).
**New evidence (2026-05-28 / Phase A8 visual gate):** User stood outside
a Holtburg building, saw a vendor through an open doorway/window, clicked
the visible vendor, and acdream selected the door instead:
`[B.4b] pick guid=0x7A9B4015 name=Door`. This is exactly the Stage A
failure mode: the open door's projected `Setup.SelectionSphere` rect is
closer than the vendor's rect, even though the visible door polygon is not
under the cursor. The fix is polygon refinement against visible GfxObj
triangles plus current animated part transforms; do not special-case doors.
**Acceptance:** Pipe per-part GfxObj visual polygons through a
`PickPolygonProvider` interface (don't duplicate mesh decoding —
hook the existing `ObjectMeshManager` cached data). Two-tier in
@ -865,8 +1333,8 @@ frame edges.
**Estimated scope:** Medium (~4-6 hours). Defer until visual
verification surfaces a Stage A miss in real play. The user
confirmed 2026-05-16 that "I can click on longer ranges now so
good" — Stage A is enough for M1's "click an NPC" demo.
confirmed 2026-05-28 that the door/vendor case is now observable in real
play, so this should be scheduled soon after A8 rather than left as polish.
---
@ -3101,6 +3569,73 @@ Unverified. The likely culprits, ranked by suspected probability:
# Recently closed
## #100 — [DONE 2026-05-25 · f48c74aa + a64e6f2] Transparent rectangular patches around every house (terrain rendering)
**Status:** DONE
**Closed:** 2026-05-25
**Commits:** `f48c74aa`, `a64e6f2`
**Component:** rendering, terrain
**Resolution (2026-05-25 · #100):** Replaced the cell-level
`hiddenTerrainCells` mechanism with retail's per-vertex Z nudge
(`zFightTerrainAdjust = 0.00999999978`) applied inside the modern
terrain vertex shader. Render terrain everywhere; coplanar building
floors win the depth test by being 1 cm higher than the rendered
terrain. Physics path untouched. ~50 LOC of `BuildingTerrainCells`
plumbing removed across LandblockMesh / LoadedLandblock /
LandblockLoader / GameWindow / GpuWorldState / LandblockStreamer
plus the corresponding unit test. Retail anchors:
acclient_2013_pseudo_c.txt:1120769 + :702254.
**Description:** Standing outside any Holtburg house, the ground in a
rectangular footprint around the building appears as a flat dark patch
instead of cobblestone / grass terrain. Visible as a sharp-edged
rectangle the size of the house's outdoor footprint. Same shape on
every house observed.
User report 2026-05-24 (with screenshot): "around every house now I
missing the ground texture, it is transparent. I can see through the
ground."
**Root cause:** Bisect 2026-05-24 — commit `35b37df` is the introducer. It
added a `hiddenTerrainCells` parameter to `LandblockMesh.Build` that collapses
terrain triangles owned by buildings to zero-area degenerates. The hide
mechanism works at outdoor-cell granularity (24 m × 24 m cells), so the entire
cell terrain was hidden but the cottage geometry only covers a smaller area inside
it — leaving a dark transparent rectangle. The fix renders terrain everywhere and
uses retail's Z nudge to ensure building floors win the depth test.
---
## #101 — [DONE 2026-05-25 · 5240d65 + 6ca872f] Stair-step cylinder phantom blocks player on multi-part EnvCell entity
**Closed:** 2026-05-25
**Commits:** `f6305b1` — feat(physics): #101 — add IsPhantomGfxObjSource predicate; `5240d65` — fix(physics): #101 — suppress mesh-aabb-fallback for phantom GfxObj stabs; `6ca872f` — docs(test): #101 — sync stale GameWindow.cs line ref in test class doc
**Component:** physics, dat-handling
**Resolution.** `PhysicsDataCache.IsPhantomGfxObjSource(gfxObjId)` predicate returns `true` when
the entity's `SourceGfxObjOrSetupId` has the GfxObj high byte (`0x01`) AND no cached
`GfxObjPhysics` entry exists (or its `BSP.Root` is null) — i.e., the underlying GfxObj had
`HasPhysics=False` so `PhysicsDataCache.CacheGfxObj` short-circuited. The inline
mesh-AABB-fallback gate at `GameWindow.cs:6127` checks this predicate and skips the shadow-shape
registration entirely when the source is a phantom. The 10 phantom stair cyls from
`GfxObj 0x0100081A` (`hasPhys=False`) that previously blocked the player at the foot of the
Holtburg upper-floor staircase are no longer registered. Collision falls through to entity
`0x40B50089` (GfxObj `0x01000C16`, `hasPhys=True` BSP with walkable inclined polygon at
`Normal.Z=0.717`, world ramp from (111.10, 25.50, 94.00)→(107.50, 27.10, 97.50)). 3 unit tests
in `PhysicsDataCachePhantomSourceTests.IsPhantomGfxObjSource_*` (no BSP → true; has BSP →
false; non-GfxObj high byte → false) shipped alongside the predicate.
**Investigation:** [`docs/research/2026-05-25-a6-stairs-cyl-retail-investigation.md`](research/2026-05-25-a6-stairs-cyl-retail-investigation.md).
**Plan:** [`docs/superpowers/plans/2026-05-25-issue-101-stairs-cyl-phantom.md`](superpowers/plans/2026-05-25-issue-101-stairs-cyl-phantom.md).
**Verification.** Visual-verified at Holtburg upper-floor cottage stairs 2026-05-25 — `[cyl-test]`
count on `obj=0x40B500*` post-fix = 0 (was 7101 pre-fix); `src=0x0100081A` mesh-aabb-fallback
count = 0 (was 28 pre-fix). Player climbed Z=94→97.5 holding W continuously over the full 45°
ramp — no phantom diagonal slides.
---
## #86 — [DONE 2026-05-19 · 3764867 + 4e308d5] Click selection penetrates walls
**Closed:** 2026-05-19

View file

@ -113,12 +113,47 @@ with no code changes lost — M1.5 doesn't touch WB-extracted territory.
### Milestone M1.5 — "Indoor world feels right" (ACTIVE — Phase O shipped; resuming from 2026-05-20 baseline)
The current top of the work order. Two phases (A6 + A7) inside one
milestone. M2 ("kill a drudge") is deferred until M1.5 lands —
drudges live in dungeons and the M2 demo target requires solid indoor
The current top of the work order. M2 ("kill a drudge") is deferred until M1.5
lands — drudges live in dungeons and the M2 demo target requires solid indoor
navigation. Full milestone block in
[`docs/plans/2026-05-12-milestones.md`](2026-05-12-milestones.md).
**2026-05-30 — render-pipeline pivot.** Indoor *rendering* (the seamless in/out
seam: the flap, missing/transparent walls, terrain bleed) is NO LONGER pursued via
the WB-inherited two-pipe (inside/outside) split. That whole approach (Phase A8/A8.F,
issue #103) is **abandoned**. Indoor rendering is now **Phase U** below. Phase A6
(physics) and A7 (lighting) inside M1.5 are unaffected.
#### Phase U — Unified retail-faithful render pipeline (NEW — supersedes A8/A8.F)
**Decision (2026-05-30):** replace the two render paths (outdoor `Draw` +
`RenderInsideOut` stencil, toggled on `cameraInsideBuilding`) with ONE pipeline driven
by retail's portal-visibility view (`PView::ConstructView` / `ClipPortals` / `GetClip`;
`CEnvCell::find_visible_child_cell`). The camera's cell is just the root of a recursive
per-portal clip-region traversal; all visible cells (indoor + outdoor) draw in one pass.
Seamless in/out **by construction** — no inside/outside branch. Modern code, retail
behavior.
- **Why:** the two-pipe split is a WB-editor inheritance, not a game-client design; you
cannot make two pipes hand off seamlessly at a doorway. Retail never splits. The A8.F
attempt to graft retail recursion onto the WB stencil failed its visual gate (#103).
- **Keep:** WB mesh/dat pipeline (ObjectMeshManager/WbDrawDispatcher/terrain), the
2026-05-30 camera-collision + physics work. **Salvage (verify):** the A8.F CPU
clip-builder (PortalProjection/ScreenPolygonClip/ViewPolygon/PortalVisibilityBuilder —
unit-test-correct). **Task 1:** delete the dead two-pipe code (RenderInsideOutAcdream,
the cameraInsideBuilding branch, IndoorCellStencilPipeline, the ACDREAM_A8_INDOOR_BRANCH
kill-switch) — audit first; some A8 commits fixed real bugs (BuildingId stamping, pool
aliasing) that stay.
- **Scope + next-session pickup:**
[`docs/research/2026-05-30-unified-render-pipeline-decision-and-handoff.md`](../research/2026-05-30-unified-render-pipeline-decision-and-handoff.md).
Start with `superpowers:brainstorming`; visual verification at Holtburg
cottage/cellar/inn + a portal dungeon is the acceptance gate (unit tests did not
catch #103).
- **Camera-collision (shipped 2026-05-30, kept):** retail `SmartBox::update_viewer`
swept-sphere spring arm (`CameraDiagnostics.CollideCamera`, `PhysicsCameraCollisionProbe`,
`RetailChaseCamera` integration) + viewer/sight bypass of the 30-step transition cap.
Specs: [`2026-05-29-a8f-camera-collision-design.md`](../superpowers/specs/2026-05-29-a8f-camera-collision-design.md).
**Today's pre-M1.5 baseline** (2026-05-20 — committed in this
session): A4 multi-cell BSP iteration (`691493e`), #89 sphere-overlap
in CheckBuildingTransit (`7ac8f54`), #90 sphere-overlap stickiness in
@ -142,27 +177,95 @@ successfully 2026-04-30 for the steep-roof case. Matching binaries
(acclient.exe v11.4186) + PDB present.
**Sub-pieces (slices):**
- **A6.P1 — cdb probe spike** (~3 days). Build cdb scripts capturing
retail's per-tick state at 9 scenarios:
- 4 building sites: Holtburg inn doorway, inn stairs, inn 2nd floor,
cottage cellar.
- 5 dungeon sites: Holtburg Sewer entry portal, first stair descent,
inter-room portal transition, open central area, dark corridor.
Breakpoints on `set_collide`, `step_sphere_up`, `step_sphere_down`,
`transitional_insert`, `set_contact_plane`, `validate_walkable`.
Mirror with our equivalent probes (`[indoor-bsp]`, `[cp-write]`,
new `[push-back]`).
- **A6.P2 — Analysis report** (~1 day). Quantify the per-call-site
gap. Identify which BSP path(s) over- or under-correct. Output:
13 specific bug findings with retail decomp anchors.
- **A6.P3 — Fix the BSP correction paths** (~35 days). Surgical
fixes informed by A6.P2 data. Likely touches `BSPQuery.AdjustSphereToPlane`,
`AdjustOffsetToPlane`, Path 5 / Path 6 branches, sub-step state
mutation.
- **A6.P4 — Remove workarounds** (~1 day). Revert #90 sphere-overlap
stickiness in `PhysicsEngine.ResolveCellId`. Delete
`Transition.TryFindIndoorWalkablePlane` + its caller in
`FindEnvCollisions`. Verify behavior holds without the workarounds.
- **✓ SHIPPED — A6.P1 — cdb probe spike** (2026-05-21). Built cdb
scripts (`tools/cdb/a6-probe.cdb` v4 with PDB-verified offsets +
hex-bits float output + Python decoder), PowerShell runner with
ASCII encoding, PDB-match verification, and the new
`[push-back]`/`[push-back-disp]`/`[push-back-cell]` acdream probe
family (env `ACDREAM_PROBE_PUSH_BACK=1`). Captured 5 of 9 scenarios
with paired retail+acdream traces (scen1 inn doorway, scen2 inn
stairs, scen3 inn 2nd floor, scen4 cottage cellar, scen5 town
network portal as substitute for Holtburg Sewer entry). Scen6-9
cancelled — Holtburg Sewer doesn't exist on this server, and any
substitute dungeon hits issue #95 (portal-graph visibility blowup)
on portal entry, making physics-only analysis impossible. Five
captures are sufficient evidence for A6.P2. Commits: infra Tasks
1-14 + cdb iterations + scen1 capture (prior session); scen2-5
captures (`a9a427f`, `297d1c5`, `4b5aebc`, `46c6e08`, `35d5c58`)
+ issue #95 filing (`5be784e`) (this session).
- **✓ SHIPPED — A6.P2 — Analysis report** (2026-05-21, `184933d`).
Output: [`docs/research/2026-05-21-a6-cdb-capture-findings.md`](../research/2026-05-21-a6-cdb-capture-findings.md).
Four findings ready for A6.P3: Finding 2 (ContactPlane resynthesis
blowup — 250× to ∞× more CP writes in acdream; primary M1.5 root
cause) is HIGH severity and the highest-confidence single-cause
fix candidate. Finding 1 (dispatcher entry frequency mismatch —
4× to 281× fewer dispatcher entries in acdream) is likely a
secondary effect of Finding 2's missing retention paths. Finding 3
(indoor cell-resolver sling-out captured in scen4) — HIGH severity,
separate fix surface in ResolveCellId/CheckBuildingTransit.
Finding 4 (portal-graph visibility blowup discovered incidentally
in scen5) — filed as issue #95, scope-adjacent, handled outside A6.
Tables 1+2 (per-site push-back delta + path-frequency diff)
deferred to optional A6.P1.5 (entry+exit BPs in cdb script);
not blocking A6.P3. M1.5 symptom coverage matrix shows every
in-scope physics symptom mapped to at least one finding.
- **A6.P3 — Fix the BSP correction paths** (~35 days). Multi-slice.
- **✓ SHIPPED — A6.P3 slice 1 — Indoor ContactPlane retention**
(2026-05-21, commits `ba9655f` plan + `6b4be7f`/`c6bc2b9` T1
research + `869edd9` T2 counter + `36975ef`/`a32f569` T3 test +
`5aba071` T4 Mechanism B + `5f7722a`/`39fc037`/`bd5fe2e` T5 strip
+ `066568a` scen2 postfix proof + `<this commit>` T8 bookkeeping).
Stripped `TryFindIndoorWalkablePlane` synthesis path from
`Transition.FindEnvCollisions` indoor branch (matches retail's
tiny `CEnvCell::find_env_collisions` shape at acclient_2013_pseudo_c.txt:309573).
Added Mechanism B (LKCP restore) in `Transition.ValidateTransition`
matching retail's pattern at acclient_2013_pseudo_c.txt:272565-272583.
Per-unit-of-activity CP-write rate dropped 63×. **Unexpected win:
stairs + cellar descent now WORK in acdream** (user happy-test
confirmed). A6.P2 Finding 1 (dispatcher entry frequency mismatch)
CLOSED as side-effect (dispatcher shape now retail-like). Finding 2
PARTIALLY CLOSED — 99.3% of remaining cp-writes come from L622
per-tick body-CP seed at `PhysicsEngine.ResolveWithTransition:622`
(filed as issue #96 for slice 2).
- **✓ SHIPPED — A6.P3 slice 2 — L622 seed investigation + no-op guard**
(2026-05-22, commits `892019b` v1 + `f8d669b` v2). v1 removed the
L622 seed entirely; broke BSP step_up at the last step of stairs
(user happy-test surfaced the regression). v2 reverted v1 + added
a no-op-if-unchanged guard inside `CollisionInfo.SetContactPlane`.
**#96 PARTIALLY ADDRESSED — accepted as documented retail
divergence.** The seed is load-bearing for `AdjustOffset`
slope-projection on sub-step 1 which BSP step_up depends on.
Matching retail would require deeper refactor (e.g. AdjustOffset
fallback to body.ContactPlane). Guard is benign improvement;
further #96 closure deferred.
- **✓ SHIPPED — A6.P3 slice 3 — cell-resolver stickiness**
(2026-05-22, commits `8898166` v1 + `3e140cf` v2). Added
point-in stickiness check at top of `ResolveCellId`'s indoor
branch. Cell-resolver ping-pong FULLY CLOSED (data: scen4 cellar
capture shows 1 cell-transit vs 20+ pre-fix). **Outcomes:**
Finding 3 (cell-resolver instability) closed. #90 workaround
redundant (deferred A6.P4 removal). #97 phantom collisions
hypothesis pending re-test (likely closed too). #98 cellar-up
symptom PERSISTS but with NEW diagnosis (re-filed in #98 as BSP
step-physics at cellar stair top — sloped step-face mis-handling,
NOT cell-resolver).
- **A6.P3 slice 4 (or A6.P4)? — BSP step-physics at cellar
stair top (#98 new diagnosis)** (NEXT or DEFERRED). Investigate
why step-down probe consumes all walk-interp at cellar stair top.
Evidence: scen4 cottage_cellar_slice3v2 push-back trace. May
require reading BSP step_up + step_down decomp + comparing to
cellar stair geometry. Could be its own slice or merged into a
broader A6.P4 cleanup phase.
- Issue #95 (visibility blowup) NOT in A6.P3 scope — separate work
surface.
- **A6.P4 — Remove workarounds + visual verification** (~1 day after
P3). Revert #90 sphere-overlap stickiness in
`PhysicsEngine.ResolveCellId`. Delete `Transition.TryFindIndoorWalkablePlane`
+ its caller in `FindEnvCollisions`. Visual verification at Holtburg
inn + cellar + (if #95 is also fixed by then) a dungeon. The
original A6.P4 plan named "Holtburg Sewer end-to-end" as the
acceptance walk; since the sewer doesn't exist, the M1.5 demo
scenario needs an alternative (see milestones doc).
#### Phase A7 — Indoor lighting fidelity (RenderDoc + retail-decomp driven)

View file

@ -187,6 +187,17 @@ close range and the player sees "You pick up the X." in chat.
### M1.5 — "Indoor world feels right" — 🔵 ACTIVE (resumed 2026-05-21 after Phase O ship)
**2026-05-30 — render-pipeline pivot.** The indoor *rendering* seam (seamless
in/out: the flap, missing/transparent walls, terrain bleed) will be solved by a
**single unified retail-faithful render pipeline (Phase U)**, replacing the
abandoned two-pipe inside/outside split (A8/A8.F, issue #103). The two-pipe split
is a WorldBuilder inheritance; retail uses one portal-visibility pass and is
seamless by construction. Decision + scope:
[`docs/research/2026-05-30-unified-render-pipeline-decision-and-handoff.md`](../research/2026-05-30-unified-render-pipeline-decision-and-handoff.md).
Camera-collision + a physics viewer-cap fix shipped 2026-05-30 and are kept (they
were a detour from the real seam fix, but retail-faithful and worth keeping). A6
(physics) and A7 (lighting) are unaffected.
**Phase O — DatPath Unification — shipped 2026-05-21.** ONE thing
touches the DATs. ~33 WB files (~7.7K LOC) extracted into
`src/AcDream.{Core,App}/Rendering/Wb/`; project references to
@ -198,18 +209,30 @@ baseline with no code changes lost — Phase O did not touch dat-loading
infrastructure for physics or collision, only the rendering pipeline.
M1.5's planned phases (A6 + A7) are unaffected.
**Demo scenario:** Enter the Holtburg Sewer dungeon through the in-town entry
portal. Navigate to the end (57 rooms with stairs + a multi-Z chamber).
Exit back to town. Throughout the walk:
**Demo scenario (updated 2026-05-21):** The original demo target was
"enter the Holtburg Sewer dungeon" but that location doesn't exist on
this ACE server (discovered during A6.P1 capture session). Additionally,
A6.P1 surfaced **issue #95** (portal-graph visibility blowup) which makes
ANY dungeon visually unusable on entry. Demo scenario revised in two
parts:
**Building/cellar demo (achievable after A6.P3 lands):**
Walk into the Holtburg inn, climb to the 2nd floor, walk around without
sling-out or wall-clip. Enter a cottage cellar, descend without falling
through. Throughout:
- Walls block — no walk-through anywhere, indoor or stab-shell.
- Stairs work — ascend + descend without falling through or stuck-in-falling.
- Items block — sarcophagi, urns, decorations, tables, chests, fireplaces.
- Lighting reads correctly — torchlit rooms are bright, dark corridors are
dark, no spotlights projecting onto walls from held items, no upper-floor
dimming bug, static decorations participate in the day-cycle (outside) and
in per-cell environment lighting (inside).
- Items block — furniture, decorations.
- Cell transitions are smooth — no CellId ping-pong, no flicker.
- Lighting reads correctly — torchlit rooms are bright, no spotlight
artifacts, static decorations participate in env lighting.
**Dungeon demo (blocked on issue #95 fix; promote to post-M1.5 if
the visibility bug isn't addressed in M1.5 scope):**
Enter any dungeon via portal (substitute for "Holtburg Sewer"). Navigate
~3-5 rooms without rendering corruption (no see-through-walls, no
other-dungeons-rendered-inside). Walls block, stairs work, items block,
lighting correct, cell transitions smooth.
**Why this is its own milestone:** M1 landed walkable + clickable as a
specification (the doorways open, NPCs select, items pick up — all visible
@ -238,14 +261,16 @@ patch.
- **#80** — Camera on 2nd floor goes very dark
- **#81** — Static building stabs don't react to atmospheric lighting
- **#83** — Indoor multi-Z walking broken (cellars, 2nd floors, intermittent falling-stuck)
- **#88** — Indoor static objects vibrate (suspected sub-step state corruption)
- **#88** — Indoor static objects vibrate (suspected sub-step state corruption — A6.P2 maps to Finding 2 family)
- **#90** — CellId ping-pong (workaround in place; remove during A6.P4)
- **#95** — Portal-graph visibility blowup (filed 2026-05-21; **blocks the dungeon half of the M1.5 demo** but is NOT in A6 scope; either add a dedicated phase inside M1.5 to fix it OR promote the dungeon demo to post-M1.5)
- **L-indoor** — Lighting indoors broken (file as new # during M1.5 kickoff)
- **L-spotlight** — Items projecting spotlight on walls (file as new # during M1.5 kickoff)
- **Stairs walk-through** — file as new # during M1.5 kickoff
- **2nd-floor walking** — file as new # during M1.5 kickoff
- **Cellar descent** — file as new # during M1.5 kickoff
- **`TryFindIndoorWalkablePlane`** — synthesis workaround removal (Bug A's original goal, finally unblocked)
- **Stairs walk-through** — captured + characterized by A6.P2 (Finding 2 family); fix in A6.P3
- **2nd-floor walking** — captured + characterized by A6.P2 (Finding 2 — scen3 shows infinite CP-write ratio on flat 2nd-floor walk); fix in A6.P3
- **Cellar descent** — same physics family as stairs; fix in A6.P3
- **Indoor sling-out** (new symptom from A6.P1 scen4) — captured + mapped to A6.P2 Finding 3 (cell-resolver in ResolveCellId / CheckBuildingTransit); fix in A6.P3
- **`TryFindIndoorWalkablePlane`** — synthesis workaround removal (Bug A's original goal, finally unblocked; A6.P4)
**Frozen phases during M1.5:** all M0 + M1 phases stay frozen. Plus
specifically the recently-shipped A4 + #89 + #91 + #92 (today's work) — those

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,356 @@
# A6.P2 cdb capture findings — 2026-05-21
**Status:** SHIPPED — 5 of 9 scenarios captured; scen6-9 cancelled (see
"Capture inventory" below). Findings 1-4 ready for A6.P3 fix surfacing.
**Spec:** [`docs/superpowers/specs/2026-05-21-phase-a6-indoor-physics-fidelity-design.md`](../superpowers/specs/2026-05-21-phase-a6-indoor-physics-fidelity-design.md).
**PDB match verification:** [`pdb-match-verification.txt`](2026-05-21-a6-captures/pdb-match-verification.txt).
**Prior handoffs:**
- [A6.P1 partial-ship handoff](2026-05-21-a6-p1-partial-ship-handoff.md) (this session continues from there)
## TL;DR — what the 5 captures prove
1. **Finding 2 (ContactPlane resynthesis blowup) is overwhelmingly confirmed.**
Across all 5 scenarios, acdream writes ContactPlane fields **250× to ∞×**
more often than retail. The infinite ratio is scen3 (flat 2nd-floor walk):
retail's `set_contact_plane` fires **zero times** while acdream writes
86,748 field updates. This is the M1.5 root cause for "indoor walking
feels broken." A6.P3 fix surface: stop resynthesizing CP every frame.
2. **Finding 1 (dispatcher entry frequency mismatch) extends to all scenarios**
but is shape-divergent. Retail calls `BSPTREE::find_collisions` 4× to
281× more often than acdream's `BSPQuery.FindCollisions`. Largest gap on
scen5 (Town Network walk: retail 9,552 vs acdream 34 — 281× fewer in
acdream). Suggests retail's `transitional_insert` calls dispatcher
per-sub-step regardless of expected collisions; acdream's modern path
is lazier.
3. **Finding 3 (cell-resolver indoor sling-out) directly captured on scen4.**
+Acdream walked a few meters inside a Holtburg cottage cellar and the
resolver flung the character across a landblock boundary (`0xA9B40148 →
0xA9B40029 → 0xA9B30030`, all `reason=resolver`). Indoor BSP was barely
queried (only 2 `[indoor-bsp]` hits during the sling-out); `[check-bldg]`
fired 5,495 times trying to re-resolve which building +Acdream was in.
This is a distinct failure family from the stair-attempt pattern (scen2)
and the CP-write blowup.
4. **Finding 4 (portal-graph visibility blowup, scope-adjacent).** Discovered
incidentally during scen5: after portal teleport to Town Network hub,
`visibleCells` per cell exploded from ~4 to 135-145 and cells from
disconnected landblocks (0x0007, 0x020A, 0x0408) were cached. Filed
separately as **issue #95** — this is the underlying cause of
user-observed "dungeons are broken (see through walls / other dungeons
rendering)" across the project.
## Capture inventory
| # | Tag | Walk script | Retail | Acdream | Status |
|---|---|---|---|---|---|
| 1 | scen1_inn_doorway | Walk through inn door, stop just inside | ✅ | ✅ | committed (prior session) |
| 2 | scen2_inn_stairs | Walk up 4 steps; acdream re-captured as stair-FAILURE | ✅ | ✅ | committed (this session) |
| 3 | scen3_inn_2nd_floor | Forward 3m, sidestep 1m, walk back (teleport into acdream) | ✅ | ✅ | committed (this session) |
| 4 | scen4_cottage_cellar | Retail ascent + acdream teleport-in + sling-out | ✅ | ✅ | committed (this session) |
| 5 | scen5_sewer_entry | Town Network portal entry (Holtburg sewer doesn't exist) | ✅ | ✅ | committed (this session) |
| 6 | scen6_sewer_first_stair | — | ❌ | ❌ | **cancelled** — Holtburg Sewer doesn't exist on this server |
| 7 | scen7_sewer_inter_room | — | ❌ | ❌ | **cancelled** — same |
| 8 | scen8_sewer_chamber | — | ❌ | ❌ | **cancelled** — same |
| 9 | scen9_sewer_corridor | — | ❌ | ❌ | **cancelled** — same; any substitute dungeon hits issue #95 (visibility blowup) on portal entry, making physics-only analysis impossible |
**Why scen6-9 cancelled:** The A6.P1 design spec assumed the Holtburg Sewer existed and was accessed via portal. Neither is true on this ACE server. Any substitute dungeon (the path we'd normally take) hits the portal-graph visibility bug (issue #95) immediately on portal entry, making the dungeon visually unusable for navigation. A dedicated A6.P1-redux capturing post-#95-fix dungeon physics is a candidate for A8 or M1.5-residual scope; for A6.P2 the 5 captured scenarios provide sufficient evidence.
## Analysis tables
### Table 1 — Per-site push-back delta
**DEFERRED.** The current cdb probe (v4) captures BP5 (`adjust_sphere_to_plane`) at function **entry only**. Computing per-call delta `‖output_center input_center‖` requires a paired epilogue breakpoint that captures the corrected sphere center after the function returns. This was scoped out of A6.P1 to keep the cdb script simple; adding it is a future A6.P1.5 (estimated 1 hour: add `bu` exit breakpoints to v5 of `a6-probe.cdb`).
**What we have instead:** BP5 entry-count is a proxy for "how often does adjust_sphere fire." From the BP5 counts in Table 3 (column 1), acdream's BP5 call rate is divergent from retail's, but without paired entry/exit values we can't quantify the over-correction directly.
**Bug-candidate threshold flagged in the spec (acdream > 3× retail on delta):** unmeasurable from current data; defer to A6.P1.5 or accept Findings 1-3 as sufficient triggers for A6.P3 fixes.
### Table 2 — Path-frequency diff
**DEFERRED.** Same reason as Table 1 — the cdb probe captures `BSPTREE::find_collisions` entry only, not which of the 7 exit paths (`PLACEMENT_INSERT`, `check_walkable`, `step_down`, `collide_with_pt`, `set_collide+slid`, `step_sphere_up`, `find_walkable`) was taken. Adding exit-discriminating breakpoints requires breakpoint after each return site in `find_collisions` — non-trivial cdb scripting work, deferred to A6.P1.5.
**What we have instead:** total dispatcher entries per scenario (Table 3 column 1). Acdream's overall dispatcher call rate is wildly lower than retail's in every scenario — see Finding 1 below.
### Table 3 — ContactPlane lifecycle diff (the smoking gun)
Walk duration was variable per scenario; values below are raw counts over the full capture window. Ratios are comparable across rows because both clients walked similar-duration scenarios per pair.
| Scenario | Retail BP4 dispatcher | Acdream push-back-disp | Acdream/Retail dispatcher ratio | Retail BP7 set_contact_plane | Acdream cp-write | **CP-write ratio (acd/retail)** |
|---|---:|---:|---:|---:|---:|---:|
| 1 inn doorway | 9,289 | 295 | 0.032× (31× fewer) | 18 | 73,304 | **4,072×** |
| 2 inn stairs (acdream: stair-fail) | 47,783 | 4,156 | 0.087× (11× fewer) | 136 | 33,969 | **250×** |
| 3 inn 2nd floor (acdream teleport) | 10,636 | 2,752 | 0.259× (4× fewer) | **0** | 86,748 | **∞** |
| 4 cottage cellar (acdream sling-out) | 12,596 | 82 | 0.007× (154× fewer) | 3 | 35,624 | **11,875×** |
| 5 town network portal | 9,552 | 34 | 0.004× (281× fewer) | 65 | 20,956 | **322×** |
**Geometric mean of CP-write ratio across the 4 finite scenarios (excluding scen3 ∞):** ~1,470×. **Median (excluding scen3):** ~2,200×.
**Verdict:** acdream writes the ContactPlane on order of 1,000× more frequently than retail. The only scenario where ratios are "small" (250×) is scen2's stair-attempt, where acdream's CP-write count is actually LOWER than the other scenarios because the failing physics couldn't synthesize a valid CP — see Finding 2 inversion.
### Table 4 — Sub-step state mutations
**PARTIAL.** Per-field mutation counts require shadow-state diffing across sub-steps, which the v4 probe doesn't emit. What we CAN report is per-tag firing rates that approximate state-mutation pressure:
| Scenario | Retail BP2 step_up | Acdream indoor-bsp | Acdream indoor-walkable | Acdream cell-cache | Acdream check-bldg |
|---|---:|---:|---:|---:|---:|
| 1 inn doorway | (0) | 26 | 18 | 540 | 9,530 |
| 2 inn stairs (fail) | 188 | 1,286 | 859 | 527 | 81 |
| 3 inn 2nd floor | (0) | 1,061 | 707 | 527 | 740 |
| 4 cottage cellar (sling) | 13 | 2 | 2 | 540 | **5,495** |
| 5 town network | 1 | 2 | 2 | 9,642 | 740 |
Notable patterns:
- **Acdream check-bldg fires 1-2 orders of magnitude more than push-back-disp** in scen1 (9,530 vs 295), scen4 (5,495 vs 82), and scen5 (740 vs 34). The `CheckBuildingTransit` machinery is constantly re-resolving "which building is the player in" even when the BSP itself isn't being queried. This is a state-thrash separate from the CP-write blowup.
- **Acdream indoor-bsp and indoor-walkable scale together** in the stair-attempt scenarios (scen2: 1,286/859; scen3: 1,061/707) but stay near zero on outdoor/portal walks (scen4/5: ~2 each). Suggests indoor BSP is gated by something that doesn't fire during normal Holtburg walking but DOES fire during stair attempts.
- **Cell-cache scales with how much landblock streaming happened**: 540 on standstill scenarios, 9,642 on scen5 where the player walked across Holtburg to reach the network portal.
## Per-scenario narrative
### Scenario 1 — Inn doorway (prior session)
User walked through the Holtburg inn front door, stopped just inside. Standard short walk over a threshold.
Retail: 18 set_contact_plane calls (~one per second of walking).
Acdream: 73,304 cp-write events. **Ratio: 4,072×.**
Per-call shape match (BP5 hit#1, vertical step-down probe against ground):
- Plane: (0, 0, 1), d≈0 — identical.
- Sphere radius: 0.48 — identical.
- WalkInterp: 1.0 — identical.
- Sphere.center.z and Movement.z DIFFER between retail and acdream (retail: -0.27 / -0.75; acdream: +0.46 / -0.50). Could be local-space convention difference (retail's `localspace_pos` vs our per-cell transform) OR could be the BSP correction-path divergence the spec hypothesizes. A6.P3 work surface.
### Scenario 2 — Inn stairs (acdream re-captured as stair-FAILURE)
Retail walked successfully up 4 inn stair steps. Acdream re-captured AFTER an initial mislabeled door-walk: user attempted to climb the inn stairs, character failed (couldn't ascend).
**Retail signature:** BP2 step_up=188 — clean stair-climb signature (scen1 doorway had only 1 BP2 hit). BP6 check_walkable=677 (with threshold=FloorZ 0.6642, confirmed by hex decoder).
**Acdream failure signature (stair-attempt vs door-walk):**
| Tag | door-walk | stair-attempt | Ratio |
|---|---:|---:|---:|
| push-back-disp | 1,141 | 4,156 | 3.6× |
| push-back-cell | 87 | 1,478 | **17×** |
| other-cells | 87 | 1,478 | **17×** |
| indoor-bsp | 343 | 1,286 | 3.7× |
| indoor-walkable | 227 | 859 | 3.8× |
| cp-write | 70,244 | 33,969 | 0.5× (inverse!) |
The 17× explosion on push-back-cell / other-cells is the failure: when the indoor BSP query can't resolve a stair-step, the multi-cell fallback fires constantly. The cp-write DROP (half the door-walk volume) is the inverse signal: when no ground plane resolves, no CP gets written. Both are A6.P3 fix-surface indicators.
### Scenario 3 — Inn 2nd floor (acdream via @teleport)
Flat-floor walk: forward 3 m, sidestep 1 m, walk back. Both clients in the same physical space (acdream got there via `@teleport` admin command).
**Retail signature:** BP1=10,217, BP4=10,636, BP5=113, BP6=113, **BP2=0, BP3=0, BP7=0.** No stairs, no walls, no contact plane updates — retail's physics did almost nothing because the 2nd-floor is flat and there's nothing to collide with.
**Acdream signature:** cp-write=86,748, push-back-disp=2,752, indoor-bsp=1,061, push-back=320.
The infinite-ratio CP-write blowup. Retail wrote CP zero times across an entire flat-floor walk; acdream rewrote CP fields 86,748 times. This is the cleanest evidence for Finding 2: the bug fires equally on ordinary flat indoor walking, not just on stair attempts.
### Scenario 4 — Cottage cellar (asymmetric pair)
Retail: walked UP out of cellar (2-step ascent + indoor→outdoor exit).
Acdream: teleported INTO cellar, walked a few meters, resolver flung +Acdream OUTSIDE the cottage.
**Retail signature:** BP2=13 (cellar ascent is 2 steps; gives 13 step_up hits — non-linear vs scen2's 188 hits for 4 stair steps; depends on step height and tick density). BP7=3 (almost no CP updates during ascent + exit).
**Acdream sling-out signature:** distinct from scen2's stair-attempt:
- check-bldg=5,495 (CheckBuildingTransit fired constantly during the sling)
- cell-transit=3 events captured the sling: `0xA9B40148 → 0xA9B40029 → 0xA9B30030`, all `reason=resolver`. The third transit crossed a landblock boundary (`A9B4 → A9B3`).
- indoor-bsp=2 (indoor BSP was barely queried during the sling!)
- push-back=1 (no real sphere-adjustment happened)
The sling-out is the cell-RESOLVER misbehaving, not the BSP. ResolveCellId pushed the player out of indoor space without engaging the indoor BSP collision path at all. The check-bldg storm is the symptom: every tick, CheckBuildingTransit re-tries to figure out which building the player is in, gets it wrong, and the resolver acts on that wrong answer.
### Scenario 5 — Town Network portal entry
Substituted for "Holtburg Sewer entry" (which doesn't exist). Both clients walked to the Town Network Portal in Holtburg, entered it, walked 2 m forward in the network hub.
**Retail signature:** clean walk + portal transition + indoor walking in hub. BP1=13,863, BP4=9,552, BP5=97, BP6=55, BP7=65 (moderate CP updates around the portal threshold), BP2=1 (portal threshold step-up).
**Acdream signature:** clean physics — no failure mode. cp-write=20,956 (still ~322× retail), push-back-disp=34 (very few dispatcher hits — mostly flat-ground walking with no collisions).
**Cell-transit chain — captures the portal entry:**
```
0x00000000 -> 0xA9B30030 reason=teleport (login spawn)
0xA9B30030 -> 0xA9B40029 -> 0xA9B40021 -> 0xA9B40019 ->
0xA9B40011 -> 0xA9B40012 -> 0xA9B4000A -> 0xA9B4000B ->
0xA9B40003 (walked across Holtburg)
0xA9B40003 -> 0x00070143 reason=teleport (PORTAL ENTRY)
0x00070143 -> 0xA9B30016 reason=resolver (post-teleport resolver)
0xA9B30016 -> 0x00060016 reason=resolver (lands at network hub)
```
**Incidental discovery (filed as issue #95):** post-teleport, `[cell-cache]` events showed `visibleCells=135-145` per cell (vs normal ~4-7), with cells cached from 3 separate landblocks (0x0007, 0x020A, 0x0408) — different `worldOrigin`s, i.e. different dungeons entirely. This is the portal-graph visibility blowup. Direct cause of "see through walls / other dungeons rendering" across the project.
## Findings
### Finding 1 — Dispatcher entry frequency mismatch (4× to 281× fewer in acdream)
**Status:** confirmed in all 5 scenarios; severity MEDIUM (probable secondary effect of the v4 probe scope rather than a single fix surface).
**Retail decomp anchor:** [`CTransition::transitional_insert`](docs/research/named-retail/acclient_2013_pseudo_c.txt) — retail's outer loop dispatches `BSPTREE::find_collisions` per sub-step regardless of expected collision. (Spec §1.2 hypothesis.)
**Our suspect code site:** `src/AcDream.Core/Physics/Transition.cs` / `src/AcDream.Core/Physics/BSPQuery.cs` — the modern dispatcher path likely short-circuits when no candidate cell has potential collision geometry.
**Divergence quantified:** retail BP4 hit count vs acdream push-back-disp hit count, per Table 3 column "Acdream/Retail dispatcher ratio." Range: 0.004× (scen5) to 0.259× (scen3). Worst gap on flat-walk scenarios where retail still queries dispatcher constantly.
**Proposed fix sketch:** investigate whether acdream's `Transition` is correctly calling FindCollisions in the per-sub-step inner loop. If `transitional_insert` short-circuits on a "no obvious collision" heuristic, the optimization may be hiding CP retention behavior that ONLY runs in the dispatcher's idle paths (e.g. step_down probe-to-ground that maintains LKCP). Removing the short-circuit may close Finding 2 as a side effect.
**Scenarios affected:** all 5.
### Finding 2 — ContactPlane resynthesis blowup (250× to ∞× more in acdream)
**Status:** confirmed in all 5 scenarios; severity **HIGH (single largest M1.5 root cause)**.
**Retail decomp anchor:** `COLLISIONINFO::set_contact_plane` and the three documented retention mechanisms — Mechanism A (Path-6 land write in `BSPQuery.FindCollisions`), Mechanism B (LKCP-restore in `validate_transition`), Mechanism C (post-OK step-down probe). See spec §1.2.
**Our suspect code site:** `src/AcDream.Core/Physics/Transition.FindEnvCollisions` indoor branch — likely resynthesizes ContactPlane per frame instead of retaining via the three mechanisms. Closely related: the existing `TryFindIndoorWalkablePlane` synthesis workaround (flagged for A6.P4 removal).
**Divergence quantified:** per Table 3 column "CP-write ratio." Median 2,200× across the 4 finite scenarios; infinite ratio on scen3 (retail: 0 writes; acdream: 86,748 writes for the same flat-floor walk).
**Proposed fix sketch:**
1. Audit every site in our physics code that writes `ContactPlane`. There should be at most 3 active sites per the retention mechanisms — likely we have N>>3.
2. Replace per-frame `ContactPlane.Set(...)` calls with the retain-or-restore pattern: at the start of each tick, restore CP from `LastKnownContactPlane` (Mechanism B); only update when Path-6 lands write a new plane (Mechanism A); only re-probe via step-down when the post-OK position is suspect (Mechanism C).
3. The `TryFindIndoorWalkablePlane` synthesis goes away as part of the same change (A6.P4).
4. Verification: after the fix, re-run the scen3 capture. Target: acdream cp-write count drops from 86,748 to ≤ retail's BP7 + some buffer (say ≤ 100). If the drop is large, the change is on the right track.
**Scenarios affected:** all 5 — strongest signal in scen3 (∞× ratio).
### Finding 3 — Indoor cell-resolver sling-out (scen4)
**Status:** confirmed in scen4; severity HIGH (player can't stay inside small indoor spaces).
**Retail decomp anchor:** `CObjCell::find_cell_list` Position-variant (`acclient_2013_pseudo_c.txt:308742-308783` — already cited in CLAUDE.md as the cell-tracking ping-pong oracle for the M1.5 hypothesis).
**Our suspect code site:** `src/AcDream.Core/Physics/PhysicsEngine.ResolveCellId` + `src/AcDream.Core/Physics/CellPhysics.CheckBuildingTransit`. Issue #90 (cell-id ping-pong workaround) is part of this surface and would be removed in A6.P4 once the proper fix lands.
**Divergence quantified:** scen4 captured 3 cell-transit events during a few meters of walking inside a cottage cellar:
- `0xA9B40148 → 0xA9B40029` (indoor cottage → outdoor cell, `reason=resolver`)
- `0xA9B40029 → 0xA9B30030` (crossed landblock boundary, `reason=resolver`)
During the sling, `[check-bldg]` fired 5,495 times (CheckBuildingTransit re-resolving repeatedly), `[indoor-bsp]` fired only 2 times (indoor BSP was barely queried), and `[push-back]` fired only 1 time (no real sphere-adjustment).
**Proposed fix sketch:** ResolveCellId / CheckBuildingTransit should preserve indoor cell membership when the sphere is close to (but slightly outside) the indoor CellBSP volume — the cell-array hysteresis logic retail uses. Port the stickiness logic from the retail decomp anchor above. May obsolete issue #90's workaround.
**Scenarios affected:** scen4 directly; likely scen2/scen3 cellar/inn variants too once the visibility bug (#95) is fixed and we can re-capture.
### Finding 4 — Portal-graph visibility blowup (scope-adjacent; filed as #95)
**Status:** confirmed in scen5; severity HIGH (blocks all dungeon navigation visually); **filed as issue #95**.
Not strictly an A6 physics finding — this surfaced incidentally during scen5 capture and explains the project-wide "dungeons are broken" symptom. Full writeup in `docs/ISSUES.md` issue #95. Mentioned here so A6.P3 sequencing knows about it: any future dungeon-physics work (A8 or M1.5-residual) needs #95 fixed first, because a broken visibility set makes any in-dungeon physics analysis untrustworthy (cells are loaded that shouldn't be, distance/visibility queries return wrong answers, etc).
## M1.5 symptom coverage
Per spec §4.7, every M1.5-in-scope symptom maps to at least one bug candidate OR is explicitly flagged as deferred.
| Symptom | Source | Mapped to finding | Notes |
|---|---|---|---|
| Issue #83 — Indoor multi-Z walking broken | ISSUES.md | Finding 2 (CP-write) + Finding 3 (resolver sling) | scen3 + scen4 evidence |
| Issue #88 — Indoor static objects vibrate | ISSUES.md | Finding 2 (CP-write resynthesis per-tick causes per-tick visible micro-adjustments on static-object physics) | Hypothesis: same root cause |
| Issue #90 — Cell-id ping-pong at indoor doorway threshold | ISSUES.md | Finding 3 (cell-resolver bug) | Issue #90 is the workaround; root cause is Finding 3. A6.P4 removes the workaround. |
| Stairs walk-through (acdream can't climb) | observed | Finding 1 + Finding 2 (the stair-step probe fails because CP isn't retained between sub-steps so step_up's walkability check sees the wrong plane) | scen2 stair-attempt direct evidence |
| 2nd-floor walking (works once teleported) | observed | Finding 2 only (scen3 shows pure flat-floor CP blowup) | Walking itself fine; CP-write is the divergence |
| Cellar descent (acdream can't descend) | observed | Finding 1 + Finding 2 same as stairs | not directly captured (couldn't descend in acdream) but same physics |
| `TryFindIndoorWalkablePlane` synthesis MISS | spec §1.2 | Finding 2 (same family) | A6.P4 removes the workaround as part of the Finding 2 fix |
| Sling-out from inside building | scen4 discovery | Finding 3 (cell-resolver) | NEW symptom not in original M1.5 list — promote to symptom roster |
| "Dungeons are broken" project-wide | user-observed | Finding 4 / issue #95 (NOT A6 scope) | Defer to dedicated visibility-bug fix |
**A6.P2 acceptance test:** every in-scope M1.5 physics symptom has a mapped finding. ✅ Met.
## A6.P3 fix-surface sequencing recommendation
Per spec §5.1: "highest-confidence single-cause fix first."
**Recommended order:**
1. **Finding 2 first** (CP-write resynthesis) — single largest divergence, single largest probable impact, narrowest suspected code site (`Transition.FindEnvCollisions` indoor branch + ContactPlane retention). If Finding 1 IS a secondary effect of CP-write missing the dispatcher idle paths (the hypothesis in Finding 1's fix sketch), then fixing Finding 2 may close Finding 1 automatically. Highest expected value per PR.
2. **Re-run scen1-5 captures after Finding 2 PR lands.** Compute new ratios. If CP-write ratios drop from ~1000× to ~1× (target), Finding 2 is closed.
3. **If Finding 1 dispatcher gap also closed** — proceed directly to Finding 3.
4. **If Finding 1 still wide** — separate PR for the dispatcher-call-rate fix.
5. **Finding 3** (cell-resolver sling-out) — narrower fix; specific to ResolveCellId + CheckBuildingTransit cell-stickiness. PR also removes issue #90 workaround.
6. **A6.P4 visual verification at Holtburg inn → stairs → cellar.** Acceptance per spec §6.3.
7. **Finding 4 / issue #95** is NOT in A6.P3 scope. Handle separately when scheduled for the visibility-bug work.
## Open items / next-session candidates
- **A6.P1.5** (optional, ~1 hour): extend cdb probe with paired entry/exit BPs to capture `adjust_sphere_to_plane` output delta (Table 1) and `find_collisions` exit-path discrimination (Table 2). Only needed if A6.P3 fixes don't close the symptoms and we need sharper data. Defer until after A6.P3 first attempt.
- **Issue #95** (separate work surface): portal-graph visibility blowup. Schedule outside A6 since fixing it unblocks scen6-9 captures and any future dungeon physics work.
- **Symptom roster update:** add "indoor sling-out" to M1.5 symptom list (Finding 3 family); already captured here as a finding, but M1.5 doc should reflect it.
---
## A6.P3 slice 1 — SHIPPED 2026-05-21
Strip-synthesis + Mechanism B (LKCP restore) fix landed in 8 commits across this same session:
| Commit | Task | What |
|---|---|---|
| `ba9655f` | plan | A6.P3 slice 1 implementation plan written |
| `6b4be7f` + `c6bc2b9` | T1 | Research note: retail's `CTransition::validate_transition` LKCP-restore (line 272565-272583) + insertion-point identified in our `Transition.ValidateTransition` at TransitionTypes.cs:2849 |
| `869edd9` | T2 | Test instrumentation: `CollisionInfo.ContactPlaneWriteCount` counter |
| `36975ef` + `a32f569` | T3 | Failing regression: `IndoorContactPlaneRetentionTests` — asserts ≤5 CP writes across 60 flat-floor frames |
| `5aba071` | T4 | Mechanism B (LKCP restore) inserted in `ValidateTransition` + proximity-check sphere bug fix (`GlobalSphere[0]``GlobalCurrCenter[0]`) |
| `5f7722a` + `39fc037` + `bd5fe2e` | T5 | Indoor branch of `FindEnvCollisions` stripped to match retail's tiny `CEnvCell::find_env_collisions` shape; test redesigned as real regression sentinel (validated 60-writes-pre-strip → 0-writes-post-strip) |
| `066568a` | T6/T7 partial | scen2_inn_stairs_postfix acdream capture proves stairs now work |
| (this commit) | T6 + T8 | scen3_inn_2nd_floor_postfix capture + bookkeeping (findings doc + roadmap + CLAUDE.md + issues #96/#97 filed) |
### scen3 re-capture results (postfix)
scen3 (Holtburg inn 2nd floor flat-walk) re-captured in this slice-1 ship commit:
| Metric | Pre-fix (4b5aebc) | Post-fix | Reduction |
|---|---:|---:|---:|
| acdream cp-write (absolute) | 86,748 | 25,082 | 3.5× |
| acdream cell-cache events (proxy for session length) | 527 | 9,629 | 18× longer session |
| **cp-write per cell-cache (normalized)** | **164.61** | **2.60** | **63.2× per-unit-of-activity** |
| retail BP7 set_contact_plane | 0 | 0 | unchanged (oracle) |
Per-unit-of-activity drop is the meaningful number — a longer post-fix session naturally accumulates more total writes, but the rate per "unit of activity" (cell-cache events ~ landblocks traversed) collapsed 63×.
### scen2 re-capture results (postfix — UNEXPECTED WIN)
scen2 (Holtburg inn stairs) acdream re-captured at commit `066568a`. **Pre-fix: physics hammered BSP trying to resolve stairs (failure mode). Post-fix: user walked up and down stairs multiple times with no failure.** Tag shape shifted:
| Tag | Pre-fix (stair FAIL) | Post-fix (stair SUCCESS) | Signal |
|---|---:|---:|---|
| indoor-walkable | 859 | **0** | synthesis gone (as designed) |
| push-back-cell | 1,478 | 879 (-40%) | multi-cell iteration relaxed |
| push-back | 51 | 345 (+577%) | real step_up firing |
| push-back-disp | 4,156 | 6,055 (+46%) | real BSP traversal |
| cp-write | 33,969 | 57,846 | L622 seed (slice 2 work) |
Stairs working post-slice-1 confirms A6.P2's hypothesis that **Finding 1 (dispatcher entry frequency mismatch) was a secondary effect of Finding 2** — fixing CP retention also closes the cell-array iteration storm that prevented stair-step resolution.
### Visual verification (user happy-testing, 2026-05-21)
User report from happy-testing session post-slice-1:
- ✅ 2nd floor walking works (with caveats below)
- ✅ Stairs up + down work (M1.5 demo target unblocked)
- ✅ Cellar descent works (M1.5 demo target unblocked)
- ❌ Phantom collisions occasionally on 2nd floor — filed as **issue #97** (hypothesis: caused by #96)
- ❌ Occasional fall-through on 2nd floor — filed as **issue #97** (same)
- ❌ See-through-walls indoors — **issue #95** (not A6 scope; visibility blowup)
- ❌ Indoor lighting broken — **A7 scope**
### Status of A6.P2 findings post-slice-1
| Finding | Status post-slice-1 |
|---|---|
| Finding 1 — dispatcher entry frequency mismatch | **CLOSED as side-effect of Finding 2 fix** (scen2 dispatcher shape now retail-like) |
| Finding 2 — ContactPlane resynthesis blowup | **PARTIALLY CLOSED.** Synthesis path eliminated (indoor-walkable = 0). Remaining 99.3% of post-fix CP writes come from `PhysicsEngine.ResolveWithTransition` line 622 — a per-tick body-CP seed that retail doesn't do. **Filed as issue #96** for slice 2. |
| Finding 3 — Indoor cell-resolver sling-out | OPEN. Not addressed by slice 1. Needs scen4 re-capture to confirm whether sling-out symptom persists post-slice-1 (possible side-effect close); separate fix surface in ResolveCellId / CheckBuildingTransit otherwise. |
| Finding 4 — Portal-graph visibility blowup | OPEN as issue #95 (not A6 scope; user-confirmed during happy-testing). |
### Slice 2 recommendation
**Highest-value next slice: gate the L622 per-tick CP seed.** It's responsible for 99.3% of remaining post-fix CP writes (24,906 of 25,082 in scen3 postfix). Retail's equivalent code path fires zero `set_contact_plane` calls during flat-floor walks. Either remove the seed entirely (rely on Mechanism A/B for CP propagation) OR gate it to fire only when the body's CP has changed since last seed.
After slice 2, re-test phantom collisions + fall-through (issue #97) — they may close as side-effects (same family of "CP state being unstable across ticks"). If not, that becomes slice 3 territory + Finding 3 work.
A6.P4 (workaround removal + visual verification) can proceed in parallel with slice 2 if scope allows.

View file

@ -0,0 +1,344 @@
# A6.P1 partial-ship handoff — 2026-05-21
**Status:** Infrastructure complete + scenario 1 (Holtburg inn doorway)
captured end-to-end (retail + acdream paired). Scenarios 29 deferred to
next session.
**Pasteable session-start prompt at the bottom of this doc.**
## TL;DR
A6.P1 ships in two milestones:
1. **Infrastructure milestone (DONE today):** `[push-back]` acdream probe (3
helpers + 3 sites + DebugVM mirror + CLAUDE.md docs), cdb probe script
(v4 with PDB-verified offsets + hex-bits float output), PowerShell
runner with ASCII encoding, README, capture-dir scaffolding,
PDB-match verification, type dumper, hex→float decoder.
2. **Capture milestone (PARTIAL):** 1 of 9 scenarios captured. Scenarios
29 user-driven, deferred at user direction to avoid fatigue.
**Scenario 1 already surfaces two strong M1.5 findings** (before any
formal A6.P2 analysis):
| Metric | Retail | acdream | Notes |
|---|---:|---:|---|
| dispatcher entries (find_collisions / BSPQuery.FindCollisions) | 5,818 | 295 | acdream calls dispatcher **20× less often** |
| ContactPlane writes (set_contact_plane fn / per-field writes) | 18 calls | **73,304** field-writes | acdream **rewrites CP every frame/sub-step** vs retail's per-event |
The CP-write blowup directly confirms the spec's hypothesis
([2026-05-21-phase-a6-indoor-physics-fidelity-design.md §1.2](../superpowers/specs/2026-05-21-phase-a6-indoor-physics-fidelity-design.md))
that `FindEnvCollisions` indoor branch resynthesizes CP per frame
instead of retaining via Mechanisms A/B/C. Same family as the
`TryFindIndoorWalkablePlane` workaround.
## State both altitudes (next session)
> **Currently working toward: M1.5 — "Indoor world feels right."**
>
> **Current phase: A6 — Indoor physics fidelity (cdb-driven).**
>
> **Next concrete step: Capture scenarios 29 (paired retail + acdream
> traces). Then run A6.P2 analysis on all 9 captures.**
## What shipped today (16 commits)
### Infrastructure (Tasks 114 from the A6.P1 plan)
| Commit | What |
|---|---|
| `ace9e62`, `ad6c89d` | T1: `ProbePushBackEnabled` toggle + roundtrip test |
| `3a173b9` | T2: `LogPushBackAdjust` helper |
| `eb8a318` | T3: instrument `BSPQuery.AdjustSphereToPlane` |
| `2d1f27d` | T4: `LogPushBackDispatch` helper |
| `35631d1` | T5: instrument `BSPQuery.FindCollisions` |
| `66ee757` | T6: `LogPushBackCellTransit` helper |
| `642734d` | T7: instrument `Transition.CheckOtherCells` |
| `dd95c10` | T8: DebugVM `ProbePushBack` mirror |
| `e1f7efe` | T9: CLAUDE.md `ACDREAM_PROBE_PUSH_BACK` env var docs |
| `7bb799b` | T10: `tools/cdb/a6-probe.cdb` v1 (broken offsets) |
| `1c640eb` | T11: `tools/cdb/a6-probe-runner.ps1` (later patched to ASCII) |
| `df315a9` | T12: `tools/cdb/README-a6-probe.md` |
| `0e21f22`, `22e341f` | T13: PDB-match verification (audit trail) |
| `260c60f` | T14: capture-dir scaffolding + findings doc stub |
### cdb script iteration (T15 dry-runs)
| Commit | What |
|---|---|
| `d0c8c54` | v1→v2 prep: type dumper (`a6-types-dump.cdb` + runner) + ASCII runner |
| `7b9b26f` | v2 cdb script: PDB-verified offsets + BP6 fix to `check_walkable` |
| `1b6d49e` | v3 cdb script: `@@c++(*(float*)addr)` for floats (still produced zeros) |
| `2d841cb` | v4 cdb script: hex-bits float output via `%08X` (WORKS) |
### Scen1 capture + decode tooling
| Commit | What |
|---|---|
| `180b4a5` | scen1 retail.log captured (v4 cdb, 13,552 hits, real hex bits) |
| `8ca718a` | scen1 acdream.log paired (84,130 lines, full probe distribution) |
| `194ed3e` | `decode_retail_hex.py` — Python hex→float decoder + scen1 decoded log |
## Why cdb v1→v4 iteration was necessary
The cdb side hit three landmines we didn't anticipate when writing the
A6.P1 plan:
1. **v1: Stack-arg offsets wrong.** Plan's probe actions used arbitrary
registers (`@edx`, `@edi`) to read function args. `__thiscall` puts
non-this args on the stack (`[esp+N]`), not in arbitrary registers.
All 12 BP5 hits printed `Nx=0 Ny=0 ...` — confirming the read
addresses were wrong. **Fix:** type dumper + double-indirect via
`dwo(poi(@esp+N)+offset)`.
2. **v2: BP6 symbol wrong + PowerShell UTF-16 encoding.** v1's
`validate_walkable` doesn't exist in the PDB (the actual function is
`CTransition::check_walkable`). PowerShell's `Tee-Object` writes
UTF-16 LE by default, making logs ungreppable. **Fixes:** BP6 symbol
corrected, runner switched to `Out-File -Encoding ASCII`. v2 had
correct integer reads (substeps=3, insertType=0) but all `%f` floats
still printed as 0.000000.
3. **v3: `%f` doesn't work with `dwo()`.** Switching to
`@@c++(*(float*)addr)` to force C++ interpretation also produced
0.000000 across all float fields. cdb's `.printf %f` appears to not
reliably handle our float values (possibly varargs promotion, possibly
a deeper limitation). **Workaround (v4):** print all floats as 32-bit
hex bits via `%08X`; Python decoder reinterprets via
`struct.unpack('<f', struct.pack('<I', value))`.
The v4 + decoder pattern works. **Pickup sessions should NOT change
the cdb script** unless adding new BPs. The hex-bits encoding is robust
and the decoder validates against known constants (BP6 threshold = FloorZ).
## Scen1 findings (preliminary — formal A6.P2 to follow)
### Capture pair
- Retail: `docs/research/2026-05-21-a6-captures/scen1_inn_doorway/retail.log` (raw v4 hex) + `retail.decoded.log` (decoded floats).
- Acdream: `docs/research/2026-05-21-a6-captures/scen1_inn_doorway/acdream.log` (84,130 lines).
### BP hit-count distribution (2-sec walk through inn doorway, both clients)
| Site | Retail | Acdream | Ratio (acdream/retail) |
|---|---:|---:|---:|
| transitional_insert / sub-step loop | 7,686 (BP1) | n/a (no acdream probe) | — |
| find_collisions dispatch | 5,818 (BP4) | 295 ([push-back-disp]) | **0.05× (20× fewer)** |
| adjust_sphere_to_plane | 12 (BP5) | 8 ([push-back]) | 0.67× |
| check_other_cells loop | n/a (BP3 zero — no wall hit) | 5 ([push-back-cell]) | — |
| check_walkable / ground verdict | 12 (BP6) | n/a (no acdream probe) | — |
| set_contact_plane / CP writes | 18 (BP7 fn calls) | 73,304 (per-field) | **~1001000× more** |
| step_up | 1 (BP2) | n/a | — |
| set_collide / wall halt | 0 (no wall hit in scen1) | n/a | — |
### Finding 1: dispatcher entry frequency mismatch
Retail's `BSPTREE::find_collisions` fires 5,818 times in ~2 seconds of
walking (~2,900/sec). Acdream's `BSPQuery.FindCollisions` fires only
295 times in the same scenario (~150/sec).
**Possible causes** (investigate during A6.P2):
- Physics tick rate difference (retail 30Hz? per CLAUDE.md
steep-roof finding) vs acdream's tick.
- Different sub-step cadence inside `transitional_insert`
retail's outer loop iterates much more than ours.
- Different number of cells visited per sub-step (retail's CELLARRAY
iteration calls dispatcher per cell; we may only call once
per primary cell).
- Probe scope difference: retail's BP catches `BSPTREE::find_collisions`
(one C++ class). Acdream's `[push-back-disp]` covers
`BSPQuery.FindCollisions` modern overload (one C# method). If our
call paths into dispatcher are differently structured, frequencies
diverge.
### Finding 2: ContactPlane write blowup
Acdream writes 73,304 ContactPlane field-level updates in 30 seconds
(~2,400/sec including the boot phase before the player moved).
Retail's `set_contact_plane` fires 18 times (~6/sec including boot).
Even with a 6× field-write multiplier per `set_contact_plane` call,
that gives ~100 actual CP updates in retail vs ~12,000 in acdream
**100×+ more frequent in acdream**.
**This is the M1.5 hypothesis confirmed empirically.** Per the spec
§1.2, the working hypothesis was that `FindEnvCollisions` indoor
branch rewrites CP every frame instead of retaining it via the three
documented retention mechanisms. The 73K cp-write data confirms.
A6.P3 fix surface: stop rewriting CP every frame; use the existing
LKCP-restore (Mechanism B at `validate_transition`) + Path-6 land
write (Mechanism A) + post-OK step-down probe (Mechanism C).
`TryFindIndoorWalkablePlane` synthesis (the workaround flagged for
A6.P4 removal) is part of the same bad-pattern family.
### Per-call shape match (BP5 hit#1)
| Field | Retail (decoded) | Acdream | Match? |
|---|---|---|---|
| Plane.N | (0, 0, 1) | (0, 0, 1) | ✓ identical |
| Plane.d | -0.0000 | -0.0000 | ✓ identical |
| Sphere.center.x | 0.0046 | -0.4325 | independent walks |
| Sphere.center.y | 10.3072 | 11.0219 | independent walks |
| Sphere.center.z | -0.2700 | 0.4600 | DIFFERENT axis — investigate |
| Sphere.radius | 0.4800 | 0.4800 | ✓ identical |
| WalkInterp (pre) | 1.0000 | 1.0000 | ✓ identical |
| Movement.x | 0.0000 | 0.0000 | ✓ identical |
| Movement.y | -0.0000 | -0.0000 | ✓ identical |
| Movement.z | -0.7500 | -0.5000 | DIFFERENT — investigate |
The shape matches (vertical step-down probe against ground), but two
axis values differ between retail and acdream:
- **Sphere.center.z**: retail -0.27, acdream +0.46. Could be different
local-space conventions (retail's localspace_pos vs acdream's
per-cell transform).
- **Movement.z**: retail -0.75 (the value passed by the call site
in retail's decomp), acdream -0.50 (smaller step-down probe distance).
These could be the BSP correction-path divergence the spec hypothesizes,
or they could be benign convention differences. A6.P2 with the full 9
scenarios will surface which.
## What's deferred (scenarios 29 + A6.P2)
### Scenarios 29 (~40 min user time at ~5 min each)
| # | Tag | Location | Walk script |
|---|---|---|---|
| 2 | scen2_inn_stairs | Holtburg inn, stairs to 2nd floor | Walk up 4 steps, stop on landing |
| 3 | scen3_inn_2nd_floor | Holtburg inn 2nd floor | Walk forward 3 m, sidestep 1 m, walk back |
| 4 | scen4_cottage_cellar | Holtburg cottage with cellar | Walk to cellar opening, descend 2 steps |
| 5 | scen5_sewer_entry | Holtburg sewer entrance | Walk into portal, then walk 2 m forward inside |
| 6 | scen6_sewer_first_stair | Sewer's first stair after entry | Walk down full stair flight |
| 7 | scen7_sewer_inter_room | Between any two sewer rooms via portal | Walk through portal, stop 1 m past |
| 8 | scen8_sewer_chamber | Sewer's multi-Z room | Walk in, traverse center, walk out other side |
| 9 | scen9_sewer_corridor | Sewer narrow corridor | Walk full length end-to-end |
Per-scenario protocol (validated by scen1):
1. User launches retail, navigates character to start point, stops.
2. Run `.\tools\cdb\a6-probe-runner.ps1 -ScenarioTag "scenN_..."`.
Wait for `a6-probe v4 armed:` confirmation in
`docs/research/2026-05-21-a6-captures/scenN_.../retail.log`.
3. User performs the scripted walk in retail.
4. cdb auto-detaches at 50K hits (or kill cdb to release retail —
acclient comes down too, accept and relaunch).
5. User launches acdream with all 5 probe env vars
(`ACDREAM_PROBE_PUSH_BACK=1` + indoor_bsp + cell + cell_cache + contact_plane).
Output to `docs/research/2026-05-21-a6-captures/scenN_.../acdream.log`.
6. User walks acdream through the SAME scripted walk.
7. Close acdream gracefully.
8. Run `py tools/cdb/decode_retail_hex.py docs/research/2026-05-21-a6-captures/scenN_.../retail.log`.
9. Commit `retail.log`, `acdream.log`, `retail.decoded.log` for that scenario.
### A6.P2 (analysis report) — ~1 day after all 9 scenarios are in
Spec §4 of the design doc defines the 4 mandatory tables:
1. Per-site push-back delta (Table 1)
2. Path-frequency diff (Table 2)
3. ContactPlane lifecycle diff (Table 3)
4. Sub-step state mutations (Table 4)
Plus per-scenario narrative + findings section.
**Already have strong evidence for Finding 2 (CP-write blowup)** from
scen1 alone. A6.P2 quantifies + extends across the remaining 8
scenarios + writes the formal A6.P3 fix sketches.
## Known issues + gotchas (lessons from today)
1. **Killing cdb kills retail** (per CLAUDE.md). Either wait for 50K
threshold via `qd` auto-detach (~60 sec under motion) or accept that
killing cdb takes acclient down too. Relaunch is ~30 sec.
2. **PowerShell `Tee-Object` writes UTF-16 LE.** The runner uses
`Out-File -Encoding ASCII` to fix this. Don't revert.
3. **cdb `.printf %f` is unreliable.** v4 uses hex output + Python
decoder. Do NOT try to "simplify" back to `%f`.
4. **Retail binary must match the PDB** (GUID `{9e847e2f-...}`,
linker UTC `2013-09-06`). Verify with
`py tools/pdb-extract/check_exe_pdb.py "C:/Turbine/Asheron's Call/acclient.exe"`
before any capture session.
5. **Hit-rate budget under motion.** ~13K total hits per 2-sec walk.
Threshold of 50K survives ~8 sec of continuous walking before
auto-detach. For longer scenarios (sewer corridor end-to-end),
the walk may need to be broken into multiple captures OR threshold
bumped to 100K (edit `a6-probe.cdb` `.if (@$t0 >= 50000)``100000`).
6. **BP6 fires with FloorZ (0.6642) not cos85 (0.0872).** v4 confirmed
this — `check_walkable` is called with `PhysicsGlobals.FloorZ` for
ground verdicts. The cos85 value (0.0872) is passed in a different
code path (post-set_collide wall-slide) which didn't fire during
scen1 (no wall hits). Will appear when scenarios 29 hit walls.
## Pickup prompt for fresh session
Open a new Claude Code session at this worktree's branch
(`claude/strange-albattani-3fc83c`, HEAD at the latest A6.P1 commit).
Then paste:
---
```
Pick up A6.P1 capture work — scenarios 2 through 9. The infrastructure
shipped today (probe + cdb v4 + decoder all working). Scenario 1 captured
end-to-end with paired retail + acdream traces; preliminary findings
already strong (CP-write blowup confirms the M1.5 hypothesis).
Read FIRST:
docs/research/2026-05-21-a6-p1-partial-ship-handoff.md
Then state both altitudes:
Currently working toward: M1.5 — Indoor world feels right
Current phase: A6.P1 — capture scenarios 2-9
Next concrete step: scenario 2 (Holtburg inn stairs)
Workflow per scenario (validated by scen1):
1. Verify retail binary matches PDB:
py tools/pdb-extract/check_exe_pdb.py "C:/Turbine/Asheron's Call/acclient.exe"
Expect MATCH (GUID {9e847e2f-...}).
2. User launches retail, walks character to scenario start, stops.
3. .\tools\cdb\a6-probe-runner.ps1 -ScenarioTag "scenN_..."
Wait for "a6-probe v4 armed:" in the log file.
4. User performs the scripted walk.
5. Wait for cdb auto-detach (50K hits) OR kill cdb (acclient dies too;
relaunch needed). Hit rate ~6.5K/sec under motion.
6. User launches acdream with all 5 probe env vars + output to
docs/research/2026-05-21-a6-captures/scenN_.../acdream.log
7. User walks acdream through the SAME scripted walk.
8. Close acdream gracefully.
9. py tools/cdb/decode_retail_hex.py docs/research/.../retail.log
10. Commit retail.log + retail.decoded.log + acdream.log for that scenario.
Scenario list per the README at tools/cdb/README-a6-probe.md.
DO NOT modify the cdb script. v4 works (verified by BP6 threshold
decoding to FloorZ 0.6642 exactly). The hex-bits + Python decoder
pattern is the stable approach.
CLAUDE.md rules apply:
- Three failed visual verifications = handoff (we hit this on the
cdb script v1→v2→v3 cycle; v4 broke the streak).
- No workarounds without approval (v4 hex output isn't a workaround,
it's the chosen design after cdb %f proved unreliable).
- Visual verification at the Holtburg Sewer is the M1.5 physics
acceptance test (deferred to A6.P4 after fixes land).
After all 9 captures: proceed to A6.P2 analysis per the design spec
docs/superpowers/specs/2026-05-21-phase-a6-indoor-physics-fidelity-design.md
§4. Note: Finding 2 (CP-write blowup) is already evidence-confirmed
from scen1; A6.P2 just needs to quantify + extend across scenarios.
```
---
## References
- Design spec: [`docs/superpowers/specs/2026-05-21-phase-a6-indoor-physics-fidelity-design.md`](../superpowers/specs/2026-05-21-phase-a6-indoor-physics-fidelity-design.md)
- Implementation plan: [`docs/superpowers/plans/2026-05-21-phase-a6-p1-cdb-probe-spike.md`](../superpowers/plans/2026-05-21-phase-a6-p1-cdb-probe-spike.md)
- cdb script: [`tools/cdb/a6-probe.cdb`](../../tools/cdb/a6-probe.cdb) (v4)
- cdb runner: [`tools/cdb/a6-probe-runner.ps1`](../../tools/cdb/a6-probe-runner.ps1)
- Type dumper: [`tools/cdb/a6-types-dump.cdb`](../../tools/cdb/a6-types-dump.cdb) + [`a6-types-dump.txt`](../../tools/cdb/a6-types-dump.txt) (PDB-extracted offsets)
- Hex decoder: [`tools/cdb/decode_retail_hex.py`](../../tools/cdb/decode_retail_hex.py)
- Scen1 retail: [`docs/research/2026-05-21-a6-captures/scen1_inn_doorway/retail.log`](2026-05-21-a6-captures/scen1_inn_doorway/retail.log) + [`retail.decoded.log`](2026-05-21-a6-captures/scen1_inn_doorway/retail.decoded.log)
- Scen1 acdream: [`docs/research/2026-05-21-a6-captures/scen1_inn_doorway/acdream.log`](2026-05-21-a6-captures/scen1_inn_doorway/acdream.log)
- Findings doc stub (to be filled by A6.P2): [`docs/research/2026-05-21-a6-cdb-capture-findings.md`](2026-05-21-a6-cdb-capture-findings.md)

View file

@ -0,0 +1,264 @@
# A6.P3 Slice 1 — Retail Mechanism B Oracle for Indoor CP Retention
**Date:** 2026-05-21
**Author:** Claude (research agent)
**Task:** Pre-fix research grounding the indoor ContactPlane-retention refactor in
retail's exact LKCP-restore pattern before the synthesis path is removed.
---
## 1. `CEnvCell::find_env_collisions` Shape
Retail decomp at `acclient_2013_pseudo_c.txt` lines 309573309593 (address
`0052c130`). The complete function is 10 functional lines:
```c
// 0052c130 enum TransitionState __thiscall CEnvCell::find_env_collisions(
// class CEnvCell const* this, class CTransition* arg2)
{
// Check entry restrictions (object ethereal? door closed? etc.)
enum TransitionState result = CObjCell::check_entry_restrictions(this, arg2);
if (result == OK_TS) {
// 0052c144 Clear obstruction-ethereal so BSP collision is live.
arg2->sphere_path.obstruction_ethereal = 0;
if (this->structure->physics_bsp != 0) {
// 0052c169 Project sphere into cell-local space.
SPHEREPATH::cache_localspace_sphere(&arg2->sphere_path, &this->pos, 1f);
// 0052c175 Run BSP: INITIAL_PLACEMENT → placement_insert path;
// all other insert_types → find_collisions path.
if (arg2->sphere_path.insert_type != INITIAL_PLACEMENT_INSERT)
result = BSPTREE::find_collisions(this->structure->physics_bsp, arg2, 1f);
else
result = BSPTREE::placement_insert(this->structure->physics_bsp, arg2);
// 0052c1a5 On collision with environment (non-Contact objects only).
if (result != OK_TS && (arg2->object_info.state & 1) == 0)
arg2->collision_info.collided_with_environment = 1;
}
}
return result;
}
```
**Key observation:** `find_env_collisions` itself does **not** write
`contact_plane`. It either returns OK (BSP path OK or no BSP) or returns a
collision state. ContactPlane is written ONLY inside `BSPTREE::find_collisions`
via Path 6 (land/step-down) — that is Mechanism A. There is no per-frame
synthesis path anywhere in this function.
---
## 2. Retail Mechanism B Location
**Function:** `CTransition::validate_transition`
**Retail address:** `0050aa70`
**Decomp line range:** `acclient_2013_pseudo_c.txt` lines 272547272700
**Identified via:** The `validate_transition` function header appears at line
272547 (`0050aa70`). Line 272538 is inside the preceding
`CTransition::check_collisions` function.
The LKCP-restore block runs at lines 272565272582 (addresses `0050aaed``0050ab4c`).
---
## 3. Retail Mechanism B Trigger Condition
Mechanism B fires when ALL of the following are true:
1. **`result > OK_TS && result <= SLID_TS`** — the transition ended in Collided,
Adjusted, or Slid (not OK, not Invalid).
2. **`collision_info.last_known_contact_plane_valid != 0`** — there is a
remembered floor plane from a prior frame.
3. **Proximity guard:** `|dot(global_curr_center, LKCP.N) + LKCP.d| <= radius + 0.000199f`
— the sphere's **current position center** (`global_curr_center`, NOT the
check-position sphere `global_sphere`) is still geometrically close to the
last-known plane.
When all three pass, retail:
```c
// 0050ab37
COLLISIONINFO::set_contact_plane(
&this->collision_info,
&this->collision_info.last_known_contact_plane,
this->collision_info.last_known_contact_plane_is_water);
// 0050ab42
this->collision_info.contact_plane_cell_id =
this->collision_info.last_known_contact_plane_cell_id;
```
Then `result = OK_TS` at `0050ab9f` — the collision is resolved by restoring the
floor and treating the transition as successful.
**After that block**, at `0050acff``0050ad7d`, retail sets
`last_known_contact_plane_valid = contact_plane_valid` (unconditional overwrite,
NOT "only when valid") and then sets `Contact` + `OnWalkable` flags based on
whether `contact_plane_valid` is non-zero. The LKCP update strategy is
**unconditional** in retail (even if current CP is invalid, LKCP gets cleared).
**The epsilon constant:** `0.000199999995f` — effectively `2e-4`. This is a
tight epsilon for floating-point error in the dot product; the sphere radius
already provides the geometric margin.
---
## 4. Our Equivalent Function
From `grep -rn "ValidateTransition" src/AcDream.Core/Physics/`:
```
TransitionTypes.cs:2751 private TransitionState ValidateTransition(TransitionState transitionState)
TransitionTypes.cs:670 transitionState = ValidateTransition(result);
```
Our C# `ValidateTransition` (TransitionTypes.cs lines 27512873) is the
correct equivalent. The call at line 670 is inside `FindTransitionalPosition`'s
step loop: each call to `TransitionalInsert` is immediately followed by
`ValidateTransition(result)`.
---
## 5. Decision — Where to Add Mechanism B in Our Code
### Gap analysis
Our `ValidateTransition` has TWO divergences from retail's Mechanism B:
**Gap 1: Missing `SetContactPlane` write in the Collided/Slid/Adjusted branch.**
Retail's `validate_transition` (lines 272565272582) calls
`COLLISIONINFO::set_contact_plane(LKCP, LKCP_is_water)` and sets
`contact_plane_cell_id = LKCP_cell_id` before returning `OK_TS`.
Our `ValidateTransition` at TransitionTypes.cs:28212866 (the
`else if (ci.LastKnownContactPlaneValid)` block) only reads `LastKnownContactPlane`
to update `oi.State` flags (`Contact`, `OnWalkable`) — it does **not** call
`ci.SetContactPlane(...)`. This means `ContactPlane` stays invalid even when
we know the LKCP is close, while `ci.LastKnownContactPlane` holds the value.
The PhysicsEngine fallback at PhysicsEngine.cs:668674 partially compensates
(it reads LKCP to populate `body.ContactPlane` cross-frame), but it only does
so after `FindTransitionalPosition` returns — not per-step inside the loop.
**Gap 2: Wrong sphere used for proximity dot product.**
Retail uses `global_curr_center` (pointer to the sphere center at the *current*
frame-start position) for the dot product. Our code at TransitionTypes.cs:2843
uses `sp.GlobalSphere[0].Origin` (the *check* position — where we want to move
to). For the proximity check against a retained floor plane, the correct center
is `sp.GlobalCurrCenter[0].Origin`, matching retail's `global_curr_center`.
This distinction matters when the player is near a cell/floor boundary: if the
check position has stepped slightly off the floor but the current position is
still on it, retail correctly restores the CP; our code might fail the proximity
guard spuriously.
### Insertion point (exact)
**File:** `src/AcDream.Core/Physics/TransitionTypes.cs`
**Method:** `ValidateTransition` (line 2751)
**Target block:** The `else if (ci.LastKnownContactPlaneValid)` block at lines
28212866 (the LKCP proximity-guard branch).
**Change required:** Within the `if (radius + PhysicsGlobals.EPSILON > MathF.Abs(angle))` branch (currently at line 2848), BEFORE setting `oi.State` flags:
1. Add `ci.SetContactPlane(ci.LastKnownContactPlane, ci.LastKnownContactPlaneCellId, ci.LastKnownContactPlaneIsWater);`
2. Change the proximity sphere center from `sp.GlobalSphere[0].Origin` (line 2843)
to `sp.GlobalCurrCenter[0].Origin` to match retail's `global_curr_center`.
The addition goes at TransitionTypes.cs approximately **line 2849** (just before
the `oi.State |= ObjectInfoState.Contact` at current line 2852), producing:
```csharp
// Retail Mechanism B (validate_transition:0050ab37): restore CP from LKCP
// when sphere is still near the plane. This writes ContactPlane valid so
// the end-of-function LastKnown-update block (below) re-latches it,
// and ObjectInfoState.Contact is set from contact_plane_valid.
ci.SetContactPlane(ci.LastKnownContactPlane,
ci.LastKnownContactPlaneCellId,
ci.LastKnownContactPlaneIsWater);
// Then set Contact + OnWalkable (same logic as retail's 0050ad6a block):
oi.State |= ObjectInfoState.Contact;
if (ci.LastKnownContactPlane.Normal.Z >= PhysicsGlobals.FloorZ)
oi.State |= ObjectInfoState.OnWalkable;
else
oi.State &= ~ObjectInfoState.OnWalkable;
```
> **Note:** `SetContactPlane` also re-latches `LastKnownContactPlane`, `LastKnownContactPlaneCellId`, and `LastKnownContactPlaneIsWater` (TransitionTypes.cs:258-261). Passing LKCP as the source means the re-latch is a no-op on those fields — functionally safe, but worth knowing if you later decide to inline the writes instead of using `SetContactPlane`.
**Note on the LKCP-update strategy divergence (Gap 3):** Retail's `validate_transition`
at `0050acff` does `last_known_contact_plane_valid = contact_plane_valid`
unconditionally — this means when contact is invalid and stays invalid, LKCP is
cleared. Our code at TransitionTypes.cs:2801 only updates LKCP when current CP
is valid (L.2.3c deliberate divergence from 2026-04-29 to prevent animation
flicker on failed step-ups). **Do not change this in slice 1** — the Mechanism B
`SetContactPlane` call above feeds into the standard contact-valid branch (lines
28012819), which then re-latches LKCP normally. The net effect is equivalent
to retail's unconditional overwrite in the success case, without the flicker
regression of clearing LKCP on transient failures.
---
## 6. Risk — First-Frame Fall-Through
**Scenario:** Player teleports into a new indoor cell (or crosses a cell
boundary). On frame 0 in the new cell: LKCP is invalid (no prior frame data),
BSP returns OK (no wall collision, player is standing on a floor poly). With
the synthesis path stripped (Task 5) and Mechanism B requiring a valid LKCP,
this frame will have `ContactPlane` invalid for the indoor case.
**Consequence:** Frame 0 post-cell-cross → `ContactPlane` invalid → outdoor
terrain fallback fires → ValidateWalkable evaluates outdoor terrain Z → outdoor
Z is below indoor floor (due to +0.02f Z-bump) → player appears 0.02+ m above
the outdoor plane → ValidateWalkable decides they're airborne → `OnWalkable=false`
→ falling animation for one frame. Retail avoids this via Mechanism A: when BSP
Path 6 (step-down/land) fires on the first indoor frame, it writes CP directly
from the floor polygon.
**Assessment for slice 1:** Mechanism A is already wired in `BSPQuery.FindCollisions`
(calls `SetContactPlane` at BSPQuery.cs lines 1204 + 1713 for Path 6). If the
player's foot sphere is close enough to a floor polygon on the first frame
(within `step_sphere_down`'s probe distance), Path 6 will write CP and LKCP
will be primed via the `ci.ContactPlaneValid` branch (TransitionTypes.cs:2801).
Frame 1 will have LKCP valid and Mechanism B can take over.
**Risk is LOW for normal walking** (player stays near the floor, Path 6 fires
on the first frame in any cell). Risk is HIGHER for teleport-into-air edge
cases where the player spawns slightly above the floor and the step-down probe
misses. Accept for slice 1; slice 2 (Mechanism C) adds a direct floor-plane
probe from the new cell's geometry on first entry, closing the gap completely.
**Mitigation hedge:** When stripping `TryFindIndoorWalkablePlane` in Task 5,
do NOT strip the `ValidateWalkable` call — keep it guarded by `walkableHit`
being true. The fall-through to outdoor terrain remains as a last-resort
backstop for the single-frame miss (wrong Z, one frame of falling animation,
then Mechanism A re-grounds on the next frame). This is one visible frame of
glitch vs the current 86,748 CP writes per walk sequence. Acceptable for
slice 1.
---
## Summary Table
| Item | Retail | Our Code (pre-fix) |
|---|---|---|
| `find_env_collisions` writes CP? | No — only via BSP Path 6 (Mechanism A) | Yes — synthesis path writes CP every frame indoors |
| Mechanism B location | `CTransition::validate_transition`, Collided/Slid/Adjusted branch | Present but INCOMPLETE — sets flags only, no `SetContactPlane` call |
| Mechanism B proximity sphere | `global_curr_center` (frame-start center) | `GlobalSphere[0].Origin` (check position — wrong) |
| LKCP update strategy | Unconditional overwrite | Only on valid CP (L.2.3c deliberate fix) |
| First-frame risk | Mechanism C closes; Mechanism A covers normal cases | Same risk; accept for slice 1 |
---
## References
- `acclient_2013_pseudo_c.txt` lines 309570309595 (`CEnvCell::find_env_collisions`)
- `acclient_2013_pseudo_c.txt` lines 272547272700 (`CTransition::validate_transition`)
- `src/AcDream.Core/Physics/TransitionTypes.cs` lines 27512873 (`ValidateTransition`)
- `src/AcDream.Core/Physics/TransitionTypes.cs` lines 15141777 (`FindEnvCollisions`)
- `src/AcDream.Core/Physics/PhysicsEngine.cs` lines 640692 (`RunTransitionResolve`)
- `src/AcDream.Core/Physics/BSPQuery.cs` lines 1204, 1713 (Mechanism A `SetContactPlane`)

View file

@ -0,0 +1,223 @@
# A6.P3 handoff — 2026-05-22
**Status:** A6.P3 slices 1+2+3 SHIPPED. Issue #98 (cellar ascent stuck at top) **diagnosed but NOT fixed.** Sharp Path-5-vs-Path-6 BSP path-selection target identified with paired retail+acdream cdb evidence. Next session: fix #98 at `BSPQuery.FindCollisions` path-selection.
**Pasteable session-start prompt at the bottom of this doc.**
---
## TL;DR
Two full days of A6 work landed:
| Day | Slice | Result |
|---|---|---|
| 2026-05-21 | A6.P1 + A6.P2 + A6.P3 slice 1 (CP retention strip + Mechanism B) | Stairs + cellar descent work in acdream. A6.P2 Finding 1 (dispatcher freq) closed as side-effect of Finding 2 (CP-write blowup). |
| 2026-05-22 morning | A6.P3 slice 2 (L622 seed; v1 reverted; v2 no-op guard) | #96 partially addressed; accepted as documented retail divergence. |
| 2026-05-22 morning | A6.P3 slice 3 (cell-resolver stickiness; v1/v2/v3) | Cell-resolver ping-pong CLOSED. #90 workaround now redundant (defer A6.P4 removal). |
| 2026-05-22 noon | Slice 4 polydump probe + retail cdb capture | **Pinpointed #98 root cause:** our BSP picks Path 5 (Contact→step_up→adjust_sphere push-back) for the cellar ramp polygon when retail picks Path 6 (find_walkable → land on flat floor). |
**User-visible deltas vs Wed morning baseline (2026-05-20):**
- ✅ Inn stairs UP — works (was broken)
- ✅ Cellar descent — works (was broken)
- ✅ 2nd floor walking — works (was broken; with caveats — phantom collisions occasionally)
- ❌ Cellar ASCENT (stuck at top step) — still broken (this is issue #98)
- ❌ Visible-through-walls in dungeons — issue #95 (separate scope)
- ❌ Indoor lighting — A7 scope (separate phase)
## What shipped this session (2026-05-22)
| Commit | What |
|---|---|
| `892019b` | A6.P3 slice 2 v1: removed L622 per-tick CP seed (CP-write 91% reduction BUT broke BSP step_up at last step of stairs) |
| `f8d669b` | A6.P3 slice 2 v2: revert v1 + add no-op-if-unchanged guard inside `CollisionInfo.SetContactPlane` |
| `d868946` | Slice 2 ship docs + filed issue #98 (cellar ascent stuck — originally hypothesized as cell-resolver ping-pong) |
| `8898166` | A6.P3 slice 3 v1: sphere-overlap stickiness in `ResolveCellId` (over-corrected; blocked legitimate cell transitions) |
| `3e140cf` | A6.P3 slice 3 v2: switched to point-in stickiness — cell-resolver ping-pong CLOSED (data confirmed: 1 cell-transit event vs 20+ pre-fix) |
| `ceeb06b` | Slice 3 ship docs + #98 re-diagnosed (cellar-up symptom persists with NEW cause — BSP step-physics, not cell-resolver) |
| `0b44996` | Slice 4: added `[poly-dump]` probe in `AdjustSphereToPlane` — verifies dat fidelity by dumping polygon vertices+plane+sidesType on every push-back |
| `3198472` | Extended `[cell-cache]` probe with `portalTargets` list — shows which cells each portal connects to |
| `8bd3117` | A6.P3 slice 3 v3: REVERTED stickiness entirely (hypothesis-test for #98) — cellar-up symptom persists |
| `bbd1df4` | Slice 4: WalkInterp reset before placement_insert in DoStepDown (retail-faithful improvement; didn't fix #98 but kept as quality fix) |
| `134c9b8` | **Retail cellar-up cdb capture** — paired evidence for the Path-5 vs Path-6 diagnosis |
| `efb5f2c` | Issue #98 updated with sharpened diagnosis + failed-attempt log |
## The sharp diagnosis for issue #98
**Symptom:** User walks UP the Holtburg cottage cellar in acdream. Runs into "an invisible roof or wall" at the top step. Animation plays but no Z progress. Stuck.
**Paired evidence:**
| Metric | Retail (success) | Acdream (stuck) |
|---|---:|---:|
| BP1 transitional_insert | 2,651 | (no acdream BP1 mirror) |
| BP2 step_up | 29 (incl. 1 on ramp slope) | — |
| BP4 find_collisions | 4,032 | push-back-disp ~9000 |
| BP5 adjust_sphere | **30 (ALL on FLAT planes)** | **push-back ~1000 (270 on RAMP slope poly 0x0008)** |
| BP6 check_walkable | 25 | indoor-walkable ~700 |
| BP7 set_contact_plane | **18 (all set same flat plane: (0,0,1) d=-93.9998 = world Z=94 = cottage main floor)** | cp-write 229,300 (varying planes from many sites) |
| step_up_slide | (via BP2 = 29) | 159+ hits |
**The divergence (pinpointed):**
For the cellar ramp polygon (cellar cell 0xA9B40147, poly 0x0008, n=(0,-0.719,0.695), 46° walkable slope):
- **Retail's BSP picks Path 6 (find_walkable → land)** — treats the ramp as a walkable floor. Smoothly LANDS the sphere on the ramp surface during step_down probe. Sets ContactPlane to the cottage main floor (flat plane at world Z=94 — the END goal of the ascent).
- **Acdream's BSP picks Path 5 (Contact → step_sphere_up → adjust_sphere push-back)** — treats the ramp as a wall to push off. The push-back lifts the sphere by 0.75m and consumes all walk-interp. step_up's placement_insert then fails (the lifted position doesn't validate). step_up returns failure → step_up_slide fires → sphere slides along step_up_normal → loop. Player physically stuck.
**Both retail and ours classify the ramp as walkable** (N.Z=0.695 > FloorZ=0.6642). So the divergence isn't in the walkability check itself. It's in the **path-selection logic** inside `BSPQuery.FindCollisions` that decides whether to fire Path 5 vs Path 6 for a given polygon hit.
**Code anchors for the next session:**
- `src/AcDream.Core/Physics/BSPQuery.cs``FindCollisions` dispatcher. Search for "Path 5" + "Path 6" comments. The path selection branches on `ObjectInfo.State` (Contact flag) + `SpherePath.StepDown` + `SpherePath.StepUp`.
- The grounded player has Contact flag set (per `PhysicsEngine.cs:597-598`). So Path 5 fires first. Path 5 calls step_sphere_up → step_up → step_down (with step_up=1) → recursive BSP query.
- The recursive BSP query (with StepDown=1, StepUp=1) should fire Path 6 — but maybe doesn't, OR fires Path 6 but Path 6's adjust_sphere on the ramp is what produces the broken push-back.
- Retail's BSP behavior at the same site: step_up fires (BP2 hits), but adjust_sphere only fires on FLAT planes (BP5 all flat). So retail's step_down inside step_up doesn't push the sphere off the ramp slope.
## Why the failed attempts today didn't land
| Attempt | What we tried | Why it didn't fix #98 |
|---|---|---|
| Slice 2 v1 (`892019b`) — remove L622 seed | Eliminate the per-tick CP seed | The seed is load-bearing for step_up's AdjustOffset slope-projection on sub-step 1; removed it → all step_up broke |
| Slice 2 v2 (`f8d669b`) — no-op guard in SetContactPlane | Make redundant CP writes a true no-op | Guard doesn't fire for the L622 seed because each tick gets a fresh `Transition` (ci.ContactPlaneValid=false on entry); useful for OTHER call sites but not the seed |
| Slice 3 v1 (`8898166`) — sphere-overlap stickiness | Stop cell-resolver ping-pong | Over-corrected: held player in cellar even during legitimate transition; cellar-up still stuck |
| Slice 3 v2 (`3e140cf`) — point-in stickiness | Less aggressive stickiness | CLOSED the ping-pong (data confirmed: 1 cell-transit vs 20+) but cellar-up still stuck — bug isn't cell-resolver |
| Slice 3 v3 (`8bd3117`) — revert all stickiness | Hypothesis test: prove cell-resolver isn't the bug | Confirmed — cellar-up still stuck even without stickiness |
| Slice 4 (`bbd1df4`) — reset WalkInterp before placement_insert | Match retail's walk_interp=1 reset pattern | Logical retail-faithful improvement but doesn't unblock cellar-up; kept in tree as quality fix |
**Common pattern:** I was guessing fixes at higher levels (cell resolution, CP retention, walk_interp) when the actual bug is deeper in BSP path-selection. The paired retail cdb capture finally pinpointed the divergence.
## State of the four A6.P2 findings
| Finding | Status as of 2026-05-22 EOS |
|---|---|
| Finding 1 — dispatcher entry frequency mismatch | CLOSED (as side-effect of slice 1 Finding 2 fix) |
| Finding 2 — ContactPlane resynthesis blowup | PARTIALLY CLOSED (slice 1 stripped synthesis; slice 2 v2 added no-op guard; L622 seed retained as documented retail divergence per #96) |
| Finding 3 — Indoor cell-resolver instability | CLOSED (slice 3 v2 point-in stickiness; ping-pong fully eliminated per data) |
| Finding 4 — Portal-graph visibility blowup | OPEN as issue #95 (not A6 scope) |
## Known open issues touched by A6 work
| Issue | Status |
|---|---|
| #83 — Indoor multi-Z walking broken | Cellars + 2nd floor walking works; cellar-up still blocked by #98 |
| #88 — Indoor static objects vibrate | Unchanged (deferred; hypothesis: closes with Finding 2 family) |
| #90 — CellId ping-pong workaround | Now REDUNDANT after slice 3 v2; defer A6.P4 removal |
| #95 — Portal-graph visibility blowup | OPEN (not A6 scope) |
| #96 — L622 per-tick CP seed | PARTIALLY ADDRESSED, accepted as documented retail divergence |
| #97 — Phantom collisions + fall-through on 2nd floor | OPEN (not re-tested post-slice-3-revert; hypothesis: same Path-5/Path-6 family as #98) |
| #98 — Cellar ascent stuck at top step | OPEN — **sharp Path-5-vs-Path-6 diagnosis ready for next session** |
## Test suite status
1148 pass + 8 pre-existing fail (baseline maintained throughout the session).
## Next session — concrete starting steps
**Goal:** Fix #98 (cellar ascent stuck at top step) by correcting `BSPQuery.FindCollisions` path-selection so the cellar ramp triggers Path 6 (find_walkable land) instead of Path 5 (Contact step_up push-back).
**Approach:**
1. **Read retail's `BSPTREE::find_collisions` dispatcher** at `acclient_2013_pseudo_c.txt` (search for `BSPTREE::find_collisions`). Note exactly which path it picks for a grounded mover hitting a walkable slope. The 6-path dispatcher is at line ~322984 (where BP4 sits).
2. **Read our `BSPQuery.FindCollisions`** at `src/AcDream.Core/Physics/BSPQuery.cs:1500+`. Identify the path-selection branch that decides Path 5 vs Path 6 for the input `(grounded=true, step_down=false, step_up=false, polygon.N.Z=0.695)` case.
3. **Compare line-by-line.** Likely candidates for the divergence:
- Wrong state flag check (e.g. checking Contact when retail checks something else)
- Wrong walkability gate (e.g. requiring N.Z >= LandingZ when retail requires >= FloorZ)
- Wrong polygon-sidedness check (one-sided poly being treated as two-sided or vice versa)
- Off-by-one in path numbering (Path 5 vs Path 6 swapped in our port)
4. **Fix surgically + verify via re-capture.** Re-run the cellar-up scenario in acdream with `ACDREAM_PROBE_POLY_DUMP=1`. Compare the post-fix `[push-back]` distribution against retail's BP5 distribution from `134c9b8` capture. Target: zero push-back hits on the ramp slope; CP set to flat cottage floor (matching retail).
5. **If the fix lands cleanly:** also re-test #97 (phantom collisions + fall-through on 2nd floor — likely closes as side-effect because it's the same family).
**Files almost certainly touched by the fix:**
- `src/AcDream.Core/Physics/BSPQuery.cs` — path-selection in `FindCollisions`
- Possibly `src/AcDream.Core/Physics/PhysicsGlobals.cs` (LandingZ vs FloorZ threshold mismatch)
**Files that DON'T need changing** (already correct per today's investigation):
- `PhysicsEngine.cs` ResolveCellId (cell-resolver works post-slice-3)
- `PhysicsEngine.cs` L622 seed (retail divergence accepted)
- `TransitionTypes.cs` ValidateTransition (Mechanism B works)
- `TransitionTypes.cs` FindEnvCollisions indoor branch (slice 1 strip is correct)
## Captures available for the next session
| Capture | What it shows |
|---|---|
| `docs/research/2026-05-21-a6-captures/scen4_cottage_cellar_polydump/acdream.log` | Acdream stuck-at-cellar trace with `[poly-dump]` lines showing the ramp polygon vertices |
| `docs/research/2026-05-21-a6-captures/scen4_cottage_cellar_portaldump/acdream.log` | Same cellar with `[cell-cache] portalTargets=...` showing the cellar's portals to 0x0146 + 0x0148 |
| `docs/research/2026-05-21-a6-captures/scen4_cottage_cellar_retail_for_issue98/retail.{log,decoded.log}` | **Retail's successful cellar-up cdb trace — the gold-standard comparison data** |
| `docs/research/2026-05-21-a6-captures/scen4_cottage_cellar_slice3v2/acdream.log` | Pre-slice-3-revert cell-transit pattern (closed ping-pong, point-in stickiness) |
| `docs/research/2026-05-21-a6-captures/scen4_cottage_cellar_slice3v3_revert/acdream.log` | Post-slice-3-revert (no stickiness) — cellar-up still stuck → confirms cell-resolver isn't the bug |
## Pickup prompt for fresh session
Open a new Claude Code session at this worktree's branch
(`claude/strange-albattani-3fc83c`, HEAD at `efb5f2c`). Then paste:
---
```
Pick up A6.P3 — fix issue #98 (cellar ascent stuck at top step).
Read FIRST:
docs/research/2026-05-22-a6-p3-handoff.md
docs/ISSUES.md issue #98 entry (sharp diagnosis section)
Then state both altitudes:
Currently working toward: M1.5 — Indoor world feels right
Current phase: A6.P3 — fix issue #98 BSP path-selection
Next concrete step: read retail's BSPTREE::find_collisions
dispatcher (acclient_2013_pseudo_c.txt) + our BSPQuery.FindCollisions
side-by-side; identify why our code picks Path 5 (Contact step_up)
for the cellar ramp polygon when retail picks Path 6 (find_walkable
land). The ramp is walkable (N.Z=0.695 > FloorZ=0.6642) so Path 6 is
the correct choice for both clients.
Sharp diagnosis (from paired cdb captures committed 2026-05-22):
- Retail's adjust_sphere fires 30x ALL on flat planes (Z=94 cottage main floor)
- Acdream's push-back fires 270x on the RAMP slope (cellar 0xA9B40147 poly 0x0008)
- Retail's BP7 set_contact_plane fires 18x with the SAME flat plane
- Acdream cp-write fires 229,300x with varying planes from many sites
Captures available for comparison:
- docs/research/2026-05-21-a6-captures/scen4_cottage_cellar_retail_for_issue98/
(retail cellar-up cdb trace — gold-standard data)
- docs/research/2026-05-21-a6-captures/scen4_cottage_cellar_polydump/
(acdream stuck-at-cellar with [poly-dump] lines)
DO NOT re-attempt the failed fixes from 2026-05-22 (handoff doc has
the full list with reasons each one didn't land). Specifically:
- Don't try removing the L622 seed (breaks step_up)
- Don't try removing slice-3 stickiness (already reverted; didn't help #98)
- Don't try cell-resolver fixes (Finding 3 is closed)
Fix expected in BSPQuery.cs path-selection (the dispatcher branch
that decides Path 5 vs Path 6 for grounded movers hitting walkable
polys). Likely 5-20 lines of code change once the divergence is found.
After fix lands: re-capture scen4_cottage_cellar with the same probe
env vars to verify acdream now matches retail's flat-plane BP7
pattern. Also re-test #97 (phantom collisions + fall-through on 2nd
floor — hypothesized to close as side-effect of #98 fix).
Test suite baseline: 1148 pass + 8 pre-existing fail. Maintain through
the fix.
CLAUDE.md rules apply. No workarounds without explicit user approval.
Three failed visual verifications = handoff (we hit this 4x on the
2026-05-22 session — discipline check before attempting another guess
fix).
```
---
## References
- A6 design spec: [`docs/superpowers/specs/2026-05-21-phase-a6-indoor-physics-fidelity-design.md`](../superpowers/specs/2026-05-21-phase-a6-indoor-physics-fidelity-design.md)
- A6.P2 findings doc: [`docs/research/2026-05-21-a6-cdb-capture-findings.md`](2026-05-21-a6-cdb-capture-findings.md)
- A6.P1 partial-ship handoff (yesterday): [`docs/research/2026-05-21-a6-p1-partial-ship-handoff.md`](2026-05-21-a6-p1-partial-ship-handoff.md)
- ISSUES.md #98 entry (sharp diagnosis section)
- cdb probe + decoder: `tools/cdb/a6-probe.cdb`, `tools/cdb/decode_retail_hex.py`

View file

@ -0,0 +1,174 @@
# A6.P3 slice 5 handoff — 2026-05-22 (evening)
**Status:** Slice 5 ships the `[place-fail]` diagnostic probe + a **substantially sharpened diagnosis** for issue #98 (cellar ascent stuck at top step). Today's handoff's "Path 5 vs Path 6 in `BSPQuery.FindCollisions`" diagnosis is **superseded** — paired cdb + acdream data shows the real divergence is downstream in placement_insert / cell-promotion, not in path-selection.
**Pasteable session-start prompt at the bottom of this doc.**
---
## TL;DR
Today's morning handoff (`2026-05-22-a6-p3-handoff.md`) said: "fix expected in `BSPQuery.FindCollisions` path-selection (5-20 lines once the divergence is found)."
That diagnosis is **incorrect**. The probe-driven evidence collected this evening shows:
1. **Retail's [BP4] dispatcher trace shows every hit has `collide=0`.** Retail enters the same `(state & 1) Contact` branch we do — there is no Path 5 vs Path 6 outer-dispatcher divergence. Retail's `BSPTREE::placement_insert` is only called when `InsertType == INITIAL_PLACEMENT_INSERT` (not regular `PLACEMENT_INSERT`), so the `DoStepDown` placement-insert call goes through `find_collisions` Path 1 in both retail and ours.
2. **Retail's BP5 (adjust_sphere) fires 17+ times on the cellar ramp polygon** (`n=(0,-0.719,0.695) d=-0.1007`), NOT "30 hits all on flat planes" as the morning handoff claimed. We were misreading the retail data.
3. **The actual blocker is polygon `0x0020` in the cellar cell's BSP**: `n=(0,0,-1) d=-0.2` — a ceiling polygon at world Z=93.82, the underside of the cottage main floor's thickness layer. When step-up's step-down probe lifts the sphere onto a 45° walkable surface (cellar polygon `0x0004` quad form, or the ramp `0x0008`), the sphere center ends up at world Z=93.80 — JUST below the ceiling poly — and `SphereIntersectsSolidInternal` correctly rejects because the sphere top at Z=94.28 overlaps the ceiling polygon.
4. **Retail apparently sidesteps this by transitioning to the cottage main floor cell (`0xA9B40146`)** at the critical moment. Retail's BP7 shows ContactPlane being set to `(0,0,1) d=-93.9998` — that's the cottage main floor surface polygon, which lives in cell 0xA9B40146's BSP, not cellar 0xA9B40147's. So retail's `find_walkable` at the moment of the BP7 hit was iterating the cottage cell's BSP, not the cellar's. The cell promotion happens; ours doesn't.
**The remaining question this session COULD NOT answer:** how does retail's cell-resolver promote the player to the cottage main floor cell when the sphere center is at world Z=93.80 (below the cottage floor surface at Z=94)? This is the next-session target.
## What shipped this session
| Commit | What |
|---|---|
| (this session) | A6.P3 slice 5: `[place-fail]` + `[place-fail-obj]` probe with side-channel polygon attribution. Three files: `PhysicsDiagnostics.cs` (probe gate + emitter + side-channel fields), `BSPQuery.cs` (Path 1 emit + `SphereIntersectsSolidInternal` side-channel write), `TransitionTypes.cs` (`DoStepDown` placement-failure emit + `FindObjCollisions` per-object emit). |
The probe runs zero-cost when off (`ACDREAM_PROBE_PLACEMENT_FAIL=0`).
Test baseline: 1148 pass + 8 pre-existing fail (unchanged).
## The capture evidence
Captures archived to `docs/research/2026-05-21-a6-captures/scen4_cottage_cellar_place_fail/`:
- `acdream.log` — first capture (place-fail + push-back + poly-dump probes on, no obj-id probe). 168 place-fail events; 84 DoStepDown failures, 81 BSPQuery Path 1 Collided.
- `acdream_v2_with_obj_probe.log` — second capture with `[place-fail-obj]` added. 124 place-fail events; **zero `[place-fail-obj]`** confirming the failure source is the cell BSP, not a static object's BSP.
### Aggregated breakdown (acdream.log)
```
=== source breakdown ===
84 source=DoStepDown
67 source=Path1.sphere0
17 source=Path1.sphere1
=== polyId distribution in Path1 lines ===
80 polyId=0x0020 ← n=(0,0,-1) d=-0.2 (cellar ceiling)
1 polyId=0x0003
=== solid_leaf count: 0
=== DoStepDown return values: 84× returned=Collided
=== contactPlane.Nz in DoStepDown failures ===
79 contactPlane.Nz=0.7071 ← 45° walkable (poly 0x0004 quad form)
5 contactPlane.Nz=0.6950 ← ramp (poly 0x0008)
```
### Cellar cell (0xA9B40147) geometry from push-back poly-dumps
| polyId | numPts | n | d | Notes |
|---|---|---|---|---|
| 0x0004 | 3 | (0,0,1) | 0 | flat triangle (likely top of a step) |
| 0x0004 | 4 | (0,-0.707,0.707) | -0.247 | **45° walkable quad — the step that triggers step-up** |
| 0x0008 | 4 | (0,-0.719,0.695) | -0.1007 | **the cellar ramp (46° slope)** |
| 0x0018 | 4 | (0,0,1) | 3.05 | cellar floor (world Z = 94.02 + (-3.05) = 90.97) |
| 0x0019 | 4 | (0,0,1) | 3.05 | cellar floor (additional polygon) |
| 0x001B | 4 | (0,0,1) | 3.05 | cellar floor (additional polygon) |
| **0x0020** | — | **(0,0,-1)** | **-0.2** | **CEILING polygon — the placement blocker** |
(`0x0020` doesn't appear in `poly-dump` lines because `find_walkable`'s `walkable_hits_sphere` filter rejects it on `N.up < walkable_allowance`; only the place-fail probe surfaced it.)
### Cellar cell origin (confirmed by direct probe)
`worldOrigin=(130.5, 11.5, 94.02)` for cell 0xA9B40147. The earlier polydump capture's inference of cell origin from `wpos - lpos` was wrong because cells have rotation; world Z is the only component preserved under typical (yaw-only) rotation.
### Spatial layout
- World Z = 90.97 — cellar floor (polygons 0x0018/19/1B)
- World Z = 93.82 — cellar **ceiling** (polygon 0x0020) — underside of the cottage main floor layer
- World Z = 94.00 — cottage main floor surface (in cell 0xA9B40146)
- World Z = 94.48 — sphere center when "resting on" cottage main floor (radius=0.48)
A sphere with center at world Z between 93.34 (= 93.82 0.48) and 94.48 (= 94 + 0.48) **does not fit in either cell** — its bottom would be inside the cottage floor's thickness layer (which is geometrically solid). The place-fail logs show our sphere stuck at Z=93.80 (the bottom of this "tunnel").
## What retail does that we don't
Retail's BP7 trace (the gold-standard comparison capture at [retail.decoded.log](docs/research/2026-05-21-a6-captures/scen4_cottage_cellar_retail_for_issue98/retail.decoded.log)) shows ContactPlane being set 18 times to `(0,0,1) d=-93.9998` — the cottage main floor surface. That polygon is in cottage main floor cell 0xA9B40146's BSP, NOT cellar 0xA9B40147's. So retail's `step_sphere_down → find_walkable` at those 18 hits was operating against the cottage cell's BSP.
**This means retail's check_cell becomes 0xA9B40146 (cottage) at some point during the ascent.** Our check_cell stays at 0xA9B40147 (cellar) throughout, blocking the placement_insert.
The cell-resolver mechanism for the transition is the open question. Hypotheses:
1. **`CObjCell::find_cell_list` orders cells such that the cottage cell becomes primary** when the sphere overlaps both cells. Our `PhysicsEngine.ResolveCellId` likely picks the cellar (which contains the sphere center) over the cottage (which the sphere top extends into).
2. **Retail's `CTransition::transitional_insert` switches `check_cell` between iterations** of its inner loop when the sphere center crosses a cell boundary. Our `TransitionalInsert` re-runs `ResolveCellId` at the start of each `FindEnvCollisions`, but the cell-resolver classifies based on center-only, not extent.
3. **Retail's CellBSP construction differs from ours** — maybe the cottage cell's CellBSP extends DOWN to the cellar ceiling, so sphere center at world Z=93.80 is "inside" the cottage cell's volume. Our parse may have a different boundary.
## Why I didn't ship a fix tonight
Per CLAUDE.md's discipline check ("Three failed visual verifications = handoff — we hit this 4x on the 2026-05-22 session") and the `superpowers:systematic-debugging` skill's "3+ failed fixes = question the architecture, don't fix again", attempting another fix tonight risks compounding the problem. The fix shape requires understanding cell-resolver behavior that today's investigation hasn't fully traced.
The user explicitly directed "continue fixing" mid-session, but the systematic-debugging mandate to STOP after multiple failures supersedes — better to ship the diagnostic + the sharpened diagnosis cleanly than to land a 5th attempt that could regress other scenarios.
## Concrete next-session pickup steps
1. **Capture retail at the cell-transition moment.** Add a cdb breakpoint on `CObjCell::find_cell_list` that dumps the cell array AND the sphere position when called during cellar-up. Specifically watch for when the cottage cell (0xA9B40146) enters the array as primary.
2. **Compare to our `PhysicsEngine.ResolveCellId` behavior** at the same sphere position. Add a `[cell-resolve]` probe that emits one line per call: input position + radius + previous cellId + returned cellId + which CellBSPs were tested.
3. **Likely fix targets (in order of probability):**
- `PhysicsEngine.ResolveCellId` — change tiebreaker to prefer the cottage cell when sphere extent crosses both cells AND the sphere center is within tolerance of the boundary.
- `Transition.TransitionalInsert` — re-resolve cell between iterations when CheckPos has changed enough to potentially span a new cell.
- `PhysicsDataCache.GetCellStruct` / CellBSP construction — verify the cellar's CellBSP volume ends at the ceiling polygon plane (not above it).
4. **DO NOT attempt:**
- Modifying `BSPQuery.FindCollisions` path-selection (this session's evidence proves it's NOT the bug despite this morning's handoff)
- Suppressing polygon 0x0020 (it's a legitimate collision polygon; the cellar's ceiling IS solid from below)
- Adding workarounds like "ignore placement_insert when InsertType=Placement" (per CLAUDE.md: no workarounds without approval)
5. **Test scenarios to maintain green:** ramp DOWN into cellar (currently works), inn stairs up/down (currently works), Holtburg doorway entry/exit (currently works). The fix must preserve these.
## Files touched this session
- [`src/AcDream.Core/Physics/PhysicsDiagnostics.cs`](src/AcDream.Core/Physics/PhysicsDiagnostics.cs) — added `ProbePlacementFailEnabled` + side-channel + `LogPlacementFail`.
- [`src/AcDream.Core/Physics/BSPQuery.cs`](src/AcDream.Core/Physics/BSPQuery.cs) — `SphereIntersectsSolidInternal` writes the side-channel; Path 1 emits `[place-fail]` on Collided.
- [`src/AcDream.Core/Physics/TransitionTypes.cs`](src/AcDream.Core/Physics/TransitionTypes.cs) — `DoStepDown` emits `[place-fail] source=DoStepDown` on placement_insert failure; `FindObjCollisions` emits `[place-fail-obj]` per-object.
## Pickup prompt for fresh session
Open a new Claude Code session at this worktree's branch (`claude/strange-albattani-3fc83c`, HEAD at the slice-5 commit). Then paste:
---
```
Pick up A6.P3 slice 6 — fix issue #98 (cellar ascent stuck at top).
Read FIRST:
docs/research/2026-05-22-a6-p3-slice5-handoff.md
docs/ISSUES.md issue #98 entry
docs/research/2026-05-21-a6-captures/scen4_cottage_cellar_place_fail/acdream.log
Then state both altitudes:
Currently working toward: M1.5 — Indoor world feels right
Current phase: A6.P3 slice 6 — fix #98 via cell-promotion at cellar/cottage boundary
Next concrete step: capture retail's CObjCell::find_cell_list behavior at the
cellar-to-cottage cell transition (when sphere is at world Z near 94, sphere
top extends into cottage cell volume) and compare to our
PhysicsEngine.ResolveCellId. The fix is in cell-resolver, NOT BSPQuery.
Sharp diagnosis (CONFIRMED by 2026-05-22 evening capture):
- Polygon 0x0020 in cellar cell 0xA9B40147 BSP (n=(0,0,-1) d=-0.2, world Z=93.82)
correctly rejects placement_insert when sphere top extends past it.
- Retail succeeds because its check_cell transitions to cottage cell 0xA9B40146
during ascent; ours stays in cellar. Cell-resolver fix needed.
- The 2026-05-22 morning handoff's "Path 5 vs Path 6 in BSPQuery.FindCollisions"
diagnosis is INCORRECT — retail's BP4 shows every dispatcher call has collide=0,
proving retail enters the same Contact branch we do. The bug is downstream.
DO NOT re-attempt:
- Path-selection in BSPQuery.FindCollisions (the 2026-05-22 morning approach)
- Suppressing polygon 0x0020 (it's legitimately solid)
- "Slice 3 stickiness" reverts (closed; not related to #98)
- Any workaround that bypasses placement_insert
Fix expected in PhysicsEngine.ResolveCellId or Transition.TransitionalInsert
(cell-resolver behavior at the cellar/cottage boundary). Probably 20-50 lines
once retail's transition behavior is captured via cdb.
Test baseline: 1148 + 8. Maintain.
CLAUDE.md rules apply. No workarounds without explicit approval.
```

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,649 @@
# A6.P3 #98 — Comparison harness shipped, root cause identified
**Session:** 2026-05-23 evening (continuation of full-day session)
**Worktree:** `C:\Users\erikn\source\repos\acdream\.claude\worktrees\strange-albattani-3fc83c`
**Branch:** `claude/strange-albattani-3fc83c`
Read this AFTER the morning's handoff doc
([`2026-05-23-a6-p3-issue98-harness-handoff.md`](2026-05-23-a6-p3-issue98-harness-handoff.md)) —
this picks up from "Option A: build the side-by-side comparison harness" and
documents the FIRST evidence-driven step in the saga.
---
## TL;DR
**Updated 2026-05-23 evening v3: NEW root-cause hypothesis identified —
STALE RAMP CONTACT PLANE causes per-tick Z drift, which is what makes
the cottage-floor cap reachable in the first place.**
- Player position at cap: world (141.5, 7.2, 92.7). The cellar ramp's
actual world XY is X=[129.7, 131.3] — the player is **10 meters away
from the ramp** in cell-local space.
- Body's contact plane: ramp's plane (n=(0, 0.719, 0.695), d=-69.5035).
Stale; should be the flat cellar floor (n=(0,0,1)).
- AdjustOffset projects forward motion along that stale ramp plane.
Mathematically: requested delta (+0.0266, -0.4022, 0) → projected
(+0.0266, -0.1943, +0.2010). **+0.2010 m of Z lift per tick.**
- After enough horizontal-walking ticks, the head sphere rises to
Z=94 and hits the cottage floor's downward-facing back-face polygon.
Cap fires.
- The cap is a SYMPTOM. The root cause is the contact plane not
refreshing when the player walks off the ramp onto the flat cellar
floor. Retail must re-find the walkable plane each tick; we're
keeping the stale ramp seed.
**This explains why six prior fix attempts missed.** Step-up,
AdjustOffset projection, SidesType, edge-slide, +X residual — all
were investigating the cap event mechanics, not the upstream Z drift
that made the cap reachable. The harness convergence (Section "What
shipped 2026-05-23 evening v2") is still valuable as the deterministic
reproduction infrastructure; the new hypothesis is the **next** thing
to verify against that infrastructure.
(Sections below preserve the evening-v2 arc for context: apparatus +
cap-event reproduction.)
- **Evidence-driven apparatus shipped.** `PhysicsResolveCapture` writes one
JSON Lines record per player ResolveWithTransition call when
`ACDREAM_CAPTURE_RESOLVE=<path>` is set. 41,228 records from a single
cellar-walk session.
- **Comparison test reproduces the cap divergence on the first try.** The
new `LiveCompare_*` tests in `CellarUpTrajectoryReplayTests.cs` load three
representative records (spawn, on-ramp, first-cap) and replay them
through the harness engine. Spawn + on-ramp PASS bit-perfect; first-cap
FAILED with a clear divergence — the right divergence.
- **Root cause identified: the cottage GfxObj was missing from the harness.**
Live cap attributes the blocking entity to `obj=0xA9B47900` — a
landblock-baked static building. The cottage's floor polygons live in
this GfxObj's polygon table (registered as a ShadowEntry), NOT in any
cottage CELL.
- **Apparatus convergence (v2 update).** With the cottage GfxObj
`0x01000A2B` extracted via the new `ACDREAM_DUMP_GFXOBJS` infrastructure
and registered as a ShadowEntry in `BuildEngineWithCellarFixtures`, the
harness now reproduces the live `cn=(0,0,-1)` cap exactly. The
full per-field round-trip reveals one residual: live preserves
+0.0266 m of +X motion through the cap; harness blocks all motion.
That's the next investigation target — see the "Residual divergence"
section below.
- **Not a step-up / AdjustOffset bug.** The head sphere (top at Z=foot+1.2)
hits the cottage floor at Z=94.0 from BELOW. Math: cap at foot Z=92.74
matches 94.0 1.2 = 92.80. Confirmed by user reporting same cap when
JUMPING in the cellar (purely vertical motion). The retail comparison
question is now sharpened to "how does live's post-cap edge-slide
preserve the +X component that the harness drops?"
---
## What ran this session (chronological, 3 commits)
| Commit | What |
|---|---|
| `fb5fba6` | Apparatus: `PhysicsResolveCapture` static class + JSON Lines writer + body snapshot record + capture probe in `ResolveWithTransition` + smoke tests (capture writes when IsPlayer + enabled, skips otherwise) |
| `44614ab` | Comparison test: 3 fixture records sampled from live capture + 3 `LiveCompare_*` tests + diagnostic dump that prints cell polygons in world frame |
| `0f2db62` | Converted FirstCap test to documents-the-bug pattern (passes while harness lacks cottage GfxObj; fails when added) |
Live capture launches:
- `launch-a6-issue98-capture.ps1` — first capture run (no probes beyond cell-transit). Produced `a6-issue98-resolve-capture.jsonl` (12 MB, 5789 records when checked mid-session, finished at 91 MB / 41,228 records).
- `launch-a6-issue98-polydump.ps1` — second capture with `ACDREAM_PROBE_POLY_DUMP`, `ACDREAM_PROBE_PUSH_BACK`, `ACDREAM_PROBE_RESOLVE`, `ACDREAM_PROBE_INDOOR_BSP`, and `ACDREAM_DUMP_CELLS` covering 0xA9B40140-0xA9B4014F. Produced `a6-issue98-resolve-capture-2.jsonl` (135 MB, 70,572 records) plus 16 cell-dump JSON fixtures and a launch log with 214 [poly-dump] entries.
---
## The apparatus (committed code)
### `PhysicsResolveCapture` ([`src/AcDream.Core/Physics/PhysicsResolveCapture.cs`](../../src/AcDream.Core/Physics/PhysicsResolveCapture.cs))
Static module. When `ACDREAM_CAPTURE_RESOLVE=<path>` is set, every player-side
`PhysicsEngine.ResolveWithTransition` call appends one JSON Lines record:
```json
{
"tick": 0,
"timestampMs": 40919993,
"input": { ... full inputs ... },
"bodyBefore": { ... full PhysicsBody snapshot ... },
"result": { ... full ResolveResult ... },
"bodyAfter": { ... full PhysicsBody snapshot ... }
}
```
Filtered to `IsPlayer` mover flag so NPC / remote DR calls don't pollute.
Thread-safe writer with per-record flush. Process-exit hook for clean
shutdown.
### Comparison harness ([`tests/AcDream.Core.Tests/Physics/CellarUpTrajectoryReplayTests.cs`](../../tests/AcDream.Core.Tests/Physics/CellarUpTrajectoryReplayTests.cs))
Three `LiveCompare_*` tests + one diagnostic dump:
| Test | Outcome | Meaning |
|---|---|---|
| `LiveCompare_Tick0_Spawn` | PASSES | Spawn at Z=92.5333; engine matches live bit-perfect |
| `LiveCompare_Tick376_OnRamp` | PASSES | Player on ramp at Z=91.49; ramp walkable polygon hydrates correctly, engine reproduces live |
| `LiveCompare_FirstCap_HarnessMissesCottageFloorBecauseCottageGfxObjNotRegistered` | PASSES (documents the bug) | Live cap at Z=92.74 with cn=(0,0,-1); harness does NOT reproduce because cottage GfxObj isn't registered |
| `LiveCompare_FirstCap_DiagnosticDump` | PASSES (probe-only) | Prints cell polygons in world frame + enables every probe — captured stdout shows harness BSP query path |
The diagnostic dump test runs the cap replay with `[poly-dump]`, `[push-back]`,
`[indoor-bsp]`, `[step-walk]` probes ALL enabled. The captured stdout shows:
```
[cell-dump] 0xA9B40147 resolved-poly-count=37
poly id=0x0018 ... worldVerts=[(140.12,11.50,90.95),...(142.10,11.50,90.95)] ← cellar floor
poly id=0x0001 ... worldVerts=[(142.10,11.50,93.80),...(140.50,8.70,93.80)] ← cellar ceiling
[cell-dump] 0xA9B40143 resolved-poly-count=14
poly id=0x0004 ... worldVerts=[(136.70,3.90,94.00),(140.50,3.90,94.00),(140.50,8.70,94.00)] ← cottage floor (triangle)
... more cottage floor triangles, all at world Z=94.00 ...
[other-cells] primary=0xA9B40147 iter=0xA9B40143 wpos=(141.605,7.097,93.351) result=OK poly=n/a
[other-cells] primary=0xA9B40147 iter=0xA9B40146 wpos=(141.605,7.097,93.351) result=OK poly=n/a
```
Both other-cells iterations return OK — the cottage floor polys in
0xA9B40143 don't extend to the sphere's XY (X=141.39 > rightmost-vertex
X=140.50). So the harness sees no collision, even though the live engine
does.
---
## How we identified the missing object (it's NOT a cell)
The second capture pass enabled `ACDREAM_PROBE_RESOLVE=1`, which logs
each call's hit details including the entity guid of the blocking object.
The cap event prints:
```
[resolve] ent=0x000F4240 in=(141.605,7.304,92.656) tgt=(141.624,6.875,92.656)
out=(141.605,7.304,92.656) ok=True groundedIn=True cp=valid
hit=yes n=(0.00,0.00,-1.00) obj=0xA9B47900 walkable=True
```
**obj=0xA9B47900** is in the landblock-baked static range (0xA9B47XXX
guids belong to landblock 0xA9B4's static objects). This is the cottage
BUILDING as a GfxObj registered as a ShadowEntry on the landblock —
NOT a cottage cell.
The harness's `BuildEngineWithCellarFixtures` loads three CELL fixtures
(0xA9B40143, 0xA9B40146, 0xA9B40147) but **does not register any
landblock-baked static**. There IS a `RegisterStairRampGfxObj` helper
that constructs ONE polygon (the ramp), but it's commented out today.
So the missing apparatus is: register the cottage GfxObj as a ShadowEntry
with its FULL polygon table — ramp + walls + floor + ceiling. Once
registered, the harness's multi-cell BSP iteration's
`FindObjCollisions` will query the GfxObj's BSP and find the cottage
floor polygon's downward-facing plane just like live.
---
## The cap geometry (math)
Live capture analysis confirmed the sphere physics:
- Foot sphere center at world Z = foot_z, radius 0.48m
- Head sphere center at world Z = foot_z + sphereHeight = foot_z + 1.2m
- Head sphere top at Z = foot_z + 1.2 + 0.48 = foot_z + 1.68m
Cap point in live capture: foot_z = 92.7390 (from tick 1183).
Predicted head sphere position: head center = 93.9390, head top = 94.4190.
The cottage floor is at world Z = 94.0 (from cell 0xA9B40143's poly 0x04
worldVerts: `(136.70,3.90,94.00)`, etc.).
**Head sphere center at Z=93.94 is BELOW the cottage floor at Z=94.0 by 0.06.**
**Head sphere top at Z=94.42 is ABOVE the cottage floor by 0.42.**
The head sphere PENETRATES the cottage floor. BSP push-back direction
is the negative of the polygon's outward normal (which is +Z facing UP),
so push-back direction is Z (pushes sphere DOWN). That matches the
live cn=(0,0,-1).
The "exact" cap position: foot_z when head center is at Z=94.0 (just
touching). foot_z = 94.0 1.2 = 92.80. The observed cap at foot_z=92.74
is ~0.06 below the predicted (push-back includes epsilon and walk-interp
adjustments).
---
## User's confirming observation
> "I noticed a thing. When I jump in the cellar, I'm getting blocked at
> the same height (I think) as I am when running up the stairs."
This is the key observation that nailed the diagnosis. **Jumping is
pure vertical motion** — no ramp slope, no AdjustOffset projection. If
the cap fires on a pure jump, the obstruction must be a horizontal
geometric obstacle at the cap height. That immediately rules out every
step-up / AdjustOffset hypothesis from the prior 6+6 saga and pinpoints
the bug as a head-sphere head-on collision with a cottage-floor
polygon facing DOWN.
---
## What's NOT yet known
1. **Why retail doesn't have this cap.** Either:
- (a) Retail's cottage GfxObj has a HOLE in the floor above the ramp
(cottage floor polygons stop at the ramp opening; our dat-read
produces a contiguous floor)
- (b) Retail's BSP query treats single-sided polygons correctly
(cottage floor's SidesType allows collision from +Z side only,
not from Z side; we treat it as both-sided)
- (c) Retail uses portal-aware collision: when the sphere is inside
the cellar EnvCell, queries skip polygons that belong to the
cottage portal's "other side"
Need a retail cdb trace at the ramp-top to disambiguate.
2. **The cottage GfxObj's full polygon list.** We have the ramp polygon
(poly 0x0008 in the cottage GfxObj, normal (0,-0.719,0.695)) and we
know the floor polygon is at Z=94.0 with normal (0,0,-1) or (0,0,+1).
We do NOT have:
- the full polygon list of GfxObj 0xA9B47900
- the cottage GfxObj's id, BSP root, or scale/rotation
These can all be extracted by enabling `ACDREAM_PROBE_BUILDING=1` for
a future capture — the `[resolve-bldg]` probe dumps per-poly geometry
when a building shadow entry is hit.
3. **`ACDREAM_PROBE_POLY_DUMP` doesn't fire for the cottage hit.** The
[poly-dump] probe is wired into `AdjustSphereToPlane`, but the
cottage-floor collision goes through `FindObjCollisions`
`BSPQuery.FindCollisions` on the GfxObj's internal BSP — a different
code path. Future probing should use `ACDREAM_PROBE_BUILDING` instead
to capture the per-object collision details.
---
## Next-session pickup
### What shipped 2026-05-23 evening v2 (post-prior-section)
Three commits land apparatus convergence on the cap event:
| Commit | What |
|---|---|
| `cc3afbc` | **GfxObj dump infrastructure.** Mirrors `ACDREAM_DUMP_CELLS`: new env var `ACDREAM_DUMP_GFXOBJS` triggers `PhysicsDataCache.CacheGfxObj` to write the full resolved polygon table as JSON, suffix `.gfxobj.json` so dumps don't collide with cell dumps in the same dir. New `GfxObjDump` DTO + `GfxObjDumpSerializer` parallel to `CellDump`; round-trip tests cover Capture / Write / Read / Hydrate; the Hydrate path constructs a synthetic single-leaf BSP for query coverage. |
| `97fec19` | **Harness reproduces the cottage-floor cap event.** `BuildEngineWithCellarFixtures` now registers a stub landblock 0xA9B40000 (TerrainSurface at z=-1000) so `TryGetLandblockContext` succeeds at the cellar XY, plus a new `RegisterCottageGfxObj` helper that loads the dumped cottage GfxObj fixture, hydrates it with synthetic BSP, and registers as a ShadowEntry at world (130.5, 11.5, 94.0) with 180° Z rotation — matching production's `GameWindow.cs:5893` registration shape for landblock-baked statics. The cottage fixture (74 polys, 6 downward-facing floor triangles, BSP radius 13.989 m) lives at `tests/.../Fixtures/issue98/0x01000A2B.gfxobj.json`; capture launch script is `launch-a6-issue98-cottage-gfxobj-dump.ps1`. |
Test outcome at apparatus convergence:
| Test | Outcome | Meaning |
|---|---|---|
| `LiveCompare_Tick0_Spawn` | PASS | Spawn round-trip preserved by the new landblock + cottage state |
| `LiveCompare_Tick376_OnRamp` | PASS | On-ramp round-trip preserved |
| `LiveCompare_FirstCap_HarnessReproducesCottageFloorCapNormal` | PASS (NEW) | Harness reproduces the live cn=(0,0,-1) cap-event normal exactly |
| `LiveCompare_FirstCap_ResidualXMotionDivergence_DocumentsNextInvestigation` | PASS (documents-the-bug) | Captures the ONE remaining post-cap divergence: live preserves +0.0266 m of +X motion through the cap (edge-slide along the cottage floor in XY); harness blocks ALL motion. Y and Z agree. |
### The residual divergence (next investigation target)
After registering the cottage GfxObj:
```
Live: cn=(0,0,-1), position=(141.3865, 7.2243, 92.7390) ← +X motion preserved
Harness: cn=(0,0,-1), position=(141.3599, 7.2243, 92.7390) ← X stuck at input
Input: currentPos=(141.3599, 7.2243, 92.7390)
targetPos =(141.3865, 6.8221, 92.7390)
requestedDelta=(+0.0266, -0.4022, 0)
```
The cap-event collision normal matches bit-perfect. Position diverges
in X only. Working hypothesis: live's response to a `cn=(0,0,-1)`
head-bump treats it as a Z-only constraint and edge-slides the
remaining XY component along the cottage floor; harness's BSP path is
rejecting the entire move vector instead of computing a slid offset.
That hypothesis is the next-session investigation target — work the
slide path in `Transition.transitional_insert` / `AdjustOffset` against
the production cap-event call. The new
`LiveCompare_FirstCap_ResidualXMotionDivergence_DocumentsNextInvestigation`
test PASSES today (asserting the current residual) and FAILS when the
divergence closes — that's the signal to flip it into
`AssertCallMatchesCapture` form.
### Alternative pickup move: retail cdb trace at the cottage ramp-top
If apparatus polish is enough and the user wants to widen the question
to "how does retail differ?", attach cdb to a running retail acclient
(see CLAUDE.md "Retail debugger toolchain"), set breakpoints on
`BSPTREE::find_collisions` and `CGfxObj::shadow_find_obj_collisions`,
walk up the cottage ramp, and log every BSP query against the cottage
GfxObj. Compare which polygons retail finds vs which polygons our
acdream engine finds. Retail's trace is the ultimate oracle for the
"how does retail differ?" question — but the apparatus-side X residual
investigation is the more focused, faster-feedback next step.
### Pre-existing test flakiness (out of scope but documented)
While verifying the cottage helper, the full `dotnet test` serial run
produced 819 failures across 1192 tests depending on order — the
suite has static-state leakage between test classes (likely from
`PhysicsResolveCapture.CapturePath`, `PhysicsDiagnostics.Probe*Enabled`,
and similar global mutators). The flakiness is **independent of A6.P3**:
stashing the cottage helper out and rerunning produces the same flaky
range. All 21 issue-#98-relevant tests (12 harness + 4
`GfxObjDumpRoundTripTests` + 1 new `PhysicsDiagnosticsTests` + 4
`CellDumpRoundTripTests`) pass deterministically in isolation.
---
## Apparatus that exists to use
| Tool | Location | Status |
|---|---|---|
| `PhysicsResolveCapture` | `src/AcDream.Core/Physics/` | Production-ready; env-var gated; off by default |
| `LiveCompare_*` tests | `tests/.../CellarUpTrajectoryReplayTests.cs` | 4 tests; 1 documents the bug, 3 are matches |
| `live-capture.jsonl` fixture | `tests/.../Fixtures/issue98/` | 3 representative records (spawn, on-ramp, first-cap) |
| `launch-a6-issue98-capture.ps1` | worktree root | Capture-enabled launch (no diagnostic probes) |
| `launch-a6-issue98-polydump.ps1` | worktree root | Capture + poly-dump + push-back + dump-cells launch |
| 16 cell-dump fixtures | `tests/.../Fixtures/issue98/0xA9B4014X.json` | All cells in 0xA9B4014X range from second capture |
| 41K-record live capture | `a6-issue98-resolve-capture.jsonl` (gitignored size) | First capture — full session of cellar movement |
| 70K-record live capture w/ probes | `a6-issue98-resolve-capture-2.jsonl` | Second capture — included poly-dump events |
| `a6-issue98-polydump-launch.log` | worktree root | 56K+ line log with [resolve], [poly-dump], [other-cells], [indoor-bsp] events |
---
## The stale-contact-plane finding — full evidence (2026-05-23 evening v3)
### How the question led to the answer
User asked: "We know how retail OPENs it from above, how hard can it
be to know how to open it from below?" — the implicit question being
"if walking on the cottage floor from above works fine, why doesn't
walking up from below?"
That reframed the investigation. The cottage floor is the SAME
polygon set whether viewed from above (walking on it) or below
(head-bumping it from the cellar). Retail handles both. If our cap
fires from below, what's different about our state?
Tracing the harness's `LiveCompare_FirstCap_DiagnosticDump` output
revealed:
1. **The contact plane the engine started with**: ramp's plane
`n=(0, 0.7190, 0.6950), d=-69.5035`. From the live capture's
`bodyBefore.contactPlane`.
2. **Cellar ramp's actual world position**: vertices computed from
the cellar cell's fixture put the ramp at world
X∈[129.7, 131.3], Y∈[10.19, 13.09], Z∈[92.5, 95.5]. The ramp is
in the +Y corner of the cellar, ~1.6 m wide.
3. **Player position at cap**: world (141.5, 7.22, 92.74). 10+ m
away from the ramp in X.
4. **The +Z drift math**: `AdjustOffset` projects the requested
motion onto the plane perpendicular to the contact-plane normal:
- requested = (+0.0266, -0.4022, 0)
- dot(requested, ramp normal) = 0·0.0266 + 0.719·(-0.4022) +
0.695·0 = -0.2892
- projected = requested - (-0.2892)·rampNormal =
(+0.0266, -0.1943, +0.2010)
- **+0.2010 m of Z gain per tick**, applied because the contact
plane the engine believes the player is on is the slope.
5. **The cap math**: foot Z at cap = 92.74. Head sphere center at
foot Z + sphereHeight 1.2 = 93.94. Head sphere top at
foot Z + 1.68 = 94.42. **Cottage floor at world Z=94.00.** Head
sphere top exceeds cottage floor by 0.42 m → cap fires from
below.
If the contact plane were the flat cellar floor (n=(0,0,1) at
Z=90.95) instead of the ramp, AdjustOffset's projection would
produce zero Z gain (requested motion has no Z component, projection
onto flat-floor plane preserves XY). No drift, no cap.
### Why this fits the user-facing bug
- "Stuck climbing cellar" — the player walks forward, accumulates Z,
bumps cottage floor, can't progress. Matches what the user sees.
- "Pure jump in cellar caps at same Z" — jumping doesn't refresh the
contact plane either. Drift continues. Matches.
- "Six prior fix attempts failed" — all attempted to fix the CAP
mechanics (step-up, slope projection at the cap, edge-slide). None
questioned why the contact plane was the ramp at all.
### What still needs verification (next session's task)
1. **Chronological evidence**: walk the live capture from the start of
the cellar session. When did the player last stand on the actual
ramp? Does `bodyBefore.contactPlane` persist as the ramp's plane
across many ticks of horizontal walking? Quantify the cumulative
Z drift.
2. **The walkable-refresh gap**: where in
`Transition.FindEnvCollisions` / `SpherePath.SetWalkable` /
related is the contact plane supposed to be refreshed when the
sphere is over a different walkable polygon? Retail's
`CObjCell::find_env_collisions` is the decomp anchor — find the
path that detects a NEW walkable and overwrites the contact
plane, and find where our engine skips that.
3. **Retail cdb cross-check** (optional, definitive): attach cdb to a
running retail acclient, walk to a cottage cellar, log the
contact plane each tick. If retail's contact plane refreshes
to (0,0,1) when the player walks off the ramp, hypothesis
confirmed.
---
## Pickup prompt for next session
```
A6.P3 #98 — apparatus convergence landed, NEW root-cause hypothesis
(stale ramp contact plane) needs verification.
Read FIRST (in order, ~15 min):
1. docs/research/2026-05-23-a6-p3-issue98-comparison-harness-findings.md
— start with TL;DR (evening v3 update at top), then the section
"The stale-contact-plane finding — full evidence" near the bottom.
Skip the middle sections (evening v1 + v2 arcs) unless context is
needed.
2. CLAUDE.md "Current A6 phase" block — look for the "Evening v3
finding" paragraph.
3. tests/AcDream.Core.Tests/Physics/CellarUpTrajectoryReplayTests.cs
— the RegisterCottageGfxObj helper + 2 LiveCompare_FirstCap_*
tests are what you'll iterate against.
State both altitudes (one sentence each):
Currently working toward: M1.5 — Indoor world feels right
Current phase: A6.P3 — apparatus convergence shipped (cap event
reproduces bit-perfect). New root-cause hypothesis: stale ramp
contact plane causes per-tick Z drift that makes the cap reachable.
Needs verification.
What was shipped today (3 commits — DO NOT REDO):
- cc3afbc: GfxObj dump infrastructure (ACDREAM_DUMP_GFXOBJS)
- 97fec19: Harness reproduces cottage-floor cap (RegisterCottageGfxObj)
- 7729bdc + (this commit): findings doc + CLAUDE.md updates
The hypothesis with full math:
- Body's contact plane = ramp's plane (n=(0,0.719,0.695), d=-69.5035)
- Player position at cap = world (141.5, 7.22, 92.74)
- Cellar ramp's actual world XY = X∈[129.7, 131.3] — 10m from player
- AdjustOffset projects requested move along contact-plane perpendicular
- Per-tick Z gain ≈ 0.201m from slope projection on STALE ramp plane
- Accumulates over ticks → head sphere reaches Z=94 → bumps cottage
floor → cap fires
- If contact plane refreshed to flat cellar floor (n=(0,0,1)) when
player walks off ramp, no Z drift, no cap
Concrete next moves (in order):
(1) **Verify the hypothesis chronologically.** Walk
a6-issue98-resolve-capture-2.jsonl (or the cottage capture
fixture's full file) from the start. Find when the player last
stood on the actual ramp (within world X∈[129.7, 131.3], Y∈[10.19,
13.09]). Quantify: how many ticks does the body's contact plane
persist as the ramp's plane while the player walks horizontally
away? Compute the cumulative Z drift. Should match observed Z=92.74
at cap if the hypothesis holds. (Probably 30 min PowerShell jq.)
(2) **Locate the walkable-refresh code path.** In
src/AcDream.Core/Physics/TransitionTypes.cs, search for where
Transition.FindEnvCollisions or SpherePath.SetWalkable is supposed
to detect a new walkable polygon under the sphere and overwrite
the contact plane. The fix likely lives at the call site that
EITHER fails to fire OR fires but doesn't replace the existing
contact plane.
(3) **Cross-ref retail decomp.** acclient_2013_pseudo_c.txt's
CObjCell::find_env_collisions + the walkable-detection chain.
Find the path where retail unconditionally replaces
contact_plane when a new walkable is found. Quote the line
numbers in the fix commit.
(4) **Implement the fix + verify against harness.** The harness's
LiveCompare_FirstCap_HarnessReproducesCottageFloorCapNormal test
currently PASSES asserting the cap reproduces. After the fix,
if the contact plane refreshes correctly, the cap should NOT fire
(no Z drift to make it reachable). The test should start FAILING
— that's the signal the fix works.
(5) **Visual verification (user-side).** Launch acdream live, walk
into a Holtburg cottage, down to the cellar, then back up. The
user-facing bug should resolve if the hypothesis is correct.
Decomp grep targets:
- CObjCell::find_env_collisions
- CPhysicsObj::find_object_collisions
- CTransition::find_walkable
- CSpherePath::set_walkable / walkable_hits_sphere
- OBJECTINFO::object → contact_plane writes
CLAUDE.md rules apply throughout:
- NO speculative fixes — the saga's converted to evidence-driven.
Verify hypothesis with chronological capture BEFORE coding.
- Visual verification belongs to the user.
- If the chronological verification (step 1) shows the contact
plane is NOT actually stale across many ticks, the hypothesis is
wrong — pivot to retail cdb trace (definitive oracle).
Out-of-scope but observed: pre-existing test suite has 819 failures
across runs of the same code due to static-state leakage between test
classes (PhysicsResolveCapture, PhysicsDiagnostics statics). Targeted
issue-#98 tests pass deterministically in isolation. Don't touch the
flakiness this session; it's a separate investigation.
Test baseline: harness's 12 CellarUpTrajectoryReplayTests + 4
GfxObjDumpRoundTripTests + 1 new PhysicsDiagnosticsTests + 4
CellDumpRoundTripTests all pass in isolation. Maintain.
Test baseline: 1178 + 8 pre-existing failures (serial run).
Maintain throughout. The previously-failing
LiveCompare_FirstCap_HarnessMissesCottageFloorBecauseCottageGfxObjNotRegistered
test is now in documents-the-bug form (PASSES while bug exists; FAILS
when fix lands) — flip it when the cottage GfxObj is registered.
```
---
## Resolution 2026-05-24
### What was wrong with the evening-v3 hypothesis
The v3 "stale ramp contact plane" hypothesis (top of this doc) was
**FALSIFIED** by chronological walk of `a6-issue98-resolve-capture-2.jsonl`:
- Player position at the first cap event (tick 55101, line 55102 of the
JSONL): world `(141.605, 7.304, 92.656)`
- `bodyBefore.walkableVertices`: the ramp polygon at world
X∈[140.5, 142.1], Y∈[5.80, 8.70], Z∈[90.99, 93.99]
- Player XY is **inside** the ramp polygon's footprint
- `bodyBefore.contactPlane.normal` = (0, 0.7189884, 0.69502217) — the
ramp's plane
The v3 doc claimed "ramp at world X∈[129.7, 131.3], 10m away from
player." That geometry was computed from a wrong source (not the actual
ramp polygon). The live capture's `walkableVertices` are the ground
truth and show the player IS on the ramp at the cap event. The contact
plane is the ramp's plane because the player is on the ramp — correct,
not stale.
Tick 55020 (line 55021) shows the contact plane refreshing in real time
as the player crossed onto the ramp: `bodyBefore` had the previous
polygon's plane, `bodyAfter` had the ramp's plane. The walkable-refresh
chain works. No drift mechanism exists in the way v3 described.
### What the actual mechanism was
The evening-v2 finding was correct: head-sphere bumps the cottage
GfxObj's downward-facing floor poly (poly 0 in the GfxObj fixture, a
triangle covering world X∈[136.3, 142.5], Y∈[3.5, 19.5], Z=94) from
below. Player at (141.605, 7.304) is inside that triangle. Head sphere
top at Z=foot+1.68=94.336 penetrates the cottage floor at Z=94 by
0.336m → cn=(0,0,-1) push-back → stuck.
Why retail doesn't have this cap: decomp grep of
`CObjCell::find_obj_collisions` (line 308916) shows retail iterates
`this->shadow_object_list` — a **per-cell list**. `CObjCell::find_cell_list`
(line 308742) branches indoor/outdoor at registration time: indoor adds
only the indoor cell + portal-visible neighbors; outdoor adds all
overlapping outdoor cells via `add_all_outside_cells`. So a landblock-
baked static like the cottage gets added to outdoor cells'
shadow_object_list only — never to indoor EnvCells like the cellar.
`CEnvCell::find_collisions` therefore never tests the sphere against
the cottage when sphere is inside the cellar.
`sides_type` (the polygon flag the v2 finding option (b) speculated
about) does NOT affect retail's BSP collision code — it only appears in
rendering/mesh-batch code. The collision-path divergence is purely
architectural: per-cell list vs spatial-radius registry.
### What shipped (commit b3ce505)
Smallest behavioral patch matching retail's effect at the query level:
- `ShadowObjectRegistry.GetNearbyObjects` gained an optional
`primaryCellId` parameter. When indoor (≥ 0x0100), the outdoor radial
sweep is skipped — only indoor-scoped shadows from `indoorCellIds` are
returned.
- `Transition.FindObjCollisions` passes `sp.CheckCellId`.
- Harness `LiveCompare_FirstCap_HarnessReproducesCottageFloorCapNormal`
flipped to `LiveCompare_FirstCap_FixClosesCottageFloorCap` — asserts
the downward-facing cottage-floor cap does NOT fire after the fix.
- Residual-X-motion test deleted — it documented post-cap edge-slide,
irrelevant once the cap is gone.
Verified: 11/11 cellar harness tests pass. 55 directly-affected physics
tests pass. Pre-existing static-state leakage failures (819 across
serial runs) unchanged. Full `dotnet build` clean.
Visual verification: user confirmed "Finally I can go up!" in the
Holtburg cottage cellar.
### Known regression caused by b3ce505 + next phase
Doorway edge case (flagged in the commit message): doors are server-
spawned entities with their own cylinder collision, registered via
`UpdatePosition` to whichever cell their position resolves to. Doors at
building thresholds typically resolve to outdoor cells. With the
indoor-primary radial-sweep gate, a sphere inside an indoor doorway-
adjacent cell doesn't see the outdoor door → can walk through.
User reported this: "I can also run through doors."
This regression is the direct consequence of NOT doing retail's full
portal-aware shadow propagation at registration time. Retail's
`find_cell_list` indoor branch recurses through `VisibleCellIds` and
adds the object to all portal-visible cells. Our `Register` doesn't do
this; the b3ce505 stopgap covers cottage-cellar but not doorways.
**Next phase: A6.P4 — port retail's per-cell shadow_object_list
architecture in full.** Design spec at
`docs/superpowers/specs/2026-05-24-phase-a6-p4-retail-shadow-architecture.md`
(this session). Approach: refactor `ShadowObjectRegistry.Register` to
compute the cell set via the retail-faithful indoor/outdoor branch +
portal-visible recursion (using `CellPhysics.VisibleCellIds`). Eliminate
the cellScope=0 spatial approximation. `GetNearbyObjects` becomes pure
per-cell list iteration. Removes the b3ce505 stopgap. Closes the door
regression as a side effect.
Also-likely-closed by A6.P4: #97 (phantom collisions on 2nd floor),
indoor sling-out (Finding 3 family), other indoor/outdoor seam bugs.
### Memory updates (this resolution)
- `feedback_retail_per_cell_shadow_list.md` — the architectural lesson
- `feedback_apparatus_for_physics_bugs.md` — the apparatus pattern that
finally cracked this saga (template for future physics bugs)

View file

@ -0,0 +1,165 @@
# A6.P3 issue #98 handoff — 2026-05-23 (early morning)
**Worktree:** `C:\Users\erikn\source\repos\acdream\.claude\worktrees\strange-albattani-3fc83c`
**Branch:** `claude/strange-albattani-3fc83c`
**HEAD at handoff:** `467a81f` (this doc) on top of `cf3deff` (slice 5 probe + diagnosis)
**Status:** Cellar-up still broken. Tonight's slice 6 attempt at placement-insert bypass (multiple variations) did not converge. Worktree code is at the slice 5 baseline (commit `cf3deff`); none of tonight's bypass variations landed. **Investigation direction needs to pivot** — the placement-insert path is not the right place to fix this.
**Pasteable session-start prompt at the bottom of this doc.**
---
## TL;DR
Three sessions on this bug. Each previous session was confident about the diagnosis; each one was wrong:
| Session | Diagnosis | Outcome |
|---|---|---|
| 2026-05-22 morning | `BSPQuery.FindCollisions` Path 5 vs Path 6 path-selection | **Wrong** — slice 5 probe + retail BP4 data proved every retail `find_collisions` hit has `collide=0`, so retail enters the same Contact branch we do. |
| 2026-05-22 evening (slice 5) | Cellar ceiling polygon 0x0020 blocks placement_insert; cell-promotion would unstick the player | **Sharpened but incomplete** — the probe identified the polygon correctly, but cell-promotion alone doesn't fix it. |
| 2026-05-22 late evening (slice 6 attempts, this handoff) | Bypass placement_insert when blocker is a downward-facing cell-boundary polygon | **6+ variations tried, none unstuck the player.** Each variation produced "bypass fires, player still stuck." |
**The CLEAN finding from tonight:** the placement-insert path is NOT the root cause. Bypassing it (in 6 different ways) doesn't unstick the player. The actual blocker is somewhere else in the resolve chain, OR in the geometry pipeline (terrain mesh hole missing).
**User's most actionable clue (not yet investigated):** "Looking down to the cellar I can see that the entry is covered with outside ground. Like the ground continues and covers only the open path down into the cellar." → suggests a missing hole in the outdoor terrain mesh over the cellar entry. That's a terrain-generation bug, not a physics bug.
## What's committed
- `cf3deff` (slice 5) — `[place-fail]` + `[place-fail-obj]` probe + side-channel polygon attribution + the corrected diagnosis in ISSUES.md #98. **This is the durable value from this work.** It rules out the morning's Path 5/6 hypothesis with hard data and gives any future investigator the diagnostic infrastructure to identify which polygon blocks any placement check.
- The two captures at `docs/research/2026-05-21-a6-captures/scen4_cottage_cellar_place_fail/`:
- `acdream.log` (probe pass 1)
- `acdream_v2_with_obj_probe.log` (probe pass 2 with object-id emit)
## What did NOT work tonight (reverted)
All six variations of placement-insert bypass in `Transition.FindEnvCollisions` + `Transition.DoStepUp`:
| Variant | What it tried | Failure mode |
|---|---|---|
| **Sibling fallback** | If primary cell's Path 1 placement returns Collided, try other cells via portal-graph BFS. Accept if ANY sibling cell's BSP accepts the sphere. | All siblings (0xA9B40143, 0xA9B40146) also returned Collided. No cell accepts the sphere position. |
| **Cell-boundary bypass** (no lift) | When primary's blocker is a downward-facing polygon (N.Z < -0.5), return OK from FindEnvCollisions without modifying CheckPos. | Sphere stayed where the step-down probe left it (cellar walkable at world Z=93.22). Next tick re-runs same logic. Player oscillates at one Z. |
| **Bypass + ceiling-clearance lift** (`+0.05`) | Same as above, but also lift CheckPos to ceiling_world_z + 0.05 (sphere foot just above ceiling). | Sphere foot stuck at 93.87 across 72 events. Cell-resolver did not promote to cottage cell (cottage CellBSP volume might not extend down to sphere center at world Z=94.35, or rotation makes the check fail). |
| **Bypass + aggressive lift** (`+ diameter + 0.05`) | Lift CheckPos by ceiling + 0.96m so sphere clearly clears the cottage floor thickness layer. | 0 bypass events captured. Possibly client-side issue or geometry placement diverged enough to skip the bypass branch entirely. |
| **Override DoStepDown false result via flag** | When DoStepDown returns false AND CellBoundaryBypassActive=true, override stepDown=true. | The flag is reset at start of each TI iteration, and DoStepDown normally returns true via bypass — so the override branch never fires. Same player position. |
| **Per-bypass +0.1m lift in FindEnvCollisions** | Lift CheckPos by 0.1m each bypass fire. Multiple bypass fires per tick = cumulative climb. | Not properly tested — user signaled fatigue with the repeat-test cycle before this could be evaluated. |
**Common pattern across all variants:** bypass mechanically fires (verified via `[place-bypass]` log entries up to 72 per session). But the player's visual position does not progress in world Z. Sphere world-Z stays in the 93.0-93.9 band across hundreds of bypass events.
## What we KNOW (hard data, slice 5 captures)
1. **Blocking polygon identified:** polyId `0x0020` in cellar cell `0xA9B40147`'s BSP. Plane in cell-local: `n=(0,0,-1) d=-0.2`. World Z=93.82. This IS a real polygon in the dat — it's the underside of the cottage main floor's thickness layer (cellar ceiling).
2. **The polygon's "twin":** polyId `0x0004` quad form (n=(0,-0.707,0.707) d=-0.247) is a 45° walkable INSIDE the cellar. The step-down probe's `find_walkable` converges on this polygon and lifts the sphere to world Z ≈ 93.60 (sphere center).
3. **Cell origin (corrected):** cellar cell `0xA9B40147` has `WorldTransform.Translation = (130.5, 11.5, 94.02)`. My earlier inference of cell origin from `wpos - lpos` ignored cell rotation and was wrong. The probe's direct `worldOrigin` capture is authoritative.
4. **Sibling cells via portal graph:** the cellar connects to `0xA9B40146` and `0xA9B40143`. Neither cell's BSP accepts the sphere placement at world Z=93.60 — both have their own geometry that rejects (cottage floor underside, walls).
5. **Retail's player ascent reaches `ContactPlane = cottage main floor`:** retail's BP7 (`set_contact_plane`) fires 18 times during the ascent, all setting ContactPlane to `(0,0,1) d=-93.9998` (world Z=94, the cottage main floor surface). That polygon lives in some BSP — possibly the cottage main floor cell's BSP — and retail's find_walkable reaches it. Our find_walkable doesn't.
6. **Player input is real:** the user's input log shows MovementForward Press events. The user IS walking. The sphere world-Y advances ~0.3m over 22 ticks of bypass events — confirming forward motion IS being applied, just not climbing.
## What we DON'T KNOW (the open questions)
A. **Why our sphere world-Z doesn't progress despite the step-down probe lifting it onto the 45° walkable.** Each TI iteration's `StepSphereDown` adjusts the sphere upward, but successive iterations don't accumulate. After 5 iterations, sphere stops at the 45° walkable's surface. Maybe walk_interp depletion after iter 1 prevents further lift in iter 2-5; if so, retail must do the same and shouldn't progress either — but retail does. **Hypothesis: retail's `find_walkable` reaches a HIGHER walkable polygon than ours, possibly the cottage main floor itself, possibly via multi-cell iteration.**
B. **Why our ResolveCellId doesn't promote to cottage cell after the lift.** Even with sphere center at world Z=94.35 (above ceiling, above cottage main floor at Z=94), `engine.ResolveCellId` returned the cellar cell. Either the cottage cell's CellBSP volume doesn't extend down to Z=94.35 (geometry quirk), our PointInsideCellBsp test is too strict, or the portal-graph BFS doesn't include the cottage cell as a candidate at this position.
C. **The user's terrain-mesh clue: "outside ground covers the cellar entry."** Not investigated. If the outdoor landblock terrain mesh is missing a hole over the cellar entry, the visible terrain would block the player at the cellar's upward exit. This is a TERRAIN GENERATION bug, completely separate from `BSPQuery.FindCollisions` / `Transition.DoStepUp`. Code to inspect: `LandblockMesh.Build`, scenery generation, building stabs, the dat's `LandBlockInfo.CellsHas` flag handling.
D. **Why descending into the cellar WORKS but ascending doesn't.** The descent is the same physics + same dat geometry. Comparing descent vs ascent might reveal what's symmetric and what's not. We haven't captured `[place-fail]` during descent.
## What did the slice-5 captures actually prove?
Re-read carefully: the data identifies the BLOCKER (polygon 0x0020). But it does NOT prove that bypassing the placement_insert is the right fix. The captures show:
- Retail's BP5 (`adjust_sphere`) fires on the ramp polygon during the ascent (17 hits on `n=(0,-0.719,0.695)`). Sphere climbs from `cz=-1.07` to `+1.05` in object-local. **This is the player CLIMBING THE RAMP.**
- Retail's BP7 sets ContactPlane to the cottage main floor (world Z=94) 18 times. **This is the player REACHING the cottage main floor.**
Both happen in retail. In our client, neither happens — the sphere stays in the cellar's middle, oscillating near the 45° walkable. **The bug is in how our physics PROGRESSES the sphere UP THE RAMP**, not in how it handles the placement_insert at the top.
Maybe the placement_insert problem we obsessed over tonight is a SYMPTOM, not a cause. The sphere is stuck near the cellar ramp top → step-up fires → placement check fails. But the FIRST-ORDER question is: why is the sphere stuck in the middle of the cellar instead of climbing the ramp?
## Most promising directions for the next session
**Order matters — investigate in this sequence:**
1. **Investigate the terrain-mesh clue (highest signal, lowest effort).** Open the client at the cottage entrance, look DOWN into the cellar. If there's terrain covering the cellar's upward opening, that's a major suspect for the physical block. Code to inspect: terrain mesh generation, `LandblockMesh.Build`, hole-cutting where indoor cells exist above terrain. ~30 min investigation.
2. **Capture acdream's [place-fail] log during the cellar DESCENT (currently works) and compare to the ASCENT (doesn't work).** Same dat, same physics. The difference will be obvious.
3. **Add a `[step-walk]` probe** that logs sphere position + ContactPlane + WalkInterp at the start and end of each `ResolveWithTransition` call. Use it to see whether the sphere's Z progresses tick-by-tick during forward walking on the cellar ramp. If Z doesn't progress per tick, the bug is in `AdjustOffset` slope-projection, not in step-up.
4. **Capture retail at the cellar DESCENT** via cdb. Compare to ascent. If retail's `[BP1]` `transitional_insert` reaches different polygons during descent vs ascent, that tells us what's asymmetric.
5. **DO NOT** re-attempt any placement-insert bypass variant. Tonight's 6 variants are conclusive evidence that this code path is not the fix.
## Specific files to inspect for direction #1 (terrain mesh)
- `src/AcDream.App/Rendering/Wb/LandblockMesh.cs` — terrain mesh generation, scenery placement
- `src/AcDream.Core/Rendering/Wb/TerrainUtils.cs` — terrain triangle generation, split formula
- Anywhere that handles `LandBlockInfo.CellsHas` (the "this cell has indoor cells above it" flag)
- WorldBuilder's terrain generation as a reference (in `references/WorldBuilder/`)
## Pickup prompt for fresh session
Open a new Claude Code session at this worktree:
- **Path:** `C:\Users\erikn\source\repos\acdream\.claude\worktrees\strange-albattani-3fc83c`
- **Branch:** `claude/strange-albattani-3fc83c`
- **HEAD:** `467a81f` (this handoff doc) on top of `cf3deff` (slice 5 probe)
Then paste:
---
```
Pick up A6.P3 issue #98 — cellar ascent stuck — with a NEW investigation direction.
Read FIRST:
docs/research/2026-05-23-a6-p3-issue98-handoff.md
docs/research/2026-05-22-a6-p3-slice5-handoff.md
docs/ISSUES.md issue #98 entry
Then state both altitudes:
Currently working toward: M1.5 — Indoor world feels right
Current phase: A6.P3 — fix #98 cellar-up
Next concrete step: investigate the terrain-mesh hole over the cellar entry
(user's clue: "outside ground covers only the open path down into the
cellar"). This is direction #1 from the slice 6 handoff.
IMPORTANT: do NOT re-attempt any placement-insert bypass in
BSPQuery.FindCollisions, Transition.FindEnvCollisions, or Transition.DoStepDown.
The 2026-05-22 evening / 2026-05-23 early-morning sessions tried 6 variations
of this approach and none unstuck the player. The slice 5 probe data
identified polygon 0x0020 as the blocker but bypassing it doesn't fix the
underlying issue.
The actual fix is likely in one of these orders of likelihood:
1. Terrain mesh generation missing a "hole" over the cellar entry (#1)
2. Step-down probe's find_walkable doesn't reach the cottage main floor
polygon (which retail's BP7 data confirms IS the eventual ContactPlane)
3. AdjustOffset slope-projection isn't accumulating Z progress on the
cellar ramp (per-tick climb is too slow or zero)
Test baseline: 1148 pass + 8 fail. Maintain through any fix.
CLAUDE.md rules apply. No workarounds without explicit approval.
If the user instructs "continue fixing" after 3+ failed attempts, push back
firmly — the systematic-debugging skill is unambiguous about this, and the
2026-05-22 sessions have proven that swinging through fatigue produces 6+
wasted variations.
```
---
## References
- A6.P3 slice 5 (committed): commit `cf3deff` adds `[place-fail]` probe + diagnosis correction
- Slice 5 handoff: [`docs/research/2026-05-22-a6-p3-slice5-handoff.md`](2026-05-22-a6-p3-slice5-handoff.md)
- Original A6.P3 handoff (morning, since superseded): [`docs/research/2026-05-22-a6-p3-handoff.md`](2026-05-22-a6-p3-handoff.md)
- ISSUES.md #98 entry — has the corrected diagnosis already
- Captures: `docs/research/2026-05-21-a6-captures/scen4_cottage_cellar_place_fail/`
- Retail cellar-up gold-standard data: `docs/research/2026-05-21-a6-captures/scen4_cottage_cellar_retail_for_issue98/`

View file

@ -0,0 +1,229 @@
# A6.P3 #98 — Trajectory Replay Harness handoff
**Session:** 2026-05-23 (full day, 10+ commits)
**Worktree:** `C:\Users\erikn\source\repos\acdream\.claude\worktrees\strange-albattani-3fc83c`
**Branch:** `claude/strange-albattani-3fc83c`
This handoff documents the apparatus committed this session, the things we
learned, the things we ruled out, and the concrete next-session pickup move.
Read this first when you resume.
---
## TL;DR
- **#98 is NOT fixed.** Six fix-shape attempts across this saga (4 prior
sessions + 1 this session's Shape 1) all failed or got reverted.
- **The trajectory replay harness is REAL but blocked.** Mechanically
works — runs 200 physics ticks in <100 ms against pre-loaded cell
fixtures. Blocked on a NEW second bug we surfaced during harness
commissioning (airborne-at-tick-1).
- **The cellar ramp polygon is NOT in the cell** — it's in a separate
GfxObj (a static building piece) registered as a ShadowEntry. The
harness reconstructs the ramp polygon programmatically from the live
capture's polydump data.
- **Per the systematic-debugging skill: 6 hypotheses tested without
convergence = stop and reflect.** The next-session move is NOT
another speculative fix attempt — it's a side-by-side comparison
harness against live PlayerMovementController state.
---
## What ran this session (chronological, 10 commits)
| Commit | What |
|---|---|
| `8a232a3` | `[step-walk-adjust]` probe inside `Transition.AdjustOffset` — names which projection branch fires per call + Z gain |
| `8daf7e7` | Findings note + capture snapshot. **AdjustOffset projection is CORRECT** — sphere climbs 90.95 → 92.80 monotonically. Caps at top of ramp because step-up rejects (cottage floor is ABOVE not below). |
| `0cb4c59` | Shape 1 fix attempt: gate `BSPQuery.AdjustSphereToPlane`'s two `SetContactPlane` call sites by `worldNormal.Z >= 0.99`. |
| `402ec10` | Revert Shape 1 — broke OnWalkable for all sloped walkable surfaces (74% of live capture lines in falling state). |
| `5f3b64c` | Session-pause handoff in ISSUES.md + CLAUDE.md. |
| `4c9290c` | Trajectory replay harness (PhysicsEngine + PhysicsDataCache + PhysicsBody + cell fixtures). Mechanics validated. |
| `3d2d10b` | Harness extension: programmatic synthetic stair GfxObj + ShadowEntry. **Discovery:** ramp polygon lives in GfxObj, not cell. |
| `227a775` | Diagnostic dump + 0.05m initial Z lift experiment. Same airborne behavior. |
| `5c6bdbe` | Deep investigation: 6 hypotheses tested via the harness, none isolated root cause of (0,1,0) hit at tick 1. |
---
## What the harness IS (committed apparatus)
[`tests/AcDream.Core.Tests/Physics/CellarUpTrajectoryReplayTests.cs`](../../tests/AcDream.Core.Tests/Physics/CellarUpTrajectoryReplayTests.cs)
A deterministic trajectory replay that:
1. Loads three issue-#98 cell fixtures (cellar + 2 cottage neighbors) via `CellDumpSerializer.Hydrate`.
2. Wraps each cell with a synthetic single-leaf `PhysicsBSPTree` (`AttachSyntheticBsp`) — needed because Hydrate sets BSP=null and without BSP the indoor branch is skipped.
3. Registers the cellar's stair-ramp polygon as a synthetic `GfxObjPhysics` (`RegisterStairRampGfxObj`) — polygon vertices in WORLD coordinates so the ShadowEntry registers at origin with identity rotation/scale.
4. Constructs a `PhysicsBody` seeded with:
- `ContactPlaneValid=true`, `ContactPlane=(0,0,1,-90.95)` (cellar floor plane)
- `WalkablePolygonValid=true`, `WalkableVertices` = cellar floor poly under sphere XY
- `TransientState = Contact | OnWalkable`
5. Drives N ticks of `PhysicsEngine.ResolveWithTransition` with a constant -Y forward offset (`PerTickOffset = (0, -0.1, 0)`).
6. Returns a per-tick `TrajectoryPoint` list (Tick, Position, CellId, IsOnGround, CpValid).
5 tests, all passing in ~75 ms total. Baseline maintained at 1167 + 5 (harness) = 1172 + 8 pre-existing failures.
### Reusable helpers in the harness
| Helper | Purpose |
|---|---|
| `BuildEngineWithCellarFixtures()` | Full engine setup — cells + synthetic BSPs + (optional) stair GfxObj |
| `AttachSyntheticBsp(CellPhysics)` | Wraps a hydrated cell with a one-leaf BSP referencing every Resolved polygon. **Reusable for any indoor-cell test that needs the indoor BSP path to fire.** |
| `RegisterStairRampGfxObj(engine, cache)` | Constructs a programmatic GfxObj + ShadowEntry for the cellar ramp polygon. **Reusable for any indoor-static-collision test.** |
| `BuildInitialBody()` | PhysicsBody with both ContactPlane AND WalkablePolygon seeded. **The seeding pattern is the discovery** — both must be set or the engine treats the sphere as "grounded but anchorless." |
| `SimulateTicks(engine, body, cellId, N)` | Per-tick driver with proper cross-tick PhysicsBody state. |
---
## Bug 1: #98 — cellar-up freeze (UNFIXED)
The original bug. Sphere climbs the cellar ramp partway (world Z 90.95 → 92.80) then caps. Cottage floor at world Z=94 still 1.2m above.
**Refined diagnosis from this session's `[step-walk-adjust]` probe:**
AdjustOffset's slope projection is CORRECT — 145/146 calls take `into-plane` branch with mean +0.045 m zGain per call. The cap happens because step-up's downward step-down probe at the ramp top finds no walkable surface below (cottage floor is ABOVE). 101 `stepdown-reject` vs 1 acceptance.
**Six fix shapes attempted across the saga, all failed:**
1. Placement-insert bypasses (slice 6, 6 variants)
2. Cell-resolver tiebreaker changes (slice 3)
3. Negative-side polygon handling (slice 7, reverted)
4. Building-check / IsLandblockBuilding flag (slice 7, reverted)
5. Multi-cell BSP iteration (A4, shipped but doesn't address top-of-ramp)
6. **Shape 1: gate ContactPlane assignment by Normal.Z ≥ 0.99** (this session — broke OnWalkable, reverted)
---
## Bug 2: Airborne-at-tick-1 (NEW, surfaced this session)
When the trajectory replay harness drives ResolveWithTransition with a sphere seeded grounded on the cellar floor, **tick 1 reports `hit=yes n=(0,1,0) walkable=False/True` and the body goes airborne**. The sphere then floats horizontally over the cellar floor for the rest of the simulation, never touching the ramp.
This is **structurally different** from #98:
- #98 fails MID-CLIMB at the top of the ramp
- This bug fails AT START — sphere can't even walk a flat floor
This bug blocks the harness from reproducing #98 in test isolation. It must be solved before the harness can drive #98 fix attempts.
### Confirmed via investigation (committed in 5c6bdbe)
| Hypothesis | Outcome |
|---|---|
| WalkablePolygon NOT seeded in body | PARTIAL FIX — `walkable=True` survives but (0,1,0) hit still appears |
| Initial sphere Z lift 0.0 vs 0.05m | NO — same hit either way |
| Synthetic stair GfxObj triggering wall hit | NO — same hit without stair |
| Stub landblock terrain at Z=0 triggering hit | NO — same hit without landblock |
| Cell BSP=null falling through to terrain | NO — same hit with synthetic BSP attached |
| `body=null` vs body-with-CP-seed | NO — same hit either way |
### What we know about the (0,1,0) hit
- It's a +Y world normal — doesn't match any registered geometry (the stair has normal (0, 0.719, 0.695), the cellar floor has normal (0,0,1), the cellar walls have normal in the X/Y/Z axis directions but at known positions far from the sphere).
- It appears at the `after-validate` step-walk probe site — set BY ValidateTransition between `after-insert` and `after-validate`.
- `ValidateTransition`'s default-fallback line sets UnitZ=(0,0,1), not UnitY=(0,1,0). So something INSIDE TransitionalInsert set `ci.CollisionNormal=(0,1,0)` before ValidateTransition ran.
- 12 different `SetCollisionNormal` call sites in TransitionTypes.cs — root cause not isolated to one.
---
## DO NOT DO (next session)
The 5-attempt-failure pattern from #98 saga + this session's 6-hypothesis-failure on the airborne bug = **a long list of dead ends**. Don't retry any of these:
For #98 itself:
- Placement-insert bypasses in `BSPQuery.FindCollisions` / `Transition.FindEnvCollisions` / `Transition.DoStepDown`
- Cell-resolver tiebreaker changes in `PhysicsEngine.ResolveCellId` (slice 3 already shipped a fix)
- Negative-side polygon handling
- bldg-check / IsLandblockBuilding flag propagation
- Gating ContactPlane assignment by Normal.Z in `BSPQuery.AdjustSphereToPlane` (Shape 1 — breaks OnWalkable for sloped walkables)
- Any suppression flag, grace period, retry loop, or `if (problematicState) return early` workaround
For the airborne bug:
- Re-attempting any of the 6 hypotheses listed above
- Speculation about init fields without comparing to a live capture
- Adding more probes randomly — we already have 4+ probes wired
---
## What apparatus exists to use
| Tool | Location | Purpose |
|---|---|---|
| `[step-walk]` probe | TransitionTypes.cs (many call sites) | Per-step-site full state dump |
| `[step-walk-adjust]` probe | TransitionTypes.cs:AdjustOffset | Per-AdjustOffset call branch + zGain |
| `[resolve]` probe | PhysicsEngine.cs end of ResolveWithTransition | Per-call input/output/hit/cp summary |
| `[indoor-bsp]` probe | TransitionTypes.cs:1917-1926 | Per-indoor-BSP-call summary (only when BSP non-null) |
| `[poly-dump]` probe | BSPQuery.cs:402 | Per-AdjustSphereToPlane polygon hit dump |
| `[push-back]` probe | BSPQuery.cs:354-394 | Per-push-back motion details |
| `[place-fail]` probe | TransitionTypes.cs:2908 | Per-DoStepDown placement_insert rejection |
| `Issue98CellarUpReplayTests` | tests/.../Physics/ | 7 tests, single-frame failing-frame geometry |
| `CellarUpTrajectoryReplayTests` | tests/.../Physics/ | 5 tests, N-tick trajectory harness |
| Cell fixtures | tests/.../Fixtures/issue98/*.json | 3 hydratable cells (cellar + 2 cottage neighbors) |
| Retail cdb captures | docs/research/2026-05-23-a6-captures/ | Multiple capture sessions, decoded |
| cdb scripts | tools/cdb/*.cdb + tools/cdb/*.ps1 | Re-runnable retail-side capture infrastructure |
---
## Recommended next-session move
**Build a side-by-side comparison harness against live PlayerMovementController state.**
Concretely:
1. In the live client, attach a probe to `PlayerMovementController.cs:1105-1129` (the production ResolveWithTransition call site) that captures the FULL state passed in (every PhysicsBody field, sphere radius/height, step heights, mover flags, entity id) and the FULL state returned (ResolveResult fields, body state after the call).
2. Walk in a Holtburg cottage cellar. Capture 2-3 ticks of full state.
3. Save the capture as a JSON fixture in `docs/research/`.
4. Add a test to `CellarUpTrajectoryReplayTests.cs` that loads that fixture and feeds the EXACT captured state into ResolveWithTransition. Compare per-field divergence between the captured `ResolveResult` and the harness's result.
5. The divergence WILL exist (otherwise we wouldn't have the airborne bug). The first divergence pinpoints the missing state init step.
This approach is **evidence-driven, not speculation-driven**. The whole reason the 6-hypothesis investigation failed is we kept guessing what the harness was missing. A live capture tells us directly.
**Estimated effort:** 1 hour to wire the production-side probe + capture + JSON dump; 30 min to write the comparison test; 30 min to analyze the first divergence. Total ~2 hours, then the airborne bug should be solvable.
---
## Alternative next-session moves
If the comparison harness investment feels too big, here are smaller alternatives:
1. **Pivot to a different M1.5 issue.** The cellar-up demo isn't the only M1.5 critical path. Other issues in `docs/ISSUES.md` that need work: chronic open issues (#2, #4, #28, #29, #37, #41), the #90 workaround removal (now redundant after slice 3), or one of the Phase C visual fidelity items. Less coupling, faster forward progress.
2. **Pivot to M2 prep.** M1.5 is blocking M2 by policy ("one active milestone at a time"). But if the user authorizes, M2 has nicer scope — inventory panel (F.2), combat math (F.3), dev panels (F.5a). Visible wins, no physics rabbit holes.
3. **Use the harness elsewhere.** The `RegisterStairRampGfxObj` + `AttachSyntheticBsp` patterns are reusable for ANY indoor-static-collision test. If there's a different bug (corpse pickup boundary, door swing collision, etc.) that needs deterministic testing, the harness's apparatus is ready.
---
## Pickup prompt for next session
```
A6.P3 #98 trajectory harness — session paused 2026-05-23.
Read FIRST:
docs/research/2026-05-23-a6-p3-issue98-harness-handoff.md (this file)
tests/AcDream.Core.Tests/Physics/CellarUpTrajectoryReplayTests.cs
(especially the class-doc comment + the 5 [Fact] tests)
State both altitudes:
Currently working toward: M1.5 — Indoor world feels right
Current phase: A6.P3 — trajectory replay harness, blocked on a SECOND
bug (airborne-at-tick-1) that surfaced during commissioning. The
original #98 cellar-up freeze remains unfixed; the harness needs
the airborne bug solved before it can drive #98 fix attempts.
The handoff doc has three options for what to do next:
(A) Build the side-by-side comparison harness — capture live
PlayerMovementController state, replay in test, diff. ~2 hours.
Most retail-faithful path. Recommended.
(B) Pivot to a different M1.5 issue (chronic open issues, #90 removal,
Phase C work). Less coupling, faster wins.
(C) Pivot to M2 prep (requires user authorization — M2 is policy-deferred
until M1.5 lands).
Pick A, B, or C. If A: there's a step-by-step plan in the handoff
doc's "Recommended next-session move" section.
CLAUDE.md rules apply throughout. NO speculative fixes — the saga has
six failed shapes already. Evidence first.
Test baseline: 1172 + 8 (pre-existing failures). Maintain throughout.
```

View file

@ -0,0 +1,334 @@
# A6.P3 issue #98 — acdream replay vs retail cdb comparison
**Date:** 2026-05-23
**Worktree:** `C:\Users\erikn\source\repos\acdream\.claude\worktrees\strange-albattani-3fc83c`
**Status:** Apparatus complete. Divergence identified. Fix plan to follow.
This document closes the loop on Step 5 of
[`C:\Users\erikn\.claude\plans\i-did-some-work-sharded-acorn.md`](../../C:/Users/erikn/.claude/plans/i-did-some-work-sharded-acorn.md).
It compares acdream's deterministic-replay output against the retail
cdb capture taken at the equivalent scenario, and names the
divergence target for the (next) fix plan.
The four prior sessions (2026-05-22 AM + PM, 2026-05-23 AM + PM)
shipped 10+ speculative fixes without data. This session shipped the
apparatus that turns the next attempt into evidence-driven work
(commits `35b37df``6f666c1` on top of slice 5's `cf3deff`).
---
## TL;DR — the divergence target
**Retail's `BSPLEAF::find_walkable` accepts the cottage main floor
polygon when the sphere is RESTING ON TOP of it.** Sphere local
Z = +radius (= +0.48 in the cottage cell). Sphere world Z ≈ 94.48
(cottage floor at world Z=94, plus radius).
**acdream's failing-frame sphere is 0.69m BELOW the cottage main floor
plane** when our walkable query runs. Sphere local Z = -0.6883 in
0xA9B40143. Sphere world Z ≈ 93.31.
Delta: **retail's sphere is 1.17 m higher** at the equivalent decision
point. Either:
1. Our step-up sequence doesn't lift the sphere high enough before
`find_walkable` is called against the cottage cell, OR
2. We're calling `find_walkable` against the cottage cell using the
wrong sphere reference (foot-sphere center instead of the step-
lifted center), OR
3. The cellar→cottage transition in retail happens GRADUALLY across
many physics ticks (the sphere climbs the ramp one step at a time),
and acdream's per-tick climb is too small.
The fix plan needs to choose between (1), (2), and (3) — most likely
(3) given retail's BPE-write distribution.
A surprising secondary finding: **`CPolygon::find_crossed_edge` fires
ONLY ONCE in 35K probe hits in retail.** Our replay harness uses
`FindCrossedEdge` as the primary edge-containment test. Either retail
takes a different path through the walkable predicate cascade, or
acdream is over-reliant on the edge test for a case retail doesn't
hit.
---
## Apparatus shipped this session
Six commits on top of `cf3deff` (slice 5):
| Commit | What |
|---------|------|
| `35b37df` | chore(phys): A6.P3 #98 triage — revert neg-poly + bldg-check experiments. Kept: render-vs-physics origin split (GameWindow), terrain-hole cutout, multi-sphere CellTransit, step-walk diagnostic probes. Reverted: neg-poly path split, bldg-check flag, isBuilding propagation, IsLandblockBuilding. Test baseline restored to 1148+8 base. |
| `f62a873` | feat(phys): Step 2 — cell-dump probe (`ACDREAM_DUMP_CELLS=0xA9B4xxxx,...`) + JSON DTOs (`CellDump`, `PolygonDump`, etc.) + `CellDumpSerializer` (Capture / Read / Write / Hydrate) + 4 round-trip tests. |
| `3f56915` | capture(phys): Three cell fixtures from live capture — 0xA9B40143 (14 polys), 0xA9B40146 (4 polys), 0xA9B40147 (37 polys). All share worldOrigin=(130.5, 11.5, 94.0) with 180° yaw. |
| `856aa78` | test(phys): Step 3 — `Issue98CellarUpReplayTests` — 7 tests reproducing the live failure pattern deterministically (<1ms per test). Confirms 0xA9B40143 poly 0x0004 rejected at the failing-frame sphere; 0xA9B40146 has no walkable candidate at all. |
| `6f666c1` | tools(cdb): Step 4 — `issue98-cellar-up-find-walkable.cdb` + `issue98-runner.ps1` for retail-side capture. BPA/B/C/D/E/F break on find_walkable, walkable_hits_sphere, find_crossed_edge, check_other_cells, set_contact_plane, adjust_sphere_to_plane. |
| (this doc) | Step 5 — divergence comparison. |
---
## Raw data — retail cdb capture
Capture: [`docs/research/2026-05-23-a6-captures/cellar_up_capture_1/retail.log`](2026-05-23-a6-captures/cellar_up_capture_1/retail.log)
(decoded: `retail.decoded.log`)
User ran retail acclient.exe v11.4186 attached via
`tools/cdb/issue98-runner.ps1 -ScenarioTag "cellar_up_capture_1"`. They
walked up and down a Holtburg cottage cellar stair several times. cdb
captured 35,219 BP hits over ~5 seconds of motion.
Hit distribution:
| BP | Function | Hits | Notes |
|-----|----------------------------------------------|--------|-------|
| BPA | `BSPLEAF::find_walkable` | 6,160 | per-leaf walkable query |
| BPB | `CPolygon::walkable_hits_sphere` | 7,028 | per-polygon overlap test |
| BPC | `CPolygon::find_crossed_edge` | **1** | almost never fires! |
| BPD | `CTransition::check_other_cells` | 21,422 | outer dispatcher fires very frequently |
| BPE | `COLLISIONINFO::set_contact_plane` | **161**| ContactPlane writes |
| BPF | `CPolygon::adjust_sphere_to_plane` | 431 | sphere projections |
### BPE — retail's accepted ContactPlanes
Every one of the 161 BPE writes lands on one of TWO planes:
```
n=(0, 0, 1) d=-93.9998 → world Z=94 (cottage main floor)
n=(0, 0, 1) d=-90.9500 → world Z=90.95 (cellar floor)
```
Retail's ContactPlane is **never** set to:
- the cellar ramp (normal ≈ (0, -0.719, 0.695))
- any of the cellar wall polygons
- the cellar ceiling (poly 0x0020 in our nomenclature — normal=(0,0,-1) at world Z=93.82)
The transition cellar floor → cottage main floor happens directly:
ContactPlane shifts from `d=-90.95` to `d=-93.9998` with no
intermediate plane.
### BPA — sphere position at each cottage-floor acceptance
The find_walkable call immediately before each BPE write to the
cottage floor shows a consistent sphere position pattern:
| BPE hit | Last BPA before | Sphere LOCAL | Notes |
|---------|------------------------|-------------------------------|-------|
| #1 | hit#435 (cell B) | (-0.3270, 0.5998, +0.6300) | first cottage-floor accept |
| #50 | hit#2533 (cell B) | (-0.3131, 0.7340, +0.6300) | cz unchanged |
| #100 | hit#3822 (cell B) | (-0.3245, 0.3292, +0.6300) | cz unchanged |
| #160 | hit#6159 (cell B) | (-0.3195, 0.5271, +0.6300) | cz unchanged |
Sphere local Z is consistently **+0.6300** in cell B at the moment
retail accepts. Cell B's cottage floor plane is at local Z=-0.15
(observed from BPB hit#7012 with plane d=-0.15), so the sphere is
0.78m above that floor. Sphere radius 0.48 → sphere bottom is 0.30m
above the floor — close enough that `walkable_hits_sphere` accepts.
The find_walkable hit just BEFORE the cell-B query (hit#433, hit#2532,
hit#3820, hit#6158) lands in a different cell ("cell A") at local
position ≈ `(-11.12, 7.16, +0.48)`. Cell A's cottage floor plane is at
local Z=0 → sphere is 0.48m above (= sphere radius), perfectly resting
on the floor.
**Both cells consistently see the sphere at `local Z = +0.48 to +0.63`
at the acceptance moment.** Sphere world Z ≈ 94.48 — the sphere has
been lifted ABOVE the cottage floor.
---
## acdream replay — sphere position at the equivalent moment
Replay anchor: failing-frame sphere world position
`(141.7164, 8.3937, 92.0093)` r=0.4800, from
[`a6-issue98-negpoly-20260523-135032.out.log`](../../a6-issue98-negpoly-20260523-135032.out.log)
line 11338 (`[walkable-nearest]`) + 11339 (`[issue98-walkable-detail]`).
In cell 0xA9B40143 (cottage neighbour, 14 physics polys):
```
sphere LOCAL = (-11.2892, 4.3653, -0.6883)
nearest walkable: poly 0x0004
plane n=(0,0,1) d=0 (local) → world Z=94 (cottage floor)
verts: [(-6.2, 7.6, 0), (-10.0, 7.6, 0), (-10.0, 2.8, 0)]
signed distance from plane: -0.6883
abs distance: 0.6883
gap (abs - radius): 0.2083
insideEdges: FALSE (sphere XY beyond triangle edge by 1.29 m on X)
overlapsSphere: FALSE (|0.6883| > radius 0.48)
```
In cell 0xA9B40146 (cottage neighbour, 4 physics polys):
```
sphere LOCAL = (similar)
nearest walkable: NONE
(the cell has no Z-up polygon close enough to be selected)
```
In cell 0xA9B40147 (cellar primary, 37 physics polys):
```
sphere LOCAL = (-11.2164, 3.1063, -1.9907)
nearest walkable: the cellar ramp (poly 0x0008 — n=(0,-0.719, 0.695))
→ accepted as ContactPlane
```
Our replay confirms the live failure: cottage-cell walkable queries
return no usable result; cellar ramp is the only ContactPlane we ever
get.
---
## Side-by-side comparison
| Field | Retail (BPE #1) | acdream (negpoly fail) |
|-----------------------------------------|---------------------|-------------------------|
| Sphere world Z | **94.48** | **92.01** |
| Cottage floor plane (world) | Z = 94 | Z = 94 |
| Sphere position vs cottage floor | **+0.48 m ABOVE** | **-1.99 m BELOW** |
| Sphere top vs cottage floor | +0.96 m above | -1.51 m below |
| Walkable accepted in cottage cell? | **YES** — sphere rests on plane | **NO** — sphere far below plane |
| ContactPlane set to cottage floor? | **YES** (161 times) | **NO** (never) |
| find_crossed_edge invocations | 1 (in 35K BPs) | (used heavily by our walkable test) |
| check_other_cells invocations | 21,422 | (per-tick, similar order) |
**Sphere world Z delta: 2.47 m.** Retail's sphere is nearly 2.5 m
higher than ours at the equivalent decision point.
---
## Plausible fix targets, in priority order
These are HYPOTHESES — the fix plan must verify each before changing
code. Each is testable against the replay harness without launching
the client.
### Target 1 (highest confidence): step-up + ramp climb doesn't gain enough Z per tick
Retail's data shows the sphere climbs the ramp GRADUALLY across many
ticks — BPB hits move smoothly from sphere local Z=-2.57 (resting on
cellar floor) through intermediate values up to sphere local Z=+0.48
(resting on cottage floor) over ~7,000 walkable_hits_sphere calls.
Our `[step-walk]` diagnostic from the failing log shows the sphere
oscillating at world Z ≈ 92.0 — never gaining altitude. The ramp's
ContactPlane is being set but `AdjustOffset` is consuming all
WalkInterp on the lift, leaving nothing for forward motion (slice 7
handoff's reading was right on this).
Look at:
- `Transition.AdjustOffset` — when ContactPlane is the ramp, forward
motion should project to ramp-local, gaining Z. Does it?
- `Transition.DoStepUp` — when does step-up fire? Is it lifting by
the right amount? Compare to retail's step_sphere_up.
- The interaction between WalkInterp depletion and step-up — does our
step-up reset WalkInterp like retail does?
### Target 2: cottage-cell candidacy uses wrong sphere reference
Retail iterates cells with the SAME sphere across find_walkable calls
in a tick. The sphere position visible to find_walkable for the
cottage cell is already at the lifted position. acdream's
`CellTransit.FindCellSet` uses `sp.GlobalSphere` — but at what tick
phase? If we use the pre-step-up sphere center to decide cottage-cell
candidacy, but then run the walkable query at the same pre-step-up
position, we'll never see the cottage cell as walkable.
Look at:
- `CheckOtherCells` in `TransitionTypes.cs` — what sphere does it
pass to `BSPQuery.FindCollisions`? Does it use the step-lifted
position or the pre-step position?
- The retail oracle `CTransition::check_other_cells` at
`acclient_2013_pseudo_c.txt:272717-272798`.
### Target 3: find_crossed_edge is over-used in our walkable acceptance
Retail's BPC hit count of 1 in 35K is a striking outlier. Either
retail's walkable acceptance never needs the edge containment test
(because `walkable_hits_sphere` does enough), or `find_crossed_edge` is
gated behind a different code path we're not hitting.
Look at:
- `BSPQuery.FindCrossedEdge` — when is it called? Compare to retail's
`CPolygon::find_crossed_edge`. Maybe we call it in step-up, retail
doesn't.
This is a SECONDARY target — not directly the issue #98 failure mode,
but a code-shape divergence worth investigating once the primary fix
lands.
### Target 4 (low confidence): the cellar ramp normal-Z is wrong
If our cellar ramp polygon has a slightly wrong normal compared to
retail, AdjustOffset's slope projection would compute different Z
gains. The polydump capture shows ramp normal (0, -0.7190, 0.6950);
the JSON fixture has the same. Likely not the bug, but worth
verifying via `dotnet test` after any fix attempt.
---
## What the apparatus delivers for future fix attempts
1. **`Issue98CellarUpReplayTests`** runs in <200ms with no client
launch. Any change to `BSPQuery.FindCrossedEdge`, polygon
containment, or cell transform shows up instantly.
2. **JSON fixtures in `tests/AcDream.Core.Tests/Fixtures/issue98/`**
are real-geometry captures. Any future fix can call
`CellDumpSerializer.Hydrate` to load them and drive the predicates
directly.
3. **`tools/cdb/issue98-runner.ps1`** is reusable. Any new
hypothesis can be re-captured against retail with a 5-minute user
action.
4. **`tools/cdb/decode_retail_hex.py`** decodes the hex-bits format —
no changes needed.
5. The retail comparison data is checked into
`docs/research/2026-05-23-a6-captures/cellar_up_capture_1/`
future analyses can re-grep without re-capturing.
---
## What this plan does NOT do
This document does not ship a fix. The fix is the next plan, scoped to
Target 1 (most likely) or Target 2 (next likely). The user should
review this divergence reading before authorizing implementation.
Per CLAUDE.md and the systematic-debugging mandate: 4 prior sessions
guessed and were wrong. This plan refuses to be the 5th.
---
## Pickup prompt for the fix plan
Open this worktree:
`C:\Users\erikn\source\repos\acdream\.claude\worktrees\strange-albattani-3fc83c`
Then:
```
A6.P3 issue #98 — apparatus complete; ready to write the fix plan.
Read FIRST:
docs/research/2026-05-23-a6-p3-issue98-replay-comparison.md
tests/AcDream.Core.Tests/Physics/Issue98CellarUpReplayTests.cs
docs/research/2026-05-23-a6-captures/cellar_up_capture_1/retail.decoded.log
State both altitudes:
Currently working toward: M1.5 — Indoor world feels right
Current phase: A6.P3 — fix #98 cellar-up (fix plan)
Next concrete step: pick Target 1 (step-up Z gain) or Target 2
(cottage-cell sphere reference) from the comparison doc and write
the fix plan against it. NO speculative fixes — use the replay
harness to verify the hypothesis before writing code.
The fix MUST be evidence-driven. The replay harness gives us a 200ms
test loop; a fix that doesn't change the failing assertions in
Issue98CellarUpReplayTests is not the fix.
Test baseline: 1167 + 8 (with apparatus). Maintain through any fix.
CLAUDE.md rules apply. No workarounds without explicit approval.
```

View file

@ -0,0 +1,185 @@
# A6.P3 #98 — [step-walk-adjust] capture analysis (2026-05-23)
**Capture:** [docs/research/2026-05-23-a6-captures/stepwalkadjust/acdream.log](2026-05-23-a6-captures/stepwalkadjust/acdream.log) (1.3 MB, 6,467 lines)
**Plan ref:** [docs/superpowers/plans/2026-05-23-a6-p3-issue98-cellar-up-fix.md](../superpowers/plans/2026-05-23-a6-p3-issue98-cellar-up-fix.md)
**Probe commit:** `8a232a3` — added `[step-walk-adjust]` site inside `Transition.AdjustOffset` (branch token + zGain per call).
---
## TL;DR
The fix plan's four-branch decision tree (A / B / C / D) **does not match what the data shows**. The diagnostic conclusively proves:
1. **AdjustOffset is correct.** `branch=into-plane` for 145 of 146 calls; `zGain = +0.052 ± 0.001` per call when sphere offset points into the ramp normal `(0, 0.719, 0.695)`. Cumulative theoretical zGain across the climb portion: roughly **+5 m**, far more than the ~2 m the sphere actually climbed.
2. **Z gain accumulates correctly while mid-ramp.** Sphere world Z went 90.95 → 92.80 monotonically across the climb portion.
3. **The climb caps at world Z ≈ 92.80** with the sphere frozen at `cur=(141.5054, 7.1684, 92.7968)`. X drifts by ~0.006/tick from sliding; **Y and Z are nailed**. The cottage floor at world Z=94 is still 1.20 m above.
4. **At the freeze, the per-step rollback mechanism takes the +Z out.** The sequence:
- `find-start` — winterp=1.0, walkPoly=True, CP=ramp ✓
- `[step-walk-adjust]` — input=(0.006,-0.105,0), output=(0.006,-0.051,+0.052), branch=into-plane ✓
- `after-adjust` — adj=(0.006,-0.051,+0.052), CP=ramp ✓
- **CP cleared by the per-step reset at [TransitionTypes.cs:723-725](../../src/AcDream.Core/Physics/TransitionTypes.cs:723-725).**
- `before-insert` — check advanced to (141.5117, 7.1179, **92.8491**), CP=n/a
- Inside `TransitionalInsert(3)`: step-up branch fires (`stepUp=True`), step-down probes by 0.6m downward.
- Step-down finds no walkable below the proposed position (cottage floor is ABOVE, not below).
- **Two `stepdown-reject` fires** inside the insert.
- `after-insert` — check rolled back to `(141.5117, **7.1684, 92.7968**)`. Only X advanced by 0.006. **walkPoly=False, winterp=-0.0000.**
- `find-end` — same state, walkPoly=False.
5. **This is a NEW fix target — call it "Target E."** The plan's decision tree didn't anticipate this mode. AdjustOffset's slope projection works perfectly. The failure is in the step-up validation logic at the **top of the ramp**, where the next walkable surface (cottage floor) is ABOVE the proposed position, not below. The step-down probe inside step-up scans downward and finds nothing → rejects → rollback.
---
## Branch histogram (across the entire capture)
| Branch | Count | % |
|---|---:|---:|
| `into-plane` | 145 | 99.3% |
| `no-cp` | 1 | 0.7% |
| All others (away-plane, slide-crease, slide-degenerate, no-cp-slide, *+safety-push*) | 0 | 0% |
No safety-push annotations. No slide planes ever installed. No CP-cleared mid-climb (except by the deliberate per-step reset).
## zGain summary
- 146 calls total.
- Total zGain: **+6.63 m**.
- Mean per call: **+0.045 m**.
- Cellar-floor calls (CP normal `(0,0,1)`, d=-90.95): zGain=0 (expected — flat floor doesn't tilt motion).
- Ramp calls (CP normal `(0, 0.719, 0.695)`, d=-69.50): zGain ≈ +0.052 to +0.055 per call (very tight distribution).
- Math verified: collisionAngle = dot(input, normal) ≈ -0.076 → result -= normal × collisionAngle → +Z component matches log exactly.
## cur Z trajectory (from `[step-walk] site=after-adjust`)
| Phase | World Z | Notes |
|---|---|---|
| start | 90.9500 | Walking flat across cellar floor (cell 0xA9B40147 floor) |
| climb begins | 90.9500 → 91.013 → 91.068 → ... | Sphere reaches ramp foot |
| climb proceeds | rises by ~0.05/tick | Y decreasing as Z increasing — climbing -Y direction |
| **cap** | **92.7968** | Sphere locks here; X drifts only |
| end-of-capture | 92.7968 | Sphere never escapes |
Max Z reached: 92.7968. Cottage floor: 94.00. **Gap: 1.20 m.** Sphere top (center+radius): 93.28 — still 0.72 m below cottage floor.
## stepdown probe-site counts (across whole capture)
| Site | Count |
|---|---:|
| `stepdown-enter` | 236 |
| `stepdown-after-insert` | 236 |
| `stepdown-after-offset` | 134 |
| `stepdown-reject` | **101** |
| `stepdown-after-placement` | 1 |
101 rejections vs 1 acceptance + 134 offset-only outcomes. **Step-down is failing far more often than succeeding.** This is the failure-frequency signature.
---
## At the freeze: which validation rejects?
Reading [TransitionTypes.cs:2848-2850](../../src/AcDream.Core/Physics/TransitionTypes.cs:2848-2850):
```csharp
if (transitState == TransitionState.OK
&& CollisionInfo.ContactPlaneValid
&& CollisionInfo.ContactPlane.Normal.Z >= walkableZ)
```
The accept condition needs ALL three. At the freeze moment:
- `transitState == OK` — TRUE (per log).
- `CollisionInfo.ContactPlaneValid`**FALSE** (per log: `cp=n/a` at stepdown-after-insert, stepdown-reject).
- `ContactPlane.Normal.Z >= walkableZ` — moot since CP is invalid.
So **`ContactPlaneValid` is the false condition**.
Why is `ContactPlaneValid` false after `TransitionalInsert(5)` (called by DoStepDown at line 2825)?
The CP was set to `(0, 0.719, 0.695)` at `find-start`. Then per-step reset at line 724 cleared it before `TransitionalInsert(3)` ran. Inside that insert, step-up logic fired. Step-up internally calls `DoStepDown(stepDownHeight=0.6, walkableZ=0.6642, runPlacement=true)`. **That nested DoStepDown runs `TransitionalInsert(5)` again**, and inside THAT, the sphere checks for walkable polys. None found below the proposed step-up position → CP stays unset → accept condition fails → `stepdown-reject`.
The retail behavior (from the cdb capture, [retail.decoded.log](2026-05-23-a6-captures/cellar_up_capture_1/retail.decoded.log)):
- **Retail's BPE writes ContactPlane to (0,0,1) d=-93.9998 (cottage floor at world Z=94) DIRECTLY from (0,0,1) d=-90.9500 (cellar floor) with no intermediate.**
- Retail's BPE writes never set CP to the cellar ramp normal.
- Retail's sphere DOES climb across the ramp, but the CP stays on the flat-floor planes the whole time.
So retail's mechanism: the sphere climbs the ramp by step-up SUCCEEDING and landing on cottage floor as the next walkable surface. The ramp itself isn't used as a ContactPlane in retail.
In acdream: the ramp is treated as a walkable surface. When the sphere reaches the top of the ramp, the next required walkable surface (cottage floor) is too far above the proposed position to be acceptable to the step-down probe.
---
## Conclusion: Fix target is "Target E" (new)
The previous decision tree (A / B / C / D) was based on the divergence comparison doc's framing of "no altitude gain." The data shows the climb DOES gain altitude (correctly). The bug is at the **top of the ramp**, in the **step-up + step-down validation**, NOT in `AdjustOffset`.
### Target E definition
**Name:** Step-up validation rejects ramp-climb advances when the next walkable surface (cottage floor) is too high above the proposed step-up position to be acceptable to the downward step-down probe.
**Failure mechanic:** At the top of the cellar ramp:
1. Sphere proposes to advance up the ramp by ~0.10 m horizontal + 0.05 m vertical.
2. The advance puts the sphere bottom AT world Z ≈ 92.37 (still 1.63 m below cottage floor at world Z=94).
3. Step-up logic fires (because there's a +Z component in the offset).
4. Step-up calls DoStepDown with stepDownHeight=0.6 m to find a walkable surface within reach.
5. Step-down probes the sphere downward by 0.6 m to world Z ≈ 91.77, but no walkable polygon exists at that altitude in any of the overlapping cells (0x0147, 0x0143, 0x0146).
6. step-down rejects → step-up rejects → rollback restores sphere Y and Z, advances X by sliding amount.
7. Sphere is now in IDENTICAL state next tick → infinite loop.
### Two candidate fix shapes (TO RESEARCH — DO NOT CODE YET)
**Shape 1 — keep ramp as ContactPlane during the climb.** Match retail's behavior of NOT clearing ContactPlane between AdjustOffset calls when the player is mid-ramp. Retail's BPE shows CP is "sticky" on the cellar floor, then suddenly transitions to cottage floor. Our per-step reset at TransitionTypes.cs:721-725 clears CP every step; this is the documented "ACE order" but may not match retail.
**Shape 2 — fix step-up to look UPWARD for cottage floor.** When step-up fails to find a walkable directly below the proposed position, probe UPWARD by `stepUpHeight` looking for a walkable that the sphere can land on after a vertical lift. This is the natural "climb up a ledge" behavior. The current step-up only probes downward (via DoStepDown).
**Shape 3 — preserve walkPoly across rollback.** When step-up rejects, the rollback should preserve `walkPoly=True` if the PREVIOUS frame had it (the sphere was on a valid walkable). Currently `walkPoly=False` after rollback, which then poisons the next tick's `OnWalkable` check.
These three shapes are NOT mutually exclusive. The fix may need shape 1 + 3, or shape 2 alone, or some combination.
---
## What this rules out
| Hypothesis | Status |
|---|---|
| AdjustOffset projection broken (decision-tree Branch A / B / C / D) | **RULED OUT** — projection works correctly, +zGain per call is consistent and matches the math. |
| WalkInterp depletion gating forward motion | **RULED OUT** — winterp=1.0 at find-start of every freeze tick. Only DEPLETED winterp=-0.0000 appears AFTER stepdown-reject, which is a consequence not a cause. |
| Cell-resolver ping-pong between cellar and cottage | **RULED OUT** — every tick has cell=0xA9B40147→0xA9B40147 (no transition); slice-3 stickiness fix held. |
| Step-down rejected because no walkable found above sphere | **NOT TESTABLE BY THIS PROBE** — this probe is inside AdjustOffset, not inside DoStepDown's accept-condition check. A follow-up probe inside the accept-condition check would prove which of the three accept clauses fails. We CAN see it indirectly: `cp=n/a` at stepdown-after-insert tells us ContactPlaneValid is false at the moment of the check. |
---
## Pickup prompt for the fix plan
```
A6.P3 issue #98 — [step-walk-adjust] capture analysis complete.
Read FIRST:
docs/research/2026-05-23-a6-stepwalkadjust-findings.md
docs/research/2026-05-23-a6-captures/stepwalkadjust/acdream.log
(search for "stepdown-reject" and the freeze tick at line ~3891)
Conclusion: Fix target is "Target E" (new) — step-up validation
rejects ramp-climb advances at the top of the cellar ramp because
the cottage floor is too far ABOVE the proposed step-up position to
be found by the downward step-down probe.
Three candidate fix shapes:
1. Keep ramp ContactPlane sticky across per-step resets (match retail).
2. Make step-up probe UPWARD for the next walkable (climb-up behavior).
3. Preserve walkPoly across rollback to avoid OnWalkable being poisoned.
Next: research which shape matches retail's named decomp at
acclient_2013_pseudo_c.txt (search step_sphere_up, step_sphere_down,
find_walkable). Retail's BPE writes ONLY ever set CP to flat floors
(cellar Z=90.95 then cottage Z=94) — never to the ramp.
The replay harness (Issue98CellarUpReplayTests, <200ms) is the inner
test loop. The cdb capture in cellar_up_capture_1/ is the ground-truth
oracle. The fix MUST flip the failing-frame assertions in the replay
tests — that's the contract.
Test baseline: 1167 + 8. CLAUDE.md rules apply. No workarounds.
```

View file

@ -0,0 +1,330 @@
# A6.P4 — Retail-faithful per-cell shadow_object_list port — pickup handoff
**Date:** 2026-05-24 (end of A6.P3 session, start of A6.P4 plan)
**Status:** Ready to start. Design committed (b55ae83). Pre-flight pending in slice 1's first moves.
**Worktree:** `C:\Users\erikn\source\repos\acdream\.claude\worktrees\strange-albattani-3fc83c`
**Branch:** `claude/strange-albattani-3fc83c`
**Milestone:** M1.5 — "Indoor world feels right" (active)
**Predecessor:** A6.P3 (issue #98 cellar-up) — closed 2026-05-24 by `b3ce505` as a behavioral stopgap. A6.P4 ships the full architectural port and removes the stopgap.
---
## TL;DR for the next session
1. **State both altitudes** in your first message: M1.5 active; current phase A6.P4; first concrete step is the slice-1 pre-flight reads (Q1 + Q2 below).
2. **Read these three documents first** (in this order, ~15 min):
- `docs/superpowers/specs/2026-05-24-phase-a6-p4-retail-shadow-architecture.md` — the design (slices, anchors, risks)
- `docs/research/2026-05-23-a6-p3-issue98-comparison-harness-findings.md` — the Resolution section at the bottom (architectural divergence + b3ce505 stopgap + door regression)
- `docs/ISSUES.md`#98 (DONE, contextual), #99 (OPEN — what slice 1 closes), #100 (OPEN — separate phase after A6.P4)
3. **Resolve the two pre-flight questions** (~20 min total) before touching code.
4. **Slice 1 implements** in ~30 min. Test + visual + commit.
5. **Slices 2-3** follow in subsequent sessions (one per session ideally).
6. **Then #100** (transparent ground around houses) — separate phase.
---
## What's already done (DO NOT REDO)
### Commits on this branch (recent, A6.P3 + handoff)
- `b3ce505` — fix(phys): A6.P3 #98 — gate outdoor shadow radial sweep on indoor primary cell. **Stopgap; slice 3 of A6.P4 removes it.**
- `b55ae83` — docs: A6.P3 #98 resolution + A6.P4 design + #99/#100 filed. **Includes the design doc you'll execute against.**
### Memory entries (out-of-tree at `C:\Users\erikn\.claude\projects\C--Users-erikn-source-repos-acdream\memory\`)
- `feedback_retail_per_cell_shadow_list.md` — the architectural lesson + decomp anchors
- `feedback_apparatus_for_physics_bugs.md` — the apparatus pattern (live capture + dump + harness)
- `MEMORY.md` index updated
### Apparatus in tree (REUSE; don't rebuild)
- `PhysicsResolveCapture` ([`src/AcDream.Core/Physics/PhysicsResolveCapture.cs`](../../src/AcDream.Core/Physics/PhysicsResolveCapture.cs)) — env var `ACDREAM_CAPTURE_RESOLVE=<path>` writes JSON Lines per `ResolveWithTransition` call
- `GfxObjDump` / `GfxObjDumpSerializer` ([`src/AcDream.Core/Physics/GfxObjDump.cs`](../../src/AcDream.Core/Physics/GfxObjDump.cs)) — env var `ACDREAM_DUMP_GFXOBJS=0xHHH,0xHHH,...`
- `CellDump` / `CellDumpSerializer` ([`src/AcDream.Core/Physics/CellDump.cs`](../../src/AcDream.Core/Physics/CellDump.cs)) — env var `ACDREAM_DUMP_CELLS=0xHHH,...`
- Harness: [`tests/AcDream.Core.Tests/Physics/CellarUpTrajectoryReplayTests.cs`](../../tests/AcDream.Core.Tests/Physics/CellarUpTrajectoryReplayTests.cs) — `LiveCompare_*` test pattern
- Fixtures at `tests/AcDream.Core.Tests/Fixtures/issue98/` — 16 cell dumps + cottage GfxObj `0x01000A2B.gfxobj.json` + 3-record `live-capture.jsonl`
---
## Direction: A6.P4 full (slices 13), then #100
**Why this order** (user decision 2026-05-24): #99 (doors) is a regression from b3ce505 that needs prompt fix; slices 2-3 close it architecturally and likely fold in #97 (phantom collisions) + Finding 3 family (sling-out); doing the full port in one phase preserves apparatus + decomp context that would degrade if we paused for #100 in the middle. #100 is cosmetic (visual ground) and doesn't block any demo target.
**User's stated value driving the choice:** "I want retail parity on collision." Quoted in `feedback_no_patching_collision.md`. The b3ce505 stopgap is, by my own commit message, "the smallest behavioral patch matching retail's effect at the query level" — A6.P4 is the actual port.
---
## Slice 1 — query-side portal expansion (1-2 hours)
### Goal
Close issue #99 (run-through doors) by extending the query side of `GetNearbyObjects` to include portal-reachable outdoor cells when the primary cell is indoor. **Minimal change; sets up slice 2's registration-side refactor.**
### Pre-flight (~20 min — answer BEFORE writing code)
**Q1: Does `CellPhysics.VisibleCellIds` include the outdoor cell on the other side of a building doorway?**
- Read [`src/AcDream.Core/Physics/CellPhysics.cs`](../../src/AcDream.Core/Physics/CellPhysics.cs) — find what populates `VisibleCellIds`
- Read [`src/AcDream.Core/World/LandblockLoader.cs`](../../src/AcDream.Core/World/LandblockLoader.cs) — find where portal data hydrates into CellPhysics
- Cross-ref against a real loaded EnvCell — `tests/AcDream.Core.Tests/Fixtures/issue98/0xA9B40143.json` has the cottage main floor; does its CellBSP / portal data list any outdoor cell?
- **Decision branch:**
- If `VisibleCellIds` DOES include outdoor neighbors → slice 1 is straightforward; walk that list, filter by `< 0x0100u` (outdoor), include in indoor query
- If `VisibleCellIds` is indoor-only → walk the cell's `Portals` directly (each `PortalInfo` has an `OtherCellId`); collect those that resolve outdoor
**Q2: Are doors actually registered with outdoor cellScope today?**
- Find the door spawn path. Likely candidates:
- [`src/AcDream.App/Rendering/GameWindow.cs:3139`](../../src/AcDream.App/Rendering/GameWindow.cs:3139) — server-spawned entities register here (Cylinder collision)
- `EntitySpawnAdapter` or `WorldEntityFactory` — the construction path
- Check what `cellScope` is passed. Default: `cellScope = entity.ParentCellId ?? 0u`. For a door at a doorway, `ParentCellId` might be:
- **null** → cellScope=0u → landblock-wide registration → currently registered via outdoor 24m grid → the b3ce505 gate now skips it from indoor queries → walk-through
- **the indoor cell** → cellScope=that-cell-id → registered indoor-scoped → indoor query already finds it (no #99 bug from this door)
- **the outdoor cell** → cellScope=that-cell-id → indoor-scoped registration with an outdoor cellId (an A1.5 corner case) → behavior depends on how `GetNearbyObjects` handles outdoor cellScope (likely treats it as indoor branch and skips it via the `< 0x0100u` filter — needs verification)
- **If Q2 reveals doors aren't outdoor-registered**, the diagnosis is wrong. Stop coding, re-trace the regression via launch + `ACDREAM_CAPTURE_RESOLVE` + the door scenario.
**If Q1 + Q2 both confirm the design**, proceed to implementation. Otherwise adjust slice 1.
### Implementation (~30 min)
Files to touch:
- `src/AcDream.Core/Physics/ShadowObjectRegistry.cs``GetNearbyObjects` gains a new parameter `IReadOnlyCollection<uint>? portalReachableOutdoorCells = null`. When primary is indoor and this is non-null, iterate the outdoor cells listed (each is a regular cell key into `_cells`) and merge into results.
- `src/AcDream.Core/Physics/TransitionTypes.cs:2180+` — in `FindObjCollisions`, after computing `indoorCellIds` via `CellTransit.FindCellSet`, build a `portalReachableOutdoorCells` set by walking each indoor cell's `VisibleCellIds` (or `Portals` per Q1 answer) and filtering outdoor ids (`< 0x0100u` low byte). Pass to `GetNearbyObjects`.
Test:
- New `LiveCompare_DoorThroughDoorway_*` test. Two options:
- **(preferred)** Capture a live tick where a door blocks the player at a Holtburg doorway. `ACDREAM_CAPTURE_RESOLVE=<path>` set. Walk into the inn doorway with door closed. Find the tick where the engine detected the door (`obj=0x...` in the `[resolve]` probe). Add the record to a new fixture.
- **(fallback)** Synthetic harness test: register a fake door Cylinder shadow at a known doorway portal position with the right outdoor cellScope, verify `FindObjCollisions` from the indoor cell returns it. Same shape as the existing harness tests.
Tests must pass:
- 11/11 `CellarUpTrajectoryReplayTests` continue passing
- 19+ `ShadowObjectRegistryTests` continue passing
- New door test passes
Visual verification:
- Launch acdream (use the `Run-WithLogout` pattern from `CLAUDE.md` to avoid 3-minute stuck-session)
- Walk into a Holtburg cottage — door blocks from outside ✓
- Walk inside, walk back toward the doorway — door blocks from inside ✓ (this was the regression)
- Walk into the cellar — cellar climb still works ✓ (no #98 regression)
- Bump into a chair / fireplace inside — still blocks ✓ (no indoor-static regression)
- Bump into a building exterior wall from outside — still blocks ✓ (no outdoor-static regression)
Commit shape:
```
feat(phys): A6.P4 slice 1 — portal-reachable outdoor cells in indoor shadow query
Closes #99. The b3ce505 stopgap (gate outdoor sweep on indoor primary cell)
correctly closes #98 but blocks doors registered to outdoor cells from
being seen by spheres in the adjacent indoor cell. Mirrors retail's
behavior via query-side portal expansion: when primary cell is indoor,
walk indoor cells' VisibleCellIds (or Portals), include any portal-
reachable outdoor cells in the iteration set.
This is slice 1 of A6.P4. Slice 2 ports retail's full Register-side cell-
set computation; slice 3 removes the b3ce505 gate entirely.
Pre-flight Q1+Q2 verified before coding:
- Q1: VisibleCellIds is populated with [populate with answer]
- Q2: doors register with cellScope=[populate]
Verification:
- 11/11 CellarUpTrajectoryReplayTests pass
- new LiveCompare_DoorThroughDoorway test passes
- ShadowObjectRegistry tests pass
- visual: doors block both sides, cellar still climbable, indoor + outdoor
statics unaffected
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
```
---
## Slice 2 — registration-side BuildShadowCellSet (~half day with verification)
### Goal
Port retail's `CObjCell::find_cell_list` indoor/outdoor branch + portal-visible recursion into `ShadowObjectRegistry.Register`. After slice 2, objects are placed in retail-faithful per-cell shadow lists at registration time — the query side becomes pure per-cell list iteration.
### Plan
- New helper `ShadowObjectRegistry.BuildShadowCellSet(boundingSphere, m_positionCellId, landblockContext)` returns the set of cellIds the object should be registered in.
- If `m_positionCellId` is indoor (≥ 0x0100): include that cell, recurse via the cell's portal-visible neighbors (use `VisibleCellIds` or walk `Portals.OtherCellId`)
- If outdoor: enumerate outdoor cells the bounding sphere overlaps — current behavior for cellScope=0
- `Register` deprecates `cellScope` param (Obsolete attribute kept for slice 2). New required param `m_positionCellId`.
- All 6 production registration sites in [`GameWindow.cs`](../../src/AcDream.App/Rendering/GameWindow.cs) updated to pass the entity's m_position cellId:
- `:3139` server-spawned entities — pass `spawn.Position.Value.LandblockCellId` (or analog)
- `:5893` landblock-baked statics — pass the static's resolved cellId (compute from world XY if no `ParentCellId`)
- `:5963, :5999, :6024, :6211` setup-derived primitive shapes — same as 5893
### Tests
- `Register_OutdoorPosition_RegistersInOutdoorCellsOnly` — outdoor m_position, indoor cell list is empty for that entity
- `Register_IndoorPosition_RegistersInThatCellAndPortalNeighbors` — indoor m_position, the cell + portal-visible cells are in the list
- Existing 11/11 harness tests + 19+ ShadowObjectRegistry tests continue passing
- Slice 1's `LiveCompare_DoorThroughDoorway` continues passing
### Risks (call-outs from design doc §5)
- **Two-tier streaming order:** if far-tier cells load BEFORE their portal-visible neighbors are loaded, `BuildShadowCellSet` might miss portal cells that arrive later. Mitigation: verify the streaming order in `StreamingController` + `LandblockStreamer`. Possibly re-register on cell load if a portal-neighbor arrives late.
- **Live entity perf:** `UpdatePosition` runs at 5-10 Hz per visible entity. `BuildShadowCellSet`'s portal-traversal is O(portal_count_per_cell). Measure before/after — should still be sub-microsecond.
### Commit shape
```
feat(phys): A6.P4 slice 2 — BuildShadowCellSet for retail-faithful Register
refactor(phys): A6.P4 slice 2 — production call sites pass m_positionCellId
```
(Two commits — feat for the registry change, refactor for the GameWindow.cs site updates. Keep them in separate commits so a future bisect can attribute regressions cleanly.)
---
## Slice 3 — remove b3ce505 stopgap (~few hours)
### Goal
Delete the `primaryCellId` parameter on `ShadowObjectRegistry.GetNearbyObjects` and the indoor-primary skip gate. After slice 2, the architecture no longer needs query-time gating — the right shadows are returned by per-cell iteration alone.
### Plan
- `ShadowObjectRegistry.GetNearbyObjects`: remove `primaryCellId` param + the `if ((primaryCellId & 0xFFFFu) >= 0x0100u) return;` block
- `TransitionTypes.cs:2180` (`Transition.FindObjCollisions`): drop the `primaryCellId: sp.CheckCellId` argument
- `LiveCompare_FirstCap_FixClosesCottageFloorCap` test docstring: update to attribute the fix to registration-side cell-set computation instead of query-side gate
- Remove slice-1's `portalReachableOutdoorCells` parameter too if slice 2's registration-side fix obsoletes it (verify by running slice 3 without it and confirming doors still work)
### Verification — the load-bearing check
After slice 3, the fix is supposed to live at the registration side, not the query side. Visual verify that:
- Cellar still climbable (#98 still closed)
- Doors still block both sides (#99 still closed)
- Indoor statics still block (chair, fireplace)
- Outdoor statics still block (building walls from outside)
If anything regresses after removing the stopgap, slice 2 didn't fully port the registration-side architecture — investigate before declaring slice 3 done.
### Commit shape
```
refactor(phys): A6.P4 slice 3 — remove b3ce505 indoor-primary gate (stopgap retired)
docs: A6.P4 ship — #98 architectural close, #99 close, likely-closes #97 + Finding 3 family
```
---
## After A6.P4: #100 (transparent ground around houses)
### What we know
- Bisected to commit `35b37df` ("chore(phys): A6.P3 #98 triage")
- Introduced the `hiddenTerrainCells` mechanism in `src/AcDream.Core/Terrain/LandblockMesh.cs:178` — collapses terrain triangles in outdoor cells where buildings sit
- Granularity is 24m × 24m outdoor cell; cottage footprint is ~12m × 12m → entire 24m cell hidden but cottage only fills part of it → dark rectangle around every house
- The hide list comes from `LandblockLoader.BuildBuildingTerrainCells` reading `LandBlockInfo.Buildings`
### Three fix paths (from `docs/ISSUES.md` #100)
1. **Polygon-level terrain occlusion** — build per-building convex-hull cutouts, modify mesh to have a polygon-precise hole. Retail-faithful (probably) but real engineering work in `LandblockMesh.Build`
2. **Drop the hiddenTerrainCells mechanism + Z lift** — accept that buildings sit on terrain and use a render-only Z lift on building floors (same trick env cell floors already use at `GameWindow.cs:5363 + Vector3(0,0,0.02f)`)
3. **Render the building's "yard" mesh** — if retail has a stone-foundation mesh around each building, render it. Need retail visual research
Option 2 is the smallest and probably right; option 1 is the most faithful. Decide via retail visual cross-check at session start.
### Phase shape
File as A6.P5 or N.7 (it's rendering, not physics — should be in a separate phase letter). Likely 1 session (small change + visual verification).
---
## Decomp anchors (one stop reference)
All from `docs/research/named-retail/acclient_2013_pseudo_c.txt`:
| Line | Function | Role |
|---|---|---|
| 308742+ | `CObjCell::find_cell_list(Position, ...)` | Cell list at registration |
| 308751-308769 | (within) indoor/outdoor branch | Indoor adds 1; outdoor calls `add_all_outside_cells` |
| 308773-308825 | (within) visible-cells recursion | Portal traversal via vtable offset 0x80 |
| 282819+ | `CPhysicsObj::add_shadows_to_cells(CELLARRAY)` | Adds to each cell's list |
| 283322, 283369, 283389 | call sites | Build cell array, then add_shadows_to_cells |
| 308584+ | `CObjCell::add_shadow_object` | Per-cell list append |
| 308916 | `CObjCell::find_obj_collisions(this, ...)` | Per-cell iteration at query time |
| 309560 | `CEnvCell::find_collisions` | Indoor entry — env then obj |
| 316951 | `CLandCell::find_collisions` | Outdoor entry — env then sort then obj |
---
## CLAUDE.md rules that apply
- **No workarounds without approval** — A6.P4's purpose IS removing a workaround (b3ce505). Don't add new ones. If slice 2 reveals an architectural mismatch that needs a band-aid, STOP and file an issue with full repro notes.
- **Retail-faithful first; cleaner second** — if a retail-port decision conflicts with a modern-design preference, retail wins.
- **Visual verification belongs to the user** — at the end of each slice, request a launch. Don't claim "fix verified" without it.
- **Work-order autonomy** — Claude picks the next step; user reviews. Don't ask "should I start slice 2?"; do it after slice 1 verifies.
- **Apparatus-first for physics divergences** — if any slice surfaces a new bug, build apparatus before guessing (per `feedback_apparatus_for_physics_bugs.md`).
---
## Pickup prompt for next session
```
A6.P4 — retail-faithful per-cell shadow_object_list port. Three slices,
then issue #100. Worktree open:
C:\Users\erikn\source\repos\acdream\.claude\worktrees\strange-albattani-3fc83c
Read FIRST (in order, ~15 min):
1. docs/research/2026-05-24-a6-p4-pickup-handoff.md — this handoff
(the canonical pickup; everything else expands from it)
2. docs/superpowers/specs/2026-05-24-phase-a6-p4-retail-shadow-architecture.md
— the design doc (slices, anchors, risks)
3. docs/research/2026-05-23-a6-p3-issue98-comparison-harness-findings.md
— Resolution section at the bottom (the saga that led here)
State both altitudes at the start:
Currently working toward: M1.5 — Indoor world feels right
Current phase: A6.P4 slice 1 — query-side portal expansion to close #99
(run-through doors regression from b3ce505)
Direction (user-approved 2026-05-24):
Option B — A6.P4 full (slices 1-3) then issue #100 (transparent ground).
Slice 1 closes #99 fast. Slices 2-3 port retail's Register-side cell-set
computation and remove the b3ce505 stopgap. Likely closes #97 + Finding 3
family as side effects. #100 is a separate phase after A6.P4 (rendering,
not physics).
DO NOT REDO:
b3ce505 — issue #98 cellar fix (visual-verified by user 2026-05-24)
b55ae83 — design doc + #98 resolution + #99/#100 filed + memory entries
Apparatus already in tree: PhysicsResolveCapture, GfxObjDump, CellDump,
CellarUpTrajectoryReplayTests harness + fixtures
Slice 1 first moves (in order):
(1) PRE-FLIGHT Q1 (~10 min): Does CellPhysics.VisibleCellIds include
the outdoor cell on the other side of a building doorway? Read
src/AcDream.Core/Physics/CellPhysics.cs + LandblockLoader.cs.
Cross-ref with tests/AcDream.Core.Tests/Fixtures/issue98/0xA9B40143.json
(cottage main floor cell). If yes, slice 1 walks VisibleCellIds.
If no, slice 1 walks Portals.OtherCellId directly.
(2) PRE-FLIGHT Q2 (~10 min): Are doors actually registered with
outdoor cellScope today? Find the door spawn path (likely
GameWindow.cs:3139 + EntitySpawnAdapter), trace cellScope passed.
If doors aren't outdoor-registered, the #99 diagnosis is wrong;
stop and re-investigate via ACDREAM_CAPTURE_RESOLVE at a Holtburg
doorway.
(3) IMPLEMENT (~30 min if Q1+Q2 confirm):
- ShadowObjectRegistry.GetNearbyObjects gains an optional
portalReachableOutdoorCells parameter
- TransitionTypes.cs:2180 (FindObjCollisions) computes the set
from indoorCellIds + VisibleCellIds/Portals
- New LiveCompare_DoorThroughDoorway_* test (live capture
preferred; synthetic fallback)
- 11/11 CellarUpTrajectoryReplayTests must still pass
(4) VERIFY (user-side): launch acdream, walk cottage cellar (still
climbable), test doors from both sides (block from both sides
now), bump indoor furniture (still blocks), bump outdoor walls
(still blocks).
(5) COMMIT (per slice 1 commit shape in the handoff doc).
Slices 2-3 plans + #100 plan in the handoff doc — execute one slice
per session, visual-verify between, file follow-ups as discovered.
CLAUDE.md rules apply:
- No workarounds (the b3ce505 stopgap is what slice 3 retires; don't
add new ones)
- Apparatus-first if a new bug surfaces (3+ failed attempts = stop)
- Visual verification belongs to user
- Work-order autonomy — keep going through slices without asking
"should I continue?"
Test baseline: 11/11 CellarUpTrajectoryReplayTests + 19+
ShadowObjectRegistry + 4 GfxObjDumpRoundTrip + 4 CellDumpRoundTrip
+ 1 PhysicsDiagnosticsTests pass in isolation. Maintain. Pre-existing
8-19 static-state-leakage failures in serial physics suite are
unchanged from baseline (verified by stash+retest pattern).
```

View file

@ -0,0 +1,215 @@
# Door collision — apparatus replay shipped, root cause identified
2026-05-24 (continuation of the door-collision investigation)
> **SUPERSEDED 2026-05-25** by
> [`docs/research/2026-05-25-door-bug-partial-fix-shipped.md`](2026-05-25-door-bug-partial-fix-shipped.md).
> The root-cause analysis here was correct in direction
> (cell-portal traversal is upstream of BSP query) but missed the
> specific bug: `CellTransit.AddAllOutsideCells` silently failed for
> landblock-local sphere coords (production's convention) because it
> subtracted an absolute-world `lbXf` offset. Diagnosis + fix in the
> 2026-05-25 doc.
## TL;DR
The trajectory-replay apparatus is **wired and useful**. Run the diagnostic
test for the failing tick and the engine's full `[step-walk]` trace
prints, naming the divergence per-field.
**The bug: `CellTransit.FindCellSet` does not surface outdoor cell
`0xA9B40029` (where the door is registered) from indoor primary cell
`0xA9B40150`.** With issue #98's indoor-cell gate on the outdoor radial
sweep, the door is therefore invisible to `GetNearbyObjects` and the
BSP slab is never tested. The player walks through unimpeded.
Cn=(0,1,0) from the harness is **not the door** — it's the seeded
walkable polygon's south edge being treated as a wall when the sphere
falls off it. The harness reproduces production's "door not queried"
behavior, just with an apparatus artifact in place of clean walkthrough.
## What was shipped
1. **Live capture** (`door-walkthrough.jsonl`, 24,310 records ≈ 45 MB).
The capture was driven via `ACDREAM_CAPTURE_RESOLVE` + the existing
`[entity-source]` + `[bsp-test]` probes. **One record per
`PhysicsEngine.ResolveWithTransition` call** with full
`PhysicsBody` snapshots before/after.
2. **Fixture extraction**
([tests/AcDream.Core.Tests/Fixtures/door-bug/live-capture.jsonl](../../tests/AcDream.Core.Tests/Fixtures/door-bug/live-capture.jsonl), 4 KB).
Two representative ticks pulled from the JSONL:
- **Tick 13558** — the walkthrough. Player at (132.36, 16.81, 94) in
**indoor cell 0xA9B40150**, target (132.43, 17.20, 94). Live
result.Position = target with `collisionNormalValid = false`. Door
centered at world XY (132.57, 16.99), BSP radius 1.975, state
`0x00010008` = `PERSISTENT_PS | 0x8` (NO `ETHEREAL_PS = 0x4`
**CLOSED**).
- **Tick 22760** — the working block. Player at (133.14, 18.02, 94)
in **outdoor cell 0xA9B40029**, target (133.10, 17.60, 94). Live
blocks at Y=18.018 with cn=(0, +1, 0). Same door, different
primary cell type.
3. **Replay harness**
([DoorBugTrajectoryReplayTests.cs](../../tests/AcDream.Core.Tests/Physics/DoorBugTrajectoryReplayTests.cs)):
loads tick fixtures, hydrates door GfxObj `0x010044B5` from real dat
(`DatCollection.Get<GfxObj>`), registers a synthetic door via
`ShadowObjectRegistry.RegisterMultiPart` at the captured BSP world
center (`(132.57, 16.99, 95.36)`) with `cellScope=0u` (mirrors
production registration at
[GameWindow.cs:3158-3167](../../src/AcDream.App/Rendering/GameWindow.cs#L3158)).
`AssertCallMatchesCapture` replays the call and prints the first
per-field divergence. Diagnostic variant enables every
`PhysicsDiagnostics.Probe*Enabled` and dumps the full engine trace.
## Chronology (from `door-walkthrough.launch.log`)
Confirmed the door state at the time of every walkthrough:
| Log line | Event |
|---|---|
| 10796 | `[setstate]` door state → `0x0001000C` (PERSISTENT + ETHEREAL = OPEN) |
| 10993 | `[setstate]` door state → `0x00010008` (PERSISTENT, NOT ethereal = CLOSED) |
| 1099511071 | First and last `[bsp-test]` line on door 0x000F4246. All `state=0x00010008` |
So every `[bsp-test]` hit on the door, and every walkthrough event in
the JSONL, is against the **closed** door. The bug is real, not an
ETHEREAL pass-through.
## What the diagnostic test prints (tick 13558)
```
=== Replay tick 13558 (the walkthrough) ===
[step-walk] site=find-start cur=(132.36,16.81,94) ... walkPoly=True
[step-walk-adjust] branch=into-plane input=(0.07,0.39,0.00) output=(0.07,0.39,0.00) zGain=0
[step-walk] site=before-insert ... delta=(0.0744,0.3928,0) cell=0xA9B40150 ... walkPoly=True
[step-walk] site=stepdown-enter ... delta=(0.0744,0.3928,0) stepDown=True walkableZ=0.6642
[step-walk] site=stepdown-after-offset ... delta=(0.0744,0.3928,-0.75) ... walkPoly=True
... (probes down by 0.75, then 1.5; all OK; walkPoly=True)
[step-walk] site=stepdown-enter ... delta=(0.0744,0.0000,0) ... hit=(0,-1,0) walkPoly=False
... (probes down again; hit stays (0,-1,0); walkPoly=False throughout)
[step-walk] site=after-insert state=Collided ... hit=(0,-1,0) walkPoly=False
[step-walk] site=after-validate state=OK ... position back to input
[resolve] in=(132.360,16.811,94) cell=0xA9B40150 tgt=(132.435,17.204,94)
out=(132.360,16.811,94) cell=0xA9B40150 ok=True
hit=yes n=(0,-1,0) walkable=True
=== Harness: pos=(132.36,16.81,94) cn=(0,-1,0) cnValid=True onGround=True cell=0xA9B40150
=== Live: pos=(132.43,17.20,94) cn=(0,0,0) cnValid=False onGround=True cell=0xA9B40150
```
**No `[bsp-test]` line fires.** The door's BSP is never queried. The
hit `(0, -1, 0)` is the engine's "sliding off the south edge of the
seeded walkable polygon" response — not a door collision.
This matches production: at indoor primary cell `0xA9B40150`,
`GetNearbyObjects` returns ZERO shadows because:
1. The captured `cellId` low-nibble `0x150 >= 0x100` → indoor →
issue #98's gate at
[ShadowObjectRegistry.cs:480](../../src/AcDream.Core/Physics/ShadowObjectRegistry.cs#L480)
skips the outdoor radial sweep.
2. `portalReachableCells` (built by `CellTransit.FindCellSet`) lacks
outdoor cell `0xA9B40029`. In the harness, this is because we
register no cell fixture for `0xA9B40150` and the indoor branch at
[CellTransit.cs:403-407](../../src/AcDream.Core/Physics/CellTransit.cs#L403)
early-returns with empty candidates. **In production**, the cell
IS in cache but the traversal still doesn't produce `0xA9B40029`
the cell's exit portal (`OtherCellId=0xFFFF`) either doesn't fire
`exitOutside=true` at the sphere's position, or `AddAllOutsideCells`
isn't computing the right outdoor cell.
## Next investigation move
**Dump cell `0xA9B40150` from the dat and inspect its portal list.**
Two ways:
a) **Dat-direct read in a test** (preferred — no live launch). Pattern
from
[DoorSetupGfxObjInspectionTests](../../tests/AcDream.Core.Tests/Physics/DoorSetupGfxObjInspectionTests.cs):
`dats.Get<EnvCell>(0xA9B40150u)`, then iterate
`envCell.CellPortals` and print each portal's `OtherCellId`,
`PolygonId`, `Flags`. If no portal with `OtherCellId == 0xFFFF`,
`exitOutside` can never be true → bug is in the cell's portal-graph
loading (or the cottage doesn't connect via 0xFFFF exit portals;
it might use the building-shell path via
`BuildingPhysics.CheckBuildingTransit` instead).
b) **Live `ACDREAM_DUMP_CELLS=0xA9B40150,0xA9B4013F,0xA9B40154`**
another launch cycle. Less preferred; we already have what we need
from the dat read.
The dat-direct read can be a new test method in
`DoorSetupGfxObjInspectionTests` (it's the natural home for this
class of dat-introspection checks).
## What NOT to do next
1. **Don't speculate on the fix.** We have the right replay apparatus
now; the next move is **read the dat** to determine the cell's actual
portal structure. Then we'll know whether the bug is in the dat
data, the portal loading, the exit-portal detection in
`FindTransitCellsSphere`, or `AddAllOutsideCells`'s grid math.
2. **Don't modify the replay test to mask the walkable-polygon edge
artifact.** The artifact is harmless (it documents that, given a
single isolated walkable poly, the engine treats its boundary as a
wall — true regardless of the door bug). The interesting finding is
"no `[bsp-test]` line"; the edge artifact just happens to fill the
collision slot.
3. **Don't re-do the registration shape.** Multi-part registration
+ dedup fix + Task 7 wiring are correct. Verified by the harness's
ability to query the door registration (it just isn't reached at
indoor primary cells).
## Files touched this session
**Committed:** none yet — pending commit at session end.
**Uncommitted:**
- `tests/AcDream.Core.Tests/Fixtures/door-bug/live-capture.jsonl`
2 captured ResolveWithTransition records (tick 13558 walkthrough +
tick 22760 outdoor block)
- `tests/AcDream.Core.Tests/Physics/DoorBugTrajectoryReplayTests.cs`
apparatus: 2 LiveCompare tests + 1 Diagnostic dump
- `docs/research/2026-05-24-door-bug-apparatus-shipped-findings.md`
this doc
## Pickup prompt for the next session
```
A6.P4 door bug — apparatus replay shipped. DoorBugTrajectoryReplayTests
loads tick 13558 (walkthrough) and 22760 (block) from a captured fixture
and replays through the engine. Door 0x000F4246 (closed, state=0x00010008,
BSP world (132.57, 16.99, 95.36) radius 1.975) IS registered correctly
in the harness, BUT the engine never queries it from indoor primary cell
0xA9B40150 — no [bsp-test] line fires. Root cause located:
CellTransit.FindCellSet's portal traversal does not surface outdoor cell
0xA9B40029 from indoor cell 0xA9B40150.
Read docs/research/2026-05-24-door-bug-apparatus-shipped-findings.md
State both altitudes:
Currently working toward: M1.5 — Indoor world feels right
Current phase: A6.P4 door bug — cell-portal investigation.
Apparatus shipped; next step is to dump cell
0xA9B40150's portal list (from the dat) and
determine why FindTransitCellsSphere doesn't
add outdoor cell 0xA9B40029 to candidates.
First move: add a test to DoorSetupGfxObjInspectionTests (or a new
CellPortalDatInspectionTests file) that reads EnvCell 0xA9B40150 from
the real dat and prints every portal's OtherCellId, PolygonId, Flags.
Then read 0xA9B4013F (player's other indoor cell from JSONL) and
0xA9B40029 (door's outdoor cell) for cross-comparison. The portal
structure will reveal whether cottages use 0xFFFF exit portals
(FindTransitCellsSphere path) or building-shell portals
(CheckBuildingTransit path). If 0xFFFF exit portals exist but
exitOutside isn't firing, the bug is in the sphere-vs-plane test
at CellTransit.cs:99-112. If they don't exist, the building-shell
path is misconfigured for indoor-primary calls.
DO NOT:
- Modify the replay test to mask the walkable-polygon-edge artifact
- Re-do the registration shape (correct)
- Speculate on the fix without dat evidence
```

View file

@ -0,0 +1,188 @@
# Door collision — end-of-session handoff (2026-05-24, late)
**Branch:** `claude/strange-albattani-3fc83c`
**Worktree:** `C:\Users\erikn\source\repos\acdream\.claude\worktrees\strange-albattani-3fc83c`
## TL;DR — what was actually accomplished
**No user-visible bug was fixed this session.** The door bug the user
reported at the start (center blocks, off-center walks through,
inside-out walks through) is **identically reproducible after the
4 commits** as it was before them.
What changed: infrastructure. Server-spawned doors now register
multi-part shadow shapes (cylinder + BSP slab) instead of one
cylinder approximation. The BSP slab is queried 135 times per door
near approach but produces **zero collision hits**, so observed
behavior is unchanged.
**Don't re-do the infrastructure.** It's correctly built and necessary
for any future fix. The remaining work is downstream of it.
## Commits landed (4)
```
163a1f0 diag(phys): [bsp-test] probe + grounded apparatus test + handoff
ca9341c feat(phys): A6.P4 Task 7 — RegisterLiveEntityCollision uses ShadowShapeBuilder + RegisterMultiPart
3b7dc46 fix(phys): GetNearbyObjects dedup-by-entityId silently drops multi-part shadows
e1d94d7 test(phys): door setup + GfxObj dat-inspection — Hypothesis A falsified
```
**Real-but-latent value from these:**
- The dedup-by-entityId issue (3b7dc46) was a latent footgun: any
future attempt at multi-part shadows (NPCs with hit-region capsules,
multi-part creatures, props with separated collision) would have
silently dropped all but the first shape. Now safe.
- The dat-inspection (e1d94d7) proved part 0 (`0x010044B5`) has a
real 1.9×0.26×2.5 m BSP slab in the dat. A future fix doesn't have
to question whether the data exists.
- The Task 7 wiring (ca9341c) puts the right architecture in place —
doors now register the shapes retail expects (cyl per CylSphere +
cyl per Sphere + BSP per Part-with-BSP).
- The `[bsp-test]` probe (163a1f0) fires before the cache lookup,
distinguishing "cache miss → silent skip" from "queried but no
hit" — neither of which `[resolve-bldg]` ever showed.
**Brutally clear: zero of these commits change observed door behavior.**
## What we now know vs. what we don't
### Known (from this session's probes)
- `0x010044B5` PhysicsBSP has 6 collision-bearing polygons forming a
1.925 × 0.261 × 2.490 m door slab. All `SidesType=Landblock`
(two-sided). Bounding sphere radius 1.975 m. Verified by direct
dat read.
- `0x010044B6` (the two leaf parts) have `HasPhysics=false`,
`PhysicsBSP=null`, `PhysicsPolygons.Count=0`. Visual-only by retail
design — our skip matches retail's
`CPhysicsPart::find_obj_collisions:275051`.
- Live Holtburg doors register with `shapes=cyl1+bsp1`. Cache is
populated. BSP entries are visited (135x for one door at player
approaches as close as 0.42 m).
- The BSP traversal produces ZERO attributed hits during live walking
(all 19 `[resolve-bldg]` lines show `gfxObj=0x00000000`, which is
the Cylinder shape). Whatever is happening inside
`SphereIntersectsPolyInternal` or the dispatch around it is
swallowing the hit silently.
### NOT known (don't speculate further)
- **Whether `DoStepUp` is involved.** The prior handoff doc
(`2026-05-24-door-collision-task7-shipped-but-bug-remains.md`)
asserted "step-up incorrectly succeeds" as the leading hypothesis.
That was over-reach. In the apparatus, `ACDREAM_DUMP_STEPUP=1`
produced no `stepup: ENTER` lines — `DoStepUp` was never called.
So the apparatus shows `hit=yes n=(0,0,1)` from some OTHER path
(terrain step-down? walkable poly preservation?). It does not
confirm step-up is the production bug.
- **Whether the production hit happens at the BSP polygon edge test,
the BSP node traversal, or some other layer.**
- **Whether the production code path is the same as the apparatus
path 5 in the first place.**
The earlier framing of "step-up is the bug" was a guess I inflated
into a conclusion. Treat it as a candidate, not a finding.
## Proper next move
**Same pattern that closed issue #98 after 6+ failed speculation rounds:
live capture + apparatus replay.**
The infrastructure for this already exists in the codebase:
1. `ACDREAM_CAPTURE_RESOLVE=<path>` env var (see
`src/AcDream.Core/Physics/PhysicsResolveCapture.cs`) captures every
player-side `PhysicsEngine.ResolveWithTransition` call as a JSON
Lines record with full `PhysicsBody` before-and-after snapshots.
2. `CellarUpTrajectoryReplayTests.LoadCapturedRecord` +
`AssertCallMatchesCapture` replay a captured record through a
harness engine and emit the first per-field divergence between
live and harness outputs.
The plan:
1. Launch with `ACDREAM_CAPTURE_RESOLVE=door-walkthrough.jsonl`
(no other probes — capture is independent).
2. Walk into a closed Holtburg cottage door 50 cm off-center.
3. Close gracefully. Save the JSONL.
4. Write a new test `LiveCompare_DoorOffCenterWalkthrough` that loads
the failing-tick record and replays it through a harness with the
real `0x010044B5` BSP hydrated + registered via
`RegisterMultiPart`. Compare per-field.
5. The first divergent field names the broken assumption. Fix that.
This is concrete, deterministic, and doesn't ask you to relaunch
multiple times for each fix attempt. The harness round-trip is <500
ms; a fix iteration is ~3 seconds.
## What NOT to do
1. **Do NOT re-do the multi-part registration.** It's correct. The
dedup fix is correct. Task 7 is correct. Verified by 53/53 tests
in the targeted scope.
2. **Do NOT speculate-and-fix.** This session burned cycles on a
"step-up is the bug" hypothesis that wasn't supported by the
evidence. The apparatus-first rule (`feedback_apparatus_for_physics_bugs.md`)
exists for exactly this. Build the apparatus before the fix.
3. **Do NOT re-investigate whether the door has BSP polygons.**
It does. 6 of them. Forming a full door slab. Cached. Visited.
4. **Do NOT relaunch with more probes hoping for an obvious signal.**
The probes we have already say "BSP visited 135 times, no hits."
More log lines won't tell us WHY it doesn't hit. The apparatus
replay will.
## Files to read first
- This doc (you're in it).
- `docs/research/2026-05-24-door-dat-inspection-findings.md` — the
dat data, polygon layout, bounding sphere center vs frame offset.
- `docs/research/2026-05-24-door-collision-task7-shipped-but-bug-remains.md`
— the prior end-of-session handoff. **Read with skepticism** — its
"leading hypothesis" section overstates confidence in the step-up
theory (corrected here).
- `tests/AcDream.Core.Tests/Physics/CellarUpTrajectoryReplayTests.cs`
— the capture+replay pattern to mirror for the door bug. See
`LiveCompare_*` tests.
## State of the M1.5 milestone
Doors at Holtburg cottages: center blocks, off-center walks through,
inside-out walks through. Same as it was 24 hours ago. The walking-
through case is the actual user pain point. Until the apparatus
replay names the divergence, treat M1.5 indoor-world as still
incomplete on the door front.
The infrastructure is in place for the eventual fix. The fix itself
remains future work.
## Pickup prompt for the next session
```
Door collision investigation. Previous session shipped infrastructure
(multi-part registration + GetNearbyObjects dedup fix) but did NOT fix
the user-visible bug: off-center / inside-out approaches still walk
through closed Holtburg cottage doors.
Read docs/research/2026-05-24-door-collision-session-end-handoff.md
State both altitudes:
Currently working toward: M1.5 — Indoor world feels right
Current phase: A6.P4 door bug — apparatus replay phase.
Multi-part registration shipped; need live capture
+ per-field divergence comparison to identify why
the door's BSP slab fires zero attributed hits
despite being visited 135x per approach.
First move: launch the client with ACDREAM_CAPTURE_RESOLVE=<path>,
walk into a closed Holtburg cottage door 50 cm off-center, close
gracefully. Then write a LiveCompare_* test in CellarUpTrajectoryReplayTests
that loads the captured failing tick + replays through a harness
with the door BSP hydrated via the existing 0x010044B5 dat read
pattern and registered via RegisterMultiPart.
DO NOT redo the multi-part registration. DO NOT speculate about
step-up without evidence — the apparatus tested DoStepUp and it
didn't fire. The bug is upstream of step-up. The replay will name
the actual divergence.
```

View file

@ -0,0 +1,265 @@
# Door collision per-part BSP session — handoff
**Date:** 2026-05-24 (long session, multiple phases)
**Branch:** `claude/strange-albattani-3fc83c`
**Worktree:** `C:\Users\erikn\source\repos\acdream\.claude\worktrees\strange-albattani-3fc83c`
This handoff documents an A6.P4-driven session that:
1. Shipped A6.P4 slice 1 (real cleanup, didn't close #99)
2. Investigated why doors don't block (apparatus-first)
3. Brainstormed + speced a per-part BSP collision design
4. Shipped most of the implementation (Tasks 1-6 of 10)
5. Discovered Task 7's per-part BSP doesn't actually fix the door bug
6. Reverted Task 7 and paused for further investigation
---
## TL;DR
**Shipped (real commits):**
- `b49ed90` — A6.P4 slice 1: drop the `< 0x0100u` filter in
`ShadowObjectRegistry.GetNearbyObjects`'s portalReachableCells loop,
rename `indoorCellIds``portalReachableCells`. Real cleanup; the
`FindCellSet`-already-includes-outdoor-cells discovery means doors at
building thresholds should be reachable from indoor primary spheres
via the exit-portal logic. But the user-visible #99 close was wrongly
claimed in the commit message — see below.
- `d71ceab` — design spec: per-part BSP for server-spawned entities.
- `8d4f14c` — 10-task implementation plan.
- `ab4278c` — Task 1: ShadowShape record.
- `7f5c287` — Task 2: ShadowShapeBuilder.FromSetup + 7 unit tests.
- `1454eab` — Task 3: ShadowEntry adds LocalPosition + LocalRotation.
- `fca0a13` — Task 4: ShadowObjectRegistry.RegisterMultiPart + 6 tests +
Deregister clears `_entityShapes` (Task 6 folded in).
- `d5ffb03` — Task 5: UpdatePosition recomposes multi-part transforms
via `_entityShapes`.
- `3e5dc8c` — Task 6 regression test: stray UpdatePosition after
Deregister is no-op.
- `1498697``[cyl-test]` diagnostic probe (broadly useful).
**Reverted (Task 7 staged, then `git restore`):** the
`RegisterLiveEntityCollision` refactor at `GameWindow.cs:3076`. Reverted
because visual verification showed the per-part BSP shape didn't actually
block the door — only the small Cylinder did, and even that only at
dead-center approach.
**Still pending:** Tasks 7-10 in the plan + the real fix for door
collision.
---
## What we learned (apparatus-first findings)
### Door Setup 0x020019FF shape inventory (live dump captured 2026-05-24)
```
[door-setup-dump] setupId=0x020019FF setupRadius=0.141 setupHeight=0.200
cylSpheres=0 spheres=1 parts=3 placementFrames=1
stepUp=0.090 stepDown=0.090
[door-setup-dump] sphere[0] r=0.100 origin=(0.000,0.000,0.018)
[door-setup-dump] part[0] gfxObj=0x010044B5
[door-setup-dump] part[1] gfxObj=0x010044B6
[door-setup-dump] part[2] gfxObj=0x010044B6
```
### Per-shape registration (post-Task-7-experiment)
With `ShadowShapeBuilder.FromSetup` running over Setup 0x020019FF in the
live launch, doors registered 2 shadows each:
1. `type=Cylinder radius=0.100 height=0.200 localPos=(0,0,0.018)` — from
the Sphere converted to short Cylinder.
2. `type=BSP gfxObj=0x010044B5 radius=2.000 localPos=(-0.006,0.125,1.275)`
from part 0 (the frame). The other two parts (`0x010044B6` x2) have
`BSP=null` → skipped.
### Collision behavior (visual verified by user, 2026-05-24)
| Scenario | Result |
|---|---|
| Cellar climb (#98 regression check) | ✅ Works |
| Door from outside, dead center | ⚠️ Partial — only the small Cylinder blocks; player stops at the center |
| Door from outside, ~50 cm off-center | ❌ Pass through |
| Door from outside (Use → swing) | ✅ Swing animation works, door opens |
| Indoor furniture (#91 regression check) | ✅ Works |
| Outdoor exterior wall (regression check) | ✅ Works |
| Door from inside walking out | ❌ Pass through |
### Diagnostic evidence
In 188K+ resolve lines from the launch:
- `Door 0xF4249 : 85 cyl-tests, 13 resolve hits attributed`
- `Door 0xF424F : 227 cyl-tests, 16 resolve hits attributed`
- **Zero `[resolve-bldg]` attributions for any door**
Conclusion: the per-part BSP at `0x010044B5` produces NO collision hits.
Either:
1. The PhysicsBSP at that GfxObj has no collision-bearing polygons
(only visual polys), OR
2. Our world-to-part-local sphere transform is wrong, OR
3. The broadphase rejects it (unlikely with radius=2.0 default).
---
## Why this differs from M1 visual verification on 2026-05-13
The user remembers doors blocking on the M1 demo verification. That
demo was "open the inn door" — clicking + watching the swing animation.
The walking-through-an-open-door part was not deliberately tested. The
closed-door blocking was probably observed accidentally when the user
walked directly at a center-of-doorway cylinder; the 14 cm cylinder is
just wide enough to catch a sphere at exactly the centerline. Today's
careful off-center test exposed the gap.
So nothing regressed since 2026-05-13. The bug has been latent. Our
investigation just exposed it.
---
## Investigation gap to close before the next implementation attempt
The per-part BSP design IS retail-faithful in shape (matches
`CPhysicsObj::FindObjCollisions``CPartArray::FindObjCollisions`
`CPhysicsPart::find_obj_collisions``CGfxObj::find_obj_collisions`).
But it didn't surface a working blocker for the cottage doors. Three
hypotheses, ranked by likelihood:
### Hypothesis A (most likely): Part 0x010044B5 has no collision-bearing PhysicsBSP polygons
The Setup defines visual parts. Some parts (especially decorative
hardware) may have a PhysicsBSP that's just the visual mesh's bounding
volume, with no walls or threshold polygons. The door's collision might
genuinely be just the small Cylinder by retail design, and retail
gets full doorway blocking from the **building's BSP** having a narrow
gap exactly the size of the door's Cylinder (~28 cm × 28 cm).
**How to verify:** Dump `0x010044B5`'s PhysicsBSP polygons via
`ACDREAM_DUMP_GFXOBJS=0x010044B5`. Inspect the polygons. If they're
just an axis-aligned bounding box matching the visual mesh, no useful
collision data exists at the part level.
### Hypothesis B: Building BSP has a wide doorway gap that retail's tiny cylinder doesn't fill
A retail building (e.g., cottage interior 0x020XXXXX) has its walls as
BSP polygons. The doorway is a gap. If the gap is ~2 m wide (visual
opening), the 28 cm cylinder doesn't span it — even retail wouldn't
block.
**How to verify:** Open RenderDoc on retail (or our client) and inspect
the cottage interior GfxObj BSP at the doorway. Measure the gap. If
it's narrow (~30 cm), the small cylinder fills it. If wide (~2 m), the
cylinder is decorative and the actual blocker must come from elsewhere.
### Hypothesis C: Retail uses a different collision mechanism entirely
Doors might use Setup.Radius / Setup.Height (the bounding cylinder
dimensions, 0.141 × 0.200 — slightly larger than our Sphere-derived
0.100 × 0.200) AS THE PRIMARY BLOCKER, not the Sphere. Or retail
overrides shape selection for `ItemType==Door` specifically.
**How to verify:** Attach cdb to a live retail client at a cottage
doorway, set a breakpoint on `CPhysicsObj::FindObjCollisions` for the
door's PhysicsObj, observe which shape branch fires.
---
## Recommended next-session approach
Per the project's "apparatus-first for physics divergences" rule
(`feedback_apparatus_for_physics_bugs.md`):
1. **Stop coding.** Don't try another fix without evidence.
2. **Dump 0x010044B5's PhysicsBSP** via `ACDREAM_DUMP_GFXOBJS=0x010044B5`.
If it has zero floor-touching polygons → Hypothesis A confirmed.
3. **Attach cdb to retail** at a cottage doorway. Trace which shapes
block the player. See `project_retail_debugger.md` for the toolchain.
4. **Cross-reference ACE source** for Door collision (if any) — search
`references/ACE/Source/ACE.Server/Physics/` for door handling.
5. **Re-brainstorm** with the new evidence. The Task 1-6 infrastructure
stays (it's correctly modeling retail's CPhysicsObj-per-entity
with parts iterated for collision). Only the SHAPES we register
need to change.
The infrastructure investment was not wasted. The architecture is right.
We just registered the wrong shapes from the door setup.
---
## What's in the tree right now
```
$ git log --oneline -15
1498697 diag(phys): [cyl-test] probe — log every Cylinder shadow collision test
3e5dc8c test(phys): Task 6 regression — Deregister clears _entityShapes cache
d5ffb03 feat(phys): UpdatePosition handles multi-part entities
fca0a13 feat(phys): ShadowObjectRegistry.RegisterMultiPart
1454eab feat(phys): ShadowEntry adds LocalPosition + LocalRotation
7f5c287 feat(phys): ShadowShapeBuilder.FromSetup
ab4278c feat(phys): add ShadowShape record (no callers yet)
8d4f14c docs(phys): implementation plan — per-part BSP for server-spawned entities
d71ceab docs(phys): design spec — per-part BSP collision for server-spawned entities
b49ed90 feat(phys): A6.P4 slice 1 — portal-reachable cellSet includes outdoor cells
b3ce505 fix(phys): A6.P3 #98 — gate outdoor shadow radial sweep on indoor primary cell
b55ae83 docs: A6.P3 #98 resolution + A6.P4 design + #99/#100 filed
3e3cd77 docs(handoff): A6.P4 pickup handoff — full session-resume artifact
```
All 49+ tests pass:
- 24 ShadowObjectRegistryTests
- 7 ShadowShapeBuilderTests
- 8 ShadowObjectRegistryMultiPartTests
- 11 CellarUpTrajectoryReplayTests
Pre-existing 6-8 baseline static-state-leakage failures in the broader
Physics suite are unchanged from prior sessions.
**No-commit state:** working tree is clean. `git status --short`
shows only untracked investigation logs (`a6-issue98-*.log`,
`launch-task7-*.log`, etc. — these accumulate from launches and don't
get committed).
---
## #99 status: still open
The A6.P4 slice 1 commit message claimed "Closes #99" but the visual
verification today proves that's premature. Slice 1 did a real cleanup
(removed a misleading filter) but didn't fully address the user-visible
door-block bug. Update `docs/ISSUES.md` accordingly (issue #99 remains
OPEN; the per-part BSP architecture is NEW infrastructure built today
that will support the eventual fix once we identify the right shapes).
---
## Pickup prompt for next session
```
Door collision still doesn't fully block in M1.5 Holtburg. Per-part BSP
infrastructure shipped 2026-05-24 (Tasks 1-6 of A6.P4 plan), but the
specific shapes we register from door setup 0x020019FF don't catch the
player. Need apparatus-first investigation:
Read docs/research/2026-05-24-door-collision-session-handoff.md
(this doc — recent session handoff)
State both altitudes:
Currently working toward: M1.5 — Indoor world feels right
Current phase: A6.P4 — investigation phase to find the right door
collision shapes; per-part BSP infrastructure
already shipped; need to verify Hypothesis A/B/C
before any more implementation
First moves (in order):
1. Dump GfxObj 0x010044B5's PhysicsBSP via ACDREAM_DUMP_GFXOBJS.
Does it have collision-bearing polygons or just visual?
2. If yes → debug the per-part transform (likely Hypothesis B/C
wrong); if no → confirm Hypothesis A and pivot strategy.
3. Either way, attach cdb to retail at a cottage doorway to see
what retail actually blocks with.
DO NOT speculate-and-fix again. The session 2026-05-24 already
burned a Task 7 attempt on a hypothesis that turned out wrong. The
6 committed implementation tasks (Tasks 1-6) are correct and stay.
Only Tasks 7-10 of the plan need to change once we know the right
shapes.
```

View file

@ -0,0 +1,247 @@
# Door collision — Task 7 shipped, partial fix, deeper bug remains
**Date:** 2026-05-24 (evening, continuation of door collision investigation)
**Branch:** `claude/strange-albattani-3fc83c`
**Status:** A6.P4 architecture is correct. Multi-part registration works.
The Holtburg door bug PARTIALLY fixed — center blocks, but off-center
and inside-out still walk through. Root cause is downstream in the
engine's grounded BSP collision path (Path 5 + step-up), NOT in the
multi-part registration we just shipped.
---
## TL;DR
**Three commits shipped** (composable foundation):
| SHA | Title | What it does |
|---|---|---|
| `e1d94d7` | dat-inspection test | Confirmed door part `0x010044B5` has full 1.9×0.26×2.5 m BSP slab (6 Landblock polys). Hypothesis A from prior handoff was wrong. |
| `3b7dc46` | `GetNearbyObjects` dedup fix | Changed `HashSet<uint>` (entityId) → `HashSet<ShadowEntry>`. Multi-part shapes no longer silently dropped. |
| `ca9341c` | Task 7 live wiring | `RegisterLiveEntityCollision` uses `ShadowShapeBuilder.FromSetup` + `RegisterMultiPart`. Doors now register cyl+bsp instead of just cyl. |
**Live verification (visual user test):**
| Scenario | Result |
|---|---|
| Dead center, walk into closed door (outside) | ✅ Blocks |
| 50 cm off-center, walk into closed door (outside) | ❌ Walks through |
| Inside walking out (closed door) | ❌ Walks through |
| Use door → swing → walk through | ✅ Works (ETHEREAL flip path) |
**Probe-instrumented live capture confirms multi-part registration works:**
- Every door spawn shows `[entity-source] shapes=cyl1+bsp1` — both shapes register.
- BSP part `0x010044B5` is visited 135 times for a single door at player approaches as close as `distXY=0.415 m`.
- `cacheHit=True` for every visit — the cache is populated.
- BUT: zero `[resolve-bldg]` attributions for the BSP shape (all 19 attributed hits show `gfxObj=0x00000000` = the Cylinder shape).
So the BSP is being QUERIED but never produces an attributed hit. The
sphere walks through despite the BSP geometry being present and
visited.
---
## What's in the tree right now
```
$ git log --oneline -8
ca9341c feat(phys): A6.P4 Task 7 — RegisterLiveEntityCollision uses ShadowShapeBuilder + RegisterMultiPart
3b7dc46 fix(phys): GetNearbyObjects dedup-by-entityId silently drops multi-part shadows
e1d94d7 test(phys): door setup + GfxObj dat-inspection — Hypothesis A falsified
c89df8e docs(handoff): door collision per-part BSP session handoff (2026-05-24)
1498697 diag(phys): [cyl-test] probe — log every Cylinder shadow collision test
3e5dc8c test(phys): Task 6 regression — Deregister clears _entityShapes cache
d5ffb03 feat(phys): UpdatePosition handles multi-part entities
fca0a13 feat(phys): ShadowObjectRegistry.RegisterMultiPart
```
**Uncommitted (to commit next):**
- `src/AcDream.Core/Physics/TransitionTypes.cs` — new `[bsp-test]` probe in
the BSP collision dispatch, mirrors `[cyl-test]`. Fires when a BSP entry
is visited, BEFORE the cache lookup. Distinguishes "cache miss → silent
skip" from "queried but no hit." Gated on `ACDREAM_PROBE_BUILDING=1`.
- `tests/AcDream.Core.Tests/Physics/DoorCollisionApparatusTests.cs`
new test `Apparatus_Grounded_50cmOffCenter_FrontApproach_DocumentsBug`
that attempts to reproduce the production bug with a grounded body +
seeded ContactPlane. Currently fails because the apparatus's behavior
diverges from production (apparatus blocks immediately at tick 0 with
a Z+ normal from the synthetic floor; production walks through).
---
## Path 5 vs Path 6 — the divergence
`BSPQuery.FindCollisions` dispatches to 6 paths based on `ObjectInfo`
state. The crucial difference:
- **Path 6 (Default)** — fires when `obj.State` has no `Contact` flag.
Calls `SphereIntersectsPolyInternal` and `SetCollide` on hit.
**Apparatus tests use this path** (no body, `isOnGround=false`). They
all PASS — the door's BSP blocks the sphere correctly.
- **Path 5 (Contact branch)** — fires when `obj.State.HasFlag(Contact)`.
Calls `SphereIntersectsPolyInternal`; on hit, calls
`StepSphereUp → DoStepUp → DoStepDown` to attempt climbing over the
obstacle. Returns OK if step succeeds, Slid if step fails.
**Production uses this path** (player grounded → `isOnGround=true`
engine sets `Contact` flag at `PhysicsEngine.cs:631`). Production
WALKS THROUGH.
So the bug is somewhere in Path 5's step-up logic. The leading
hypothesis (not yet proven):
> When the player is standing on flat ground in front of the door,
> step-up's `DoStepDown` probes 0.6 m downward from the sphere's
> current position. It finds the SAME flat ground extending to the
> OTHER SIDE of the door (Holtburg cottages have no Z change between
> exterior and interior floor — both at Z=94). `find_walkable`
> declares step-up SUCCESS, the BSP collision returns `OK`, and the
> sphere walks through the door.
>
> The fix probably involves: step-up should reject if a forward probe
> at the lifted height STILL hits the same obstacle. The current
> DoStepDown probes only DOWNWARD; it doesn't verify that the
> forward motion at the lifted height is clear.
This is speculation — needs apparatus verification.
---
## Why the apparatus didn't reproduce the bug
The grounded apparatus test (`Apparatus_Grounded_50cmOffCenter_*`) was
supposed to fail in the same way as production (walk through). Instead
it BLOCKED at tick 0 with normal=(0,0,1) — sphere position unchanged.
Diagnostic output:
```
[bsp-test] obj=0x000F424F gfx=0x010044B5 ... pos=(11.99,12.12,1.27)
distXY=1.234 cacheHit=True
[resolve] in=(12.500,11.000,0.480) tgt=(12.500,11.100,0.480)
out=(12.500,11.000,0.480) ok=True hit=yes n=(0,0,1) walkable=True
```
`ACDREAM_DUMP_STEPUP=1` produced no `stepup: ENTER` lines, so
`DoStepUp` was NOT called. The hit normal `(0,0,1)` came from
somewhere else (likely the seeded walkable polygon or the synthetic
floor interaction with the engine's terrain step-down).
The apparatus's stub terrain (Z=-1000) + synthetic walkable poly at
Z=0 may be causing the engine to take a different code path than
production's real Holtburg terrain. Reproducing production fully
would require:
1. Real terrain heightmap covering the test landblock at Z=94
2. EnvCell or stab geometry near the test door
3. Proper cottage/cell setup so portal-reachable cells include
the door's outdoor cell when player is indoor
This is significant apparatus investment. Worth it IF the bug
requires multi-tick simulation in real geometry to surface. For
now, the apparatus shows the broad shape: with proper grounded
state + seeded body, the engine doesn't take the same path as
the airborne (Path 6) test.
---
## Recommended next steps (ranked)
### Option A — Live diagnostic with ACDREAM_DUMP_STEPUP=1 (cheapest)
Relaunch with `ACDREAM_PROBE_BUILDING=1` + `ACDREAM_DUMP_STEPUP=1`.
Walk into a closed door off-center. The step-up dump will show:
- Whether `DoStepUp` fires at all when the BSP hits
- If so, what the input normal is
- Whether `stepDown` succeeds or fails
If `stepDown` succeeds (i.e., step-up climbs over the door), we've
confirmed the hypothesis above and can target the fix.
### Option B — Build a richer apparatus
Replace the stub terrain with a real heightmap-like surface at Z=94
spanning the test landblock. Replace the synthetic walkable poly with
a proper terrain polygon at the door's world XY. This should let
Path 5 run the SAME way as production. Then iterate on the fix
locally in <500 ms.
Estimated effort: 1-2 hours of apparatus work.
### Option C — Direct retail cdb trace
Attach cdb to a running retail client at a Holtburg cottage doorway,
break on `CTransition::step_up` or `CTransition::step_down`, and
observe how retail handles step-up against a door. Compare against
acdream's behavior.
Estimated effort: 30 min - 2 hours depending on what we find.
### Option D — Pivot to fix-and-verify
Hypothesis-based fix: in `DoStepUp`, reject step-up if the input
collision normal is mostly horizontal AND the obstacle's bounding
sphere height range significantly exceeds the step-up height. The
door has BS radius 1.975 m centered at Z=1.275 → top of BS at Z=3.25,
way above step-up=0.6. If we detect "this obstacle is too tall to
step over," fall back to wall-slide.
Risk: might break stairs / ramps. Need apparatus to verify.
### Recommendation
Option A first (~5 min, no code changes needed). If hypothesis
confirmed, then Option D (with apparatus from Option B for
regression testing).
---
## Files touched this session (cumulative)
**Committed:**
- `src/AcDream.Core/Physics/ShadowObjectRegistry.cs` (dedup fix)
- `src/AcDream.App/Rendering/GameWindow.cs` (Task 7 wiring)
- `tests/AcDream.Core.Tests/Physics/DoorSetupGfxObjInspectionTests.cs` (NEW)
- `tests/AcDream.Core.Tests/Physics/DoorCollisionApparatusTests.cs` (NEW)
- `docs/research/2026-05-24-door-dat-inspection-findings.md` (NEW)
**Uncommitted (this doc + 2 file changes):**
- `src/AcDream.Core/Physics/TransitionTypes.cs` (added `[bsp-test]` probe)
- `tests/AcDream.Core.Tests/Physics/DoorCollisionApparatusTests.cs`
(added grounded test scenario — fails for unrelated apparatus
reasons but the probe wiring is sound)
**Memory updated:** `feedback_dedup_keys_after_cardinality_change.md`
---
## Pickup prompt for next session
```
A6.P4 Task 7 shipped (RegisterLiveEntityCollision uses
ShadowShapeBuilder + RegisterMultiPart) and the foundation fix
(GetNearbyObjects dedup on full ShadowEntry instead of entityId).
Production verification: center blocks, but off-center + inside-out
still walk through closed doors. The multi-part registration is
correct (verified by live probes); the remaining bug is downstream
in BSPQuery Path 5's step-up logic.
Read docs/research/2026-05-24-door-collision-task7-shipped-but-bug-remains.md
State both altitudes:
Currently working toward: M1.5 — Indoor world feels right
Current phase: A6.P4 door collision — step-up misbehavior
investigation. Multi-part registration shipped;
step-up at thin tall obstacles is the remaining bug.
Recommended first move: Option A from the findings doc — relaunch
with ACDREAM_PROBE_BUILDING=1 + ACDREAM_DUMP_STEPUP=1, walk into
a Holtburg cottage door off-center. The step-up dump will reveal
whether DoStepUp is incorrectly succeeding for the door's BSP slab
hit (the leading hypothesis: DoStepDown finds the same flat floor
on the other side of the door, declaring step-up success).
DO NOT re-investigate the multi-part registration or GetNearbyObjects
dedup — both are confirmed working. Focus on the step-up path 5
behavior for thin tall obstacles.
```

View file

@ -0,0 +1,258 @@
# Door collision dat inspection — findings
**Date:** 2026-05-24 (evening, continuation of door collision investigation)
**Branch:** `claude/strange-albattani-3fc83c`
**Status:** Evidence gathered. Hypothesis A from
[`2026-05-24-door-collision-session-handoff.md`](2026-05-24-door-collision-session-handoff.md) **FALSIFIED**.
---
## TL;DR
A deterministic, read-only dat-inspection test
([`DoorSetupGfxObjInspectionTests.cs`](../../tests/AcDream.Core.Tests/Physics/DoorSetupGfxObjInspectionTests.cs))
opens the real client dat and prints the raw state of door Setup
`0x020019FF` + its three referenced GfxObjs.
**Result — Hypothesis A is wrong.** Part 0 (`0x010044B5`) has a complete
1.925 m × 0.261 m × 2.490 m door-sized collision volume in the dat. Six
two-sided (`SidesType=Landblock`) physics polygons form the closed door
slab. Bounding sphere radius 1.975 m. Setup `Flags=HasPhysicsBSP`.
Parts 1, 2 (`0x010044B6`) **are** visual-only by design — `HasPhysics`
flag is clear, `PhysicsBSP` is null, `PhysicsPolygons.Count = 0`. **This
matches retail's `CPhysicsPart::find_obj_collisions`**
([`acclient_2013_pseudo_c.txt:275051`](../research/named-retail/acclient_2013_pseudo_c.txt)),
which explicitly short-circuits when `physics_bsp == 0`. So retail also
runs no collision against `0x010044B6` — and our skip-on-null-BSP
behavior is retail-faithful, not a bug.
**This rewrites the "next-session approach" recommendation in the prior
handoff.** The handoff said "if 0x010044B5's BSP has zero floor-touching
polys → Hypothesis A confirmed → pivot strategy." The BSP has six
collision polygons forming the whole door slab. The pivot is not needed;
we need to figure out why our integration of `0x010044B5`'s BSP didn't
fire during the Task 7 experiment.
---
## Raw dump (verbatim from the test)
```
=== Setup 0x020019FF ===
Flags = HasParent, AllowFreeHeading, HasPhysicsBSP (0x0000000D)
Radius = 0.1414
Height = 0.2000
StepUp = 0.0900
StepDown = 0.0900
CylSpheres = 0
Spheres = 1
[0] r=0.1000 origin=(0.000,0.000,0.018)
Parts = 3
[0] gfxObj=0x010044B5
[1] gfxObj=0x010044B6
[2] gfxObj=0x010044B6
PlacementFrames = 1
[Default] frameCount=3
frame[0] pos=(-0.006,0.125,1.275) rot=(0.000,0.000,0.000,1.000)
frame[1] pos=(0.710,0.000,1.210) rot=(0.000,0.000,0.000,1.000)
frame[2] pos=(0.710,0.247,1.210) rot=(0.000,0.000,1.000,0.000)
=== GfxObj 0x010044B5 === (the door slab — has physics)
Flags = HasPhysics, HasDrawing, HasDIDDegrade (0x0000000B)
HasPhysics = True
VertexArray = non-null, 8 vertices
PhysicsPolygons = 6 polys
PhysicsBSP = non-null
PhysicsBSP.Root = non-null
Root.Type = BPnN
Root.BoundingSphere = (-0.390,-0.056,-0.150) r=1.975
BSP tree total polys (including children) = 6
PhysicsPolygon AABB sweep (first 6 polys):
[0x0000] nVerts=4 sides=Landblock min=(-0.954,-0.134,-1.236) max=(0.971,0.127,-1.236) # bottom face
[0x0001] nVerts=4 sides=Landblock min=(-0.954,-0.134,-1.236) max=(-0.954,0.127,1.255) # left face
[0x0002] nVerts=4 sides=Landblock min=(-0.954,-0.134, 1.255) max=(0.971,0.127,1.255) # top face
[0x0003] nVerts=4 sides=Landblock min=( 0.971,-0.134,-1.236) max=(0.971,0.127,1.255) # right face
[0x0004] nVerts=4 sides=Landblock min=(-0.954,-0.134,-1.236) max=(0.971,-0.134,1.255) # front face
[0x0005] nVerts=4 sides=Landblock min=(-0.954, 0.127,-1.236) max=(0.971,0.127,1.255) # back face
PhysicsPolygons combined AABB: min=(-0.954,-0.134,-1.236) max=(0.971,0.127,1.255)
size=(1.925, 0.261, 2.490)
=== GfxObj 0x010044B6 === (the leaves — visual-only by design)
Flags = HasDrawing, HasDIDDegrade (0x0000000A)
HasPhysics = False
VertexArray = non-null, 40 vertices
PhysicsPolygons = 0 polys
PhysicsBSP = NULL
Polygons (visual) = 87 polys
DrawingBSP = non-null
```
---
## What this means
### The data is right
Part 0's BSP is a six-faced thin slab oriented as a vertical door:
1.925 m wide × 0.261 m thin × 2.490 m tall. Placed at frame[0] offset
`(-0.006, 0.125, 1.275)`, it occupies entity-local Z ∈ `[0.039, 2.530]`
a standard door height. All six faces are
`SidesType=Landblock` (two-sided collision — catches a sphere
approaching from either side).
This is exactly what retail's collision system uses to block doors.
No mystery, no missing data, no need to fall back to a wider Cylinder
approximation.
### The leaves are correctly visual-only
`0x010044B6` is the swinging door leaf (used twice — left + right
panels). It has no physics by retail design. Our `ShadowShapeBuilder`
skipping these parts matches both the dat and retail's
`CPhysicsPart::find_obj_collisions`.
### So the bug is in integration, not data
The previous session's Task 7 experiment registered `0x010044B5`'s BSP
correctly (we saw `type=BSP gfxObj=0x010044B5 radius=2.000
localPos=(-0.006,0.125,1.275)` in the per-shape registration), yet got
**zero `[resolve-bldg]` attributions** during live play. With the data
now confirmed good, that gap must be in:
1. **The BSP collision dispatch never enters for the door entry**
`TransitionTypes.cs:2257` silently `continue`s when
`engine.DataCache.GetGfxObj(obj.GfxObjId)?.BSP?.Root is null`. If the
GfxObj wasn't cached at collision time (race with renderer load), the
entry is invisibly skipped. **No log distinguishes this from
"queried-and-no-hit."**
2. **Broadphase placeholder radius** — Task 2's `ShadowShapeBuilder`
uses `bspRadius = 2f` as a stand-in pending a Task 5/6 caller
replacement. The real dat value is `1.975` — close enough not to be
the blocker, but the placeholder convention means callers MUST
substitute the real BS radius from `cache.GetGfxObj(gfxId).BoundingSphere.Radius`
before registering. If a future caller forgets, the broadphase will
still mostly work but won't be tight.
3. **The broadphase center is the part's FRAME origin, not the BSP's
bounding-sphere center.** Frame origin = `(-0.006, 0.125, 1.275)`;
BS center in part-local = `(-0.390, -0.056, -0.150)`. Distance:
1.48 m. The 2.0 m broadphase radius nominally covers the BS sphere
(radius 1.975) from the frame origin only on the side closest to the
BS center. For approaches on the opposite side, the broadphase
sphere extends 2.0 m + 1.48 m = 3.48 m from the BS center — wider
than needed, but never too tight in the door case. Still, a more
faithful encoding centers the broadphase on the part's BS center +
frame offset, with radius = BS radius.
4. **BSPQuery against `SidesType=Landblock` polys**`BSPQuery.cs`
pass-through-copies `SidesType` (line 2277) but doesn't filter on
it. We have not yet verified that `Landblock`-typed polys actually
produce collision hits in our query pipeline against a thin-slab
geometry. (Note: indoor cells use `SidesType=Single`-typed cell-floor
polys and those work — the cellar replay tests pin that. But Doors'
`Landblock` polys may behave differently — particularly w.r.t.
two-sided collision.)
5. **Entity rotation at the doorway** — Holtburg cottage doors face
non-cardinal directions. The entity's world rotation
`entity_rot` composes with `frame[0].Rotation` (identity for part 0)
to produce `obj.Rotation = entity_rot`. The sphere
transform `inv(entity_rot) * (sphere_world obj.Position)` is
sensitive to that rotation. If we register with identity (forgetting
to plumb the spawn's rotation through), the BSP polys will be
oriented "into the world" wrong — passing tests that approach from
the wrong axis.
---
## Recommended next step
The handoff's "DO NOT speculate-and-fix again" rule still applies. The
right next move is **apparatus-first**, not another implementation
attempt:
**Write a focused unit test** that:
1. Loads the real `0x010044B5` PhysicsBSP from the dat via the
inspection test's pattern (or use `GfxObjDumpSerializer.Hydrate`
for a deterministic fixture).
2. Constructs a synthetic door entity at a known world position
`(132.6, 17.1, 94.08)` with a known rotation (try identity AND a
~90° Z rotation to cover both axes).
3. Sweeps a player sphere at the door from each of the four
compass directions, at off-center positions (50 cm off-center)
AND dead center.
4. Calls `Transition.FindObjCollisions` / `ResolveWithTransition`
directly (apparatus path mirrors the live one).
5. Asserts:
- Dead-center approach → `Collided` / `Adjusted` / `Slid`
with `CollideObjectGuids` containing the door entity.
- 50 cm off-center approach → same.
- From inside walking out → same.
If the test fails: we have a deterministic reproduction of the live
bug in <500 ms, and we can fix the integration with confidence.
If the test passes: the door bug is elsewhere (cell registration,
spawn-time race, etc.).
This is the next apparatus the previous session was building toward
when it ran out of cycles. With the data question now closed by the
dat inspection, it's the highest-information next move.
---
## What's in the tree right now
```
$ git status --short
?? docs/research/2026-05-24-door-dat-inspection-findings.md
?? tests/AcDream.Core.Tests/Physics/DoorSetupGfxObjInspectionTests.cs
[+ untracked launch logs from prior sessions]
```
Build green; existing tests still pass; new test runs in 34 ms and
produces the dump above.
---
## Pickup prompt for next session
```
Door collision dat inspection (2026-05-24 evening) FALSIFIED
Hypothesis A. Part 0 (0x010044B5) has a full door-slab BSP in the
dat — 6 Landblock-typed polys forming a 1.925 m × 0.261 m × 2.490 m
collision volume. Parts 1, 2 (0x010044B6) are visual-only by retail
design (HasPhysics flag clear). Retail and acdream both skip those
in CPhysicsPart::find_obj_collisions — that's not a bug.
Read docs/research/2026-05-24-door-dat-inspection-findings.md
State both altitudes:
Currently working toward: M1.5 — Indoor world feels right
Current phase: A6.P4 — door collision investigation continues.
Per-part BSP infrastructure (Tasks 1-6) ships
already; data is confirmed good in the dat; need
to determine WHY our integration of 0x010044B5's
BSP didn't fire collisions during the Task 7
experiment.
Next moves (in order):
1. Write CellarUpTrajectoryReplay-style apparatus test that
loads 0x010044B5's PhysicsBSP from a dat dump, registers a
synthetic door via RegisterMultiPart, and sweeps a player
sphere at it. Confirm BSP collision fires (or doesn't) in
isolation.
2. If the test passes → bug is in live registration (likely
cell scoping, entity rotation, or race with renderer
loading). Investigate live cell membership for door
entities.
3. If the test fails → bug is in BSPQuery.FindCollisions
against thin-slab Landblock-typed polys. Investigate the
6-path dispatcher for that case.
DO NOT re-attempt Task 7 (per-part BSP wiring in
RegisterLiveEntityCollision) until the apparatus test confirms
the BSP works in isolation. Tasks 1-6 stay; they're correct.
```

View file

@ -0,0 +1,328 @@
# A6.P6 / A6.P7 — Door cylinder + slab interaction handoff
**Date:** 2026-05-25 PM
**Status:** A6.P5 cellSet fix shipped (3b1ae83). A6.P6 cyl step-over shipped
(3d4e63f). Residual symptom remains: sphere can't slide tangentially
past the door's foot cylinder when the cyl's radial collision normal
dominates the slide direction. Three fix options identified; user picked
"investigate retail first" — that's this session's work.
---
## TL;DR
Walking into a closed cottage door from outside in acdream:
| Before A6.P5 | After A6.P5 only | After A6.P6 too |
|---|---|---|
| Sphere walks through (cellSet didn't include door) | Sphere blocks BUT cyl phantom radial-pushes sphere AWAY from target (~10 cm push-out at door center) | Sphere stops at current position when cyl fires — no more push-out, but also can't slide tangentially past the cyl on some headings |
A6.P5 made the door reliably visible from all approach angles (closed
the cellSet bug); A6.P6 routed Contact-grounded cyl collisions through
step-over instead of radial push. Both retail-anchored. But the residual
"can't slide past cyl on certain headings" still happens because:
1. The door has two collision shapes: a tiny foot cylinder (r=0.10,
h=0.20) and the big slab BSP.
2. Our FindObjCollisions tests shapes in registration order. The cyl
gets tested FIRST. When cyl fires, FindObjCollisions returns
immediately — slab BSP never tested in that iteration.
3. The cyl's collision normal is radial (away from cyl axis). For a
sphere wanting to move SE past a door at world (132.6, 17.1), the
cyl-radial normal is roughly (0.86, 0.51, 0). The slide tangent
from that normal points mostly south — INTO the slab. Slab then
blocks (in a downstream iteration). Net: sphere doesn't move.
4. If the slab's clean (0, +1, 0) normal were used instead, the slide
tangent would be pure east. Sphere would slide cleanly along the
door. This is what retail does visibly.
So the question is: how does retail end up with the slab's normal
driving the slide, when retail also has the cyl AND tests it?
---
## What today shipped (DO NOT redo this)
### A6.P5 — cellSet portal expansion fix (commit 3b1ae83)
- File: `src/AcDream.Core/Physics/CellTransit.cs`
- Function: `FindTransitCellsSphere` exit-portal branch + `BuildCellSetAndPickContaining`
- Change: exit portals contribute `exitOutside = true` by topology, not by sphere-plane overlap.
- Retail anchor: `CObjCell::find_cell_list` at `acclient_2013_pseudo_c.txt:308742-308869`.
- Tests: `CellTransitTests.A6P5_BuildCellSetFromIndoorStart_ReachesDoorOutdoorCell` + `A6P5_BuildCellSetFromAlcove_AlsoReachesDoorOutdoorCell`. Both pass.
- Fixture: `tests/AcDream.Core.Tests/Fixtures/door-bug/over-penetration-capture.jsonl` (3 records from the 17 MB live capture).
### A6.P6 — cyl step-over for Contact movers (commit 3d4e63f)
- File: `src/AcDream.Core/Physics/TransitionTypes.cs`
- Function: `CylinderCollision` — added Contact-grounded branch
- Change: when `oi.Contact && !sp.StepUp && !sp.StepDown && engine != null` and cyl height fits step-up-height, attempt `DoStepUp(collisionNormal, engine)`. On failure → `StepUpSlide(this)`. On step-fail, behavior changes from radial push to tangent-along-crease.
- Retail anchor: `CCylSphere::intersects_sphere` at `acclient_2013_pseudo_c.txt:324626-324641` (Contact branch dispatches `step_sphere_up`) + `CCylSphere::step_sphere_up` at `acclient_2013_pseudo_c.txt:324516-324538`.
- Tests: all `A6P5_*` + `Path 5` tests + door directional tests pass in isolation. Full Core suite 17 failures (same as A6.P5 baseline) — diff is documented static-leak flakiness.
### Probes added (still in place — useful for next session)
- `ACDREAM_PROBE_CELLSET=1``[cellset-build]` line per `BuildCellSetAndPickContaining` call.
- `ACDREAM_PROBE_BUILDING=1``[cyl-test]` + `[bsp-test]` (existing).
- `ACDREAM_PROBE_RESOLVE=1``[resolve]` (existing).
- `ACDREAM_CAPTURE_RESOLVE=<path>` → JSONL capture for replay.
### Captures from today (gitignored, on disk)
- `door-stuck-capture.jsonl` (17 MB, 8483 records) — the original phantom reproduction.
- `door-phantom-capture.jsonl` (13 MB, ~7000 records) — captured with cyl/bsp probes ON post-A6.P5.
- `door-a6p6-v2.launch.log` (UTF-16) + `door-a6p6-v2.utf8.log` — most recent diagnostic launch with all 3 probes on after A6.P6 fix landed. Shows residual cyl phantom (12+ resolves with cn=(0.86, 0.51, 0) attributed to door entity 0x000F4245).
---
## The remaining symptom (what to fix)
User walks into a closed cottage door (Setup 0x020019FF, entity at
world ≈ (132.6, 17.1, 94.1)). When the sphere ends up at certain
angles to the door (NE / SE of the cyl center), the cyl's slide
"blocks" the sphere from making tangential progress along the slab
face.
Specific evidence from `door-a6p6-v2.utf8.log` (line ~23553):
```
[resolve-bldg] obj=0x000F4245 ... hitPoly: plane=(0.000,0.000,-1.000,-1.236) ← slab BOTTOM hit, but culled (no Z motion)
[cyl-test] obj=0x000F4245 ... result=Slid ← cyl fired
[resolve] in=(132.777,17.724) tgt=(133.044,17.400) out=(132.777,17.724)
hit=yes n=(0.86,0.51,0.00) obj=0x000F4245 nObj=9
```
The cn=(0.86, 0.51, 0) is the cyl's radial normal (sphere is NE of cyl
axis). The slide direction is perpendicular = (0.51, -0.86, 0) ≈ mostly
south = into the slab. Slab blocks in subsequent iteration. Net: out == in.
Counts from the latest launch (~7K resolves):
- 117 hit=yes attributed to door entity 0x000F4245
- 99 hit=yes attributed to cottage GfxObj 0xA9B47900
- 350 cyl-tests result=Slid (out of 1623 total cyl tests)
- 12 resolves with cn=(0.86, 0.51, 0) on the door — the "phantom slide direction" pattern
---
## The three options (user picked #2-investigation first)
### Option 1: BSP-first per-entity test order (smallest fix)
Within an entity's shapes, test BSP shapes before Cylinder shapes. If
BSP fires, skip the cyl. The slab's clean (0, ±1, 0) normal drives the
slide → sphere slides smoothly along door face.
- ~10 lines in `FindObjCollisions` (sort `nearbyObjs` per-entity).
- Retail-faithful behaviorally; whether it's retail-faithful
architecturally is uncertain (see Option 2 research).
### Option 2: Port retail's per-physobj dispatch (architectural)
Restructure `ShadowObjectRegistry` to group shapes by entity. Implement
retail's `CPhysicsObj::FindObjCollisions` dispatch including the
`state & 0x10000` branch logic (acclient_2013_pseudo_c.txt:276861).
- Large change; touches many files.
- True retail-faithful architecture. **But** behaviorally may end up
producing the same outcome as Option 1 if our state flag mapping
is correct.
### Option 3: Door-cyl-as-informational
Hypothesis: retail's door cyl is for click-target / sound trigger /
foot-slip prevention for non-player entities, NOT a physics blocker
for the standard player. Skip registering it as a collision shape on
entities that also have a BSP.
- Needs retail research to confirm.
- Risk: breaks foot-slip prevention for small entities.
---
## Retail investigation needed (THIS SESSION's main work)
The fundamental question: **what does retail do with the door's cyl
that produces clean sliding past it?** Two specific things to read +
test:
### Investigation 1: What does `state & 0x10000` mean?
Retail's `CPhysicsObj::FindObjCollisions` at
`acclient_2013_pseudo_c.txt:276861`:
```c
if (((this->state & 0x10000) == 0 || ebp_1 != 0) || eax_12 != 0) {
// iterate cylspheres + spheres
} else {
// iterate BSP parts via CPartArray::FindObjCollisions
}
```
Door state at spawn = `0x00010008`. Bit 0x10000 (bit 16) IS set. So
condition `state & 0x10000 == 0` is FALSE. The branch depends on
`ebp_1` and `eax_12`.
**Investigation steps:**
1. Grep `acclient_2013_pseudo_c.txt` for what assigns to `ebp_1` and
what `eax_12` is computed from. Identify which mover/target state
bits drive the branch.
2. Search `docs/research/named-retail/acclient.h` for state flag bit
definitions (look for constants `0x10000`, `OBJECT_USES_PHYSICS_BSP`
or similar around the OBJECTINFO / PhysicsObj state field).
3. Determine which branch fires for: closed door (state 0x10008) +
grounded player.
4. If cyl branch fires for our case: how does retail block player
from passing through the door without the BSP test?
5. If BSP branch fires: why? What state condition is off in our
replica?
Cross-reference with ACE's `PhysicsObj.FindObjCollisions`
`references/ACE/Source/ACE.Server/Physics/PhysicsObj.cs`. ACE might
have cleaner names for the same logic.
### Investigation 2: What does the door's cyl actually DO in retail?
Concrete experiment using cdb on the live retail client:
1. Attach cdb to retail acclient.exe (toolchain in CLAUDE.md "Retail
debugger toolchain" section).
2. Set breakpoint on `CCylSphere::collides_with_sphere` (acclient
address 0x53a880) with action: log entity id + sphere position +
result. Use `qd` after ~5000 hits to detach.
3. Walk retail player into a closed cottage door from outside,
trying to slide along it.
4. Capture trace. Look for:
- Does the door cyl ever fire `collides_with_sphere` returning 1?
If yes → cyl IS active in retail.
- If no → cyl is somehow excluded from physics in retail (Option 3
plausible).
5. Set breakpoint on `BSPTREE::find_collisions` for the same scenario.
Determine if BSP slab is tested.
### Investigation 3: Inspect Setup parsing differences
Compare what our `ShadowShapeBuilder.FromSetup` produces from
`Setup 0x020019FF` vs what retail's PhysicsObj constructs from the
same Setup:
1. `dotnet test --filter "FullyQualifiedName~DoorSetupGfxObjInspectionTests"
--logger "console;verbosity=detailed"` for our parse.
2. Inspect retail's PhysicsObj creation flow (acclient.exe around the
PhysicsObj constructor + part_array initialization). Look for
filtering: does retail include the Setup's cyl in its physics shape
list, or is there a flag-driven include/exclude?
---
## Files to read FIRST next session
| File / location | What to find |
|---|---|
| `docs/research/named-retail/acclient_2013_pseudo_c.txt:276776+` | `CPhysicsObj::FindObjCollisions` (the dispatch + state flag branch) |
| `docs/research/named-retail/acclient_2013_pseudo_c.txt:324558` | `CCylSphere::intersects_sphere` (the per-cyl dispatch for state & 3) |
| `docs/research/named-retail/acclient_2013_pseudo_c.txt:324516` | `CCylSphere::step_sphere_up` (our A6.P6 anchor; verify our port matches) |
| `docs/research/named-retail/acclient.h` | OBJECTINFO state bit constants (esp. `0x10000`) |
| `references/ACE/Source/ACE.Server/Physics/PhysicsObj.cs` | ACE's port — cleaner names |
| `docs/research/named-retail/acclient_2013_pseudo_c.txt:308916` | `CObjCell::find_obj_collisions` (per-cell shadow iteration, calls CPhysicsObj::FindObjCollisions) |
---
## Tests to keep green (do NOT regress)
Run these in isolation when verifying any new fix:
```bash
dotnet test tests/AcDream.Core.Tests/AcDream.Core.Tests.csproj --no-build -c Debug --filter "FullyQualifiedName=AcDream.Core.Tests.Physics.CellarUpTrajectoryReplayTests.LiveCompare_FirstCap_FixClosesCottageFloorCap|FullyQualifiedName=AcDream.Core.Tests.Physics.DoorBugTrajectoryReplayTests.Directional_OutsideIn_SouthApproach_BlocksAtSlabSouthFace|FullyQualifiedName=AcDream.Core.Tests.Physics.DoorBugTrajectoryReplayTests.Directional_InsideOut_NorthApproach_BlocksAtSlabNorthFace|FullyQualifiedName=AcDream.Core.Tests.Physics.DoorBugTrajectoryReplayTests.CornerSlide_AlcoveEastToCottageNorth_ShouldBlock|FullyQualifiedName=AcDream.Core.Tests.Physics.DoorBugTrajectoryReplayTests.Geometric_DoorSlabAtSphereHeight_OverlapsInZ|FullyQualifiedName=AcDream.Core.Tests.Physics.DoorBugTrajectoryReplayTests.InsideOut_Tick3254_WithCottageWalls_ShouldBlock|FullyQualifiedName~BSPQueryTests.FindCollisions_Path5|FullyQualifiedName~CellTransitTests.A6P5|FullyQualifiedName~DoorCollisionApparatusTests.Apparatus_DeadCenter"
```
Expected: all 14 pass.
Full Core suite has 17 documented flaky-in-full-run failures — those
are the static-leak flakiness CLAUDE.md describes, not regressions.
---
## Things NOT to do (do-not-retry list)
1. **Don't reverse cyl/BSP iteration order globally.** Cross-entity
ordering should follow registration sequence (matches retail per-cell
shadow_object_list). Only within-entity ordering needs adjustment.
2. **Don't disable the door cyl unconditionally.** Foot-slipping
matters for small entities even if not for the player.
3. **Don't enlarge `EPSILON` in slide-back-off math** to "give more
margin." The 11mm residual penetration is a separate issue
(`SlideSphere` preserves `currPos.Y` which may already be slightly
penetrating); changing epsilon would mask other bugs.
4. **Don't add per-call workarounds in `CylinderCollision`** (like
"if entity has a sibling BSP, return OK"). Per CLAUDE.md no-workarounds
rule — fix the architectural issue, not the symptom.
5. **Don't break A6.P6 step-over for non-door cyls** (tree trunks, rock
pillars, NPCs). Whatever fix lands must keep cyl-only entities
blocking correctly.
---
## Open issue tracking
Add to `docs/ISSUES.md` after this handoff:
```
- door-cyl-residual-block: After A6.P5 + A6.P6, sphere can still be
blocked at NE/SE headings approaching a closed cottage door because
the cyl's radial collision normal drives the slide direction into
the slab. Three fix options outlined in
docs/research/2026-05-25-a6-door-cyl-investigation-handoff.md;
pending retail investigation to pick the retail-faithful path.
Severity: M1.5 polish (does not block "kill a drudge" demo).
```
---
## Pickup prompt for next session
```
A6.P6 / A6.P7 — door-cyl residual block investigation.
Read first (in this order):
1. docs/research/2026-05-25-a6-door-cyl-investigation-handoff.md
(full context: what landed, what's still broken, the 3 fix options,
do-not-retry list)
2. docs/research/named-retail/acclient_2013_pseudo_c.txt:276776
(CPhysicsObj::FindObjCollisions — the state-flag dispatch)
3. docs/research/named-retail/acclient_2013_pseudo_c.txt:324558
(CCylSphere::intersects_sphere — the cyl dispatch)
State both altitudes:
Currently working toward: M1.5 — Indoor world feels right
Current phase: A6.P7 — retail investigation for door cyl + slab
collision interaction.
The session's main work: retail investigation. NOT implementation.
Specific questions to answer (cite retail line numbers in the report):
1. What does state bit 0x10000 mean? Closed cottage doors have it
set (state = 0x00010008). Retail's FindObjCollisions branches on
`((state & 0x10000) == 0 || ebp_1 != 0) || eax_12 != 0`. What are
ebp_1 and eax_12? Which branch fires for a closed door + grounded
player? (Cross-reference references/ACE/Source/ACE.Server/Physics/
PhysicsObj.cs for cleaner names.)
2. Does the door cyl actually fire collides_with_sphere in retail
when player slides along the door? Set a cdb breakpoint on
CCylSphere::collides_with_sphere (acclient address 0x53a880),
walk a retail player into the cottage door, observe. If cyl
fires: how does retail produce smooth sliding past it? If cyl
doesn't fire: by what mechanism is it excluded?
3. Compare our ShadowShapeBuilder.FromSetup output vs retail's
PhysicsObj shape list for Setup 0x020019FF. Where do they
diverge?
Deliverable: a short report (~2-3 pages) covering the 3 questions with
retail line numbers + cdb trace excerpts. Then propose which of the
3 fix options (BSP-first per-entity / per-physobj dispatch port /
door-cyl-informational) is the most retail-faithful, justified by
the research.
DO NOT implement the fix this session — the brainstorming-only
discipline applies. After the report, the next session will pick
the implementation approach + execute via writing-plans → executing-plans.
Do-not-retry list (in handoff doc) — read it before starting.
Tests to keep green if any code changes happen: see handoff doc.
Reproduction setup ready to relaunch with diagnostics if needed:
ACDREAM_PROBE_BUILDING=1 ACDREAM_PROBE_RESOLVE=1 ACDREAM_PROBE_CELLSET=1
ACDREAM_CAPTURE_RESOLVE=<path>.jsonl
```

View file

@ -0,0 +1,319 @@
# A6.P7 — Retail dispatch investigation for door cyl + slab interaction
**Date:** 2026-05-25 PM
**Mode:** Report-only (per `/investigate` skill). No code edits.
**Predecessor:** [`2026-05-25-a6-door-cyl-investigation-handoff.md`](2026-05-25-a6-door-cyl-investigation-handoff.md)
---
## TL;DR — the smoking gun
**Retail's `CPhysicsObj::FindObjCollisions` dispatches BINARILY between
"BSP-only" and "cyl + sphere" — never both.** The selector is the state
bit `HAS_PHYSICS_BSP_PS = 0x10000` (named verbatim in the retail header).
For a closed cottage door + walking player:
- Door state `0x10008` has `HAS_PHYSICS_BSP_PS` set.
- Player isn't a missile.
- Player isn't a PvP-eligible target of the door.
- → Retail goes to the **BSP-only branch**. **The cyl is never tested.**
Acdream tests both because our dispatch iterates per `ShadowEntry`
(cyl and BSP are separate entries). The residual phantom slide at
NE/SE headings is the predictable consequence: the cyl's radial normal
fires first, drives the slide tangent into the slab face, slab blocks
in a downstream sub-tick, net out=in.
The fix is **~15 LOC at the per-entry test site**, reading
`obj.State & 0x10000u` (which is already populated on every
`ShadowEntry` from ACE's `spawn.PhysicsState`). It is **NOT** an
architectural restructuring of `ShadowObjectRegistry`. The handoff's
"Option 2 = large change" assessment was wrong — Option 2 is the
right answer, but its scope is dramatically smaller than the handoff
feared.
---
## Question 1 — What is `state & 0x10000`? Which branch fires?
**Named flag:** `HAS_PHYSICS_BSP_PS = 0x10000`
[`docs/research/named-retail/acclient.h:2833`](research/named-retail/acclient.h:2833).
The full retail `PhysicsState` enum lives at lines 2815-2843. The flags
implicated by the dispatch:
| Bit | Name | Meaning |
|---|---|---|
| 0x4 | `ETHEREAL_PS` | Non-solid; passes through other objects |
| 0x10 | `IGNORE_COLLISIONS_PS` | Skips collision processing entirely |
| 0x40 | `MISSILE_PS` | Object is a projectile / arrow / spell in flight |
| 0x10000 | `HAS_PHYSICS_BSP_PS` | Object exposes a per-Setup BSP collision mesh |
**Branch logic from
[`acclient_2013_pseudo_c.txt:276861`](research/named-retail/acclient_2013_pseudo_c.txt):**
```c
if (((this->state & 0x10000) == 0 || ebp_1 != 0) || eax_12 != 0)
{
// CYL + SPHERE path (lines 276863-276953):
// iterate part_array's CylSpheres → CCylSphere::intersects_sphere
// fall through label_50f21d:
// iterate part_array's Spheres → CSphere::intersects_sphere
// (BSP is NEVER tested in this branch)
}
else
{
// BSP path (lines 276956-276985):
state_3 = CPartArray::FindObjCollisions(part_array, transition);
// (cyl + sphere are NEVER iterated in this branch)
}
```
**What `ebp_1` and `eax_12` are:**
- `ebp_1` is set at lines 276808-276841. It's non-null **only when**
THIS object's weenie is a player AND the moving transition's
ObjectInfo has the IsPlayer flag AND no PvP exclusion (IsPK match,
IsPKLite match, IsImpenetrable). Effectively: "I am a player and the
incoming mover is also a player I can collide with."
- `eax_12` is `OBJECTINFO::missile_ignore(transition, this)`
[`acclient_2013_pseudo_c.txt:274385`](research/named-retail/acclient_2013_pseudo_c.txt:274385).
Returns non-zero when the moving object is a missile that should
ignore this target. For a walking player vs door: returns 0.
**For our scenario (player walking vs closed door):**
- `state & 0x10000 == 0`: FALSE (door has the bit set).
- `ebp_1 != 0`: FALSE (door is not a player target).
- `eax_12 != 0`: FALSE (walking isn't a missile).
- Condition is FALSE → **ELSE branch fires → BSP-only path.**
The retail client **never calls `CCylSphere::intersects_sphere` on the
door's foot cylinder** when a non-missile, non-PvP mover walks into it.
**ACE cross-reference confirms the truth table exactly.**
[`references/ACE/Source/ACE.Server/Physics/PhysicsObj.cs:412-450`](../../references/ACE/Source/ACE.Server/Physics/PhysicsObj.cs):
```csharp
if (!State.HasFlag(PhysicsState.HasPhysicsBSP) || missileIgnore || exemption)
{
// cyl-then-sphere iteration
}
else if (PartArray != null)
{
var collided = PartArray.FindObjCollisions(transition); // BSP path
// ...
}
```
ACE names the flag `HasPhysicsBSP`; the local variables are `missileIgnore`
(retail's `eax_12`) and `exemption` (retail's `ebp_1`). The structure is
identical bar a `// TODO: reverse this check to make it more readable`
comment at [`PhysicsObj.cs:401`](../../references/ACE/Source/ACE.Server/Physics/PhysicsObj.cs:401)
confirming ACE faithfully transcribed the negated predicate without
adding interpretation.
**Verdict on Q1: the cyl is not tested in retail for our case. Bit
0x10000 means "this object has a BSP — use it exclusively, do not
test the cyl/sphere proxies".**
---
## Question 2 — Does retail's cyl actually fire `collides_with_sphere`?
**Answer derivable from Q1 without running cdb: NO.** The retail dispatch
unambiguously routes a closed-door + walking-player call to
`CPartArray::FindObjCollisions` (the BSP path). The function
`CCylSphere::collides_with_sphere` is reached only via the cyl-and-sphere
path; that path is dead code for our scenario.
A cdb trace would confirm zero hits on `CCylSphere::collides_with_sphere`
for our scenario — but the decomp + ACE agreement is sufficient
evidence to skip the trace. The branch condition is fully resolved by
inspection.
If we wanted defensive verification (recommended only if a fix attempt
fails on first land), the live-trace recipe is:
```
bp acclient!CCylSphere::collides_with_sphere "r $t0=@$t0+1; gc"
bp acclient!CPartArray::FindObjCollisions "r $t1=@$t1+1; gc"
```
Walk into the cottage door from outside for ~10 seconds. Expected:
`@$t0 == 0` (cyl never tested), `@$t1` non-zero. This would settle
the question definitively, but is not blocking the fix.
---
## Question 3 — Compare our ShadowShapeBuilder vs retail's Setup parsing
**Retail STORES both cyl and BSP** for a door whose Setup has both.
The cyl + sphere primitives live in `CPartArray::cylspheres` /
`CPartArray::spheres`, the BSP is per-Part. Retail does not filter at
the storage layer; it filters at the **dispatch** layer via the
`HAS_PHYSICS_BSP_PS` flag.
**Our `ShadowShapeBuilder.FromSetup`** at
[`src/AcDream.Core/Physics/ShadowShapeBuilder.cs:41-110`](../../src/AcDream.Core/Physics/ShadowShapeBuilder.cs)
does the same — emits both a Cylinder shape and per-Part BSP shapes
for Setup `0x020019FF`. **This is correct.** The bug isn't in
registration; it's in dispatch.
**Where we diverge:**
| Step | Retail | Acdream |
|---|---|---|
| Storage | One `CPartArray` per `CPhysicsObj`; cyls + spheres + BSP parts all stored | Flat `ShadowEntry` rows in `_cells[cellId]`; one row per shape, no per-entity grouping at the cell layer |
| Dispatch trigger | `CPhysicsObj::FindObjCollisions` called once per shadow object (per-cell iteration) | `Transition.FindObjCollisions` iterates every `ShadowEntry` in `nearbyObjs` |
| Cyl-vs-BSP branch | Binary on `state & 0x10000` | None — every shape is tested |
| Effect on door | Only BSP tested → clean slab-normal slide | Cyl tested first → radial-normal drives slide into slab |
**Critical observation:** The retail state bit is already on every
acdream `ShadowEntry.State` (uint field), populated at
[`GameWindow.cs:3156`](../../src/AcDream.App/Rendering/GameWindow.cs:3156)
from `spawn.PhysicsState ?? 0u` — ACE delivers it on the wire.
Confirmed via direct check: the door test fixtures
([`DoorBugTrajectoryReplayTests.cs:61`](../../tests/AcDream.Core.Tests/Physics/DoorBugTrajectoryReplayTests.cs:61),
[`DoorCollisionApparatusTests.cs:371`](../../tests/AcDream.Core.Tests/Physics/DoorCollisionApparatusTests.cs:371))
all seed the door with `0x10008u` (= `STATIC_PS | REPORT_COLLISIONS_PS |
HAS_PHYSICS_BSP_PS`). The bit is available — we just don't read it.
---
## Mapping to the three fix options
| Option | Retail-faithful? | Verdict |
|---|---|---|
| **#1 — BSP-first per-entity test order** | NO. Retail isn't "BSP-first"; it's "BSP-only when 0x10000 set." Per-entity ordering would also test the cyl for tree trunks (no BSP) which is correct — but would still test the cyl for doors, which retail doesn't. | Reject. |
| **#2 — Port retail's per-physobj dispatch** | **YES.** This is exactly what retail does. The handoff's scoping ("touches many files; large change") was based on a misread of what Option 2 requires — it does NOT require restructuring `ShadowObjectRegistry` to group shapes by entity. The retail check is per-shape on a state flag already present. | **RECOMMENDED.** ~15 LOC at the per-entry dispatch site. |
| **#3 — Door-cyl-as-informational (skip cyl registration when entity has BSP)** | NO. Retail STORES both shapes in `CPartArray` — registration includes both. Filtering at registration would diverge from retail's data model and risk breaking missile / PvP paths that need the cyl. | Reject. |
The handoff's option-2 worry about "restructure `ShadowObjectRegistry`
to group shapes by entity" is overengineered. The retail check is
local to each shape's `ShadowEntry.State`:
```text
For each ShadowEntry obj in nearbyObjs:
if obj is BSP and (obj.State & HAS_PHYSICS_BSP_PS) is unset, skip (impossible — BSP entries on entities WITH 0x10000 don't need a check; we just need to ensure they DO fire)
if obj is Cylinder/Sphere and (obj.State & HAS_PHYSICS_BSP_PS) is SET and not pvp-target and not missile-ignored, skip
```
Effectively: **suppress cyl/sphere tests when the entity has BSP.**
Implemented as a single `continue` guard inside the existing loop at
[`TransitionTypes.cs:2313`](../../src/AcDream.Core/Physics/TransitionTypes.cs:2313).
No data-structure change. No grouping pass. No new fields.
---
## Recommended next step
**Approve the implementation of a retail-binary dispatch** at the
per-entry site in `Transition.FindObjCollisions`. The fix has these
properties:
1. **Site:** [`src/AcDream.Core/Physics/TransitionTypes.cs:2313`](../../src/AcDream.Core/Physics/TransitionTypes.cs:2313)
(the `if (obj.CollisionType == ShadowCollisionType.BSP) ... else ...`
dispatch).
2. **Guard:** add a continue at the cyl/sphere branch when
`(obj.State & HasPhysicsBspPs) != 0 && !isPvpTarget && !missileIgnore`.
For M1.5 polish we can treat both `isPvpTarget` and `missileIgnore`
as `false` (no PK, no missiles in scope) and add `// TODO: wire
when PK / missiles ship` comments. The simplified guard is
`(obj.State & 0x10000u) != 0`.
3. **Companion constant:** add `HasPhysicsBsp = 0x10000u` to
`PhysicsStateFlags` ([`PhysicsBody.cs:25-43`](../../src/AcDream.Core/Physics/PhysicsBody.cs:25)) —
it's currently absent. Naming matches both retail (`HAS_PHYSICS_BSP_PS`)
and ACE (`HasPhysicsBSP`).
4. **Existing tests that would change outcome under the fix:**
- [`DoorCollisionApparatusTests.Apparatus_Grounded_50cmOffCenter_FrontApproach_DocumentsBug`](../../tests/AcDream.Core.Tests/Physics/DoorCollisionApparatusTests.cs:213)
is in "documents the bug" form — its header comment at lines
285-298 explicitly says "When the fix lands, flip this to
`Assert.True(blocked)`." Fix lands → invert assertion in same
commit.
- Apparatus dead-center + back-approach tests — should remain
PASS (BSP still fires).
- `DoorBugTrajectoryReplayTests` LiveCompare tests — should
remain PASS (BSP-only behavior is closer to live capture).
- `CellarUpTrajectoryReplayTests.LiveCompare_FirstCap_FixClosesCottageFloorCap`
— unrelated path (#98 cottage-floor cap). Unaffected.
- `BSPQueryTests.FindCollisions_Path5_*` — unrelated; tests
`BSPQuery` internals not dispatch. PASS.
- `CellTransitTests.A6P5_*` — unrelated. PASS.
5. **Risks:**
- **Foot-slipping for small entities on the door cyl.** Retail
doesn't have this concern because retail's cyl isn't tested on
the door for the standard mover either — so we won't regress
anything that retail does. If a future fix needs cyl-vs-cyl for
a small dynamic entity (e.g. a chicken bumping the door), that's
a separate problem solved per `MISSILE_PS` / `ebp_1` rules, which
ours already approximate via `CollisionExemption`.
- **Other entities with `0x10000`.** Cottage walls (the static
landblock GfxObj `0xA9B47900`) likely have `HAS_PHYSICS_BSP_PS`
and only register BSP shapes (no cyl) — fix is a no-op there.
NPCs and players have no BSP, no `0x10000`, so the cyl path
continues firing for them — desired.
- **Verification:** run the existing test list from the handoff
(14 tests) post-fix; rerun live launch with all three probes;
expect zero `[cyl-test] obj=0x000F4245` lines in the log.
---
## Verification plan (post-fix)
When the fix lands, a single live launch + 14-test green list is
sufficient verification. The `door-a6p6-v2.utf8.log` showed:
- 117 `hit=yes obj=0x000F4245` resolves
- 350 `[cyl-test] result=Slid` (across all entities)
- 12 phantom `cn=(0.86, 0.51, 0)` resolves attributed to the door
Post-fix expectation in an equivalent capture:
- Door cyl-test count attributed to `obj=0x000F4245`: **0**
- Door BSP `[bsp-test]` calls: unchanged or slightly higher (no
cyl short-circuit)
- `cn=(0.86, 0.51, 0)` phantom on the door: **0**
- Visual confirmation: smooth slide along door slab face from
NE/SE approach.
---
## What this is NOT
- This is **NOT** a recommendation to restructure `ShadowObjectRegistry`.
The flat per-cell list is fine. The retail check is per-shape, not
per-entity.
- This is **NOT** an Option 1 ("BSP-first ordering") fix. Retail does
binary selection, not reordering.
- This is **NOT** an Option 3 ("don't register cyl") fix. Retail
registers both shapes.
- This is **NOT** related to A6.P6's `CCylSphere::step_sphere_up`
port (commit `3d4e63f`). That port is correct — it just doesn't
fire for the door because the cyl is never reached. A6.P6 remains
useful for non-door cylinders (tree trunks, rock pillars).
- This is **NOT** related to the cdb workflow being insufficient — we
could trace it for confirmation but the decomp + ACE agreement makes
inspection sufficient.
- **The cottage-floor cap (#98) is unrelated.** This bug is in entity
collision dispatch; #98 is in cell BSP / GfxObj polygon evaluation.
---
## Citations
| Source | Line(s) | What |
|---|---|---|
| `docs/research/named-retail/acclient.h` | 2815-2843 | `enum PhysicsState``HAS_PHYSICS_BSP_PS = 0x10000` at 2833 |
| `docs/research/named-retail/acclient_2013_pseudo_c.txt` | 276776-276996 | `CPhysicsObj::FindObjCollisions` |
| `docs/research/named-retail/acclient_2013_pseudo_c.txt` | 276861 | Binary dispatch branch |
| `docs/research/named-retail/acclient_2013_pseudo_c.txt` | 276808-276841 | `ebp_1` (PvP-target-player flag) setup |
| `docs/research/named-retail/acclient_2013_pseudo_c.txt` | 274385-274410 | `OBJECTINFO::missile_ignore` |
| `references/ACE/Source/ACE.Server/Physics/PhysicsObj.cs` | 381-454 | ACE's `FindObjCollisions` |
| `references/ACE/Source/ACE.Server/Physics/PhysicsObj.cs` | 412-450 | ACE's binary dispatch (cleaner names) |
| `references/ACE/Source/ACE.Entity/Enum/PhysicsState.cs` | 14, 24 | ACE's `Missile = 0x40` + `HasPhysicsBSP = 0x10000` |
| `src/AcDream.Core/Physics/TransitionTypes.cs` | 2189-2521 | Our `FindObjCollisions` |
| `src/AcDream.Core/Physics/TransitionTypes.cs` | 2313 | Our per-shape dispatch site |
| `src/AcDream.Core/Physics/ShadowShapeBuilder.cs` | 41-110 | Our `FromSetup` (emits both shapes — correct) |
| `src/AcDream.App/Rendering/GameWindow.cs` | 3156 | Where `spawn.PhysicsState` lands on `ShadowEntry.State` |
| `src/AcDream.Core/Physics/ShadowObjectRegistry.cs` | 587 | `ShadowEntry.State : uint` field |
| `src/AcDream.Core/Physics/PhysicsBody.cs` | 25-43 | `PhysicsStateFlags` (currently missing `HasPhysicsBsp`) |
| `tests/AcDream.Core.Tests/Physics/DoorCollisionApparatusTests.cs` | 213, 285-298 | The "documents the bug" fixture; flip-assertion guidance |

View file

@ -0,0 +1,176 @@
# Door bug — retail cdb trace + NegPolyHit dispatch findings
2026-05-25, continuation of door-collision investigation
## TL;DR
cdb attached to retail at a Holtburg cottage door while user walked the
inside-out off-center scenario. The smoking-gun trace identified the
real collision-recording function: **`SPHEREPATH::set_neg_poly_hit`**
fired hundreds of times during the walk; `SPHEREPATH::set_collide`,
`COLLISIONINFO::set_collision_normal`, `set_sliding_normal`,
`add_object` ALL fired zero times.
In our codebase, `NegPolyHitDispatch` exists but **is never called
from any production code path** — it's dead code. The `path.NegPolyHit`
flag is therefore never set. The downstream handler in
`Transition.TransitionalInsert` was a stub that just cleared the flag.
Two-part fix attempted this session:
1. **`BSPQuery.FindCollisions` Path 5** (Contact branch) restructured
to call `NegPolyHitDispatch` when sphere 0 had a near-miss polygon
set but didn't fully penetrate (mirrors retail's `var_5c != 0` case
at `acclient_2013_pseudo_c.txt:0053a6ce-0053a6fb`).
2. **`Transition.TransitionalInsert` NegPolyHit handler** rewritten
to dispatch to `step_up + step_up_slide` (NegStepUp=true) or
record collision normal + return `Collided` (NegStepUp=false).
**Result: fix doesn't fully close the bug.** User still squeezes
through. Diagnostic `[neg-poly-dispatch]` probe shows ZERO hits in
production — the BSP Path 5 changes don't surface NegPolyHit for this
case.
## Why the fix doesn't fire
Retail's `BSPTREE::find_collisions` calls
`vtable->sphere_intersects_poly(localspace_sphere, var_78_6, var_74_6, var_70_8)`
which:
- **Returns `eax_10`**: non-zero on full sphere-vs-poly hit
- **Writes `var_5c`**: closest polygon pointer, set EVEN ON
NEAR-MISS (BSP traversal sets it when entering a leaf containing
candidate polys, regardless of intersection)
So retail records "near miss" polygons during BSP traversal. The
caller dispatches `set_neg_poly_hit(1, var_5c + 0x20)` when sphere 0
returned `eax_10 == 0` but `var_5c != 0`.
Our `SphereIntersectsPolyInternal` only sets `hitPoly` on actual
hits. Near-miss polygons are NOT recorded. So the Path 5 branch
`if (hitPoly0 is not null)` is false → no `NegPolyHitDispatch` call
→ no NegPolyHit set → no dispatch in TransitionalInsert.
## The deeper fix needed
Implement retail's "BSP traversal records closest near-miss polygon"
behavior in `SphereIntersectsPolyInternal` (or a sibling). The
function should return TWO outputs:
- `bool hit` — true if sphere fully penetrates a polygon
- `ResolvedPolygon? closestPoly` — set during traversal to the
polygon that the sphere came closest to (in the BSP node walk),
regardless of whether the full intersection test passed
This requires modifying the BSP recursion to track the "closest
considered" polygon. Retail's sphere_intersects_poly likely tracks
this as a side effect of testing each candidate polygon during the
traversal.
Once that's in place, the existing Path 5 changes + TransitionalInsert
NegPolyHit dispatch should fire correctly and produce the block.
## Second symptom flagged by user (2026-05-25 evening)
User flagged: "we get run a bit into the door as well when it blocks.
That is not retail behavior."
Over-penetration before block = our BSP detects collision AFTER the
sphere has already moved into the surface (static overlap detection)
vs retail's swept-sphere collision (predicts the t-value of first
contact along the motion path and stops the sphere at the surface).
This is the SAME ROOT MECHANISM as the squeeze-through:
sphere_intersects_poly in retail does swept collision with the
motion vector (var_44 = sphere_center - prev_center). Our
`SphereIntersectsPolyInternal` takes a `movement` parameter but the
internal poly-test logic may not actually use it for swept detection.
Verifying: read SphereIntersectsPolyInternal and check whether it
uses the `movement` vector for swept-sphere-vs-poly intersection
testing (computes the t-value where sphere first contacts the poly
along motion), or just does static overlap (sphere center +/- radius
overlaps poly plane). Retail does swept (the `var_44` in
sphere_intersects_poly is the motion delta).
Single fix needed in next session: SphereIntersectsPolyInternal needs to:
1. Implement swept-sphere-vs-poly detection (use the motion vector)
2. Record the closest-considered polygon for near-miss handling
Both feed into the existing Path 5 + TransitionalInsert dispatch
(committed today). Once that single function does its job correctly,
both symptoms close at once.
## What the cdb trace proved
| Symbol | v1 hits | v2 hits | v3 hits |
|---|---|---|---|
| `CPhysicsObj::FindObjCollisions` | 161,081 | 196,608 | 196,608 |
| `CCylSphere::collides_with_sphere` | 35,527 | — | — |
| `SPHEREPATH::set_collide` | **0** | — | — |
| `COLLISIONINFO::set_collision_normal` | — | **0** | — |
| `COLLISIONINFO::set_sliding_normal` | — | **0** | — |
| `COLLISIONINFO::add_object` | — | **0** | — |
| `BSPTREE::slide_sphere` | — | — | **0** |
| `CTransition::cliff_slide` | — | — | **0** |
| **`SPHEREPATH::set_neg_poly_hit`** | — | — | **303+ (fires)** |
| `CTransition::insert_into_cell` | — | — | 3,652 |
Retail records collisions almost exclusively via
`SPHEREPATH::set_neg_poly_hit` during normal-grounded-motion. The
COLLISIONINFO normal/sliding setters fire essentially never for
walking-into-walls scenarios. Our investigation premise was wrong;
the cdb data forced the correction.
## Apparatus + scripts committed
- `tools/cdb/door-inside-out.cdb` — v1 (set_collide check)
- `tools/cdb/door-inside-out-v2.cdb` — v2 (COLLISIONINFO family)
- `tools/cdb/door-inside-out-v3.cdb` — v3 (wide net, found
set_neg_poly_hit)
- `tools/cdb/symbol-probe.cdb` — verifies symbol resolution
## Pickup prompt for next session
```
A6.P4 door inside-out: cdb trace + NegPolyHit dispatch landed
(BSPQuery.FindCollisions Path 5 + TransitionalInsert NegPolyHit
branch) but the fix doesn't fire because our SphereIntersectsPolyInternal
doesn't record near-miss polygons. Retail's sphere_intersects_poly
sets a "closest polygon" output even on non-hits via BSP traversal
side-effect; our equivalent only sets it on full hits.
Read docs/research/2026-05-25-door-bug-cdb-retail-trace-findings.md
State both altitudes:
Currently working toward: M1.5 — Indoor world feels right
Current phase: A6.P4 door bug — implement near-miss polygon
recording in SphereIntersectsPolyInternal.
TWO SYMPTOMS to fix simultaneously (same root cause):
(a) Off-center inside-out: sphere walks (or squeezes) past door
(b) When blocked: sphere visibly penetrates the door before stopping
Both = static overlap detection without near-miss recording.
Retail uses swept-sphere-vs-poly intersection (uses motion vector
to compute t-value of first contact, stops sphere at surface)
AND records the closest near-miss polygon during BSP traversal.
First move: read SphereIntersectsPolyInternal in
src/AcDream.Core/Physics/BSPQuery.cs. Check whether the `movement`
param is actually used for swept-sphere-vs-poly testing. If not
(just static overlap), that's symptom (b). Add swept detection
and a "closestPoly" output param set on ANY polygon considered
during traversal (not just hits). That closes symptom (a) too.
Then the Path 5 branch `if (hitPoly0 is not null)` will fire on
near-miss cases, NegPolyHitDispatch will set NegPolyHit, and the
TransitionalInsert dispatch (already landed) will block the sphere
at the surface (swept-detected t-value), not after penetration.
Retail oracle: BSPTREE::find_collisions + sphere_intersects_poly
vtable call at acclient_2013_pseudo_c.txt:0053a630-0053a6fb.
Visual verification: same scenario (Holtburg cottage door,
inside-out, ~50cm off-center). Should block fully, no squeeze-through.
Outside-in should still work. Issue #98 cellar cap must still pass.
```

View file

@ -0,0 +1,265 @@
# Door bug — inside-out walkthrough: missing cottage exterior wall (geometry gap)
2026-05-25, continuation of door-collision investigation
## TL;DR
The inside-out walkthrough that persisted after the
`AddAllOutsideCells` fix is **NOT a collision-detection bug**. It's a
**collision-geometry GAP**: the cottage's north exterior wall east
(and presumably west) of the doorway opening doesn't exist in any
registered entity our engine knows about. The sphere walks past the
door slab on its east side, clears the doorway alcove cell's small
east wall (Y range [16.5, 17.1]), and then has nothing in front of it
in the collision representation — even though the VISUAL cottage has
a wall there.
## Apparatus diagnostics
Three new tests landed (in `DoorBugTrajectoryReplayTests`):
1. `Directional_OutsideIn_SouthApproach_BlocksAtSlabSouthFace` — sphere
south moving north blocks. PASSES.
2. `Directional_InsideOut_NorthApproach_BlocksAtSlabNorthFace` — sphere
north moving south blocks. PASSES.
3. `Geometric_DoorSlabAtSphereHeight_OverlapsInZ` — pins slab world Z
range = [94.139, 96.630]; sphere top at Z=95.20 IS within slab.
The slab is at sphere height — BSP collision is geometrically active.
4. `InsideOut_Tick3254_WithCottageWalls_ShouldBlock` — hypothesis test
adds cottage GfxObj 0x01000A2B. Result: cottage DID block but with
cn=(0,0,1) — a floor-cap response, NOT a wall response.
5. `Diagnostic_CottagePolys_NearWalkthroughPosition` — dumps cottage
polygons near sphere XY=(133.655, 17.59), any Z. **Result: ZERO
cottage polygons in that area.** The cottage GfxObj has no
geometry where the sphere walks through.
`DoorSetupGfxObjInspectionTests.HoltburgCottage_CellPortals_DatInspection`
extended to dump cell 0xA9B40150's 4 physics polys in world frame:
```
[0] sides=Landblock X=[131.600, 133.500] Y=[16.500, 17.100] Z=[94.000, 94.000] FLOOR
[1] sides=Landblock X=[131.600, 131.600] Y=[16.500, 17.100] Z=[94.000, 96.500] WEST WALL
[2] sides=Landblock X=[131.600, 133.500] Y=[16.500, 17.100] Z=[96.500, 96.500] CEILING
[3] sides=Landblock X=[133.500, 133.500] Y=[16.500, 17.100] Z=[94.000, 96.500] EAST WALL
```
Cell 0xA9B40150 is the **doorway alcove** — a small ~1.9m × 0.6m × 2.5m
volume between the cottage interior and the outdoor area. Its east wall
only extends Y=[16.5, 17.1]. **North of Y=17.1, no wall** in this cell.
The captured failing sphere at (133.655, 17.59) is 0.155m east of the
east wall AND 0.49m NORTH of the wall's Y range. The wall doesn't
reach the sphere.
## The collision-geometry gap
Visual representation (in-client):
- Cottage has a north exterior wall east and west of the doorway opening
- The wall extends Y > 17.1 (north of the alcove)
- User sees their character partially clipping into this wall
Collision representation (what we register):
- Cottage GfxObj 0x01000A2B: **0 polygons** in the area (133.655, 17.59, 94-95.20)
- Cell 0xA9B40150 (alcove): walls only at Y=[16.5, 17.1]
- Door slab: only spans X=[131.635, 133.560] — too narrow to cover the cottage opening
- Outdoor cell 0xA9B40029: outdoor cell, no walls
**Net: no entity has wall polygons at (133.655, Y > 17.1).** Sphere can
walk there freely.
## Verification in production capture
`door-fix-inout2.launch.log` shows:
- Cottage GfxObj `[bsp-test]` fires 425 times during inside-out walking
(so visibility is correct post-fix)
- Door slab `[bsp-test]` fires 245 times
- Captured tick 3254: sphere at (133.655, 17.590), target (133.549,
17.599). Result: position X=133.655 unchanged (blocked westward),
position Y=17.599 (moved north freely). cn=(+1, 0, 0) = slab east
face normal.
- The slab east face blocks WEST motion correctly. The sphere is FREE
to move north because no geometry covers (133.655, Y > 17.1).
## UPDATE (2026-05-25 evening): the wall EXISTS, but isn't blocking
Continued investigation with a wider polygon search in
`Diagnostic_CottagePolys_NearWalkthroughPosition` revealed the cottage
DOES have the missing wall:
```
poly 0x0032 n=(0.00, +1.00, 0.00) X=[133.50, 136.30] Y=[17.10, 17.10] Z=[94.00, 97.00]
poly 0x0033 n=(0.00, +1.00, 0.00) X=[133.50, 136.30] Y=[17.10, 17.10] Z=[94.00, 97.00]
```
(Plus symmetric polys 0x0030, 0x0031, 0x0034, 0x0035 covering X<131.6,
0x0037, 0x0038, 0x003A, 0x003B above the doorway lintel.)
The cottage's north exterior wall east of doorway IS at world (X=[133.5,
136.3], Y=17.10, Z=[94, 97]), normal +Y. **This wall SHOULD block sphere
at X=133.655 (sphere west edge at 133.175 ≤ wall X range, sphere south
edge at 17.110 ≤ wall Y).**
The new question: WHY isn't the wall blocking in production?
Sphere at world (133.655, 17.59) at the captured failing tick:
- Sphere XY: X=[133.175, 134.135], Y=[17.110, 18.070]
- Sphere overlaps wall in X (133.175..134.135 vs 133.5..136.3) by 0.635m
- Sphere south edge at Y=17.110 ALIGNS with wall at Y=17.10 (0.010m past)
- Sphere CENTER at Y=17.59 is 0.49m north of wall
- Distance from sphere center to wall plane: 0.49m. Sphere radius 0.48m.
- |dist| (0.49) ≈ radius (0.48). Sphere is JUST grazing the wall plane.
At this exact tick the sphere CENTER is 0.49m north of wall; sphere
south edge is 0.01m north of wall. Sphere is BARELY past the wall.
So this tick isn't where the walkthrough happens. The walkthrough is
EARLIER — when sphere center Y went from 17.58 (just past wall by reach)
to 17.59. The crossing must have allowed the sphere through.
OR: the sphere never actually crossed the wall — it walked around it.
Cottage wall east of doorway is X=[133.5, 136.3]. Sphere at X=133.655
is barely in the wall's X range. If sphere came from X < 133.5 (where
no east wall exists) and shifted east while sliding along the slab,
it could end up at X > 133.5 having NEVER crossed the wall plane.
Cell transit data confirms: tick 1549 outdoor→indoor at X=132.859,
tick 2586 indoor→outdoor at X=134.022 (way past wall east edge).
**The sphere reached X=134.022 inside cottage geometry somehow.**
Sphere fitting through doorway opening requires center X in
[131.6+0.48, 133.5-0.48] = [132.08, 133.02]. Tight. The user's
off-center test (~50cm east) puts sphere at edge of opening or
past. Sphere is sliding against the slab east face (cn=(+1,0,0))
which gradually pushes it east. Eventually sphere center exceeds
X=133.5 — past the cottage east wall's start. From that position,
sphere can move north WITHOUT crossing the wall plane (sphere
center already north of Y=17.10 from prior sliding).
**This may be retail-faithful behavior** OR a bug in sphere-vs-corner
collision. The corner where alcove east wall (X=133.5, Y=[16.5,17.1])
meets cottage north wall (X=[133.5,136.3], Y=17.10) is a degenerate
edge. Sphere sliding along the alcove east wall (moving +Y) reaches
the corner at (133.5, 17.10) — should encounter the cottage wall
and be stopped. If our engine handles the corner transition
incorrectly, sphere slides past.
## What's next (revised AGAIN — corner test PASSED, bug is state-related)
**Corner-slide hypothesis: FALSIFIED.** `CornerSlide_AlcoveEastToCottageNorth_ShouldBlock`
test runs cottage GfxObj + cell 0x0150 BSP both registered. Places
sphere at (132.95, 16.8, 94) inside alcove near east wall. Walks +Y
50 times at 0.05 m/tick. **Sphere stays put at (132.95, 16.8) for all
50 ticks with cn=(0.71, -0.71, 0)** — the corner normal between
alcove east wall and cottage north wall. **The corner handling works
correctly in the harness.**
So production's walkthrough is **a STATE difference**, not a geometric
or collision-detection bug. The harness's sphere can't reach
X=133.655 inside the cottage geometry. Production's sphere does
reach it somehow.
Differences between harness and production:
- Harness uses identity walkable polygon (big quad). Production uses
real cell walkable polys (small, with edges).
- Harness has stub landblock terrain at Z=-1000. Production has real
terrain.
- Harness uses fresh body each tick. Production has accumulated state
from many prior ticks (velocity, contact plane history, etc.).
- Harness uses sphereRadius=0.48 + sphereHeight=1.20 exactly. Production
matches but might have different stepUp / stepDown.
**Next-session apparatus**: replay the EXACT captured tick 2586's body
state through the corner-blocking test setup. Tick 2586 was where
sphere went from indoor cell 0x0150 to outdoor cell 0x0029 at
PrevPy=17.586, Py=17.586 (no Y motion) with X=134.022 (way past alcove
east wall). That tick is the smoking-gun "how did sphere get to X=134
inside alcove" event. Load its body state into the harness, replay
the call, see what the engine reports about getting to that position.
If the harness blocks (sphere can't reach X=134), then production has
state we're not capturing — probably accumulated push/depenetration
across many earlier ticks. If the harness reproduces sphere at X=134,
the bug is in the specific body state at that moment.
The cleanest path forward is **cdb attach to retail** as the original
handoff recommended. Inspect what retail does FRAME-BY-FRAME at the
same doorway approach. If retail walks the user inside cottage at
off-center approach EXACTLY like we do — the bug isn't a bug, and
we should accept the behavior. If retail blocks cleanly — diff
retail's body state evolution vs ours to find the divergence.
## OLD (superseded) "what's next" candidates
**Identify which entity SHOULD own the cottage's north exterior wall
east of the doorway.** Three candidates:
1. **A different cottage GfxObj.** Holtburg cottages might be
multi-piece (separate GfxObjs for wall sections, doorway frame, roof).
The cottage we have (0x01000A2B) might be one of multiple. Check
the landblock's static-entity list for other GfxObjs at the cottage
position via `[entity-source]` log + Setup file.
2. **A landblock-baked "stab"** (separate static entity registered at
spawn time). LandblockLoader produces these. Check `LandBlockInfo`
dat record for landblock 0xA9B4 — what other entities are at world
(~133, ~18)?
3. **The cottage GfxObj's drawing geometry is wider than its physics.**
If 0x01000A2B has `Polygons` (visual) at the wall location but no
`PhysicsPolygons` (collision), the visual is wider than the
collision. This is a dat-data fact — not fixable without retail
re-engineering of the dat.
For candidates 1-2, the fix is "register the missing entity." For 3,
the bug is dat-side (or retail accepts the same walkthrough we do).
**Cheapest next-step test:** add a method to
`DoorSetupGfxObjInspectionTests` that loads `LandBlockInfo` 0xA9B4FFFE
(landblock-baked statics) and prints every static at world XY in
[131, 135] × [16, 19]. The output will name what other GfxObjs/Setups
are registered at the cottage doorway — if any include the missing
wall, we know what to register additionally.
## Apparatus committed
- `tests/AcDream.Core.Tests/Physics/DoorBugTrajectoryReplayTests.cs`:
faithful door registration, directional collision tests, geometric
pin test, cottage GfxObj hypothesis test, cottage polygon dump.
- `tests/AcDream.Core.Tests/Physics/DoorSetupGfxObjInspectionTests.cs`:
HoltburgCottage_CellPortals_DatInspection extended with cell-poly
world-frame dump.
All tests under `DoorBugTrajectoryReplayTests` and the extended
`DoorSetupGfxObjInspectionTests.HoltburgCottage_CellPortals_DatInspection`
PASS (skip on CI when dat dir absent).
## Pickup prompt for next session
```
A6.P4 door inside-out walkthrough: identified as collision-geometry
gap, NOT collision-detection bug. The cottage's north exterior wall
east+west of the doorway opening isn't represented in any registered
entity. Sphere walks freely at (133.655, 17.59) — no wall to block.
Read docs/research/2026-05-25-door-bug-inside-out-geometry-gap.md
+ Diagnostic_CottagePolys_NearWalkthroughPosition test output
+ HoltburgCottage_CellPortals_DatInspection dump for cell 0x0150
State both altitudes:
Currently working toward: M1.5 — Indoor world feels right
Current phase: A6.P4 door bug — find missing cottage wall entity.
The fix isn't in BSP, cells, or AddAllOutsideCells
— those are correct. The collision geometry has a
gap. Need to identify which entity SHOULD own the
wall and register it.
First move: add a LandblockStatics_DatInspection test to
DoorSetupGfxObjInspectionTests that loads LandBlockInfo 0xA9B4FFFE
+ iterates StaticObjects. Print every entity at world XY in
[131, 135] x [16, 19] — name + setup id + position. Will reveal
what other entities (if any) live at the cottage doorway.
If a wall-bearing entity exists but we're not registering it: fix
the registration path. If nothing exists: the dat doesn't have the
wall, and this might be retail-faithful behavior we have to accept
(or compensate for by widening the door slab via gameplay layer).
```

View file

@ -0,0 +1,232 @@
# Door bug — partial fix shipped (cell visibility), inside-out asymmetric collision remains
2026-05-25
## TL;DR
**Major root cause closed.** `CellTransit.AddAllOutsideCells` was
silently failing for every production caller because it assumed sphere
positions were in absolute world coordinates (subtracting the
landblock's "absolute" world origin `lbXf = 0xA9 * 192 = 32448`), while
production has used landblock-local coordinates since Phase A.1
(streaming-center landblock at world origin → `lbOffset = (0, 0)`).
For outdoor primary cells the bug was masked by `GetNearbyObjects`'s
radial sweep. For indoor primary cells (where issue #98's gate skips
the outdoor sweep), it meant **outdoor cells were never added to
`portalReachableCells`** → cottage door's outdoor cell `0xA9B40029`
invisible from indoor cell `0xA9B40150` → door's BSP never queried
→ player walked through.
**Outside→inside now blocks correctly. Inside→outside REMAINS BROKEN
asymmetrically.** Body partially intersects the door, slides through
visibly. Not retail-faithful. This is a SEPARATE bug in
BSP-collision-response for two-sided polygons — to investigate next
session.
## Apparatus shipped
Full trajectory-replay harness:
1. **Live capture** (`door-walkthrough.jsonl` from previous session; not
committed): 24,310 records of `PhysicsEngine.ResolveWithTransition`
calls including PhysicsBody snapshots before/after.
2. **Fixture extraction**
([tests/AcDream.Core.Tests/Fixtures/door-bug/live-capture.jsonl](../../tests/AcDream.Core.Tests/Fixtures/door-bug/live-capture.jsonl), 4 KB):
tick 13558 (the walkthrough) + tick 22760 (the working outdoor block)
as representative records.
3. **Replay harness**
([DoorBugTrajectoryReplayTests.cs](../../tests/AcDream.Core.Tests/Physics/DoorBugTrajectoryReplayTests.cs)):
- `LiveCompare_*` tests load the failing tick + replay through the
harness + diff result fields vs captured live values.
- `FindTransitCellsSphere_IndoorExitPortal_AddsOutsideForCapturedSpherePos`
— direct unit test for cell-portal traversal at the captured
sphere position. PASSES (cell graph is correct).
- `AddAllOutsideCells_LandblockLocalSphere_AddsDoorOutdoorCell`
— direct unit test that pinpointed the root cause. **Initially
failed** (`AddAllOutsideCells` returned empty when given
landblock-local sphere coords). **Now passes after fix.**
4. **Dat-direct cell-portal inspector**
([DoorSetupGfxObjInspectionTests.HoltburgCottage_CellPortals_DatInspection](../../tests/AcDream.Core.Tests/Physics/DoorSetupGfxObjInspectionTests.cs)):
reads `EnvCell` + `Environment.Cells` + portal `Polygon.Plane` from the
real dat for cells `0xA9B40150` (doorway alcove), `0xA9B4013F`
(cottage interior), `0xA9B40029` (outdoor — confirmed NOT EnvCell).
Output: cell `0xA9B40150` HAS a 0xFFFF exit portal at poly `0x0005`
with plane `n_local=(0, +1, 0), d_local=+5.6`. The sphere-vs-plane
math (sphere world `(132.36, 16.81, 94)` → local `(-1.86, -5.31, 0)`
via 180° Z rotation → `dist = +0.29` within `±rad=0.5` → straddles)
confirmed `exitOutside` SHOULD fire — but `AddAllOutsideCells` then
silently dropped the outdoor cell.
## The fix
[src/AcDream.Core/Physics/CellTransit.cs](../../src/AcDream.Core/Physics/CellTransit.cs)
`AddAllOutsideCells` no longer subtracts the landblock's
"absolute" world origin from the sphere position. Treats
`worldSphereCenter` as landblock-local directly (matching retail's
`CLandCell::add_all_outside_cells` which uses the per-cell 6-byte
position struct, and matching production's universal convention since
Phase A.1).
Existing tests in
[CellTransitAddAllOutsideCellsTests.cs](../../tests/AcDream.Core.Tests/Physics/CellTransitAddAllOutsideCellsTests.cs)
and
[CellTransitFindCellSetTests.cs](../../tests/AcDream.Core.Tests/Physics/CellTransitFindCellSetTests.cs)
updated to use landblock-local sphere coords (they were the only
callers using the world-coord convention; production never did).
## Visual verification
User tested all four combinations at a closed Holtburg cottage door,
~50cm off-center:
| Direction | Speed | Pre-fix | Post-fix |
|---|---|---|---|
| outside → inside | RUN | walks through | **BLOCKS** ✅ |
| outside → inside | WALK | walks through | (presumed BLOCKS — not retested) |
| inside → outside | RUN | walks through | **PARTIAL** ⚠️ body intersects door, sometimes through |
| inside → outside | WALK | walks through | **PARTIAL** ⚠️ same as run |
User quote: *"We have partial blocking from inside out. Can get
through some times. However, char is blocked a bit through the door.
So for example if I'm running towards this from the inside, I can see
parts of the body getting blocked a bit in to the door. This is not
per retail behavior and this is not how it looks when its block from
the outside"*.
The asymmetry is the new diagnostic: outside-in produces a clean block
(no body-into-door intersection visible); inside-out produces a partial
block with visible body intersection. This is the signature of an
**asymmetric collision response** to the door slab's two-sided
polygons (`SidesType=Landblock`), or a **BSP query that handles
sphere-already-overlapping-slab differently from sphere-approaching-slab**.
The `[bsp-test]` probe fires 245 times for the door entity during the
post-fix inside-out attempts — door IS being queried. The
collision-detection mechanics produce the wrong response.
## What's next (separate bug)
**Investigation status (corrected 2026-05-25 late evening).** Two new
directional tests + a geometric pin test all PASS:
- `Directional_OutsideIn_SouthApproach_BlocksAtSlabSouthFace` PASSES.
- `Directional_InsideOut_NorthApproach_BlocksAtSlabNorthFace` PASSES.
- `Geometric_DoorSlabAtSphereHeight_OverlapsInZ` PASSES.
The geometric test reveals (correctly computed this time):
```
Setup 0x020019FF (cottage door) PhysicsPolygons local AABB:
min=(-0.954, -0.134, -1.236) max=(0.971, 0.127, 1.255)
(slab origin at GEOMETRIC CENTER, not the bottom)
partFrame[0].Origin = (-0.006, 0.125, 1.275) → lifts slab origin
1.275 m above entity Z
With entity at world (132.6, 17.1, 94.1) + 180° entity rotation:
partWorldPos = (132.606, 16.975, 95.375)
Slab WORLD AABB:
X: [131.635, 133.560] (1.925 m wide)
Y: [16.848, 17.109] (0.261 m thick)
Z: [94.139, 96.630] (2.491 m tall, bottom JUST above floor)
Player sphere at foot Z=94:
Z: [94, 95.20]
Slab DOES overlap sphere in Z (overlap Z=[94.139, 95.20] = 1.061 m).
```
**The slab IS at sphere height — it should collide.** Both directional
tests prove BSP collision response is symmetric for sphere-to-slab
approach. Yet production shows asymmetric inside-out walkthrough at
off-center positions. The bug must be in one of:
1. **The portal-reachable cells from indoor cell 0x0150 still miss the
door's shadow at certain sphere positions**, despite the
AddAllOutsideCells fix. The user's walkthrough at X=133.655 (1.05 m
east of door center) puts the sphere mostly east of slab X range
[131.635, 133.560]. The sphere's WEST edge (X=133.175) is barely
inside the slab. If GetNearbyObjects's outdoor radial sweep uses
sphere center XY for cell lookup, it computes
gridX = (int)(133.655 / 24) = 5 → cell 0xA9B40029. But AddAllOutsideCells
only adds cells based on the sphere's PRIMARY position. The east-cell
neighbor might not be added if the sphere is wholly within the primary
cell's grid XY. Worth verifying.
2. **The BSP polygon-level test for partial-overlap geometry.** Sphere
half-east-of-slab, sphere south edge at slab north edge, moving +Y:
sphere is on the verge of leaving the slab volume. BSPQuery's polygon
intersection might consider this a "leaving collision" with no
response, even though the sphere body still partially occupies the
slab volume. Retail might handle this as "depenetration push" to
resolve the overlap.
3. **Cell BSP (cell 0x0150's PhysicsPolygons) is missing**. The doorway
alcove cell has 4 physics polygons — likely walls + floor. If retail
relies on the cell's walls to catch sphere-vs-doorway-side-wall
collisions (in addition to the door slab), and we're not loading /
testing the cell BSP correctly for the player's foot at sphere
height, the side walls would miss.
Three candidate investigations, ranked by ROI:
**A. cdb attach to retail** at a Holtburg cottage doorway. Break on
`CTransition::FindObjCollisions` for the door entity. Inspect what
shapes retail actually tests against. THIS IS DEFINITIVE — answers
"what should we be doing differently" in 15-30 min. CLAUDE.md has the
toolchain ready.
**B. Reproduce inside-out walkthrough at unit-test speed.** Load real
cell 0x0150 BSP into the harness (via CacheCellStruct from dat) +
register door at faithful transform + replay captured tick 3262.
If walkthrough reproduces at unit speed, can iterate on the fix in
<500 ms.
**C. Audit GetNearbyObjects radial sweep + AddAllOutsideCells coverage**
for east-neighbor cell when sphere XY is at primary cell boundary.
Recommendation: **A first** (cdb), then **B** to validate the fix at
unit-test speed.
## Commits
[List the commit SHAs of the apparatus + fix once landed.]
## Pickup prompt for the next session
```
Door bug — major root cause closed (CellTransit.AddAllOutsideCells
landblock-local coord convention). Outside→inside now blocks. But
inside→outside has asymmetric BSP collision response: body partially
intersects the door slab, sphere slides through. Same behavior at run
+ walk speed. Bug is in BSP collision response for two-sided polygons
or sphere-already-overlapping-slab handling.
Read docs/research/2026-05-25-door-bug-partial-fix-shipped.md
State both altitudes:
Currently working toward: M1.5 — Indoor world feels right
Current phase: A6.P4 door bug — inside-out asymmetric BSP collision
response. Apparatus is shipped (DoorBugTrajectoryReplayTests).
First major root cause closed. Remaining bug is in
BSP-collision-response mechanics, not cell visibility.
First move: extend the existing DoorBug apparatus with a more
faithful door registration (entity at the actual production world
pos + correct rotation; use the partFrame from the dat). Then write
TWO directional tests: sphere approaching the slab from the south
(outside-in) and sphere approaching from the north (inside-out).
Compare cn normal + resolution for each. The asymmetric response
will reproduce at unit-test speed. From there, inspect
BSPQuery.FindCollisions's handling of two-sided polygons and
sphere-already-overlapping cases. Retail oracle:
CBSPTree::find_collisions family at acclient_2013_pseudo_c.txt.
DO NOT:
- Re-investigate cell visibility (closed by AddAllOutsideCells fix)
- Re-do the registration shape (multi-part registration is correct)
- Speculate on the BSP fix without apparatus
```

View file

@ -0,0 +1,313 @@
# Issue #100 shipped + indoor-cell culling investigation handoff
**Date:** 2026-05-25 PM
**Status:** Issue #100 SHIPPED (visually verified for primary acceptance). Visual verification surfaced a NEW finding in the same family as issue #78 — outdoor terrain mesh visible inside cottage cellars at certain camera angles. Next session: deep investigation + plan + port retail's indoor-cell visibility culling to close the family.
**Branch:** `claude/strange-albattani-3fc83c` (worktree)
**Predecessor handoff:** [docs/research/2026-05-25-issue-100-terrain-cutout-handoff.md](2026-05-25-issue-100-terrain-cutout-handoff.md) (the prior session's smoking-gun research that drove the #100 fix).
---
## TL;DR
**Shipped this session (3 commits):**
- `f48c74a` — Task 1: terrain shader Z nudge (retail `zFightTerrainAdjust = 0.00999999978`)
- `a64e6f2` — Task 2: removed ~50 LOC of `hiddenTerrainCells` / `BuildingTerrainCells` plumbing across 7 source files + 2 test files, closed #100 in ISSUES.md
- `84e3b72` — docs: stabilized Task 2's SHA reference in ISSUES.md (follow-up commit, not amend)
**Visual verification result (Holtburg, live ACE):**
- ✅ **Primary acceptance:** transparent rectangles around houses are GONE. Ground reads as continuous cobblestone / grass around every cottage observed. Issue #100's user-visible symptom is closed.
- ❌ **New finding:** standing inside a cottage cellar with the camera positioned such that the cottage walls don't fully occlude the view, the outdoor terrain mesh renders as a sharp-edged grass rectangle over the cellar stairs and floor. **Clears when the camera moves closer** (camera position changes such that cottage geometry properly occludes). **Gameplay unaffected** — player can walk down/up the stairs normally.
**Root cause hypothesis for the new finding (HIGH CONFIDENCE):** indoor-cell visibility culling is not gating outdoor terrain rendering. The outdoor terrain mesh is now (correctly per retail) rendered everywhere on the 192 m landblock — including in 3D regions occupied by indoor `CEnvCell` volumes. When the camera is in an indoor cell, the outdoor terrain mesh should be EXCLUDED from the draw set unless an outdoor cell is reachable via portal LOS from the camera's cell. acdream does not currently perform this culling.
**This is the same root cause as filed issue #78** ("Outdoor stabs/buildings visible through the rendered floor" at the inn), just with outdoor *terrain* affected instead of outdoor *stabs*. #78 was filed 2026-05-19 with the hypothesis "Outdoor stabs aren't being culled when the player is inside an EnvCell — this is the Phase 1 Task 3 deferred work ('Cull outdoor stabs when indoors via VisibleCellIds')." We never returned to it.
---
## The visibility-culling issue family
Three filed/observed issues likely share infrastructure:
| ID | Symptom | Domain |
|---|---|---|
| **#78** (OPEN) | Inside Holtburg Inn, outdoor stabs/buildings visible THROUGH the floor and walls | Outdoor stabs not culled when camera in indoor cell |
| **Cellar-stairs** (NEW, observed 2026-05-25 PM) | Inside cottage cellar, outdoor terrain mesh visible covering stair geometry at certain camera angles | Outdoor terrain not culled when camera in indoor cell |
| **#95** (OPEN) | Entering dungeon via portal, `visibleCells` per cell jumps from ~4-7 to **135-145**, including cells from other landblocks; see-through walls, other-dungeon geometry visible | Indoor→indoor portal-graph traversal blowup (over-inclusion) |
#78 and the cellar-stairs finding are the **same bug** (outdoor geometry not culled when camera is in an indoor cell) with different geometry classes affected. **They should close together.**
#95 is a sibling — same visibility-culling SUBSYSTEM but different specific failure (indoor→indoor over-inclusion via unrooted portal recursion). It might or might not close as a side effect of the #78/cellar-stairs fix; the next session should determine if the infrastructure overlaps enough to fix both, or whether #95 needs its own work.
Additional adjacent issues (probably NOT same root cause but worth noting):
- **#79, #80, #81, #93, #94** — indoor lighting bugs. Filed under A7 (M1.5 lighting fidelity). Some may share visibility plumbing (e.g., if lights from outdoor entities leak into indoor cells, that's a visibility issue).
---
## Why I'm confident this is culling, not Z-fighting
Three signals, ordered by weight:
1. **Patch geometry is too large.** A Z-precision Z-fight at coplanar 1 cm separation would manifest as a thin ~0.3 m strip on the topmost stair tread (Z=94). The observed patch is sharp-edged rectangular geometry the size of a terrain cell footprint (likely 24 m × 24 m in landblock-local space), covering multiple stair steps and floor area. That's a polygon, not a precision artifact.
2. **"Clears when closer" matches geometric occlusion, not depth precision.** If 1 cm depth-buffer precision were failing, closer camera distance would PASS more cleanly (precision tightens). The user reports the patch clears as they approach the stairs — consistent with cottage walls + stair treads now occluding the terrain in screen space. At 2-5 m camera distance and 24-bit depth buffer, the 1 cm nudge has sub-millimeter resolving power; precision is not the bottleneck.
3. **Exact match for #78's hypothesis #2 mechanism.** #78 ("outdoor stabs visible through cell walls") was filed 2026-05-19 with hypothesis: outdoor stabs aren't culled when player is in an EnvCell; WB has a `RenderInsideOut` stencil pipeline that acdream never invokes. The cellar-stairs case is the same mechanism applied to outdoor terrain mesh.
**One test that could falsify culling-as-cause:** stand at the spot showing the artifact, look at the grass patch, rotate the camera slowly without moving the character. If the patch FLICKERS / shimmers as you turn, that's Z-fight (depth precision unstable across angles). If the patch stays geometrically stable (its polygon edges move predictably with the camera, but it doesn't flicker), that's culling. The screenshot suggested polygon-stable edges — consistent with culling — but rotating the camera is the definitive test, and the next session should do this in the first 60 seconds of visual checking before planning the fix.
---
## Existing apparatus the next session can use
### acdream's current visibility code
**[`src/AcDream.App/Rendering/CellVisibility.cs`](../../src/AcDream.App/Rendering/CellVisibility.cs)** — portal-based interior cell visibility system ported from ACME's `EnvCellManager.cs`. Exposes:
- `FindCameraCell(...)` — resolves which EnvCell the camera is in.
- `PointInCell(...)` — point-in-cell test with `PointInCellEpsilon = 0.01f`.
- `GetVisibleCells(...)` — returns `VisibleCellIds` set for the camera's current cell, via portal-chain traversal.
- `CellSwitchGraceFrameCount = 3` — anti-flicker grace period for cell transitions.
**[`src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs`](../../src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs)** — per-entity draw filter. Per ISSUES.md #78 line 165, "the dispatcher already filters by `entity.ParentCellId ∈ visibleCellIds` but outdoor stabs have `ParentCellId == null` so they always pass." This is the gate we need to extend.
**[`src/AcDream.App/Rendering/TerrainModernRenderer.cs`](../../src/AcDream.App/Rendering/TerrainModernRenderer.cs)** — terrain dispatcher. Currently renders ALL loaded landblocks unconditionally. Needs to learn about indoor-camera-state to optionally skip outdoor-cell terrain cells.
### Probes available
From CLAUDE.md "Diagnostic env vars":
- `ACDREAM_PROBE_CELL=1` — one `[cell-transit]` line per `PlayerMovementController.CellId` change. Useful for verifying when the camera is in an indoor vs outdoor cell.
- `ACDREAM_PROBE_RESOLVE=1` — full physics resolver trace.
- Runtime-toggleable via the DebugPanel "Diagnostics" section.
No existing probe instruments the rendering visibility decision — the next session might add one (`ACDREAM_PROBE_VIS=1` that logs the camera's resolved cell + `VisibleCellIds` set per N frames).
### Retail oracle anchors
```
docs/research/named-retail/acclient_2013_pseudo_c.txt:311397
CEnvCell::find_visible_child_cell (address 0x0052dc50)
docs/research/named-retail/acclient_2013_pseudo_c.txt:280028
call site: eax_6 = CEnvCell::find_visible_child_cell(eax_5, &__return, arg5);
```
Grep further for `find_visible`, `visibility`, `cull`, `RenderDeviceD3D::DrawBlock`, `ACRender::draw`, etc. The retail render loop's visibility chain — pre-frame walk-down from the camera's cell through portal-visible neighbours — is the target to port.
### WorldBuilder reference
```
references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/VisibilityManager.cs
references/WorldBuilder/Chorizite.OpenGLSDLBackend/GameScene.cs
```
WB has a `RenderInsideOut` mechanism in these files. Per #78's hypothesis, "acdream never invokes" this pipeline. The next session should determine whether to (a) invoke WB's existing code from our render path, (b) port the algorithm to acdream's namespaces, or (c) write a retail-faithful port from the named-retail decomp directly. CLAUDE.md's WB inventory policy applies — read [docs/architecture/worldbuilder-inventory.md](../architecture/worldbuilder-inventory.md) before deciding.
---
## Do-not-retry list for the next session
1. **Don't try to roll back the #100 fix.** The transparent-rectangle bug was a universal symptom on every Holtburg house. The cellar-stairs artifact is conditional and camera-angle-dependent. Reverting #100 trades a worse bug for a less-bad one.
2. **Don't try to solve the cellar-stairs case by lowering the terrain Z further** (e.g., bumping the shader nudge from 0.01 to 0.1 or 1.0). The visible terrain is rendered at its correct Z; the issue is that it's visible AT ALL inside the indoor cell. Bigger nudge doesn't help and would break coplanar-floor disambiguation elsewhere.
3. **Don't try to solve it by hiding terrain cells based on the building footprint again.** That was issue #100's bug — cell-level hiding is too coarse (cottage ~12 m × 12 m in a 24 m × 24 m cell). The right granularity is per-camera-state visibility, not per-cell mesh modification.
4. **Don't try to fix this with depth tricks** (disable depth-write for terrain, etc.) — those break elsewhere and aren't retail-faithful.
5. **Don't conflate this with #82** (some slope terrain lit incorrectly). #82 is about per-vertex normal calculation; the cellar-stairs artifact is about which polygons render at all, not how they're shaded.
6. **Don't try to land a 1-line fix for this.** Indoor-cell visibility culling is a real system to port. Single-line patches at the symptom site (e.g., "if camera in cellar, skip terrain") would close cellar-stairs but not #78 — and would be the kind of workaround CLAUDE.md prohibits. Per the project rule, fix the root cause: port the visibility computation properly.
7. **Don't trust the WB `RenderInsideOut` code blindly.** WB's editor view has known visibility quirks (per the predecessor handoff: "WB has a known Z-fighting issue in the editor view that nobody noticed because it's editor-only"). Cross-reference WB against retail before adopting.
---
## Open questions for the next session to answer
1. **Is the cellar-stairs artifact 100% culling, or partly Z-precision?** The first verification step is the camera-rotation test described above (rotate without moving — flicker = Z-fight, stable = culling). Until this is confirmed, the diagnosis remains "high confidence" but not certain.
2. **Does the #78 + cellar-stairs fix also close #95?** The two are in the same family but #95's specific failure (over-inclusion of indoor cells via portal recursion) might need a separate cap-traversal-depth fix. The next session should map the shared infrastructure before committing to a combined-or-split plan.
3. **What's the right Phase identifier?** M1.5 doesn't have a "visibility" sub-phase yet. A6 is physics; A7 is lighting. Visibility might warrant its own A-letter (A8?) or be slotted under whichever existing structure makes sense. Discuss with user at the start of the next session before naming the work.
4. **Should the cellar-stairs case be documented in #78** as additional evidence, or filed as a separate issue tied to #78? Per user direction (2026-05-25 PM session-end): don't file a new issue; treat as evidence for #78. The next session's investigation should formalize this — possibly by editing #78 to broaden its description to "outdoor geometry (stabs + terrain) visible inside EnvCells."
---
## Pickup prompt for the next session
```
Indoor-cell visibility culling — port retail's mechanism to close
issue #78 (outdoor stabs visible through inn floor) and the new
cellar-stairs visual artifact discovered while visual-verifying
the #100 fix on 2026-05-25.
Read first (in this order):
1. docs/research/2026-05-25-issue-100-shipped-and-culling-handoff.md
(this doc — full session handoff with the family map, root-cause
hypothesis, retail anchors, WB references, do-not-retry list)
2. docs/ISSUES.md #78 (the filed issue; same root cause as the
cellar-stairs finding)
3. docs/ISSUES.md #95 (sibling visibility issue; verify whether
it closes as a side effect)
4. CLAUDE.md — search "currently working toward" to refresh state
State both altitudes:
Currently working toward: M1.5 — Indoor world feels right
Current phase: TBD (visibility culling; new sub-phase to name
with the user at session start — possibly A8 if A6=physics,
A7=lighting follow this naming, OR fits under an existing
A6 sub-phase)
## Session flow (three phases, in order)
### Phase 1 — Investigate (use the /investigate skill)
Independently verify the hypothesis and locate the retail mechanism.
Specifically:
a. Run the camera-rotation falsification test on the cellar-stairs
artifact. Stand in a Holtburg cottage cellar at a position where
the grass overlay is visible, rotate the camera slowly without
moving. If the patch stays geometrically stable (polygon edges
move predictably), confirms culling. If it flickers / shimmers,
pivot the diagnosis to Z-precision.
b. Grep named-retail for the visibility chain. Anchors to start
from:
acclient_2013_pseudo_c.txt:311397 — CEnvCell::find_visible_child_cell
acclient_2013_pseudo_c.txt:280028 — call site
Find: RenderDeviceD3D::DrawBlock (around line 430027 per the
#100 predecessor handoff), the visibility computation that
precedes it, and how it gates outdoor-cell rendering when the
camera is in an indoor cell.
c. Read WorldBuilder's visibility implementation:
references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/VisibilityManager.cs
references/WorldBuilder/Chorizite.OpenGLSDLBackend/GameScene.cs
Specifically the RenderInsideOut stencil pipeline that #78
flags as "acdream never invokes." Decide whether to adopt
wholesale, port to our namespaces, or write fresh from
retail.
d. Read acdream's existing visibility code:
src/AcDream.App/Rendering/CellVisibility.cs
src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs
src/AcDream.App/Rendering/TerrainModernRenderer.cs
Understand the current per-entity gate (filters by
entity.ParentCellId ∈ visibleCellIds, but outdoor stabs
have null ParentCellId so they always pass — that's the bug).
e. Determine whether #95's symptom (visibleCells exploding to
135-145 at network hubs) closes as a side effect or needs
its own work. Read scen5 acdream.log if it's still in the
research tree.
Output of Phase 1: a short report — either "culling is confirmed
and here's the retail anchor / WB code / acdream extension point"
or "diagnosis pivot needed, here's the new shape." Plus a
fix-shape sketch. Get user approval before Phase 2.
### Phase 2 — Plan (use the superpowers:writing-plans skill)
Draft the implementation plan. The shape depends on Phase 1
findings, but likely 4-6 tasks:
- Task 1: Build the diagnostic probe (ACDREAM_PROBE_VIS=1 logging
camera cell + VisibleCellIds + which entities/cells get
rendered) — apparatus first, per CLAUDE.md's "apparatus for
physics bugs" memory note generalized to rendering.
- Task 2: Extend the WbDrawDispatcher per-entity gate to skip
outdoor entities (ParentCellId == null) when the camera's
current cell is indoor AND no outdoor cell is in VisibleCellIds.
- Task 3: Extend the TerrainModernRenderer to skip outdoor
landblocks under the same condition (or to skip individual
cells if the granularity matters — let the retail decomp
decide).
- Task 4: (Possibly) Port the portal-LOS chain that decides
which outdoor cells ARE visible from inside an indoor cell
via doors/windows — so transitions through doorways don't
abruptly cull and re-add geometry. Read retail's clip-plane
portal test for this.
- Task 5: (Possibly) Address #95's traversal-depth cap if
Phase 1 confirms it's not closed by the #78 fix.
- Task 6: Visual verification — at Holtburg cottages (cellar
stairs no longer show terrain), Holtburg Inn (outdoor stabs
no longer visible through walls), and a portal-entry dungeon
(visibleCells stays in a sane range if #95 is in scope).
### Phase 3 — Implement (use superpowers:subagent-driven-development)
Same pattern as the #100 session: fresh subagent per task,
two-stage review per task (spec + code quality), final review
across all commits, visual verification by user as the
acceptance test.
## Constraints
Per CLAUDE.md "no workarounds" rule — fix the root cause, do not
patch symptom sites. Visibility culling is a real system, not a
one-line gate.
Read the do-not-retry list in this handoff doc (7 items) before
starting Phase 2.
Visual verification is the acceptance test. The fix must close
the cellar-stairs artifact AND #78's "outdoor stabs through floor"
AND not regress #100's transparent-rectangle resolution. Be
honest about partial results.
## Reference repo hierarchy reminder
Per CLAUDE.md "Reference repos: cross-check the relevant ones" —
for visibility/culling work, the relevant references are:
- Retail decomp (docs/research/named-retail/) — primary oracle
- WorldBuilder VisibilityManager + GameScene — implementation reference
- ACE has minimal coverage here (it's server-side; client visibility
is not its concern)
- holtburger is TUI, no rendering visibility
- AC2D has fixed-function rendering — limited modern relevance
Cross-reference retail + WB. If they diverge, retail wins.
## What success looks like
After this work lands:
- Standing in a Holtburg cottage cellar at the exact spot of the
2026-05-25 screenshot artifact, no grass overlay on stairs from
ANY camera angle.
- Standing inside Holtburg Inn, no outdoor stabs visible through
floor or walls.
- Entering a dungeon via the Town Network portal, visibleCells
per cell stays in the ~4-15 range (if #95 in scope).
- No regression on issue #100 (no transparent rectangles around
houses).
- dotnet build green; dotnet test failures within the documented
14-23 flaky window.
```
---
## CLAUDE.md update (post-handoff)
Pending. The CLAUDE.md ship paragraph for #100 was deferred to "after visual verification confirms" — visual verification PARTIALLY confirmed (primary acceptance met, secondary artifact in same family as existing #78). The next session can either:
- Add a brief CLAUDE.md ship entry now mentioning #100 closed + cellar-stairs finding linked to #78
- Skip until #78 / cellar-stairs lands, then add a combined paragraph
Recommendation: add it now (issue #100 is genuinely closed by its own criteria). The cellar-stairs work is a NEW investigation, not a continuation of #100.
---
## Files state at session end
```
Branch: claude/strange-albattani-3fc83c
HEAD: 84e3b72 docs: #100 — stabilize Task 2 SHA reference in ISSUES.md
Parent: a64e6f2 refactor: #100 — remove hiddenTerrainCells / BuildingTerrainCells plumbing
Grandparent: f48c74a fix(render): #100 — render terrain 1 cm below physical Z (retail zFightTerrainAdjust)
Before #100: 2fc312e docs: #101 — fix fabricated content in Recently closed entry
Working tree: clean
Untracked: pre-flight-test-baseline.log, issue100-verify-launch.log (logs, can be deleted/gitignored)
```
Both log files are session-scoped; the next session can either delete them or ignore them. They aren't committed.

View file

@ -0,0 +1,406 @@
# Issue #100 — Transparent ground around buildings — investigation handoff
**Date:** 2026-05-25 PM (end of A6.P8 session)
**Status:** Initial research done; **next session is fix-design + implement**. The smoking gun is retail's per-draw `zFightTerrainAdjust = 0.01`. The current acdream code uses a wrong mechanism (cell-level terrain collapse) that creates the transparent rectangles around every Holtburg house.
**Predecessor issue entry:** [`docs/ISSUES.md` #100](../ISSUES.md) (filed 2026-05-24).
---
## TL;DR
The transparent rectangles around every Holtburg house are caused by acdream's
`hiddenTerrainCells` mechanism — a misfire on the Z-fighting problem. The
mechanism collapses entire 24m × 24m outdoor terrain cells to a zero-area
degenerate when any building's `Frame.Origin` lies in them, but cottages are
only ~12m × 12m, so ~75% of each "hidden" cell is bare framebuffer-clear
showing through.
**Retail's mechanism is different and almost trivially small:** retail
**always renders the full terrain mesh, then nudges every terrain vertex Z
down by `0.00999999978 m` (= ~0.01 m) at draw time.** That makes terrain
always lose the depth test against a coplanar building floor — Z-fight
solved, no cells hidden, no cutout polygon needed. Verbatim from the
2013 EoR retail decomp:
| Source | What |
|---|---|
| `docs/research/named-retail/acclient_2013_pseudo_c.txt:1120769` | `float zFightTerrainAdjust = 0.00999999978;` |
| `docs/research/named-retail/acclient_2013_pseudo_c.txt:430113` | `DrawLandCell(esi_3)` — per-cell terrain draw |
| `docs/research/named-retail/acclient_2013_pseudo_c.txt:430124` | `DrawSortCell(esi_3)` — per-cell building draw, **same iteration** |
| `docs/research/named-retail/acclient_2013_pseudo_c.txt:427867` | `ACRender::landPolysDraw(arg2->polygons, 2)` — the `arg2=2` path |
| `docs/research/named-retail/acclient_2013_pseudo_c.txt:006b6402` | `edi_4[1] = (float)((long double)esi_1[2] - (long double)zFightTerrainAdjust);` — the terrain-Z nudge |
**WorldBuilder also renders full terrain** — it does **not** hide cells.
WB has a known Z-fighting issue in the editor view that nobody noticed
because it's editor-only.
[`references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/TerrainGeometryGenerator.cs:123-141`](../../references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/TerrainGeometryGenerator.cs) iterates all 64 cells unconditionally.
**The fix is path 2 from the issue #100 entry**, refined: drop
`hiddenTerrainCells` entirely + apply `gl_Position.z -= 0.01` (or
equivalent world-Z nudge) in `src/AcDream.App/Rendering/Shaders/terrain_modern.vert`
at line 139. Estimated change: ~15 LOC across 1-2 commits, including
removal of the dead `BuildingTerrainCells` / `hiddenTerrainCells`
plumbing.
---
## Symptom (concrete evidence)
User screenshot 2026-05-25: standing next to a Holtburg cottage. The ground
in a rectangular footprint around the building appears as a flat dark
pink/light patch (the framebuffer clear color) instead of cobblestone /
grass terrain. Visible as a sharp-edged rectangle the size of the
**outdoor terrain cell** (24 × 24 m), not the size of the **cottage's
building footprint** (~12 × 12 m). Same shape on every house observed.
User wording from 2026-05-24 report: "around every house now I missing
the ground texture, it is transparent. I can see through the ground."
---
## Root cause (now confirmed via decomp cross-reference)
### The acdream code that produces the bug
Commit `35b37df` (2026-05-23, A6.P3 #98 triage) kept the
`hiddenTerrainCells` mechanism. The path:
1. **`LandblockLoader.BuildBuildingTerrainCells(LandBlockInfo info)`**
([`src/AcDream.Core/World/LandblockLoader.cs:39-50`](../../src/AcDream.Core/World/LandblockLoader.cs:39))
reads `info.Buildings`, computes
`int cx = clamp(building.Frame.Origin.X / 24f, 0, 7)`,
`int cy = clamp(building.Frame.Origin.Y / 24f, 0, 7)`, and emits
`cy * 8 + cx` per building. Granularity: **one 24m cell per building**.
2. **`LandblockMesh.Build`**
([`src/AcDream.Core/Terrain/LandblockMesh.cs:175-185`](../../src/AcDream.Core/Terrain/LandblockMesh.cs:175))
replaces every index in those cells with the cell's first-vertex index,
producing degenerate (zero-area) triangles that the GPU rasterizer skips.
3. Result: a **24m × 24m hole** in the terrain mesh per building, regardless
of the building's actual size.
A cottage at, say, world `(110, 26)` has `Frame.Origin` at landblock-local
`(110, 26)``cx = 4`, `cy = 1` → outdoor cell index `12`. The hidden
area is `(cx*24, cy*24)` to `((cx+1)*24, (cy+1)*24)` = `(96, 24)` to
`(120, 48)` — a 24×24m square. The cottage footprint is closer to
~12×12m centred near `(110, 26)`. ~75% of the hidden area has no
building geometry to cover it → framebuffer-clear visible.
### What the existing comments said the intent was
[`src/AcDream.Core/Terrain/LandblockMesh.cs:171-174`](../../src/AcDream.Core/Terrain/LandblockMesh.cs:171):
> Indices are trivial 0..383 since we don't deduplicate verts. When a
> building owns an outdoor terrain cell, **keep the fixed 384-index
> contract but collapse its two triangles so the building/stair mesh can
> visually own the hole.**
[`src/AcDream.Core/World/LandblockLoader.cs:33-37`](../../src/AcDream.Core/World/LandblockLoader.cs:33):
> Map LandBlockInfo.Buildings to 8x8 terrain mesh cells (cy * 8 + cx).
> **Retail attaches each CBuildingObj to its outside landcell during
> CLandBlock::init_buildings;** keep this signal separate from stabs so
> ordinary static props do not punch holes in terrain.
The first comment shows the intent: avoid Z-fighting between the building
floor and the terrain below. The second is correct but irrelevant — retail
attaches buildings to a cell for render-order (the `DrawSortCell` step),
NOT to hide that cell's terrain. Our author misread the retail intent.
---
## Retail mechanism (verbatim)
Per the research-agent dispatch this session, the full retail render
sequence is at `RenderDeviceD3D::DrawBlock`
([`acclient_2013_pseudo_c.txt:430027`](../research/named-retail/acclient_2013_pseudo_c.txt)
onwards):
```
for each CLandCell in draw_array (all 64 cells): // line 430113
DrawLandCell(esi_3) // → ACRender::landPolysDraw(polygons, 2)
DrawSortCell(esi_3) // → DrawBuilding(...) for any CBuildingObj attached
// to this cell + the cell's object list
```
`landPolysDraw(polygons, 2)` selects the path that subtracts
`zFightTerrainAdjust` from every terrain vertex Z at upload time. The
constant:
```c
float zFightTerrainAdjust = 0.00999999978; // acclient_2013_pseudo_c.txt:1120769
```
And the application
([`acclient_2013_pseudo_c.txt:006b6402`](../research/named-retail/acclient_2013_pseudo_c.txt)):
```c
edi_4[1] = ((float)(((long double)esi_1[2]) - ((long double)zFightTerrainAdjust)));
```
Where `edi_4[1]` is the output vertex Z and `esi_1[2]` is the source
vertex Z. So every terrain vertex's `Z` becomes `Z - 0.01` at draw time.
**Result:** terrain is uniformly 1 cm lower than its physical height (the
physics path uses the un-nudged Z; only the render path nudges). Building
floors at the physically-correct height always win the depth test
because they're 1 cm higher than the rendered terrain. No cells are
hidden. No cutout is computed. The world reads as one continuous surface.
### Retail's `CLandBlock::init_buildings`
[`acclient_2013_pseudo_c.txt:313854`](../research/named-retail/acclient_2013_pseudo_c.txt)
iterates `lbi->buildings`, calls
`CBuildingObj::makeBuilding(building_id, ...)`, then
`CBuildingObj::add_to_cell(eax_4, landcell)` — attaches the building to
whichever `CLandCell` it physically belongs to. **This is for render
ordering (sort) and physics scoping, not for terrain cutout.** No terrain
modification happens here.
### `BuildInfo` data fields (acclient.h:32035)
```c
struct __cppobj BuildInfo {
IDClass<_tagDataID,32,0> building_id; // Setup DID (0x02xxxxxx)
Frame building_frame; // position + rotation
unsigned int num_leaves; // portal leaf count
unsigned int num_portals;
CBldPortal **portals;
};
```
**There is no explicit footprint polygon, AABB, or terrain-cell list.**
The only geometric anchor is `building_frame.Origin`. Building footprint
must be derived from the Setup's `parts[0]` GfxObj geometry if you needed
it — retail never does, because the depth-nudge mechanism makes it
unnecessary.
---
## Recommended fix shape
### Path 2 (refined) — retail-faithful terrain Z-nudge
**Site:** [`src/AcDream.App/Rendering/Shaders/terrain_modern.vert`](../../src/AcDream.App/Rendering/Shaders/terrain_modern.vert) line 139.
**Change:** replace
```glsl
gl_Position = uProjection * uView * vec4(aPos, 1.0);
```
with
```glsl
// Retail zFightTerrainAdjust (acclient_2013_pseudo_c.txt:1120769, value
// 0.00999999978). Lower terrain by 1 cm so coplanar building floors
// (at the un-nudged physically-correct Z) always win the depth test.
// Cross-ref: docs/research/2026-05-25-issue-100-terrain-cutout-handoff.md.
vec3 terrainPos = vec3(aPos.xy, aPos.z - 0.01);
gl_Position = uProjection * uView * vec4(terrainPos, 1.0);
```
**Cleanup (same commit or follow-up):**
1. Delete `hiddenTerrainCells` parameter and the collapse block at
`LandblockMesh.cs:175-185`.
2. Delete `LoadedLandblock.BuildingTerrainCells` field at
`src/AcDream.Core/World/LoadedLandblock.cs`.
3. Delete `BuildBuildingTerrainCells` at
`LandblockLoader.cs:33-50`.
4. Delete the threading through `GameWindow.cs:1808, 5366, 8761` and
`src/AcDream.App/Streaming/{GpuWorldState,LandblockStreamer}.cs`.
5. Delete `tests/AcDream.Core.Tests/Terrain/LandblockMeshTests.cs`'s
hiddenTerrainCells test cases. Delete or rewrite
`tests/AcDream.Core.Tests/World/LandblockLoaderTests.cs`'s
`BuildBuildingTerrainCells_*` cases.
**Test plan:**
- Add a tiny shader-vertex unit test if there's a precedent (look in
`tests/AcDream.App.Tests/Rendering/` for any shader-correctness tests).
- Visual verification at Holtburg: terrain renders continuously under
cottages, no transparent rectangles. Z-fighting between building floor
and terrain not visible.
- Run the full focused test suite (now 23 tests, will likely shrink by 2-4
when the dead `BuildBuildingTerrainCells` / `LandblockMesh.hiddenTerrainCells`
tests are removed) and confirm green.
**Why this is right:**
- Matches retail mechanism verbatim (1 cm Z nudge on terrain at draw time).
- Removes ~50 LOC of dead plumbing (`BuildingTerrainCells` threading
through 5 files).
- Avoids the per-building-footprint computation that the current code
cannot do correctly without loading the Setup mesh.
### Why NOT path 1 (polygon-level cutout)
- Retail doesn't do this — there is no precedent in the named decomp.
- Building footprint isn't in `BuildInfo` — would require loading the
Setup AND computing a 2D XY footprint polygon from `parts[0]`'s
geometry. Engineering-heavy.
- Even if computed, mesh modifications break the fixed 384-index contract
in `LandblockMesh.Build`.
### Why NOT path 3 (building yard mesh)
- Retail doesn't have this. `BuildInfo` carries no yard polygon.
- Cottage Setups don't appear to include a yard mesh in their geometry
(would need confirmation by dumping a cottage Setup, but the retail
mechanism makes this question moot).
---
## Do-not-retry list
1. **Don't try to compute the building's tight footprint** from
`LandBlockInfo.Buildings`. The struct doesn't carry one. Retail doesn't
either. Any computation would require loading the Setup mesh and
building an XY hull from `parts[0]` — pure engineering with no retail
anchor.
2. **Don't shift the 0.02 m EnvCell render lift** at
`GameWindow.cs:5400` (or equivalent). That lift is for indoor-cell
floor rendering and is correct as-is. The terrain Z nudge is the
reverse direction (lower terrain) and is independent.
3. **Don't disable depth testing** on terrain or building draws. Retail
uses standard depth test (`GL_LESS` equivalent); the Z nudge alone is
the disambiguator.
4. **Don't apply `glPolygonOffset`** to terrain. Retail uses a vertex Z
nudge, not GPU-side polygon offset. Polygon offset has hardware-specific
slope-dependent behavior; the constant 1 cm world-Z is uniform and
well-defined.
5. **Don't keep `hiddenTerrainCells` and add the Z nudge as a "belt and
suspenders"** safety. The hidden-cells path is wrong and should be
deleted in the same commit. Two mechanisms for the same problem is
future technical debt.
6. **Don't touch the physics path.** The Z nudge is render-only. Physics
already uses the un-nudged terrain Z. This is the same render-vs-physics
split that `35b37df` correctly introduced for the `0.02m` EnvCell render
lift (kept item in that commit's "Kept" list).
---
## Files involved (for the next session)
| File | What's there | Action |
|---|---|---|
| `src/AcDream.Core/Terrain/LandblockMesh.cs:175-185` | `hiddenTerrainCells` collapse block | Delete |
| `src/AcDream.Core/Terrain/LandblockMesh.cs:Build` signature | `IReadOnlySet<int>? hiddenTerrainCells` param | Delete param |
| `src/AcDream.Core/World/LoadedLandblock.cs` | `BuildingTerrainCells` field | Delete |
| `src/AcDream.Core/World/LandblockLoader.cs:33-50` | `BuildBuildingTerrainCells` method | Delete |
| `src/AcDream.Core/World/LandblockLoader.cs:Load` | `buildingTerrainCells` local + threading into `LoadedLandblock` ctor | Delete locals + simplify ctor call |
| `src/AcDream.App/Rendering/GameWindow.cs` ~lines 1808, 5366, 8761 | `LandblockMesh.Build(..., lb.BuildingTerrainCells)` call sites | Drop the `hiddenTerrainCells` argument |
| `src/AcDream.App/Streaming/GpuWorldState.cs` | `BuildingTerrainCells` threading | Drop |
| `src/AcDream.App/Streaming/LandblockStreamer.cs` | `BuildingTerrainCells` threading | Drop |
| `src/AcDream.App/Rendering/Shaders/terrain_modern.vert:139` | `gl_Position = ...` | Insert `aPos.z - 0.01` nudge above |
| `tests/AcDream.Core.Tests/Terrain/LandblockMeshTests.cs` | `hiddenTerrainCells` test cases | Delete |
| `tests/AcDream.Core.Tests/World/LandblockLoaderTests.cs` | `BuildBuildingTerrainCells_*` cases | Delete |
---
## Open questions
1. **Old terrain shader removed?** There's a `terrain_modern.vert` and the
build-output mirrors. Confirm there's no older `terrain.vert` that
also needs the nudge applied (the comment at line 4-5 says "Math
identical to terrain.vert"; check whether the legacy shader is still
compiled into the binary or has been fully retired post-N.5b).
2. **Sky / water shaders** — confirm the Z-nudge doesn't accidentally
affect anything else. Should be limited to the terrain shader only.
3. **Building floor render order** — retail also relies on the
`DrawSortCell` per-cell building draw happening after `DrawLandCell`.
Does acdream's current draw order put buildings after terrain? If yes,
nothing else needed. If the order is reversed, the depth-nudge still
works because depth-test is positional, not order-dependent. Just
verify for completeness.
4. **Does WB have a different shader Z nudge we should crib?** The
research agent says no — WB renders full terrain without nudge and
has Z-fighting in the editor view. So we should NOT crib from WB
here; this is one of the cases where WB and retail diverge and
retail wins.
---
## Pickup prompt for next session
```
Issue #100 — Transparent ground around buildings.
Initial research is done by the prior session (the smoking gun is
retail's zFightTerrainAdjust = 0.01). This session: VALIDATE the
research first, then plan, then implement.
Read first (in this order):
1. docs/research/2026-05-25-issue-100-terrain-cutout-handoff.md
(the handoff doc — symptom, retail mechanism, proposed fix
shape, do-not-retry list, files involved)
2. docs/ISSUES.md #100
3. CLAUDE.md — search "currently working toward" to refresh state
State both altitudes:
Currently working toward: M1.5 — Indoor world feels right
Current phase: A6 follow-up — fix issue #100 visual regression
## Session flow (three phases, in order)
### Phase 1 — Investigate (use the /investigate skill)
Independently verify the handoff's claims before committing to the
fix shape. Specifically:
a. Confirm zFightTerrainAdjust = 0.00999999978 at
docs/research/named-retail/acclient_2013_pseudo_c.txt:1120769
and the nudge-application at line 006b6402. The handoff cites
these — read them yourself and cross-check the surrounding
context.
b. Confirm WorldBuilder renders all 64 cells unconditionally at
references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/
TerrainGeometryGenerator.cs (handoff says lines 123-141).
c. Read src/AcDream.App/Rendering/Shaders/terrain_modern.vert in
full and confirm line 139 is the right injection point. Check
for any older terrain shader still compiled into the binary
(the handoff flags this as an open question).
d. Check that physics uses the un-nudged Z. Render-vs-physics
split must hold; we cannot let the Z nudge leak into collision.
e. Confirm there's no precedent for glPolygonOffset on terrain
in our codebase (handoff says no, but verify).
Output of this phase: a short report in chat — either "research
confirmed, fix shape stands" or "found X divergence, here's the
revised fix shape." If the research holds, proceed to Phase 2.
### Phase 2 — Plan (use the superpowers:writing-plans skill)
Draft the implementation plan. Expect 3-4 tasks:
Task 1: terrain_modern.vert Z nudge (the one substantive change).
Task 2: delete hiddenTerrainCells / BuildingTerrainCells plumbing
(LandblockMesh.cs, LoadedLandblock.cs, LandblockLoader.cs,
GameWindow.cs call sites, GpuWorldState.cs,
LandblockStreamer.cs). Pure removal — no behavioral
change beyond what Task 1 introduces.
Task 3: delete corresponding tests in LandblockMeshTests +
LandblockLoaderTests that exercise the dead plumbing.
Task 4: visual verification — terrain renders continuously at
Holtburg cottages, no transparent rectangles, no obvious
Z-fighting at building floors.
The handoff doc has a file-by-file action table to seed the plan.
### Phase 3 — Implement (use superpowers:subagent-driven-development)
Execute the plan with fresh subagents per task, two-stage review
between (spec + code quality), final review across all commits.
Pre-flight verification: full focused test suite green. Build clean.
## Constraints
Do-not-retry list in the handoff doc (6 items). Read it before
starting Phase 2.
Visual verification is the acceptance test — the M1.5 milestone is
at stake and any new visual regression in this area would be
obvious. Be honest about what visual verification shows; don't
declare success on partial regressions.
```

View file

@ -0,0 +1,206 @@
# Issue #78 + cellar-stairs visibility culling — investigation report
**Date:** 2026-05-25 PM (continuation session)
**Status:** REPORT-ONLY. Awaiting user (a) camera-rotation falsification test and (b) approach selection before any code work.
**Predecessor handoff:** [`docs/research/2026-05-25-issue-100-shipped-and-culling-handoff.md`](2026-05-25-issue-100-shipped-and-culling-handoff.md)
---
## Symptom
Two visible defects share one root cause:
1. **Cellar-stairs (observed 2026-05-25 PM, evidence for #78):** standing in a Holtburg cottage cellar with the camera at certain angles, the outdoor terrain mesh renders as a sharp-edged grass rectangle covering the cellar stair geometry. **Clears when camera moves closer** (cottage walls + stair treads geometrically occlude). Gameplay unaffected — player can walk up/down normally.
2. **Inn-wall stabs (#78, filed 2026-05-19):** standing inside the Holtburg Inn looking at the floor or walls, the user sees other buildings in the distance at their correct world position + scale, visible THROUGH the floor and walls.
The user has NOT yet run the camera-rotation falsification test (Phase 1a of the handoff). Until they do, the diagnosis below is "high confidence" but not certain.
Sibling: **#95** (dungeon portal-graph blowup) is the same visibility subsystem but a different specific failure (over-inclusion). scen5 log shows `visibleCells` per cell reaching **295** (worse than the 135-145 filed).
---
## Hypotheses (ranked)
### H1 — Indoor-camera gate missing on outdoor render passes (HIGH confidence)
**Mechanism:** `TerrainModernRenderer.Draw` and `WbDrawDispatcher` render outdoor geometry unconditionally regardless of whether the camera is inside an EnvCell. Retail and WorldBuilder both gate the outdoor passes by the indoor portal-walk result. acdream does neither.
**Evidence FOR (strong):**
- Retail anchor verified: `PView::DrawCells` at `acclient_2013_pseudo_c.txt:432709` gates `LScape::draw` (outdoor terrain dispatch) by `if (outside_view.view_count > 0)`. `outside_view.view_count` is only incremented during the indoor portal BFS (`PView::ConstructView`) when a portal targets `other_cell_id == 0xFFFFFFFF` (outdoor sentinel). When no portal sees outside, the entire outdoor pass is skipped.
- Retail's per-mesh draw (`RenderDeviceD3D::DrawMesh` line 429245) iterates `Render::PortalList->view_count` and skips meshes that straddle 0 sub-views. **No stencil** — retail uses screen-space polygon clipping via `PView::GetClip`.
- WB anchor verified: `VisibilityManager.RenderInsideOut` (lines 73-239) uses **stencil**: mark current-building portals stencil=1, punch portal regions to far depth, draw EnvCells unconditionally, then `terrain/scenery/statics` gated by `glStencilFunc(Equal, 1, 0x01)`. The top-level loop already skips the unconditional terrain draw via `if (!isInside) terrainManager.Render(...)` at GameScene.cs:965.
- acdream audit verified the gate is missing: `WbDrawDispatcher.cs:360-362` gates by `entity.ParentCellId.HasValue && !visibleCellIds.Contains(...)`. When `ParentCellId == null` (outdoor stabs, scenery, live-spawned entities), the boolean short-circuits to `cellInVis = true` — the entity passes regardless of `visibleCellIds`.
- `TerrainModernRenderer.Draw` (lines 191-208) only does per-slot frustum cull. No `visibleCellIds` parameter, no indoor-camera awareness.
- Patch geometry size (~24 m × 24 m rectangle) matches a terrain cell footprint — that's a polygon, not a precision artifact.
- "Clears when closer" matches geometric occlusion: cottage walls + stair treads come to occlude the offending terrain cells screen-space as the camera approaches. A 1 cm depth-buffer Z-fight (#100's nudge) at 2-5 m camera distance with 24-bit depth has sub-millimeter resolving power; precision is not the bottleneck.
**Evidence AGAINST:**
- User has not yet run the camera-rotation test. If the patch flickers/shimmers when rotating the camera in place, the diagnosis pivots to Z-precision.
**How to falsify:** Stand at the spot showing the cellar-stairs artifact, look at the grass patch, rotate the camera slowly without moving the character. Polygon-stable edges that track predictably with the view = culling (H1). Flickering / shimmering = Z-precision (H2).
### H2 — Residual Z-fight from #100's nudge (LOW confidence)
The 1 cm shader nudge from issue #100 might be insufficient at certain Z values or with shader precision quirks.
**Evidence FOR:** Same code area was just touched.
**Evidence AGAINST:** Predecessor research already established 1 cm @ 24-bit depth has sub-mm resolving at gameplay camera distances. Patch is rectangular polygon, not thin Z-fight strip. "Clears when closer" reverses precision direction.
**How to falsify:** Same camera-rotation test.
### H3 — #95 portal-traversal blowup is independent of H1 (HIGH confidence it IS independent)
**Mechanism:** `CellVisibility.GetVisibleCells` BFS over portals lacks termination/cap-depth logic. Network hubs expose 100+ outbound portals to disconnected dungeons, all marked visible. scen5 log shows up to 295 cells in one visible set.
**Evidence FOR independence:**
- H1 is an **asymmetric over-render** (outdoor passes ignore indoor state).
- H3 is a **symmetric over-inclusion** (BFS doesn't terminate properly).
- A fix to H1 would gate WHEN to render outdoor; H3's fix is to bound WHICH indoor cells the BFS includes.
- Different code paths: H1 lives in `TerrainModernRenderer.Draw` + `WbDrawDispatcher`; H3 lives in `CellVisibility.GetVisibleCells`.
**Conclusion:** H1 and H3 should be **separate fixes**. Closing H1 will close cellar-stairs + the outdoor-stab side of #78 but NOT close #95. The next phase should plan H1 in scope and decide whether H3 fits in the same milestone (M1.5).
---
## What we've ruled out
- **It's not the #100 cell-collapse bug returning.** `hiddenTerrainCells` plumbing was fully removed in `a64e6f2`; terrain mesh now correctly renders everywhere on the landblock per retail. The new artifact's mechanism is "outdoor geometry visible at all when indoor," not "incorrect terrain mesh shape."
- **It's not a depth-precision issue (high confidence, pending falsification).** Patch shape + "clears closer" both contradict Z-fight.
- **It's not a `ParentCellId` propagation bug.** Audit confirmed that interior cell static objects (`GameWindow.BuildInteriorEntitiesForStreaming:5476`) and cell-mesh entities (line 5416) both receive non-null `ParentCellId = envCellId`. The dispatcher's existing filter already correctly culls them when the camera is in a different building. The bug is the OPPOSITE direction (outdoor entities w/ `ParentCellId == null` always pass).
- **It's not WB extraction divergence.** Phase O extracted ~33 WB files into `src/AcDream.App/Rendering/Wb/` but the `VisibilityManager` / `RenderInsideOut` pipeline was NOT extracted — that code never existed in our tree.
- **It's not a missing camera-cell signal at the render layer.** `cameraInsideCell`, `visibility.VisibleCellIds`, and `visibility.HasExitPortalVisible` are all already computed in `GameWindow.cs:6970-6984` and live in scope at the two `Draw` call sites (lines 7074 + 7110). No new plumbing required.
---
## Approach options for the fix
Three viable approaches, with tradeoffs:
### Approach A — WB-style stencil (recommended for first ship)
Port `VisibilityManager.RenderInsideOut`'s stencil pipeline to acdream. Two-pass render: (1) mark current-building portal silhouettes in stencil, (2) gate outdoor passes by `glStencilFunc(Equal, 1, 0x01)`.
**Pros:**
- Closest to acdream's existing modern GL pipeline (we already use stencil for nothing else; adding one stencil bit is cheap).
- WB is acdream's documented rendering base (per CLAUDE.md). Cross-reference checked against retail confirms WB's intent matches retail's, just via a different mechanism.
- Handles the "see outside through open door" case correctly — terrain renders through portal silhouettes only.
- Reusable for both outdoor terrain AND outdoor entities (single stencil gate applies to all subsequent draws).
**Cons:**
- Multi-pass render adds GPU cost (small — one stencil pass per current-building's portals).
- Requires a portal-mesh upload pipeline (WB has one in `PortalRenderManager.cs:488-628`; we'd port it).
- More LOC than Approach C.
**Estimated scope:** 4-6 tasks, 1-2 weeks of implementation + verification.
### Approach B — Retail-faithful polygon-clip sub-views
Port `PView::ConstructView` + `PView::GetClip` + `Render::PortalList` from retail. Per-mesh viewport set to clipped portal polygon.
**Pros:**
- 100% retail-faithful.
**Cons:**
- Requires per-draw viewport scissor changes — current rendering uses bindless + MDI with one viewport per pass. Wedging per-mesh viewport in would break the modern pipeline's batching.
- Multi-week port. Out of scope for one session.
**Estimated scope:** 8-12 tasks, 4-6 weeks. Defer to a future milestone if needed.
### Approach C — Ship-now binary gate
When `cameraInsideCell && !visibility.HasExitPortalVisible`, skip outdoor terrain pass entirely and gate `WbDrawDispatcher` to exclude `ParentCellId == null` entities.
**Pros:**
- Smallest change. ~2-3 tasks. Closes the cellar-stairs symptom and the sealed-interior side of #78 immediately.
- All required state already computed (`HasExitPortalVisible` from `CellVisibility.GetVisibleCells` line 404).
**Cons:**
- Under-renders when player can see outside through an open door/window (renders nothing instead of clipping correctly). This is regressive vs. today for the doorway-view case.
- Per CLAUDE.md "no workarounds": this *is* a symptom-gate rather than a root-cause fix. **Would need explicit user approval.** Approach A is the correct shape; Approach C is a temporary patch.
**Estimated scope:** 2-3 tasks, 1-2 days.
---
## Recommended next step
1. **User runs the camera-rotation falsification test (~60 seconds).** Spawn at Holtburg, walk into a cottage cellar, find the angle showing the grass patch, rotate the camera in place without moving. Report what happens.
- Polygon-stable → confirms H1, proceed.
- Flickering → pivots to H2, this report needs major revision.
2. **If H1 confirmed: user picks Approach A vs C.** Recommendation: **Approach A (WB-style stencil)**. Per CLAUDE.md's "no workarounds" rule, the right thing is to port the stencil pipeline, not gate at the symptom site. Approach C is offered only if the user wants to close cellar-stairs immediately and defer doorway-view correctness as known-incomplete; that's an explicit workaround that needs user sign-off.
3. **#95 should NOT be in scope for this work.** Different mechanism, different code path. File continues as separate work in M1.5.
4. **Phase identifier:** the handoff proposes A8 (visibility) alongside A6 (physics) and A7 (lighting). I'll defer naming to the user.
5. **CLAUDE.md update for #100 ship:** the handoff calls this out as pending. Recommendation: add a brief #100 ship entry mentioning the cellar-stairs finding linked to #78. Out of scope for investigate mode; will happen at the start of the implementation session.
---
## What this is NOT
This is NOT a #100 regression. The terrain Z-nudge ship works correctly; the new artifact has a different root cause (indoor-camera gate on outdoor passes was already missing pre-#100#100 just made it more visible by removing the terrain-cell hide mechanism that incidentally masked it inside building footprints).
This is NOT a depth-precision fix. The 1cm nudge is correctly sized; larger nudges would break coplanar-floor disambiguation elsewhere.
This is NOT a `ParentCellId` data fix. Interior entities are correctly tagged.
This is NOT covered by Phase O's WB extraction. The visibility-management code was deliberately NOT extracted.
---
## Reference appendix
### Retail anchors (acclient_2013_pseudo_c.txt)
| Line | Symbol | Role |
|---|---|---|
| 92635 | `SmartBox::RenderNormalMode` | Per-frame top-level dispatcher (indoor vs outdoor branch) |
| 267912 | `LScape::draw` | Outdoor terrain dispatch |
| 311397 | `CEnvCell::find_visible_child_cell` | Point-in-visible-cell query |
| 311878 | `CEnvCell::grab_visible_cells` | Loads outdoor on `seen_outside` |
| 427843 | `RenderDeviceD3D::DrawInside` | Indoor entry point |
| 429245 | `RenderDeviceD3D::DrawMesh` | **Per-mesh portal-sub-view loop** |
| 430027 | `RenderDeviceD3D::DrawBlock` | Outdoor landblock dispatch |
| 432709 | **`PView::DrawCells`** | **The `outside_view.view_count > 0` gate** |
| 433750 | `PView::ConstructView` | BFS portal walk |
### WorldBuilder anchors
| File:Line | Role |
|---|---|
| `references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/VisibilityManager.cs:73-239` | `RenderInsideOut` — full stencil pipeline |
| Same file:241-359 | `RenderOutsideIn` — outdoor branch |
| Same file:47-71 | `PrepareVisibility` — visible cell set |
| `references/WorldBuilder/Chorizite.OpenGLSDLBackend/GameScene.cs:880-1008` | Main render dispatch (lines 965, 988 are the gates) |
| `references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/PortalRenderManager.cs:488-628` | Portal mesh upload |
| `references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/CameraController.cs:142-174` | Camera-cell tracking (portal raycasts) |
| `references/WorldBuilder/Chorizite.OpenGLSDLBackend/Shaders/PortalStencil.frag:7-16` | Stencil shader (writes `gl_FragDepth = 1.0`) |
### acdream extension points (audit-verified)
| File:Line | Current behavior | Extension required |
|---|---|---|
| `src/AcDream.App/Rendering/CellVisibility.cs:222-232` | Returns `VisibilityResult` with `VisibleCellIds`, `HasExitPortalVisible`, `CameraCell` | None — state already in place |
| `src/AcDream.App/Rendering/GameWindow.cs:6970-6984` | Computes `cameraInsideCell` and `playerInsideCell` per frame | None — values already in scope |
| `src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs:360-374` | Gates by `ParentCellId ∈ visibleCellIds`; outdoor entities (null) always pass | Add second gate: when `cameraInsideCell == true` and entity is outdoor (`ParentCellId == null`), require stencil pass or skip entirely |
| `src/AcDream.App/Rendering/TerrainModernRenderer.cs:191-208` | Frustum-only cull; renders all loaded landblocks | Add parameter for stencil pass / indoor-camera state |
| `src/AcDream.App/Rendering/GameWindow.cs:7074` | `_terrain?.Draw(camera, frustum, neverCullLandblockId: playerLb)` | Add `cameraInsideCell` (or equivalent) parameter |
| `src/AcDream.App/Rendering/GameWindow.cs:7110` | `WbDrawDispatcher.Draw(... visibleCellIds: visibility?.VisibleCellIds, ...)` | Add `cameraInsideCell` parameter |
| `src/AcDream.Core/Rendering/RenderingDiagnostics.cs:75-77` | Existing probe flag registry (mirror of `PhysicsDiagnostics`) | Add `ProbeVisibilityEnabled` from `ACDREAM_PROBE_VIS=1` |
### Issues family map
| ID | Symptom | Closes with H1 fix? |
|---|---|---|
| #78 | Outdoor stabs visible through inn floor/walls | YES (same root cause) |
| Cellar-stairs (NEW) | Outdoor terrain visible inside cottage cellar | YES (same root cause; new evidence for #78) |
| #95 | Portal-graph visibility blowup (visibleCells up to 295) | NO — independent (different code path) |
| #79/#80/#81/#93/#94 | Indoor lighting bugs | Maybe — #93 explicitly suspects "indoor visibility culling for lights" sub-cause; lighting subsystem may share infrastructure with visibility-gate but not directly impacted |
### Workflow notes (from CLAUDE.md "How to operate")
- "No workarounds without explicit approval" — Approach C is a workaround; Approach A is the correct shape.
- Visual verification is the user's job; can't be automated.
- Phase ID for visibility work is undecided. User picks at implementation-session start.
- Per the milestones doc, this is M1.5 scope; cellar-stairs is on the M1.5 critical path because it blocks the building/cellar half of the M1.5 demo.

View file

@ -0,0 +1,339 @@
# M1.5 — Broken stairs (cyl-only multi-part entity) — investigation handoff
**Date:** 2026-05-25 PM
**Status:** Filed as issue #101 (post-A6.P7 visual verification surfaced a NEW
bug, not the closed door bug). **Research-only next session.** No
implementation until we know what retail does at this exact stair location.
**Predecessor handoff:** [`2026-05-25-a6-door-cyl-investigation-handoff.md`](2026-05-25-a6-door-cyl-investigation-handoff.md)
(closed by A6.P7 commit `888272a`).
---
## TL;DR
A6.P7 visual verification at Holtburg confirmed the cottage door is fixed.
While exploring, the user found **a different staircase that doesn't work**
sphere can't climb at all. Captures show:
- Stairs are in cells `0xA9B40159` + `0xA9B4015A` (NOT the cottage-cellar
cells `0xA9B40143/146/147` that work post-A6.P3 cellar fix).
- Geometry is a **multi-part entity** `0x0040B500` (entityId; ~150 parts in
the setup; 10 of them are stair-step cylinders).
- Each step is a separate cylinder (`r=0.80m, h=0.80m`) at `Y=26.60`, stepping
up in X and Z (0.25 m per step, Z: 94.22 → 96.47).
- `state=0x00000000` on each cyl part — **no `HAS_PHYSICS_BSP_PS` flag**, so
A6.P7's dispatch gate (`Transition.BspOnlyDispatch`) does NOT skip them.
- The cyls fire 284 `result=Slid` with diagonal radial normals like
`(0.88, -0.47, 0)` — the same phantom shape A6.P7 closed for the cottage
door, but here the cause is per-cyl-without-BSP, not per-entity-with-both.
- **Player Z stayed at 94.00 for the entire 4216-record capture** — never
gained altitude.
This is **NOT** a regression of A6.P7. The fix did exactly what retail does
for entities with `HAS_PHYSICS_BSP_PS`. The stair bug is a separate class:
**cyl-only entities (no BSP) whose cyl geometry shouldn't physically block
the player but does.**
---
## What today shipped (DO NOT redo)
### A6.P7 — retail-binary cyl/BSP dispatch (commit `888272a`)
- File: `src/AcDream.Core/Physics/PhysicsBody.cs` (added
`PhysicsStateFlags.HasPhysicsBsp = 0x00010000`)
- File: `src/AcDream.Core/Physics/TransitionTypes.cs` (added
`Transition.BspOnlyDispatch(uint)` predicate + per-entry guard at the
cyl/sphere branch)
- Test: `tests/AcDream.Core.Tests/Physics/A6P7DispatchRulesTests.cs` (7 tests)
- Investigation:
[`docs/research/2026-05-25-a6-door-cyl-retail-dispatch-investigation.md`](2026-05-25-a6-door-cyl-retail-dispatch-investigation.md).
- **Visual-verified at Holtburg cottage door 2026-05-25.** Captures:
`launch-a6p7.log`, `launch-a6p7-v2.log` — 1187 `[cyl-skip-bsp]`, 0
`[cyl-test]` on the door, 30 axis-aligned hits, no phantom diagonals.
---
## The new bug — captures + evidence
### Captures (on disk, gitignored — DO NOT commit them; treat as live data)
- **Working baseline** (cellar stairs that work): `stairs-working.jsonl`
(16.9 MB, ~22K records). Z range 90.95 ↔ 94.00 (full cellar climb). 12
cell transitions; only 23 `hit=yes` events; no diagonal normals; user
ran up + down twice. Cells `0xA9B40143/146/147`.
- **Broken stairs**: `stairs-broken.jsonl` (8.1 MB, 4216 records). Z stayed
at 94.00 for the entire capture. Cells `0xA9B40159` + `0xA9B4015A`. The
player tried multiple approach angles; never climbed any step.
- **Launch logs with probes**: `stairs-working.launch.log`,
`stairs-broken.launch.log`. Contain `[cyl-test]`, `[cyl-skip-bsp]`,
`[bsp-test]`, `[resolve]`, `[resolve-bldg]` probe lines.
### Reproduction
Login as `+Acdream` at Holtburg. The cellar stairs work (verified). The
broken stairs the user found are at world XY around (110, 26), Z range
94 → 96. Walk west into them — sphere hits something diagonal and gets
stuck oscillating between `n=(0, 1, 0)` and `n=(0.87, -0.49, 0)` slides.
### Geometry summary (from `stairs-broken.launch.log`)
The blocker is multi-part entity `entityId=0x0040B500`. Ten of its parts
are cylinders forming a staircase at `Y=26.60`:
| Part | World XY | Z (cyl bottom) |
|---|---|---|
| `0x40B5008C` (part 140) | (108.72, 26.60) | 96.47 |
| `0x40B5008D` (part 141) | (108.97, 26.60) | 96.22 |
| `0x40B5008E` (part 142) | (109.22, 26.60) | 95.97 |
| `0x40B5008F` (part 143) | (109.47, 26.60) | 95.72 |
| `0x40B50090` (part 144) | (109.72, 26.60) | 95.47 |
| `0x40B50091` (part 145) | (109.97, 26.60) | 95.22 |
| `0x40B50092` (part 146) | (110.22, 26.60) | 94.97 |
| `0x40B50093` (part 147) | (110.47, 26.60) | 94.72 |
| `0x40B50094` (part 148) | (110.72, 26.60) | 94.47 |
| `0x40B50095` (part 149) | (110.97, 26.60) | 94.22 |
Each cyl: `radius=0.80, height=0.80, state=0x00000000`. The entity also
has a BSP part `obj=0xB5008900 gfx=0x01000C16 radius=2.645 pos=(109.30,
26.30, 95.75)` but it's effectively non-physics
(`hasPhys=False bspR=0.00 vAabbR=0.82`) — the `vAabbR` here is the
**visual** AABB radius being borrowed as a cylinder fallback because the
underlying `GfxObj` has no physics BSP.
### What's blocking the player
Sphere at `(112.115, 25.995, 94.00)` wants to move west. The closest cyl
`0x40B50095` is at `(110.97, 26.60, 94.22)`:
- `distXY = 1.295m` (just barely outside reach `0.80 + 0.48 = 1.28m`)
- But during sub-stepping the sphere center crosses 1.28m → cyl overlaps
- Radial normal direction from cyl center to sphere: `(0.884, -0.467, 0)`
matches observed phantom hits `(0.88, -0.47)`, `(0.86, -0.51)`, etc.
The cyl is **too tall (0.80m) to step over** under A6.P6's grounded
step-over check (step-up budget = 0.60m). Falls through to the
wall-slide branch which produces the diagonal radial normal that drives
the sphere's slide tangent into the perpendicular cell wall, then
re-blocks. Net: stuck.
### Why A6.P7 doesn't help
A6.P7 gates the cyl branch on `(state & 0x10000) != 0`. These stair cyls
have `state=0x00000000` — bit not set. Guard does NOT fire. Cyls are
tested. Sphere blocks.
---
## What this session needs — retail investigation
**Mandate:** report-only research, NO implementation. Use the `/investigate`
skill. The fix design comes in a subsequent session once the retail
behavior is settled.
### Question 1 — What does retail DO at this exact staircase?
**Use cdb.** The toolchain in `CLAUDE.md` "Retail debugger toolchain" is
ready. The matching binary + PDB are verified.
Concrete experiment:
1. Have the user run the retail acclient.exe (Microsoft AC official build
v11.4186) at the same world location (cells `0xA9B40159` + `0xA9B4015A`,
XY ≈ (110, 26)). The user needs to be IN the building, AT the foot of
these stairs.
2. Attach cdb with breakpoints:
- `acclient!CCylSphere::collides_with_sphere` at `0x53a880` — counter
`$t0`, log every 100 hits with the `this` pointer and the moving
sphere's position, `gc`. Auto-detach after 5000.
- `acclient!CCylSphere::intersects_sphere` (the dispatch from
`CPhysicsObj::FindObjCollisions` cyl branch) — counter `$t1`, log
entity address.
- `acclient!CObjCell::find_env_collisions` — counter `$t2`. Tells us if
retail uses cell BSP for stair collision.
- `acclient!CPartArray::FindObjCollisions` — counter `$t3`. Confirms BSP
dispatch path.
3. Have the user walk straight into the broken stairs from outside, then
try to climb them. Capture 30 seconds.
4. Detach. Analyze:
- Does `CCylSphere::collides_with_sphere` fire on the stair entity? If
yes → retail's cyls ARE active here, and retail somehow handles them
differently (different step-up threshold? cell-context-aware?). If
no → retail's cyls are excluded by something we don't replicate.
- Does `CObjCell::find_env_collisions` fire heavily? If yes → retail
might be using cell BSP polygons for the stairs (and the entity cyls
are decorative/click-targets only).
### Question 2 — What's the Setup ID? Compare retail's PhysicsObj construction
Our `[resolve-bldg]` lines show the entity is built from GfxObj
`0x0100081A` with `hasPhys=False`. **What's the Setup ID for entity
`0x0040B500`?** Trace through our streaming code to find which Setup
emitted the 150-part build.
Steps:
1. Grep `src/AcDream.App/Rendering/GameWindow.cs` for the
`BuildInteriorEntitiesForStreaming` path (CLAUDE.md says it hydrates
EnvCell static objects with id `0x40xxxxxx`).
2. Add a temporary `[entity-source]` probe that logs the Setup id when an
entity gets registered. Or check existing diagnostic output — the
`gfxObj=0x0100081A` is the part's GfxObj, but we need the parent Setup.
3. With the Setup id in hand, look up retail's behavior:
- Decompile / grep `docs/research/named-retail/acclient_2013_pseudo_c.txt`
for `CPhysicsObj::InitPartArrayFromSetup` or similar to see how retail
builds the part_array from a Setup. Does retail include every part as
a collision shape, or filter by some flag?
### Question 3 — Why is `vAabbR` becoming a cylinder?
The `[resolve-bldg]` line shows `gfxObj=0x0100081A hasPhys=False bspR=0.00
vAabbR=0.82`. We registered a `r=0.80` cyl. The 0.80 ≈ 0.82 match is
suspicious — we're using the **visual AABB radius** as a fallback cyl
radius when there's no physics BSP.
Steps:
1. Find the code path in our tree that does this fallback. Likely in
`src/AcDream.Core/Physics/ShadowShapeBuilder.cs` `FromSetup` or in
`RegisterMultiPart`. Look for cases where `GfxObj.PhysicsBSP` is null
and a cyl is synthesized.
2. Cross-reference retail: does retail synthesize a cyl from visual bounds
when physics is null? Or does retail skip such parts entirely for
collision (visual-only)?
3. ACE check: `references/ACE/Source/ACE.Server/Physics/PhysicsObj.cs`
how does ACE construct the part_array from a Setup with mixed
physics/visual-only parts?
### Question 4 — Cell BSP fallback
If retail's stairs are walked via cell BSP polygons (not entity cyls),
what's in cell `0xA9B40159`'s BSP at this XY/Z? Is there a walkable
polygon staircase that we're not iterating?
Steps:
1. Use `ACDREAM_DUMP_CELLS=0xA9B40159,0xA9B4015A` to dump the cell BSPs to
JSON. (Confirm the env var path; see existing `CellDump` infra near
issue #98's apparatus.)
2. Look for inclined polygons in the dump that form the staircase. If
present → retail likely uses these for collision; our entity cyls are
either a setup misinterpretation or redundant.
---
## Files to read FIRST next session
| Path | Why |
|---|---|
| `docs/ISSUES.md` (#101) | The filed issue with severity + acceptance |
| `docs/research/2026-05-25-a6-door-cyl-retail-dispatch-investigation.md` | A6.P7 background (closed; companion bug) |
| `docs/research/named-retail/acclient_2013_pseudo_c.txt:276776` | `CPhysicsObj::FindObjCollisions` |
| Setup dat reader path in `src/AcDream.Core/Physics/ShadowShapeBuilder.cs` | Cyl synthesis from Setup; the suspected fallback |
| `src/AcDream.App/Rendering/GameWindow.cs::BuildInteriorEntitiesForStreaming` | Entity hydration for EnvCell statics |
| `references/ACE/Source/ACE.Server/Physics/PhysicsObj.cs` | ACE PartArray construction |
| `references/ACE/Source/ACE.Server/Physics/Common/Setup.cs` | ACE Setup → PartArray pipeline |
---
## Tests that must stay green
Same as A6.P7 list:
```
dotnet test tests/AcDream.Core.Tests/AcDream.Core.Tests.csproj --no-build -c Debug --filter "FullyQualifiedName=AcDream.Core.Tests.Physics.CellarUpTrajectoryReplayTests.LiveCompare_FirstCap_FixClosesCottageFloorCap|FullyQualifiedName=AcDream.Core.Tests.Physics.DoorBugTrajectoryReplayTests.Directional_OutsideIn_SouthApproach_BlocksAtSlabSouthFace|FullyQualifiedName=AcDream.Core.Tests.Physics.DoorBugTrajectoryReplayTests.Directional_InsideOut_NorthApproach_BlocksAtSlabNorthFace|FullyQualifiedName=AcDream.Core.Tests.Physics.DoorBugTrajectoryReplayTests.CornerSlide_AlcoveEastToCottageNorth_ShouldBlock|FullyQualifiedName=AcDream.Core.Tests.Physics.DoorBugTrajectoryReplayTests.Geometric_DoorSlabAtSphereHeight_OverlapsInZ|FullyQualifiedName=AcDream.Core.Tests.Physics.DoorBugTrajectoryReplayTests.InsideOut_Tick3254_WithCottageWalls_ShouldBlock|FullyQualifiedName~BSPQueryTests.FindCollisions_Path5|FullyQualifiedName~CellTransitTests.A6P5|FullyQualifiedName~DoorCollisionApparatusTests.Apparatus_DeadCenter|FullyQualifiedName~A6P7DispatchRulesTests"
```
Expected: 20/20 pass.
---
## Things NOT to do (do-not-retry)
1. **Don't lower step-up height** to make A6.P6's grounded step-over fit the
0.80m cyl. Step-up budget = 0.60m is retail-faithful. Tweaking it would
regress every other surface where 0.60m is correct (curbs, low ledges).
2. **Don't extend A6.P7's `BspOnlyDispatch` to entities with `state=0`.**
That gate is retail-specific (`HAS_PHYSICS_BSP_PS`). Skipping cyls
purely because peer parts exist with BSP would diverge from retail and
break NPC cyl-only entities.
3. **Don't disable cyl fallback when `hasPhys=False` without checking
retail.** Until we know how retail handles `GfxObj` with no physics
BSP, "just skip the cyl" might break other content (small decorative
items that DO collide in retail).
4. **Don't add per-entity workarounds** ("if entity id 0x0040B500, skip
cyls"). Per CLAUDE.md no-workarounds rule.
5. **Don't enlarge the sphere's step-up budget for tall cyls.** Retail's
threshold is what it is. If retail steps over 0.80m cyls in this
scenario, the mechanism is something else.
---
## Three fix-shape candidates (for the FOLLOWING session, not this one)
Listed in rough order of retail-faithfulness based on the limited evidence
we have. The retail investigation will decide which is right.
1. **Don't synthesize cyls from visual AABB when `GfxObj.PhysicsBSP` is
null.** Suppress at registration time in `ShadowShapeBuilder.FromSetup`.
Retail-anchored: if retail's `CPartArray` doesn't include such parts in
the collision list, our registration shouldn't either. The cell BSP
would then be the only collision source.
2. **Use cell BSP polygons** for stair geometry; entity cyls are
decorative-only for this entity class. Requires: (a) confirming cell
`0xA9B40159` BSP has walkable stair polys, (b) ensuring our cell BSP
query iterates them. Likely a no-op on our side once (1) is done.
3. **Make `step_sphere_up` cyl-height-tolerant** — if the sphere is on a
walkable plane and a cyl is detected, attempt step-up even when cyl
height > step-up budget IF a walkable surface exists at the top of the
cyl. Retail-anchored ONLY if cdb shows retail does this on these
specific stairs.
---
## Pickup prompt for next session
```
A6 — Broken stairs cyl investigation (issue #101). Investigation-only session.
Read first (in this order):
1. docs/research/2026-05-25-stairs-cyl-investigation-handoff.md
(this file — full context, captures, geometry, do-not-retry list)
2. docs/ISSUES.md #101
3. docs/research/2026-05-25-a6-door-cyl-retail-dispatch-investigation.md
(A6.P7 background — closed; companion bug)
State both altitudes:
Currently working toward: M1.5 — Indoor world feels right
Current phase: A6 — broken-stairs investigation (issue #101)
Session mandate: retail investigation, NOT implementation. Use the
/investigate skill. Specific questions (each must be answered with cited
evidence — retail line numbers, cdb traces, dat dumps):
1. Does retail's CCylSphere::collides_with_sphere fire on the stair-step
cylinders at cells 0xA9B40159/0xA9B4015A when a player walks in to
climb them? If yes — how does retail walk past 0.80m-tall cyls? If
no — what excludes them?
2. What's the Setup ID for entity 0x0040B500? Trace from
GameWindow.cs::BuildInteriorEntitiesForStreaming. Cross-reference how
retail's CPhysicsObj::InitPartArrayFromSetup (or equivalent) builds
the collision shape list — does retail include parts with
hasPhys=False?
3. Why does our ShadowShapeBuilder synthesize an r=0.80 cyl from
vAabbR=0.82 when GfxObj.PhysicsBSP is null? Identify the code path.
Does retail do this?
4. Dump cell 0xA9B40159's BSP polygons (ACDREAM_DUMP_CELLS). Does the
cell BSP have walkable stair polygons? If yes — retail's stair
collision is the cell BSP, not the entity cyls.
Deliverable: a short report (~2-3 pages) covering the 4 questions with
retail line numbers, cdb trace excerpts, code citations. Then propose
which of the 3 fix-shape candidates is most retail-faithful (or a fifth
shape that emerges from the research).
DO NOT implement the fix this session. Save it for the session after.
Do-not-retry list (in handoff doc) — read it before starting.
Tests to keep green if any code changes happen (none expected this
session): see handoff doc.
Reproduction setup for the broken scenario:
ACDREAM_PROBE_BUILDING=1 ACDREAM_PROBE_RESOLVE=1
ACDREAM_CAPTURE_RESOLVE=<path>.jsonl
walk to cells 0xA9B40159/A in Holtburg (XY ≈ 110, 26)
```

View file

@ -0,0 +1,402 @@
# Phase A8 RR2 — `BuildingInfo` data shape + interior-portal walk
**Date:** 2026-05-26 (PM, RR2 spike)
**Predecessor:** [docs/research/2026-05-26-a8-wb-full-port-rr1-shipped-handoff.md](2026-05-26-a8-wb-full-port-rr1-shipped-handoff.md)
**Successor:** RR3 — `Building` + `BuildingRegistry` + `BuildingLoader` implementation
**Status:** SHIPPED — gate at end says "compatible → proceed to RR3"
## TL;DR
The DRW v2.1.7 `BuildingInfo` type exposes everything the design needs, with the field names already used elsewhere in our codebase. WB's `PortalService.GetPortalsByBuilding` (referenced by `PortalRenderManager.GeneratePortalsForLandblock` at lines 488-561) implements a clear BFS walk that translates 1:1 to our `LoadedCell.Portals` graph. Gate decision: **data shape compatible — proceed to RR3.**
Two refinements vs the plan's RR3 pseudocode worth noting (both are minor wording fixes, not algorithm changes):
1. The DRW type for each entry of `BuildingInfo.Portals` is `BuildingPortal` (NOT `BldPortal` as the plan's RR3-S9 test file uses). Plan's RR3 tests should rename `new BldPortal { ... }``new BuildingPortal { ... }`.
2. The exit-portal sentinel `0xFFFF` is the **value of `OtherCellId`** itself (ushort), not "low word of a 32-bit value." Plan code already treats it correctly.
## 1. `BuildingInfo` field shape (DRW v2.1.7)
Verbatim from `ilspycmd "%USERPROFILE%\.nuget\packages\chorizite.datreaderwriter\2.1.7\lib\net8.0\DatReaderWriter.dll" -t DatReaderWriter.Types.BuildingInfo`:
```csharp
namespace DatReaderWriter.Types;
public class BuildingInfo : IDatObjType, IUnpackable, IPackable
{
/// <summary>Either a SetupModel (0x02xxxxxx) or GfxObj (0x01xxxxxx) id.</summary>
public uint ModelId;
/// <summary>The position information (Origin: Vector3, Orientation: Quaternion).</summary>
public Frame Frame;
public uint NumLeaves;
public List<BuildingPortal> Portals = new List<BuildingPortal>();
}
```
Note: **fields, not properties**. All four are mutable but in practice are only populated by `Unpack(DatBinReader)` during dat-file load. `Portals` is initialized inline (never `null`).
`BuildingPortal` (same DLL):
```csharp
public class BuildingPortal : IDatObjType, IUnpackable, IPackable
{
public PortalFlags Flags; // 16-bit enum (PortalFlags.ExactMatch = 0x0001)
public ushort OtherCellId; // LOW WORD of cell id; landblock prefix ORs in
public ushort OtherPortalId;
public List<ushort> StabList = new List<ushort>();
}
```
`Frame` (lives at `DatReaderWriter.Types.Frame`, also a field-based class):
```csharp
public class Frame : IUnpackable, IPackable
{
public Vector3 Origin;
public Quaternion Orientation;
}
```
### What our codebase already does with this
In `src/AcDream.Core/World/LandblockLoader.cs:74-89`, the post-Phase-2 loop already iterates `info.Buildings` and consumes `building.ModelId`, `building.Frame.Origin`, `building.Frame.Orientation`. Plain field access — no surprises.
In `src/AcDream.App/Rendering/GameWindow.cs:5789-5803`, the indoor portal cell-tracking phase already builds our internal `BldPortalInfo` from `BuildingInfo.Portals`. The construction confirms the field types in practice:
```csharp
foreach (var building in lbInfo.Buildings)
{
if (building.Portals.Count == 0) continue; // .Count works → IList
foreach (var bp in building.Portals)
{
bldPortals.Add(new AcDream.Core.Physics.BldPortalInfo(
otherCellId: lbPrefix | (uint)bp.OtherCellId, // ushort → uint cast
otherPortalId: bp.OtherPortalId, // ushort
flags: (ushort)bp.Flags)); // PortalFlags → ushort cast
}
}
```
Conclusion: every field the RR3 design assumes already exists with the assumed semantics; the existing physics phase has been consuming them since 2026-05-19.
## 2. Holtburg cottage `BuildingInfo` — live dump
Live-inspect via `Console.WriteLine` diagnostic at `LandblockLoader.cs:74-89` (reverted after capture, see git diff in this commit's parent). Login at `+Acdream` (server guid `0x5000000A`, pos `(131.7, 26.1, 94.0) @ 0xA9B4002A`); the diagnostic fired for every landblock streamed during initial entry. Captured to `a6-rr2-s3-buildings.log` (gitignored — not committed).
### Holtburg town landblock `0xA9B4FFFF` — 12 BuildingInfo entries
```
idx=0 ModelId=0x01000C1E Frame.Origin=(84.1,131.5,66.0) NumLeaves=64 Portals=10
portal -> OtherCellId=0x0100 OtherPortalId=0x0000 Flags=0x0001 StabList.Count=17
portal -> OtherCellId=0x0100 OtherPortalId=0x0001 Flags=0x0001 StabList.Count=17
portal -> OtherCellId=0x0100 OtherPortalId=0x0005 Flags=0x0001 StabList.Count=17
portal -> OtherCellId=0x0103 OtherPortalId=0x0000 Flags=0x0001 StabList.Count=17
portal -> OtherCellId=0x0102 OtherPortalId=0x0001 Flags=0x0001 StabList.Count=17
portal -> OtherCellId=0x0106 OtherPortalId=0x0000 Flags=0x0001 StabList.Count=17
portal -> OtherCellId=0x0107 OtherPortalId=0x0000 Flags=0x0001 StabList.Count=17
portal -> OtherCellId=0x0109 OtherPortalId=0x0001 Flags=0x0001 StabList.Count=17
portal -> OtherCellId=0x010A OtherPortalId=0x0001 Flags=0x0001 StabList.Count=17
portal -> OtherCellId=0x010B OtherPortalId=0x0000 Flags=0x0001 StabList.Count=17
idx=1 ModelId=0x01000BC3 Frame.Origin=(31.5,159.5,66.0) NumLeaves=36 Portals=3
portal -> OtherCellId=0x0113 OtherPortalId=0x0001 Flags=0x0001 StabList.Count=5
portal -> OtherCellId=0x0114 OtherPortalId=0x0000 Flags=0x0001 StabList.Count=5
portal -> OtherCellId=0x0115 OtherPortalId=0x0000 Flags=0x0001 StabList.Count=5
idx=2 ModelId=0x0100082E Frame.Origin=(154.1,132.7,66.0) NumLeaves=30 Portals=4
portal -> OtherCellId=0x0116 OtherPortalId=0x0001 Flags=0x0003 StabList.Count=9
portal -> OtherCellId=0x0118 OtherPortalId=0x0001 Flags=0x0001 StabList.Count=9
portal -> OtherCellId=0x0119 OtherPortalId=0x0001 Flags=0x0001 StabList.Count=9
portal -> OtherCellId=0x011D OtherPortalId=0x0001 Flags=0x0001 StabList.Count=9
idx=3 ModelId=0x01000830 Frame.Origin=(104.5,135.5,66.0) NumLeaves=31 Portals=5
portal -> 0x011F, 0x0120 (F=0x0003), 0x0122, 0x0124, 0x0125 — all StabList.Count=8
idx=4 ModelId=0x01000827 Frame.Origin=(57.5,133.5,66.0) NumLeaves=101 Portals=8
portal -> 0x012D, 0x0133, 0x0134, 0x0135, 0x0129, 0x012B, 0x012C, 0x0137 — all StabList.Count=17
idx=5 ModelId=0x0100081C Frame.Origin=(132.5,154.0,66.0) NumLeaves=34 Portals=4
portal -> 0x0139, 0x013B, 0x013C, 0x013D — all StabList.Count=7
idx=6 ModelId=0x01000A2B Frame.Origin=(130.5,11.5,94.0) NumLeaves=29 Portals=5
portal -> 0x0145, 0x014C, 0x014E, 0x014F, 0x0150 — all StabList.Count=18
[cottage from issue #98 cellar saga; entry cells 0x0145 + cellar cells 0x014C-0x0150]
idx=7 ModelId=0x01000C17 Frame.Origin=(107.5,36.0,94.0) NumLeaves=39 Portals=3
portal -> 0x0164, 0x0165, 0x015E — all StabList.Count=25
[Holtburg Inn vestibule + ground floor]
idx=8 ModelId=0x01000BC3 Frame.Origin=(79.5,37.5,94.0) NumLeaves=36 Portals=3
portal -> 0x016C, 0x016D, 0x016E — all StabList.Count=5
idx=9 ModelId=0x01002232 Frame.Origin=(161.9,7.5,94.0) NumLeaves=209 Portals=2
portal -> 0x016F (F=0x0003), 0x0170 (F=0x0003) — all StabList.Count=7
[largest building, 209 leaves — probably a multi-floor structure or unique Holtburg landmark]
idx=10 ModelId=0x01002A1B Frame.Origin=(65.2,156.6,66.0) NumLeaves=48 Portals=2
portal -> 0x0178, 0x0177 — both F=0x0003 StabList.Count=3
idx=11 ModelId=0x01000F69 Frame.Origin=(158.2,37.7,94.0) NumLeaves=43 Portals=1
portal -> 0x0179 (F=0x0003) StabList.Count=2
[single-portal building — likely a simple shed / outhouse]
```
### What the dump confirms
| Confirmation | Evidence |
|---|---|
| Every `BuildingInfo.Portals` has `.Count > 0` (no empty buildings observed) | All 12 idx entries Portals=1..10 |
| Every `OtherCellId` is a real interior cell (none == `0xFFFF`) | Inspected 60 portal lines; all OtherCellId in 0x0100-0x0179 range |
| `Flags` values: `0x0001` (ExactMatch) for ~85% of portals; `0x0003` (ExactMatch + bit 1) for ~15% — likely the "exterior-facing portal side" bit | Per `DatReaderWriter.Enums.PortalFlags`: `ExactMatch = 0x0001`; bit 1 is `Side` per our `BldPortalInfo` ctor (which already handles it) |
| All `ModelId` values are GfxObjs (`0x01xxxxxx`) — NO Setups in Holtburg | Matches `LandblockLoader.IsSupported` (currently accepts both — Setup ids would survive the filter if they appeared) |
| `NumLeaves` correlates with building size — 209 for the largest, 29 for a small cottage | The DRW field is metadata; we don't consume it in `BuildingLoader` (only WB's offscreen mesh path uses it) |
| `StabList` populated (3-25 entries per portal) — indices into `LandBlockInfo.Objects` for the stabs (decorations) inside the building's cells | Not used by `BuildingLoader`; informational only |
| Cottage at idx=6 matches the issue #98 cellar saga's geometry: `Frame.Origin=(130.5,11.5,94.0)` is the cottage entry cell `0xA9B40145`; cellar cells `0xA9B4014C/014E/014F/0150` are reached via interior portals (this is exactly the BFS walk WB does in §3) | Cross-ref `docs/research/2026-05-23-a6-p3-issue98-comparison-harness-findings.md` |
### Implications for `BuildingLoader`
1. **No empty-portal building edge case in production data** — the `envCellIds.Count == 0` short-circuit at the end of Step C will essentially never fire for Holtburg. Still wire it (matches WB §89 + handles future content with empty buildings).
2. **Defensive `if (portal.OtherCellId == 0xFFFF) continue` in Step A** — never fires for BuildingInfo.Portals (confirmed across 60 portals); keeping it matches WB's defensive style.
3. **Cottage idx=6 is a known small multi-cell building** — perfect first verification target for RR8 visual gate ("cellar walls solid; cottage floor solid"). The cells it owns (0xA9B40145, 0xA9B4014C, 0xA9B4014E, 0xA9B4014F, 0xA9B40150) are the exact ones from the #98 saga.
## 3. WB's interior-portal walk algorithm
Source: [references/WorldBuilder/WorldBuilder.Shared/Services/PortalService.cs:43-97](../../references/WorldBuilder/WorldBuilder.Shared/Services/PortalService.cs).
```csharp
public IEnumerable<BuildingPortalGroup> GetPortalsByBuilding(uint regionId, ushort landblockId)
{
var lbFileId = ((uint)landblockId << 16) | 0xFFFE;
if (!_dats.CellRegions.TryGetValue(regionId, out var cellDb)) yield break;
if (!cellDb.TryGet<LandBlockInfo>(lbFileId, out var lbi)) yield break;
for (int buildingIdx = 0; buildingIdx < lbi.Buildings.Count; buildingIdx++)
{
var bInfo = lbi.Buildings[buildingIdx];
// --- Step A: seed with BuildingInfo.Portals (entry portals) ---
var discoveredCellIds = new HashSet<uint>();
var cellsToProcess = new Queue<uint>();
foreach (var portal in bInfo.Portals)
{
if (portal.OtherCellId != 0xFFFF)
{
var cellId = ((uint)landblockId << 16) | portal.OtherCellId;
if (discoveredCellIds.Add(cellId))
cellsToProcess.Enqueue(cellId);
}
}
// --- Step B: BFS through interior CellPortals ---
while (cellsToProcess.Count > 0)
{
var cellId = cellsToProcess.Dequeue();
if (cellDb.TryGet<EnvCell>(cellId, out var envCell))
{
foreach (var cellPortal in envCell.CellPortals)
{
if (cellPortal.OtherCellId != 0xFFFF)
{
var neighborId = ((uint)landblockId << 16) | cellPortal.OtherCellId;
if (discoveredCellIds.Add(neighborId))
cellsToProcess.Enqueue(neighborId);
}
}
}
}
// --- Step C: collect EXIT portals from every discovered cell ---
var outsidePortals = new List<PortalData>();
foreach (var cellId in discoveredCellIds)
foreach (var portal in GetPortalsForCell(cellDb, cellId)) // OtherCellId == 0xFFFF
outsidePortals.Add(portal);
if (discoveredCellIds.Count > 0)
yield return new BuildingPortalGroup
{
BuildingIndex = buildingIdx,
Portals = outsidePortals,
EnvCellIds = discoveredCellIds,
};
}
}
```
Where `GetPortalsForCell` walks each cell's `CellPortals`, picks the entries with `OtherCellId == 0xFFFF` (the "to outside" sentinel), looks up the portal polygon via:
```
_dats.Portal.TryGet<DatReaderWriter.DBObjs.Environment>(0x0D000000u | envCell.EnvironmentId, out var environment)
environment.Cells[envCell.CellStructure].Polygons[portal.PolygonId]
```
…then transforms each vertex by:
```
Matrix4x4.CreateFromQuaternion(envCell.Position.Orientation) *
Matrix4x4.CreateTranslation(envCell.Position.Origin)
```
…and yields `PortalData { Vertices = worldVertices, BoundingBox = ... }`.
### Key invariants extracted from the WB code
| Invariant | Evidence (WB line) |
|---|---|
| `0xFFFF` is the exit-portal sentinel for both `BuildingPortal.OtherCellId` and `CellPortal.OtherCellId` | `if (portal.OtherCellId != 0xFFFF)` (l. 58) and `if (cellPortal.OtherCellId != 0xFFFF)` (l. 71) and `if (portal.OtherCellId == 0xFFFF) /* Portal to outside! */` (l. 103) |
| Cell-id full form: `((uint)landblockId << 16) \| portal.OtherCellId` (NOT `landblockId & 0xFFFF0000u \| otherCellId` — but functionally equivalent because the high 16 bits of `landblockId` already encode the landblock x/y) | Lines 59, 72 |
| BFS uses dat-loaded `EnvCell.CellPortals`, NOT pre-resolved cell instances | Line 70 |
| Building's cell set comes from BOTH the entry portals AND the BFS extension — entry portals alone would miss most of a multi-cell building | Lines 57-64 (seed) + 67-79 (BFS) |
| A building with zero entry portals (`bInfo.Portals.Count == 0`) yields nothing — the `discoveredCellIds.Count > 0` gate at l. 89 short-circuits the `yield return` | Line 89 |
| `BuildingPortalGroup` instances correspond 1:1 with `BuildingInfo` entries (via `BuildingIndex`) | Line 91 |
### Edge cases observed in WB
- A cell shared between two `BuildingInfo` entries would be discovered TWICE (once per BFS). WB's `HashSet<uint> discoveredCellIds` is per-building, so each building gets its own copy. The plan's `BuildingRegistry.GetBuildingsContainingCell` already handles the "shared cell" case via `List<Building>`.
- WB walks the dat database (`cellDb.TryGet<EnvCell>(cellId, ...)`) DIRECTLY, regardless of whether cells are already loaded. Our `BuildingLoader.Build` will take `IReadOnlyDictionary<uint, LoadedCell>` so it walks pre-loaded cells. **Difference matters when streaming hasn't loaded a building's cells yet** — see §4.
## 4. Resolved algorithm for acdream's `BuildingLoader`
The plan's RR3-S11 pseudocode is correct in shape. Two updates pin it down precisely:
### 4.1 Type rename
Plan's RR3-S9 test file uses `BldPortal` and `BldPortal.OtherCellId`. Rename to `BuildingPortal` (the actual DRW type) and keep `OtherCellId` (matches DRW). The test helper signature becomes:
```csharp
var portalList = new List<BuildingPortal>();
foreach (var ocid in portals)
{
portalList.Add(new BuildingPortal
{
OtherCellId = (ushort)(ocid & 0xFFFFu),
Flags = 0,
OtherPortalId = 0,
StabList = new List<ushort>(),
});
}
```
### 4.2 Pre-loaded cells vs dat-direct walk
The plan's BuildingLoader walks `IReadOnlyDictionary<uint, LoadedCell> cellsByCellId`, NOT the dat database. This is the correct choice for acdream because:
- Our `LoadedCell.Portals` is already populated with `CellPortalInfo` records (one per `EnvCell.CellPortals` entry) at landblock-load time by `CellMesh` / `PhysicsDataCache.CacheCellStruct`.
- The streaming pipeline (`LandblockStreamer.LoadNear`) loads ALL of a landblock's `EnvCell`s into the dict before `BuildingLoader.Build` runs. So the dict is complete at registry-build time for the loaded landblock.
- Walking the dict avoids a duplicate dat fetch + EnvCell decode per BFS step (perf bonus).
The plan's empty-dict guard (`if (cellsByCellId.Count > 0)`) covers the unit-test case where the loader is invoked without cells. Production never hits that path.
### 4.3 Final pseudocode (carbon copy of plan's RR3-S11 modulo the rename)
```csharp
public static BuildingRegistry Build(
LandBlockInfo info,
uint landblockId,
IReadOnlyDictionary<uint, LoadedCell> cellsByCellId)
{
var reg = new BuildingRegistry();
if (info.Buildings is null || info.Buildings.Count == 0)
return reg;
uint lbMask = landblockId & 0xFFFF0000u;
uint nextId = 1;
foreach (var b in info.Buildings)
{
var envCellIds = new HashSet<uint>();
var exitPortalPolys = new List<Vector3[]>();
// Step A: seed from BuildingInfo.Portals
if (b.Portals is not null)
foreach (var portal in b.Portals)
{
if (portal.OtherCellId == 0xFFFF) continue;
envCellIds.Add(lbMask | portal.OtherCellId);
}
// Step B: BFS through interior CellPortals (preferred — uses pre-loaded LoadedCell.Portals)
var queue = new Queue<uint>(envCellIds);
while (queue.Count > 0)
{
var current = queue.Dequeue();
if (!cellsByCellId.TryGetValue(current, out var cell)) continue;
foreach (var p in cell.Portals)
{
if (p.OtherCellId == 0xFFFF) continue;
uint neighbourId = lbMask | p.OtherCellId;
if (envCellIds.Add(neighbourId))
queue.Enqueue(neighbourId);
}
}
// Step C: collect EXIT portal polygons in world space
foreach (var cellId in envCellIds)
{
if (!cellsByCellId.TryGetValue(cellId, out var cell)) continue;
for (int pi = 0; pi < cell.Portals.Count; pi++)
{
if (cell.Portals[pi].OtherCellId != 0xFFFF) continue;
if (pi >= cell.PortalPolygons.Count) continue;
var localPoly = cell.PortalPolygons[pi];
if (localPoly.Length < 3) continue;
var worldPoly = new Vector3[localPoly.Length];
for (int v = 0; v < localPoly.Length; v++)
worldPoly[v] = Vector3.Transform(localPoly[v], cell.WorldTransform);
exitPortalPolys.Add(worldPoly);
}
}
if (envCellIds.Count == 0) continue; // building has no interior — skip (matches WB §89)
var building = new Building
{
BuildingId = nextId++,
EnvCellIds = envCellIds,
ExitPortalPolygons = exitPortalPolys,
};
reg.Add(building);
foreach (var cellId in envCellIds)
if (cellsByCellId.TryGetValue(cellId, out var cell))
cell.BuildingId = building.BuildingId;
}
return reg;
}
```
`cell.PortalPolygons` is already populated by `CellMesh.Build` / `PhysicsDataCache.CacheCellStruct` from the same dat lookup chain (`Environment.Cells[CellStructure].Polygons[PolygonId]`) — RR3 doesn't have to re-derive it.
## 5. Edge cases
1. **Building with zero portals** — skipped (matches WB `discoveredCellIds.Count > 0` gate at l. 89). The building entity (the cottage shell mesh) still ships via the existing `LandblockLoader` path with `IsBuildingShell = true`; the `BuildingRegistry` just doesn't list it.
2. **Cell shared between two buildings** — handled by `BuildingRegistry._byCellId: Dictionary<uint, List<Building>>` (plan's RR3-S7). `LoadedCell.BuildingId` will be stamped with the LAST building's id; consumers requiring all owners must use `BuildingRegistry.GetBuildingsContainingCell` (plural). RR7's render-path uses the plural lookup.
3. **Building with portals pointing to unloaded cells** — Step B's BFS bails out at the unloaded cell (`!cellsByCellId.TryGetValue`); the building's `EnvCellIds` is short by however many cells weren't loaded. In production this doesn't happen (streaming loads all cells before the registry builds). In tests, the loader still returns a valid (possibly partial) building. Worth a doc comment in RR3's `BuildingLoader.cs`.
4. **`BuildingInfo.Portals[i].OtherCellId == 0xFFFF`** — defensively skipped at Step A. Empirically WB's code includes the same defensive check (l. 58), so the case is anticipated even if not common.
5. **Multi-landblock buildings** — none observed. `BuildingPortal.OtherCellId` is a 16-bit value scoped to the same landblock; the dat-level encoding can't reference a different landblock. Buildings are LB-local.
6. **Dungeon cells** — dungeons are NOT enumerated in `LandBlockInfo.Buildings`. Their cells have `BuildingId == null` and flow through the outdoor render path. The plan calls this out explicitly; nothing changes here.
## 6. Gate decision
✅ **Data shape compatible — proceed to RR3.**
The two corrections vs the plan's RR3 pseudocode (`BuildingPortal` rename, `cell.BuildingId` setter timing) are minor and confined to RR3's test-helper + setter call site. The algorithm is unchanged from the plan's expectation. No re-brainstorm needed.
## 7. References
- DRW v2.1.7 `BuildingInfo`: `%USERPROFILE%\.nuget\packages\chorizite.datreaderwriter\2.1.7\lib\net8.0\DatReaderWriter.dll` (decompiled via `ilspycmd -t DatReaderWriter.Types.BuildingInfo`)
- DRW v2.1.7 `BuildingPortal`: ditto, `-t DatReaderWriter.Types.BuildingPortal`
- DRW v2.1.7 `CellPortal`: ditto, `-t DatReaderWriter.Types.CellPortal`
- WB walk: [references/WorldBuilder/WorldBuilder.Shared/Services/PortalService.cs:43-97](../../references/WorldBuilder/WorldBuilder.Shared/Services/PortalService.cs)
- WB upload: [references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/PortalRenderManager.cs:488-628](../../references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/PortalRenderManager.cs)
- Retail header for `BuildInfo` (renamed in DRW to `BuildingInfo`): [docs/research/named-retail/acclient.h:32035-32042](../research/named-retail/acclient.h)
- Retail header for `CBldPortal` (renamed to `BuildingPortal`): [docs/research/named-retail/acclient.h:32094-32103](../research/named-retail/acclient.h)
- Existing acdream consumer pattern: [src/AcDream.App/Rendering/GameWindow.cs:5789-5803](../../src/AcDream.App/Rendering/GameWindow.cs)
- Existing `LoadedCell.Portals` shape: [src/AcDream.App/Rendering/CellVisibility.cs:51,79](../../src/AcDream.App/Rendering/CellVisibility.cs)

View file

@ -0,0 +1,137 @@
# Phase A8 re-plan — entity taxonomy investigation
**Date:** 2026-05-26
**Phase:** A8 — Indoor-cell visibility culling RE-PLAN
**Predecessor handoff:** [docs/research/2026-05-26-a8-revert-handoff.md](2026-05-26-a8-revert-handoff.md)
**Status:** Report-only. Awaiting user approval of recommended fix-shape before Phase 2 (plan writing).
**Empirical context (added during investigation):** the bug exists on `main` too — verified by side-by-side launch of `main` vs `HEAD = fef6c61`. Both branches show outdoor buildings/terrain visible through the walls of a cottage when standing inside. The bug is **fundamental**, not a regression in this worktree's 149-commit divergence. The A8 framing in the predecessor handoff stands.
---
## TL;DR
The retail data model, WorldBuilder's data model, and the comment at `GameWindow.cs:5175-5178` all agree on a single architectural fact: **building shells are tagged distinctly from outdoor scenery at the data layer.** acdream's `LandblockLoader` reads both `LandBlockInfo.Objects` (scenery) and `LandBlockInfo.Buildings` (shells) into the same `WorldEntity` pool with no tag, destroying the distinction. The fix is to add `WorldEntity.IsBuildingShell: bool` at the loader, propagate it through hydration, and use it in the `WbDrawDispatcher.EntitySet` partition. This is **retail-faithful** (matches `BuildInfo` array) and **WB-faithful** (matches `SceneryInstance.IsBuilding`).
GL state order from the A8 Round 3 learning (MarkAndPunch BEFORE indoor draw) is confirmed correct by reading WorldBuilder's `VisibilityManager.RenderInsideOut`.
Far-side-portal (WB "Step 5", 3-stencil-bit) is deferred. First-ship approximation: only stencil-mark the **camera's own cell's** portals, not BFS-extended `VisibleCellIds`.
---
## The seven entity classes in acdream's runtime
| # | Class | `ParentCellId` | `Id` prefix | `ServerGuid` | Source field |
|---|---|---|---|---|---|
| 1 | Cell mesh | set | `0x40xxxxxx` | 0 | `EnvCell.EnvironmentId` |
| 2 | Cell static object | set | `0x40xxxxxx` | 0 | `EnvCell.StaticObjects` |
| 3 | **Building shell stab** | **null** | `0xC0xxxxxx` | 0 | **`LandBlockInfo.Buildings`** |
| 4 | **Outdoor scenery stab** | **null** | `0xC0xxxxxx` | 0 | **`LandBlockInfo.Objects`** |
| 5 | Procedural scenery | null | `0x80xxxxxx` | 0 | `SceneryGenerator` (terrain table) |
| 6a | Live animated | null | `0x10xxxxxx` | ≠0 | `CreateObject` packet |
| 6b | Live static | null | `0x10xxxxxx` | ≠0 | `CreateObject` packet |
**Classes 3 and 4 are indistinguishable at runtime today** (identical field shape after hydration). This is the load-bearing wrong assumption from the A8 attempt.
### Code anchors (acdream)
- `src/AcDream.Core/World/LandblockLoader.cs:62-71` — Objects (Class 4) loop
- `src/AcDream.Core/World/LandblockLoader.cs:74-87` — Buildings (Class 3) loop, **same `nextId++` counter, same WorldEntity shape**
- `src/AcDream.App/Rendering/GameWindow.cs:5129-5137` — hydration pass-through, no distinction preserved
- `src/AcDream.App/Rendering/GameWindow.cs:5175-5178` — the comment that proves the distinction is intentional in dat:
> *"Only Buildings suppress scenery. Stabs (LandBlockInfo.Objects) are static scenery placeholders themselves (rocks, tree clusters) that retail does NOT use to suppress scenery generation."*
---
## How retail tags buildings (cross-reference 1)
`CLandBlock::init_buildings` (`acclient_2013_pseudo_c.txt:313854-313920`) reads `CLandBlockInfo::buildings[]` — a **separate `BuildInfo**` array**, NOT a flag bit or ID-range scheme.
- `CLandBlockInfo.num_buildings` + `buildings[]` array (`acclient.h:31893-31905`)
- `BuildInfo` struct: `building_id`, `building_frame`, `num_portals`, `CBldPortal** portals` (`acclient.h:32035-32042`)
- Buildings hydrate via `CBuildingObj::makeBuilding()` (line 313879) and register into the landblock's `stablist[]` (per-landblock visible-cell set, line 313910)
- Visibility uses **stablist (portal PVS)**, NOT AABB-encloses-camera. `CEnvCell::grab_visible` walks `stab_list[i]` directly.
Conclusion: **retail explicitly distinguishes the two via separate dat arrays.** This is the data-model truth we should match.
## How WorldBuilder tags buildings (cross-reference 2)
WB uses **two manager classes** sharing one mesh pool:
- `StaticObjectRenderManager` — handles BOTH `LandBlockInfo.Objects` and `LandBlockInfo.Buildings`, tagging each `SceneryInstance.IsBuilding` (`StaticObjectRenderManager.cs:334-400`).
- `SceneryRenderManager` — handles ONLY procedural terrain-derived scenery (different class entirely, doesn't share the dat path).
Tagging happens at **hydration time** in `GenerateForLandblockAsync` (lines 315-427). The instance is then split into separate `StaticPartGroups` vs `BuildingPartGroups` for draw dispatch.
`BuildingPortalGPU` (`PortalRenderManager.cs:687-701`) holds `EnvCellIds: HashSet<uint>` populated at landblock generation (line 549) — the "this building contains these EnvCells" association. The set is **never re-computed at render time**.
WB's `RenderInsideOut` GL state order (`VisibilityManager.cs:73-239`):
1. Stencil bit 1 ← portal polygons (color/depth masks off)
2. `gl_FragDepth = 1.0` ← portal polygons (depth mask on, depth-func = Always)
3. **Interior EnvCells render WITHOUT stencil restriction** ← key step
4. Stencil-restricted (`Equal, 1`): terrain + scenery + buildings render only at portal silhouettes
5. (Step 5) 3-stencil-bit pipeline for cross-building visibility — DEFER
**WB's order = MarkAndPunch (Step 1 + 2) FIRST, then indoor cells (Step 3).** This matches A8 Round 3's correction. The handoff's GL-state-order conclusion stands.
---
## Recommended fix-shape (synthesized)
### Stage 1: Tag at hydration (`IsBuildingShell` flag)
Add `WorldEntity.IsBuildingShell: bool` (default false). In `LandblockLoader.cs`:
- Objects loop (line 62): `IsBuildingShell = false`
- Buildings loop (line 74): `IsBuildingShell = true`
In `GameWindow.cs:5129-5137` (hydration): copy `IsBuildingShell` from `e` to the hydrated entity. One-line change.
### Stage 2: Refine `WbDrawDispatcher.EntitySet` partition
Replace today's binary `IndoorOnly`/`OutdoorOnly` with:
- `IndoorPass``ParentCellId.HasValue || IsBuildingShell` (Classes 1, 2, 3)
- `OutdoorScenery``!ParentCellId.HasValue && !IsBuildingShell && (ServerGuid == 0)` (Classes 4, 5)
- `LiveDynamic``ServerGuid != 0` (Classes 6a, 6b)
`WalkEntitiesInto` updates one branch (the partition predicate). 26 dispatcher tests will need their fixture entities tagged correctly; otherwise behavior is the same.
### Stage 3: Re-wire render frame with WB's order
When camera is inside a cell:
1. Draw terrain (color in framebuffer)
2. **MarkAndPunch** (stencil = 1 + depth = 1.0 at portal silhouettes)
3. `WbDrawDispatcher.Draw(set: IndoorPass)` — cell mesh + cell statics + building shells. Stencil disabled, depth test normal. These write depth ON TOP of the 1.0 punch, correctly occluding the next stencil-gated pass.
4. Re-draw terrain (color writes only) with `StencilFunc(Equal, 1)` — terrain visible only at portal silhouettes.
5. `WbDrawDispatcher.Draw(set: OutdoorScenery)` with `StencilFunc(Equal, 1)` — outdoor scenery visible only at portal silhouettes.
6. `WbDrawDispatcher.Draw(set: LiveDynamic)` — stencil disabled, depth test on. Live entities draw freely; depth occludes them by walls and cell meshes already in the depth buffer.
When camera is outside: stencil work skipped entirely. Today's all-entities single draw stands (or substitute the three EntitySet calls with stencil disabled — depth still sorts them correctly).
### Stage 4: Far-side-portal approximation (defer Step 5)
Stencil-mark **only the camera's own cell's portals** in Step 2, not the BFS-extended `VisibleCellIds`. This trades cross-cell-portal visibility (rare visually) for correctness in the common case (no "see-through-wall on the other side of the room"). Track as a known limitation; revisit if visual gate flags it.
---
## Reasons for confidence
1. **Triple-cited**: retail (`BuildInfo` array), WB (`IsBuilding` flag), acdream's own code comment (5175-5178) all agree on the distinction.
2. **Tagging cost is microscopic** — one bool on `WorldEntity`, one branch in `LandblockLoader`. No new types, no new managers, no field migration.
3. **`EntitySet` enum is already in place** (dormant from Tasks 1-6). Refactor is reshaping its semantics, not introducing it.
4. **GL state order is validated** by both Round 3 of the A8 attempt and WB's reference. No remaining ambiguity.
5. **Live-dynamic separation handles the Round 1 character-disappears bug** (handoff §Round 1). They draw last, stencil disabled, depth-tested against everything else.
## Open questions for user approval
1. Use `IsBuildingShell` flag (recommended) vs separate `0xC1xxxxxx` ID-namespace? Flag is more explicit, retail-faithful, and trivially greppable. ID-namespace is one less field but invisible at the call site.
2. Defer Step 5 (far-side portals) and stencil-mark only camera's own cell? Recommendation: yes — ship simple, file follow-up.
3. Live-dynamic entities (Class 6b: dropped items) — draw in `LivePass` or accept "invisible from inside" until a richer flag exists? Recommendation: `LivePass`. They're rare visually, and the player benefits from seeing dropped items through the floor (gameplay nicety, not retail violation).
4. Cellar-stairs grass overlay from OUTSIDE: NOT A8 scope (no stencil runs when camera is outside). Open question for a future "deep-cell terrain occlusion" phase. Confirm we file this separately, not bundled.
---
## Reference anchors (still valid from predecessor handoff)
- WB stencil: `references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/VisibilityManager.cs:73-239`
- WB building-cell association: `references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/PortalRenderManager.cs:518-551`
- Retail building init: `docs/research/named-retail/acclient_2013_pseudo_c.txt:313854-313920`
- Retail building struct: `docs/research/named-retail/acclient.h:31893-31905`, `:32035-32042`, `:32094-32103`
- acdream LandblockLoader: `src/AcDream.Core/World/LandblockLoader.cs:62-87`
- acdream hydration: `src/AcDream.App/Rendering/GameWindow.cs:5093-5148`, `:5175-5178`

View file

@ -0,0 +1,271 @@
# Phase A8 — R3.5 transition-flicker iteration PAUSED. Handoff for restructure session.
**Date:** 2026-05-26 (PM)
**Status:** R1 + R2 + R3 + R3.5 v1 + R3.5 v2 all shipped. Primary #78 indoor fix WORKS. Three distinct transition/sky issues surfaced during R4 visual verification that resist symptom-level patching. **Paused for proper brainstorm → write-plan → execute-plan workflow in a fresh session.**
**Branch:** `claude/strange-albattani-3fc83c` (worktree)
**HEAD:** `2bfeafd`
**Predecessor handoff:** [docs/research/2026-05-26-a8-revert-handoff.md](2026-05-26-a8-revert-handoff.md)
**Original re-plan:** [docs/superpowers/plans/2026-05-26-phase-a8-replan.md](../superpowers/plans/2026-05-26-phase-a8-replan.md)
**Entity-taxonomy fix-shape (approved):** [docs/research/2026-05-26-a8-entity-taxonomy.md](2026-05-26-a8-entity-taxonomy.md)
---
## TL;DR
R1 (IsBuildingShell flag), R2 (EntitySet partition reshape), R3 (render-frame integration of WB-order stencil pipeline) all shipped clean. The primary #78 fix WORKS: standing inside a Holtburg cottage, the walls now block outdoor visibility — no see-through buildings, no see-through scenery. M1.5's "indoor world feels right" is partially achieved.
Visual verification (R4) surfaced **three remaining issues** that are NOT individual bugs — they're symptoms of an **architectural mismatch** between our render frame and WB's `RenderInsideOut` reference. Specifically: we draw terrain unconditionally before the stencil work and use depth-clear-if-inside as a workaround, while WB skips initial terrain entirely when inside and renders terrain ONLY at the stencil-gated step. Two patch attempts (R3.5 v1 and R3.5 v2) papered over parts of the symptom but kept producing new edge cases — the exact "patching symptoms" anti-pattern CLAUDE.md and the predecessor revert handoff explicitly call out.
**Next session must brainstorm the right architecture, write a plan, and execute.** Do NOT continue inline patches.
---
## What shipped this session (5 commits)
| Commit | Task | What it does |
|---|---|---|
| `ed72704` | R1 | Adds `WorldEntity.IsBuildingShell: bool init` set in `LandblockLoader.Buildings` loop; propagated through `GameWindow.cs:5129-5136` hydration. 2 LandblockLoader tests lock the data-layer guarantee. |
| `55f26f2` | R2 (amended) | Reshapes `WbDrawDispatcher.EntitySet` from `IndoorOnly`/`OutdoorOnly` to taxonomy-aware `IndoorPass` / `OutdoorScenery` / `LiveDynamic`. Adds `private static bool EntityMatchesSet(WorldEntity, EntitySet)` truth-table predicate. 7 tests cover the partition. |
| `60f07bc` | R3 | Wires the stencil pipeline into `GameWindow` render frame with WB-order: `MarkAndPunch → IndoorPass → EnableOutdoorPass → terrain re-draw → OutdoorScenery → DisableStencil → LiveDynamic`. Stencil-marks **camera's own cell's exit portals only** (WB Step 5 deferred). |
| `38d5374` | R3.5 v1 | Adds `cameraReallyInside = PointInCell(camPos, visibility.CameraCell)` gate for the stencil branch (kept `cameraInsideCell` for sky / lighting / depth-clear). Attempt to close the exit-transition flicker. |
| `2bfeafd` | R3.5 v2 | Also gates the depth-clear-if-inside on `cameraReallyInside`. Attempt to close the "objects through ground" symptom the v1 fix exposed. |
All 5 commits are kept; none are reverted. Build green at HEAD. Test failures within the documented 14-23 pre-existing flaky window.
---
## What's WORKING (the primary fix)
Standing inside any Holtburg cottage (ground floor or cellar), looking around:
- **Walls are solid.** No outdoor scenery visible through walls. No buildings visible through walls.
- **The original #78 symptom is gone.** This is the primary acceptance criterion for the A8 phase.
- User confirmed: *"Ok better. ... When I look out now from inside it is not showing buildings below or any windows inside the house."*
The architectural win is real:
- `WorldEntity.IsBuildingShell` correctly tags cottage walls at the dat-source boundary (`LandblockLoader.Buildings` loop).
- `WbDrawDispatcher.EntitySet.IndoorPass` correctly routes cell mesh + cell statics + building shells together — fixing the previous Round-3 regression where cottage walls disappeared.
- Camera's-own-cell-portals-only approximation (Step 5 deferred) avoids the "see through wall to another room's outdoor" regression from previous Round 2.
---
## What's NOT WORKING (3 transition/sky issues)
Verbatim user reports from R4 visual verification (post R3.5 v2):
### Issue A — Exit indoor→outdoor: "objects through ground + building parts missing"
> "If I stand outside or just pass outside I get the flicker where objects are visible through ground and walls of other buildings are missing"
**My diagnosis:** during the 3-frame grace window after camera physically exits a cell (`CellVisibility._cellSwitchGraceFrames`), `cameraInsideCell` stays true but `cameraReallyInside` becomes false (PointInCell on the previous cell returns false). With v2:
- Sky still skipped (cameraInsideCell)
- Initial terrain still drawn (unconditional, line 7115)
- depth-clear NOT fired (cameraReallyInside)
- Stencil branch NOT taken (cameraReallyInside)
- Outdoor branch (`Draw(set: All)`) runs
This *should* be correct — terrain depth preserved, all entities depth-tested. But the user still sees the symptom. **Working hypothesis:** with the depth buffer holding terrain Z (~99.99 post the -0.01 nudge from f48c74a), entities at world Z below terrain may still win depth tests in certain camera angles. Or the issue is something else entirely that the v2 didn't address.
### Issue B — Inside looking through window: "Sky don't render"
> "Sky dont render when I look from inside to outside"
**My diagnosis:** when inside, sky pass is skipped (`if (!cameraInsideCell) { _skyRenderer?.RenderSky(...); ... }` at line 7079). The stencil-gated outdoor pass re-draws terrain + outdoor scenery in portal silhouettes, but **NOT sky**. Through a window, the user sees terrain (where it projects in the portal silhouette) and beyond the terrain horizon — fog color (the framebuffer clear color is set to fog haze at line 6894, not sky color).
This is a **known WB-pipeline limitation** — WB itself doesn't draw sky inside-out. To fix in acdream we'd add a stencil-gated `_skyRenderer.RenderSky` call inside the indoor branch between `EnableOutdoorPass` and the terrain re-draw. Not done in any R3.5 patch.
### Issue C — Entry outdoor→indoor: "floor transparent showing cellar + wrong texture"
> "When going from outside to inside flickering so that parts of the floor is transparent so I see the cellar from above and wrong texture on the floor"
**My diagnosis (LOWER CONFIDENCE):** the cottage floor and cellar ceiling are at adjacent world Z values. Both meshes are loaded (cottage cell + cellar cell both in `VisibleCellIds` when standing in the cottage). During the entry transition frame, depth-fight may occur between cottage floor (Z=100.02 with the +0.02 cell origin bump) and cellar ceiling (whatever Z that mesh sits at). "Wrong texture" suggests the cellar ceiling is winning depth at floor pixels and its texture is showing through. This is **likely a pre-existing data-model / multi-cell-Z artifact, not strictly an A8 bug**, but it became visible because the new pipeline doesn't have the depth-clear-if-inside masking it on every frame anymore.
---
## Architectural diagnosis — the root cause
Reading `references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/VisibilityManager.cs:73-239` carefully:
**WB's RenderInsideOut order:**
1. (No initial terrain. Depth buffer is empty from frame-start `glClear`.)
2. MarkAndPunch — stencil bit 1 + depth = 1.0 at exit-portal silhouettes only.
3. Render interior EnvCells with stencil OFF, normal `DepthFunc.Less`. Cell mesh wins fresh depth at most pixels.
4. Enable stencil restriction (`StencilFunc Equal 1, 0x01`).
5. **Render terrain + scenery + static objects** — at portal silhouettes ONLY (stencil-restricted). Terrain depth (close, ~99.99) wins against the 1.0 punch in portal areas → outdoor visible through windows.
6. (Step 5: WB's 3-stencil-bit pipeline for cross-building visibility — deferred.)
**Our R3.5 v2 order:**
1. **Terrain drawn unconditionally** (line 7115; color + depth at ~99.99).
2. depth-clear-if-cameraReallyInside (depth → 1.0; redundant with MarkAndPunch).
3. MarkAndPunch (no-op against the depth-cleared 1.0).
4. IndoorPass — cell mesh + statics + building shells.
5. EnableOutdoorPass + terrain RE-draw + OutdoorScenery (stencil-gated).
6. DisableStencil + LiveDynamic.
**The mismatch:** we draw terrain TWICE (initial + re-draw) and have a depth-clear that's a workaround for the initial terrain draw. WB avoids both by skipping the initial terrain entirely when inside. Our pipeline is a "FRANKENSTEIN" — it works in the steady-state indoor case (the primary #78 fix) but breaks at transitions and during grace frames because the interactions between (initial terrain + depth-clear + grace + cameraInsideCell vs cameraReallyInside flag asymmetry) keep producing new edge cases.
**The R3.5 v1 and v2 patches were symptom-fixes**, not root-cause fixes. CLAUDE.md is explicit about this: *"When you spot a bug or encounter a behavioral mismatch, fix the underlying cause — do not ship a band-aid, suppression flag, grace period, retry loop, or any other 'make the symptom go away' shortcut, unless the user has explicitly approved that shape."* The user has now correctly pulled the emergency brake.
---
## Recommended next-session approach
Use the **superpowers full workflow**:
### Phase 1: BRAINSTORM (use `superpowers:brainstorming`)
Settle the design BEFORE writing a plan. Key brainstorm questions:
1. **Should the initial terrain draw be conditional?**
- WB faithfully: yes, skip when `cameraReallyInside`. Terrain draws only at stencil-gated step.
- Hybrid: keep initial terrain unconditional but remove the depth-clear so terrain depth wins against indoor cells at non-portal pixels. *(Would break the #78 fix — cottage floor at +0.02 would lose to terrain at -0.01.)*
- **Probably WB-faithful is the right call.**
2. **Should sky be re-drawn stencil-gated when inside?**
- WB: no. Sky color shows as fog-clear-color through windows.
- acdream enhancement: yes, render `_skyRenderer.RenderSky` between `EnableOutdoorPass` and the terrain re-draw inside the indoor branch.
- **Tradeoff:** WB-faithfulness vs. user's expectation that windows show sky. Retail probably shows sky through windows; investigate retail's polygon-clip scissor approach.
3. **What's the deal with the entry-flicker "floor transparent showing cellar"?**
- Is it depth-fight between cottage floor mesh (Z=100.02) and cellar ceiling mesh (Z=?)? Need a brief investigation to confirm.
- Is it a one-frame visibility-update lag where cottage cell isn't yet in VisibleCellIds during the entry transition frame?
- Is it pre-existing in main (test by reverting all of A8 and entering a cottage on main)?
- **Don't try to fix this in A8.** Identify, file as separate follow-up (likely candidate for #103 family or new #106).
4. **Should we eliminate `cameraInsideCell` vs `cameraReallyInside` asymmetry?**
- Today: `cameraInsideCell` (grace-aware) gates sky/lighting; `cameraReallyInside` (PointInCell, no grace) gates depth-clear + stencil branch.
- The split is a workaround for the grace-mechanism conflict with the render path. With WB-faithful order (no initial terrain, no depth-clear), can we use `cameraReallyInside` everywhere? Or does that introduce sky flicker at the threshold?
- The grace mechanism was added to prevent cell-id flicker at doorways. Does PointInCell with its existing epsilon already provide enough hysteresis?
- **Likely path: unify on `cameraReallyInside` and remove the grace mechanism entirely.** Simpler is better.
5. **Are R3.5 v1 + v2 patches worth keeping or should we revert them before the restructure?**
- v1 (stencil branch gate): subsumed by the restructure since the stencil branch will use `cameraReallyInside`.
- v2 (depth-clear gate): subsumed since depth-clear gets DELETED entirely.
- **Recommendation:** revert v1 and v2 (`git revert 2bfeafd 38d5374` or new commits) at the start of the implementation session, work from the R3 baseline. Cleaner diff, easier review.
### Phase 2: WRITE-PLAN (use `superpowers:writing-plans`)
Expected plan shape (TDD where possible):
- **Task RR1**: Revert R3.5 v1 + v2 (`git revert 38d5374 2bfeafd`). Result: HEAD at logical state of `60f07bc` (R3 baseline).
- **Task RR2**: Restructure render frame to WB-faithful order. Sub-steps:
- Move `cameraReallyInside` computation up next to `cameraInsideCell` (~line 7011-7014).
- Gate the initial terrain draw (line 7115) on `!cameraReallyInside`.
- Delete the depth-clear-if-inside block entirely.
- Decide on `cameraInsideCell` vs `cameraReallyInside` unification (per Phase 1 brainstorm Q4).
- Inside branch: keep existing structure (MarkAndPunch → IndoorPass → EnableOutdoorPass → terrain → OutdoorScenery → DisableStencil → LiveDynamic).
- **Task RR3 (optional)**: Add stencil-gated sky pass for sky-through-windows (per Phase 1 brainstorm Q2). Or defer as #105.
- **Task RR4**: Visual verification matrix (same as R4: cottage interior, cellar, inn, dungeon; PLUS exit transition, entry transition, sky-through-windows).
- **Task RR5**: Ship docs (R5 from original plan; file the genuine follow-ups; close #78).
GL integration tasks are visual-verification-only by nature (the partition logic + EntitySet are already unit-tested). Don't burn cycles writing unit tests for GL state — the existing infrastructure tests (26 dispatcher + 5 stencil + 2 PortalPolygons + 1 ProbeVisibility = 34) already lock the non-GL bits.
### Phase 3: EXECUTE-PLAN (use `superpowers:subagent-driven-development`)
Same pattern as this session: fresh Sonnet subagent per task, two-stage review (spec compliance + code quality). The CRITICAL extra review check beyond default — **add to the spec reviewer prompt**: *"Does the implementation match WB's RenderInsideOut order at `references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/VisibilityManager.cs:73-239`? Specifically: NO initial terrain draw when inside, NO depth-clear, terrain rendered ONLY stencil-gated?"*
---
## Pickup prompt for next session
```
Phase A8 — render frame restructure to match WB's RenderInsideOut order
faithfully. R1+R2+R3+R3.5 v1+v2 shipped this session (commits ed72704 →
2bfeafd). Primary #78 fix works (cottage interior solid walls). Three
transition/sky issues remain that resist symptom patching.
Read first (in this order — REQUIRED):
1. docs/research/2026-05-26-a8-r3.5-restructure-handoff.md (this doc — full
story of why we paused; the architectural mismatch; recommended path)
2. docs/research/2026-05-26-a8-entity-taxonomy.md (approved fix-shape)
3. docs/research/2026-05-26-a8-revert-handoff.md (predecessor; the original
A8 attempt's revert lessons — still applies)
4. docs/superpowers/plans/2026-05-26-phase-a8-replan.md (this session's
plan — R1+R2+R3 still apply; R3.5 patches and the WB-faithful
restructure are NEW work)
5. references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/VisibilityManager.cs:73-239
(the proven reference — read it verbatim BEFORE designing the restructure)
6. CLAUDE.md — find "currently working toward" to refresh state
State both altitudes:
Currently working toward: M1.5 — Indoor world feels right
Current phase: A8 — render frame restructure to WB-faithful order
HEAD: 2bfeafd (R3.5 v2)
Clean revert points: 60f07bc (R3 baseline) or 55f26f2 (R2)
Test baseline: build green; 1238 pass / 14 fail (documented flaky window)
Session flow — MUST use full superpowers workflow:
### Phase 1 — BRAINSTORM (use superpowers:brainstorming)
Settle the design. Do NOT skip this. The previous session jumped to
patching after R3 and that produced this handoff. Five questions in the
recommended-next-session-approach section of this handoff doc must be
answered before any code is written.
Brainstorm output: a short design note in chat + an updated entry in the
entity-taxonomy doc OR a fresh design doc. Get user approval before
Phase 2.
### Phase 2 — WRITE-PLAN (use superpowers:writing-plans)
Expected: tasks RR1 (revert R3.5), RR2 (restructure render frame), RR3
(optional sky-through-windows), RR4 (visual verification), RR5 (ship
docs). Plan path: docs/superpowers/plans/2026-05-2X-phase-a8-restructure.md
(date when written).
### Phase 3 — EXECUTE (use superpowers:subagent-driven-development)
Fresh Sonnet subagent per task with two-stage review. Add the WB-order
check to the spec reviewer prompt (see handoff doc).
## Constraints
- Per CLAUDE.md "no workarounds without approval" — fix the root cause.
The R3.5 v1+v2 patches were symptom fixes. Do not repeat that pattern.
- Visual verification is the acceptance test. Test scenarios in the
handoff's "What's NOT working" section MUST all be re-tested.
- Existing infrastructure (Tasks 1-6 + R1 + R2 + R3) is correct and
shipped. The restructure is a render-frame surgery, not a partition
reshape or data-layer change.
## What success looks like
After this restructure ships:
- Standing INSIDE cottage / cellar / inn / dungeon: solid walls
(unchanged from this session's R3 win).
- EXITING indoor → outdoor: clean transition. No "objects through
ground." No "buildings missing." Brief lighting transition is OK if
sky-on-cameraInsideCell is kept, otherwise no lighting transition.
- ENTERING outdoor → indoor: clean transition. No floor-transparent
showing cellar. If the floor-cellar-z-fight is pre-existing on
main, file as a separate issue and accept it as not-A8-scope.
- LOOKING THROUGH WINDOWS from inside: terrain visible at the
portal silhouette. Sky visible (if RR3 included) OR fog color (if
RR3 deferred and noted in #105).
- dotnet build green; test failures within the documented 14-23
flaky window.
```
---
## Files state at session end
```
Branch: claude/strange-albattani-3fc83c
HEAD: 2bfeafd fix(render): Phase A8 R3.5 v2 — gate depth-clear on cameraReallyInside too
Parent: 38d5374 fix(render): Phase A8 R3.5 — gate stencil branch on PointInCell containment
GP: 60f07bc feat(render): Phase A8 R3 — wire stencil pipeline into render frame (WB order)
GGP: 55f26f2 feat(render): Phase A8 R2 — WbDrawDispatcher.EntitySet taxonomy partition
GGGP: ed72704 feat(world): Phase A8 R1 — tag WorldEntity.IsBuildingShell at LandblockLoader
Working tree: clean
Build: green (0 warnings, 0 errors)
Tests: 1238 pass / 14 fail (all within documented 14-23 flaky window;
zero new failures attributable to A8 R1/R2/R3/R3.5)
Untracked log files: launch-a8-verify*.log (deletable)
```
The five commits are all NEW additions to main; no destructive history rewrites. Next session can:
- Continue from HEAD with the restructure layered on top (R3.5 patches subsumed by it).
- OR `git revert 38d5374 2bfeafd` for a cleaner diff against R3 baseline.
Either path is valid — pick whichever the brainstorm settles on.

View file

@ -0,0 +1,447 @@
# Phase A8 — Indoor-cell visibility culling — REVERTED. Handoff for re-plan.
**Date:** 2026-05-26 (session began 2026-05-25 PM, continued into 2026-05-26)
**Status:** Task 7 integration REVERTED after three rounds of visual verification surfaced cascading bugs. Infrastructure (Tasks 1-6) RETAINED — all dead-but-correct code, ready to be re-integrated under a different design.
**Branch:** `claude/strange-albattani-3fc83c` (worktree)
**Predecessor handoff:** [docs/research/2026-05-25-issue-100-shipped-and-culling-handoff.md](2026-05-25-issue-100-shipped-and-culling-handoff.md)
**Original plan:** [docs/superpowers/plans/2026-05-25-phase-a8-indoor-cell-visibility-culling.md](../superpowers/plans/2026-05-25-phase-a8-indoor-cell-visibility-culling.md)
**Investigation report (Phase 1):** [docs/research/2026-05-25-issue-78-visibility-culling-investigation.md](2026-05-25-issue-78-visibility-culling-investigation.md)
---
## TL;DR
We tried to close issue #78 (outdoor stabs visible through inn floor/walls) and the cellar-stairs grass-overlay artifact by porting WorldBuilder's stencil-based `RenderInsideOut` pipeline. The plan executed cleanly through 7 tasks (1029 lines of plan, 11 commits including hardening + fixes), and the H1 hypothesis from the investigation was correct — the cellar-stairs artifact IS a culling problem, NOT a Z-fight problem, confirmed by the camera-rotation falsification test.
But the WB stencil approach has **architectural assumptions that don't match acdream's data model**, and three rounds of visual verification surfaced compounding bugs that aren't fixable by patching:
1. **Round 1** (commit `41c2e67` initial integration) — character disappeared indoors. Root cause: player/NPCs have `ParentCellId == null`, got classified as outdoor scenery, stencil-gated to portal silhouettes only.
2. **Round 2** (commit `a2ad5c1` animated-entity fix) — character now visible, but closed doors leaked outside, walls between rooms showed far-side portal openings, character body bled to terrain where it overlapped a portal silhouette on screen.
3. **Round 3** (commit `b76f6d1` order swap — Mark+Punch BEFORE indoor draw, matching WB's actual order) — closed doors now correctly blocked, BUT cottage walls completely disappeared, character rendered head-inside-out, see-through everything. Root cause: cottage walls are **landblock-baked stabs** (`LandBlockInfo.Objects`) with `ParentCellId == null`, classified as outdoor scenery, stencil-gated → visible only at portal silhouettes (windows/doors).
The integration commits `41c2e67`, `a2ad5c1`, `b76f6d1` are now reverted by `fef6c61`, `96f8bd2`, `c897a17`. Tasks 1-6 (infrastructure: `PortalPolygons` field, `RenderingDiagnostics.ProbeVisibilityEnabled`, portal_stencil shaders, `IndoorCellStencilPipeline`, `PortalMeshBuilder`, `WbDrawDispatcher.EntitySet` enum) remain committed and tested. They're dormant — nothing in the runtime invokes them — but they're correct, tested, and ready for a different integration design.
**Current HEAD: `fef6c61`** — render frame back to pre-A8 behavior (terrain → depth-clear-if-inside → dispatcher with all entities). Build green, all infrastructure tests passing (26 dispatcher + 5 stencil-pipeline + 2 PortalPolygons data-class + 1 ProbeVisibilityEnabled toggle).
The next session needs to **re-investigate the entity taxonomy** before re-planning the integration. The plan's binary `IndoorOnly vs OutdoorOnly` partition is wrong; AC's data model has at least four distinct entity classes that need different treatment.
---
## What was tried (chronological)
### Phase 1: Investigation (REPORT-ONLY, before any code)
Dispatched four parallel research agents:
1. Retail decomp visibility chain (`PView::DrawCells`, `RenderInsideOut`, `CEnvCell::find_visible_child_cell`)
2. WorldBuilder `VisibilityManager.RenderInsideOut` reference implementation
3. acdream's existing visibility code (`CellVisibility`, `WbDrawDispatcher`, `TerrainModernRenderer`, render frame integration points)
4. ISSUES.md context for #78, #95, and the lighting family
Findings consolidated in [`docs/research/2026-05-25-issue-78-visibility-culling-investigation.md`](2026-05-25-issue-78-visibility-culling-investigation.md). Two main outputs:
- Confirmed retail and WB use different mechanisms (retail = screen-space polygon-clip scissor, WB = stencil mask), but achieve the same observable behavior. WB's stencil approach is the right fit for acdream's modern GL pipeline.
- Three approach options sketched: A (WB stencil port), B (retail polygon-clip — multi-week), C (binary gate — workaround).
User chose **Approach A** (WB stencil port).
### Phase 1a: Falsification test (visual)
User stood in a Holtburg cottage cellar at the artifact spot and rotated the camera in place. Reported: **no flickering around the edges.** This confirmed H1 (culling) over H2 (Z-fight). The artifact IS a rendered polygon that needs to be culled, not a depth-precision issue.
### Phase 2: Plan written
[`docs/superpowers/plans/2026-05-25-phase-a8-indoor-cell-visibility-culling.md`](../superpowers/plans/2026-05-25-phase-a8-indoor-cell-visibility-culling.md). 8 tasks, TDD-shaped where unit-testable. Architecture: split entities into `IndoorOnly` (`ParentCellId.HasValue`) and `OutdoorOnly` (`ParentCellId == null`); stencil-mark current building's exit portals; gate terrain + outdoor entities by `glStencilFunc(Equal, 1, 0x01)`.
### Phase 3: Subagent-driven execution
Tasks 1-7 implemented by Sonnet subagents, each with two-stage review (spec-compliance + code-quality). Task 4 was sent back once for over-engineering (the implementer added speculative `pos.w` clamp and `out FragColor` declarations not in the spec; subtractively reverted in commit `344034b`). Task 5 received a hardening pass (`a1c393e`) for explicit `Enable(DepthTest)`, `readonly` fields, and an `AllocateVbo` comment.
All 7 implementation tasks shipped clean. Built green, ~36 unit tests added across tasks, all passing.
### Phase 4: Visual verification — three rounds, three regressions
**Round 1 — commit `41c2e67` (initial integration)**
User scenarios:
- Cellar stairs: not visible from outside-to-in (but this turned out to be a NOT-A8 artifact — separate)
- Inn walls: solid (no see-through buildings) ✅
- Character: **DISAPPEARED inside cottages**
- Character at doorway: only parts of body visible, **head rendered backwards**
- Flickering on enter/exit
Diagnosis: animated entities (player, NPCs) have `ParentCellId == null` (server-spawned, not statically tied to a cell). EntitySet partition classified them as OutdoorOnly, so the stencil-gated outdoor pass only let them render where stencil bit 1 was set (= portal silhouettes). Walking around inside, character body crossed in and out of portal silhouettes → partial body visible briefly at doorways, head-on-backwards artifacts where stencil clipped one part of body but not another, fully invisible most of the time.
**Round 2 — commit `a2ad5c1` (animated-entity fix)**
Fix: `animatedEntityIds` overrides the partition. Animated entities go into `IndoorOnly` (stencil OFF), excluded from `OutdoorOnly`.
User scenarios:
- Character: **VISIBLE**
- Closed door: **OUTSIDE STILL VISIBLE through closed door**
- Door from adjacent room: **VISIBLE THROUGH WALL** between rooms ❌
- Character at door opening overlap: **outside bleeds through character body** where body covers the portal silhouette on screen ❌
Diagnosis: my plan had the GL state order WRONG. I had `IndoorOnly draw → MarkAndPunch → terrain stencil-gated`. The `MarkAndPunch` step writes `gl_FragDepth = 1.0` at all stencil-1 pixels, destroying any indoor depth that was just written there. Then terrain at 0.99 wins every depth test at portal-silhouette pixels. WB's actual order is `MarkAndPunch FIRST → indoor cells → terrain stencil-gated`. With WB's order, indoor cells write depth AFTER the punch, so their depth survives and correctly occludes the subsequent stencil-gated terrain pass.
**Round 3 — commit `b76f6d1` (order swap to match WB)**
Fix: swap `IndoorOnly draw` and `MarkAndPunch` so MarkAndPunch runs first.
User scenarios:
- Closed door: **NOW BLOCKS OUTSIDE**
- Door from adjacent room through wall: **STILL VISIBLE** ❌ (worse than expected)
- Character at door: **TOTALLY BROKEN** — character rendered head-inside-out, see-through to distant outdoor objects through where walls should be ❌
Screenshot evidence: user stood on what appeared to be the upper floor of a Holtburg cottage. Visible in the frame: wood stairs/floor (indoor cell mesh), player character in armor, and a small window-shaped opening showing outdoor terrain (correct portal behavior). Beyond that: GREY expanse (clear color) with NPCs and decorations floating in space (= distant outdoor entities visible THROUGH where walls should be).
Diagnosis (the showstopper): **cottage walls are landblock-baked stabs** stored in `LandBlockInfo.Objects`, NOT in the EnvCell's mesh or `StaticObjects`. They're `WorldEntity` instances with `ParentCellId == null`. The EntitySet partition treats them as outdoor scenery and stencil-gates them. Result: cottage interior walls only render at portal silhouettes — i.e., framed in the window openings. The rest of the wall area is just the cleared framebuffer (grey), with distant entities (which DO render unconditionally because they happen to be in screen positions not occluded by walls that don't exist) bleeding through.
The head-inside-out artifact is a cascade — with the depth buffer state and framebuffer being so broken (walls absent, terrain stencil-gated in unexpected places, depth punched then partially overwritten by terrain), the character mesh rendering interacts with these broken depths in ways producing the impossible-anatomy effect. I don't have a single-call explanation; it's "the depth + stencil state is so far from sane that character vertex shader + fragment depth tests produce nonsense."
### Phase 5: REVERT
Decision: continuing to patch was going to keep surfacing edge cases. The fundamental issue (EntitySet partition by `ParentCellId.HasValue` is wrong) requires re-design, not patching.
Three revert commits:
- `c897a17` reverts `b76f6d1` (order swap)
- `96f8bd2` reverts `a2ad5c1` (animated-entity fix)
- `fef6c61` reverts `41c2e67` (Task 7 integration)
After reverts: HEAD = `fef6c61`. GameWindow.cs render frame is back to pre-A8 (terrain → depth-clear-if-inside → dispatcher with all entities). Build green. All infrastructure tests passing.
---
## What was kept (the infrastructure)
Six commits NOT reverted. All internally consistent, all tested, all dormant (nothing invokes them at runtime):
| Commit | What it adds | Status |
|---|---|---|
| `fee878f` | `LoadedCell.PortalPolygons: List<Vector3[]>` field | dormant, tested |
| `d834188` | `BuildLoadedCell` populates `PortalPolygons` from `cellStruct.Polygons[portal.PolygonId].VertexIds` | dormant (data populated, nothing reads it) |
| `6577c0a` | `RenderingDiagnostics.ProbeVisibilityEnabled` flag + DebugVM mirror | dormant (no probe code uses it) |
| `2d31d49``344034b``f3d7b13` | `portal_stencil.vert/.frag` shader pair | dormant (no code loads them) |
| `3973596``a1c393e` | `IndoorCellStencilPipeline` class + `PortalMeshBuilder` static helper, with hardening | dormant, 5 unit tests pass |
| `dcf69a1` | `WbDrawDispatcher.EntitySet { All, IndoorOnly, OutdoorOnly }` enum + `set` parameter on `Draw` + `WalkEntitiesForTest` helper | dormant (`Draw` always called with default `EntitySet.All`), 26 dispatcher tests pass |
These are all correct and useful. They don't need to be re-shipped in the re-plan — they're ready for a new integration to consume.
**However**, the re-plan may want to reshape some of them:
- `EntitySet` enum's binary `IndoorOnly/OutdoorOnly` partition is the load-bearing wrong assumption. The re-plan likely needs more partition values (e.g. `IndoorStatic`, `BuildingShell`, `OutdoorScenery`, `LiveDynamic`) or a different mechanism entirely. The enum can be extended or replaced.
- `IndoorCellStencilPipeline` is correct as a primitive but its current usage assumption ("mark exit portals, gate outdoor passes") may need refinement. For example, it might want a "draw building-shell stabs unconditionally THEN stencil-gate outdoor scenery" split.
---
## Root cause taxonomy (the architectural lesson)
acdream's `WorldEntity` data model has more entity classes than the plan accounted for. The classes encountered:
| Class | `ParentCellId` | Source | Examples | Stencil treatment needed |
|---|---|---|---|---|
| **Cell mesh** | set | `EnvCell` geometry | inn walls, dungeon corridor walls, cellar floor | Always render (unconditional) |
| **Cell statics** | set | `EnvCell.StaticObjects` | inn furniture, dungeon braziers | Always render (unconditional) |
| **Building shell stab** | **null** | `LandBlockInfo.Objects` | cottage walls/roof, smithy walls | Always render WHEN camera is inside the building |
| **Outdoor scenery stab** | null | `LandBlockInfo.Objects` | trees, fences, lampposts, rocks, hitching posts | Stencil-gate (only visible through portals from inside) |
| **Live animated** | null | server `CreateObject` + in `animatedEntityIds` | player, NPCs, monsters, doors mid-animation | Always render (unconditional) |
| **Live static** | null | server `CreateObject`, NOT animated | dropped items, sigils, idle doors after animation ends | Probably always render? Hard to say |
The plan's binary `IndoorOnly = HasValue, OutdoorOnly = !HasValue` partition lumps "building shell stab" with "outdoor scenery stab" — both have null `ParentCellId`. But they need OPPOSITE stencil treatment.
**The 3rd-round disaster came from this conflation specifically.** When camera is inside a cottage, the cottage's walls (building shell stab) need to render UNCONDITIONALLY (just like cell mesh would). My plan classified them with the trees and lampposts → stencil-gated → invisible.
WB handles this via a "building" concept: `BuildingPortalGPU` tracks which `EnvCellIds` belong to each building, and the building's portal mesh + occlusion is treated separately from generic scenery. acdream doesn't have this concept — all landblock stabs go into the same `WorldEntity` pool with no "is-building-shell" flag.
### Why Tasks 1-6 review missed this
The spec / code review focused on:
- Spec compliance (did the implementation match the spec?)
- Code quality (well-structured, clean, etc.)
Neither addressed: **is the spec's architectural premise correct?** The plan stated the partition as a binary based on `ParentCellId`, the reviewers verified the implementation followed that, but no one questioned whether the premise was right. Investigation (Phase 1) didn't catch it either — the audit focused on the EXISTING code paths and didn't go deep on the `WorldEntity` lifecycle / classification.
This is the kind of issue where the plan's "self-review" step + investigation's "what we've ruled out" section should have included an entity-taxonomy audit. Future plans for rendering-pipeline changes should include: "List every kind of `WorldEntity` and what classification it gets, then verify the pipeline treats each correctly."
### A second architectural issue (deferred but real)
Even with the cottage-walls case solved, the WB stencil approach has a known limitation that the predecessor handoff already flagged: **all exit portals in `VisibleCellIds` get marked**, including portals on cells far from the camera. From inside a cottage, if the camera looks at a wall, the portals BEHIND the wall (on the other side of the room) ARE marked in stencil (their silhouettes project to screen positions covered by the wall). Then far-depth is punched at those positions. Then terrain stencil-gated wins over indoor wall depth → "outdoor visible through window on the other side of the room behind a wall."
In Round 2 testing, this surfaced as "I can see the door of the adjacent room through the wall." It's a real geometric over-marking issue.
WB handles it with a 3-stencil-bit pipeline ("Step 5" in WB's RenderInsideOut). My plan explicitly DEFERRED Step 5. With Round 2's order, the issue was masked because indoor wall depth was being destroyed by the late MarkAndPunch anyway, so the punch's far-depth happened to coincide with the bug. With Round 3's order, the punch happens before walls draw, so walls correctly write depth — but now the far-side-portal issue is unmasked.
The re-plan needs to address Step 5 OR accept it as a documented limitation OR find a different mechanism (camera-frustum portal filtering, occlusion query for portals behind walls, etc.).
---
## Things the re-plan must consider (the "do-not-miss" list)
1. **Building shell stabs are NOT outdoor scenery.** They have `ParentCellId == null` but must render unconditionally when the camera is inside the building. The fix is one of:
- (a) **AABB-encloses-camera heuristic**: when an entity's `[AabbMin, AabbMax]` contains `cameraPos`, treat it as building shell. Quick to implement, ~30 min. Works for cottages and inns. Edge cases: very tall buildings with low camera, or buildings the camera isn't quite inside.
- (b) **Tag building stabs at hydration time**: when reading `LandBlockInfo.Objects`, identify objects that have associated `EnvCellIds` (i.e., they're the building shell of those cells). Add `WorldEntity.IsBuildingShell: bool` (or similar). Correct, but requires understanding LandBlockInfo's structure.
- (c) **WB-style building concept**: full `BuildingPortalGPU.EnvCellIds` model. Heavy lift; probably overkill for first ship.
- **Recommended: (a) for first ship, (b) as a follow-up if the heuristic has misses.**
2. **Live entities (player, NPCs, dropped items) need a "always render" path.** Today's `animatedEntityIds` covers the animated subset. Dropped items / idle doors are NOT animated but ARE live. The cleanest model: add `WorldEntity.IsLiveDynamic` flag set at hydration when the entity has a `ServerGuid` (vs landblock-baked). All live entities skip stencil-gating entirely.
3. **GL state order matters: MarkAndPunch BEFORE indoor cells.** Confirmed by Round 3. The far-depth punch must run before indoor geometry draws, so indoor geometry writes depth on top of the 1.0 punch and correctly occludes the subsequent stencil-gated terrain. The Plan had the order wrong; the order-swap (Round 3) is the correct order. Re-plan must reflect this.
4. **Animated/live entities should draw AFTER all stencil work**, with stencil disabled, so their depth never interacts with the punch or the stencil-gated pass. Round 2 showed character body bleeding to terrain when drawn BEFORE the punch (depth destroyed by punch). Drawing them last fixes this naturally.
5. **The "far-side portal visible through wall" problem (WB Step 5)** is real and won't be fixed by the order swap alone. Either implement Step 5 (complex), accept it as a known limitation for first ship, or add a camera-frustum filter on portal triangles (only stencil-mark portals the camera could plausibly see directly).
6. **Cellar-stairs grass artifact from outside-to-in is NOT A8 scope.** This was reported by the user in Round 1 and persisted across all rounds. From outside, no stencil work runs; the artifact is purely terrain-Z-fight against the cellar geometry. The cellar floor is meters below terrain Z; #100's 1cm shader nudge doesn't help. File as a separate issue OR roll into a future "deep-cell terrain occlusion" phase.
7. **Closed doors must block outdoor visibility.** The Round 3 order successfully delivers this — door entities (`ParentCellId == null` but inside the building's AABB) need to draw in the indoor pass AFTER MarkAndPunch, so their depth wins over terrain. Doors actually map cleanly to the building-shell stab solution: a door is functionally part of the building when closed.
8. **The `EntitySet` enum may need refactoring or replacement.** Today it has `All, IndoorOnly, OutdoorOnly`. The taxonomy suggests at least:
- `All` (pre-A8 default)
- `IndoorPass` — cell mesh + cell statics + building shell stabs + live entities (essentially everything that should draw unconditionally when inside)
- `OutdoorPass` — outdoor scenery only, stencil-gated
- `LivePass` (optional) — separate pass for live entities at the very end, no stencil
Or replace the enum with a callback / filter delegate. The current enum is a quick prototype; the production design should reflect the actual taxonomy.
9. **Visual verification scenarios must cover MORE buildings.** Round 1 tested at the inn (which has cell mesh walls); Round 3 tested at a cottage (which has stab walls). Different bugs surfaced. The re-plan's Task 8 must explicitly test: inn, cottage (interior + cellar), dungeon (portal-entry), and at least one mid-size building with multiple rooms. Each likely has a different geometry classification mix.
10. **The flickering on enter/exit** reported across all rounds is unexplained. Likely the `CellSwitchGraceFrameCount = 3` interacting with stencil setup timing — when camera transits a cell boundary, the visibility result toggles between cells over the grace frames, and the stencil mask flips with it. Investigate during re-plan.
---
## Existing apparatus the next session inherits
### Code (committed, dormant)
- **`src/AcDream.App/Rendering/CellVisibility.cs`** — `LoadedCell.PortalPolygons` field populated by `BuildLoadedCell`. The data is ready; nothing reads it.
- **`src/AcDream.App/Rendering/IndoorCellStencilPipeline.cs`** — `PortalMeshBuilder.BuildTriangles(...)` (pure-math, tested) + `IndoorCellStencilPipeline` (GL class, untested at runtime but the GL state machine has been reviewed twice). The `MarkAndPunch` GL sequence is correct per WB; the cleanup state is correct for either pre- or post-indoor-draw scheduling. Re-usable as-is.
- **`src/AcDream.App/Rendering/Shaders/portal_stencil.vert/.frag`** — minimal MVP + `gl_FragDepth = 1.0` writer. Re-usable.
- **`src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs`** — `EntitySet` enum + `WalkEntitiesForTest` helper + partition logic in `WalkEntitiesInto`. **The current partition is the load-bearing wrong assumption.** Re-plan likely modifies this.
- **`src/AcDream.Core/Rendering/RenderingDiagnostics.cs`** — `ProbeVisibilityEnabled` flag. Re-usable.
### Tests (passing)
- `tests/AcDream.App.Tests/Rendering/CellVisibilityPortalPolygonsTests.cs` — 2 tests, data-class invariants
- `tests/AcDream.App.Tests/Rendering/IndoorCellStencilPipelineTests.cs` — 5 tests, triangle-fan math
- `tests/AcDream.Core.Tests/Rendering/Wb/WbDrawDispatcherEntitySetTests.cs` — 3 tests, EntitySet partition
- `tests/AcDream.Core.Tests/Rendering/RenderingDiagnosticsVisibilityTests.cs` — 1 test, flag toggle
### Documents
- The original plan: [`docs/superpowers/plans/2026-05-25-phase-a8-indoor-cell-visibility-culling.md`](../superpowers/plans/2026-05-25-phase-a8-indoor-cell-visibility-culling.md) — read for Task 1-6 implementation reference; **do NOT re-execute Task 7** as written.
- The investigation report: [`docs/research/2026-05-25-issue-78-visibility-culling-investigation.md`](2026-05-25-issue-78-visibility-culling-investigation.md) — the H1 confirmation + WB/retail/acdream code anchors still apply.
- The original handoff: [`docs/research/2026-05-25-issue-100-shipped-and-culling-handoff.md`](2026-05-25-issue-100-shipped-and-culling-handoff.md) — the family map (#78 + cellar-stairs + #95) is unchanged.
### Reference anchors (still valid)
- **WB stencil:** `references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/VisibilityManager.cs:73-239` (RenderInsideOut). **Note: WB's order is MarkAndPunch FIRST, then indoor cells — confirmed by Round 3.**
- **WB building concept:** `references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/PortalRenderManager.cs``BuildingPortalGPU.EnvCellIds` is the "this stab belongs to a building" association we're missing.
- **Retail:** `acclient_2013_pseudo_c.txt:432709` (`PView::DrawCells`, `outside_view.view_count > 0` gate). Polygon-clip scissor, not stencil — equivalent observable behavior.
### Issue state
- **#78** — still OPEN. Not fixed by A8 attempt.
- **Cellar-stairs artifact (NEW evidence for #78)** — still happening from outside-to-in (NOT A8 scope) AND from inside (was A8 scope; not fixed).
- **#95** (portal-graph blowup at network hubs) — out of scope, separate work.
- **#79/#80/#81/#93/#94** (indoor lighting family) — unchanged.
---
## Pickup prompt for the next session
```
Phase A8 — Indoor-cell visibility culling — RE-PLAN after revert.
Read first (in this order — REQUIRED):
1. docs/research/2026-05-26-a8-revert-handoff.md (this doc — full
story of the 3-round visual verification failure + reverts)
2. docs/superpowers/plans/2026-05-25-phase-a8-indoor-cell-visibility-culling.md
(original plan — reference Tasks 1-6 implementation; do NOT
re-execute Task 7 as written)
3. docs/research/2026-05-25-issue-78-visibility-culling-investigation.md
(original investigation; H1 culling diagnosis is confirmed)
4. CLAUDE.md — find "currently working toward" to refresh state
State both altitudes:
Currently working toward: M1.5 — Indoor world feels right
Current phase: A8 — Indoor-cell visibility culling RE-PLAN
Previous attempt at HEAD: fef6c61 (reverts of 41c2e67, a2ad5c1, b76f6d1)
Infrastructure preserved: Tasks 1-6 commits (fee878f → dcf69a1 + a1c393e)
Test baseline: build green; 36 A8-infrastructure tests pass dormant
Session flow:
### Phase 1 — RE-INVESTIGATE the entity taxonomy (USE /investigate skill)
DO NOT skip to planning. The original plan's binary IndoorOnly/OutdoorOnly
partition was the load-bearing wrong assumption. Before any new plan:
a. Read src/AcDream.App/Rendering/GameWindow.cs around the entity
hydration paths (BuildInteriorEntitiesForStreaming around line
5409+, and the LandBlockInfo.Objects iteration). Document every
code path that constructs a WorldEntity and what ParentCellId it
gets.
b. Enumerate the actual entity classes that exist in acdream's runtime:
- Cell mesh (ParentCellId set, from EnvCell)
- Cell statics (ParentCellId set, from EnvCell.StaticObjects)
- Building shell stab (ParentCellId == null, from LandBlockInfo.Objects,
represents inn walls / cottage walls / etc)
- Outdoor scenery stab (ParentCellId == null, from LandBlockInfo.Objects,
represents trees / fences / lampposts)
- Live animated (ParentCellId == null, server-spawned, in
animatedEntityIds — player, NPCs, monsters, mid-animation doors)
- Live static (ParentCellId == null, server-spawned, NOT animated —
dropped items, idle doors after animation ends, sigils)
c. For each class, determine: how can the renderer distinguish it from
the other null-ParentCellId classes? Today only animatedEntityIds
separates one class. The re-plan needs distinguishers for the others.
Options:
- WorldEntity.IsBuildingShell (set at LandBlockInfo hydration)
- WorldEntity.IsLiveDynamic (set when ServerGuid != 0)
- AABB-encloses-camera heuristic (runtime, no new field)
- WB-style building association (per-cell building registry)
Spike each option's cost + correctness.
d. Read WB's reference to confirm how WB handles each class.
references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/PortalRenderManager.cs
has the BuildingPortalGPU.EnvCellIds association we're missing.
references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/StaticObjectRenderManager.cs
might show how outdoor scenery is treated separately from buildings.
e. Decide the entity-distinguisher approach. The cheapest option that
handles cottages + inns + dungeons is likely AABB-encloses-camera
for building shells, animatedEntityIds for live animated, and
accept "dropped items invisible from inside" as a known limitation
for first ship (defer to a follow-up task with a real IsLiveDynamic
flag).
f. Re-confirm the GL state order: MarkAndPunch FIRST, then indoor
cells (including building shells + live animated), then terrain
stencil-gated, then outdoor scenery stencil-gated. Confirmed by
Round 3 of the original A8 attempt.
g. Decide whether to address the "far-side portal visible through wall"
issue (WB Step 5 territory) in this phase or defer. The simplest
ship-now approximation: only stencil-mark portals on the CAMERA'S
OWN CELL (not BFS-extended VisibleCellIds). This restricts stencil
to portals directly adjacent to the camera. Loses cross-cell-portal
visibility (probably acceptable for first ship).
Phase 1 output: a short report (<400 tokens chat, full doc to
docs/research/YYYY-MM-DD-a8-entity-taxonomy.md). Plus a fix-shape
sketch covering all 6 entity classes. Get user approval before Phase 2.
### Phase 2 — Re-PLAN Task 7 (USE superpowers:writing-plans skill)
The re-plan replaces Task 7 (and may reshape `EntitySet` enum semantics
or add new partition values). Expected shape:
- Task R1: Add the entity distinguisher (e.g. AABB-encloses-camera
helper on WbDrawDispatcher, or new WorldEntity.IsBuildingShell field
if going that route).
- Task R2: Update WbDrawDispatcher.EntitySet partition to use the
distinguisher. May rename enum values to reflect new taxonomy.
Update unit tests.
- Task R3: Add a third dispatcher call for live entities AFTER the
stencil work. Either new EntitySet value or a flag parameter.
- Task R4: Re-wire GameWindow render frame with MarkAndPunch FIRST
order. Three (or four) dispatcher calls when inside:
1. Indoor pass (cell mesh + cell statics + building shell stabs)
2. MarkAndPunch
3. Terrain stencil-gated
4. Outdoor scenery pass (stencil-gated)
5. Live entity pass (no stencil, AFTER everything)
- Task R5: Visual verification — MUST test at cottage interior + cellar,
inn interior, AND a dungeon (portal-entry). Each likely surfaces
different bugs.
- Task R6: Ship docs (close #78, update CLAUDE.md A8 paragraph) —
only if all three visual scenarios pass clean.
The infrastructure from Tasks 1-6 is ready. The re-plan only needs to
ship the new integration. Keep tasks small (TDD-shaped where possible);
the GL integration tasks are visual-verification-only by nature.
### Phase 3 — Implement (USE superpowers:subagent-driven-development)
Same pattern as original. Fresh Sonnet subagent per task with two-stage
review. CRITICAL: spec reviewer must ALSO check "does the spec's
architectural premise match the actual entity taxonomy?" Don't repeat
the original's mistake of reviewing implementation without questioning
the spec's foundational assumption.
## Constraints
- Per CLAUDE.md "no workarounds" rule — fix the root cause, do not
patch symptom sites. The re-plan IS the root-cause fix for the
taxonomy issue; Round 1-3 patches were band-aids that didn't address
the underlying classification gap.
- Visual verification is the acceptance test. Test at AT LEAST THREE
building types (cottage, inn, dungeon) before declaring success.
- The cellar-stairs grass artifact FROM OUTSIDE is NOT A8 scope (no
stencil work happens when camera is outside). File as a separate
issue if not already filed, with a note that it's a deep-cell
terrain Z-fight (not solvable by #100's 1cm nudge).
- The "far-side portal visible through wall" issue may be addressed
in this phase or deferred to A8.P2. Decide explicitly during Phase 1.
- DON'T re-revert the infrastructure. Tasks 1-6 commits are kept
intentionally; the re-plan consumes them. The only thing being
re-shipped is the integration design.
## What success looks like
After this re-plan ships:
- Standing inside a Holtburg cottage (any room), all walls are
SOLID — no see-through to outdoor objects, no see-through to
adjacent rooms.
- Standing inside Holtburg Inn, same. No outdoor stabs through
walls/floor (#78's primary acceptance).
- Standing in cottage cellar, no grass overlay on stair geometry
(the cellar-stairs in-to-out half of the artifact; the
out-to-in half is separate).
- Player character + NPCs are FULLY VISIBLE indoors at all camera
angles. No partial body, no head-backwards, no flickering on
enter/exit (or document any residual flickering as a known
issue).
- Closed doors BLOCK outdoor visibility. Open doors SHOW outdoor
through the opening, occluded properly by surrounding wall.
- No regression on issue #100 (no transparent rectangles around
cottages).
- dotnet build green; dotnet test failures within the documented
14-23 flaky window.
## Reference repo hierarchy reminder
Per CLAUDE.md "Reference repos: cross-check the relevant ones" —
for the entity taxonomy / building shell question:
- WB's PortalRenderManager + StaticObjectRenderManager (how WB
splits buildings from outdoor scenery)
- WB's VisibilityManager (the proven stencil pipeline with the
correct GL state order)
- Retail decomp for CLandBlock::init_buildings (the data-model
source — how retail tags building objects vs. scenery)
- ACE (server) has minimal coverage here — buildings are
client-side decoration
Cross-reference WB + retail. The acdream-specific question is HOW
acdream's WorldEntity model can express the building-vs-scenery
distinction.
```
---
## Files state at session end
```
Branch: claude/strange-albattani-3fc83c
HEAD: fef6c61 Revert "feat(render): Phase A8 — wire stencil pipeline into render frame"
Parent: 96f8bd2 Revert "fix(render): Phase A8 — animated entities exempt from stencil-gated outdoor pass"
Grandparent: c897a17 Revert "fix(render): Phase A8 — mark-and-punch BEFORE indoor draw (correct WB order)"
Before reverts: b76f6d1 fix(render): Phase A8 — mark-and-punch BEFORE indoor draw
Infrastructure base: dcf69a1 → a1c393e → 3973596 → 344034b/f3d7b13 → 2d31d49 → 6577c0a → d834188 → fee878f
Working tree: clean
Build: green (0 warnings, 0 errors)
Tests: 26 WbDrawDispatcher + 5 IndoorCellStencilPipeline + 2 PortalPolygons + 1 ProbeVisibility = 34 A8 infrastructure tests passing
Untracked: launch-a8-verify*.log (session logs, can be deleted)
```
The reverts are NEW commits (not destructive history rewrites — original commits remain in history for evidence). The re-plan can `git log b76f6d1..fef6c61` to see exactly what was reverted, or `git diff dcf69a1..fef6c61` to see the net effect on the codebase (should be: only test file is at slightly different state; everything else from Tasks 1-6 is in place).

View file

@ -0,0 +1,69 @@
# A8 RR0 falsification — are Issues A and C pre-existing or A8-caused?
**Date:** 2026-05-26 (PM)
**Method:** three-branch launch + visual repro at Holtburg cottage entry / exit transitions.
## Observations
| Branch | Commit | Issue C (entry transparent floor) | Issue A (exit through-ground / walls missing) | #78 (constant houses-below-terrain visible from inside) |
|---|---|---|---|---|
| HEAD | `2bfeafd` (R3.5 v2) | **YES** | **YES** (varies by building) | (gone — R1+R2+R3 fixed) |
| R3 baseline | `60f07bc` | **YES** | **YES** (same as HEAD) | (gone — R1+R2+R3 fixed) |
| main | `7034be9` | **NO** | **NO flicker** — but DIFFERENT SYMPTOM: houses below terrain visible from inside, constant (not transition) | **YES, constant** |
User screenshots from HEAD captured during the spike:
1. Cottage interior: floor partly see-through to outdoor grass; misplaced textured panel visible
2. Cottage exterior: brown floor/wall panel floating in space; surrounding building walls missing
User quote on main observation:
> "No floor is not transparent."
> "When I now stand in the cottage and look out I can see houses below the terrain. There is no flick when I pass out. They are just visible all the time"
## Diagnosis
Issues A and C are **NOT pre-existing.** They are caused by **R3 (the stencil pipeline wire-in)**:
- R3 successfully closes the original #78 symptom (constant houses-below-terrain visibility from inside) ✓
- R3 introduces two new artifacts as side-effects:
- Issue C — cottage floor transparent showing cellar during entry transition
- Issue A — through-ground objects + walls-missing flicker during exit transition
The R3.5 v1+v2 patches were attempts to mitigate, didn't help (R3 baseline and HEAD show identical A+C symptoms).
## Decision
Per the design's decision gate at RR0-S5:
- [x] **Outcome 2 selected:** Only R3 + HEAD reproduce → A and/or C caused by R3 work specifically. **PAUSE the plan.** Re-brainstorm via `superpowers:brainstorming` to address them; update the design doc; resume.
The original restructure design assumed Issues A and C might be pre-existing and could be filed as separate out-of-A8-scope issues. RR0 invalidates that assumption. The restructure must address them OR accept that A8 trades one bug class for another (which the user has not approved).
## Open questions for the re-brainstorm
1. **Mechanism of Issue C (entry transparent floor):** what about R3's stencil work makes cottage floor transparent during entry? Hypotheses:
- Stencil bit 1 set on portal silhouettes but cleared next frame; during the entry the camera-cell hadn't yet promoted, so VisibleCellIds was empty, so MarkAndPunch had no portals to mark → outdoor pass effectively ungated, terrain re-draw beats indoor cell mesh at the floor pixels.
- Depth-clear-if-inside firing too early or too late, leaving the depth buffer in a bad state.
- The cottage cell's mesh + the cellar cell's mesh both included in IndoorPass at adjacent Z values, Z-fight is fundamental.
2. **Mechanism of Issue A (exit through-ground flicker):** during grace frames after exit, `cameraInsideCell=true` but `cameraReallyInside=false`. Sky skipped, terrain drawn, depth-clear skipped, stencil branch skipped, outdoor Draw(All) runs. Why do entities below terrain win the depth test in these specific frames?
3. **Will the WB-faithful restructure help, hurt, or be neutral on A and C?** The restructure removes the depth-clear and initial-terrain workarounds. During grace frames after exit, it gates terrain on `!cameraInside` (true since cameraInside is strict). So terrain DRAWS unconditionally during grace (because !cameraInside = !false = true → draws). Behavior identical to main during these frames → likely re-introduces #78 main symptom for ~3 grace frames after exit. Trade-off: 3 frames of #78 vs 3 frames of Issue A.
4. **Should we shorten or eliminate the cell-switch grace mechanism?** Currently 3 frames. If 0 frames, the gate flips strict and cleanly at the threshold. PointInCell epsilon (0.01f) provides minimal hysteresis but might be enough.
5. **Is there a third option** between "stencil pipeline gates outdoor visibility" (causes A+C) and "no stencil work" (causes #78)? Possibilities:
- Stencil work but with different cell-set scoping (only camera-cell's portals, not BFS-extended; already in R3).
- Hybrid: stencil-gate outdoor scenery but NOT terrain (let terrain draw unconditionally + accept #78 leak for terrain only).
- Frame-based heuristic: skip stencil for first N frames after entry/exit to mask the transition artifact.
## Logs
- `launch-a8-rr0-head.log` / `launch-a8-rr0-head-take2.log` / `launch-a8-rr0-head-take3.log` — HEAD launches (2bfeafd)
- `launch-a8-rr0-r3.log` — R3 baseline launch (60f07bc, GameWindow.cs single-file checkout)
- `launch-a8-rr0-main.log` — main launch (7034be9, side worktree at .claude/worktrees/tmp-main-baseline with WorldBuilder ref junction)
## Cleanup performed
- Restored HEAD's GameWindow.cs in this worktree (no working-tree changes left)
- Removed Windows junction `tmp-main-baseline/references/WorldBuilder` → strange-albattani-3fc83c
- Removed side worktree `.claude/worktrees/tmp-main-baseline`

View file

@ -0,0 +1,250 @@
# Phase A8 — RR1 cleanup SHIPPED. Handoff for RR2 spike in fresh session.
**Date:** 2026-05-26 (PM, end of session)
**Branch:** `claude/strange-albattani-3fc83c` (worktree at `.claude/worktrees/strange-albattani-3fc83c`)
**HEAD:** `29e306b` (RR1 footer-marks)
**Build:** green (0 errors, 0 warnings in App; 6 warnings in test projects are pre-existing lint)
**Tests:** within documented 14-23 flaky window
---
## TL;DR
This session pivoted Phase A8 from "WB-faithful restructure" to "**full WorldBuilder RenderInsideOut + RenderOutsideIn port**" after RR0 falsification revealed R3+R3.5's bugs were structural (BFS-wide cell rendering), not just workaround-induced.
Shipped this session:
- ✅ RR0 falsification spike (proved Issues A+C are R3-caused, not pre-existing on main)
- ✅ Re-brainstorm with new design doc
- ✅ 12-task implementation plan
- ✅ RR1 cleanup: `[vis]` probe committed; R3+R3.5 v1+v2 reverted; old design+plan footer-marked as superseded
Next: **RR2 spike** — inspect `LandBlockInfo.Buildings` data shape + WB's interior-portal walk algorithm before implementing `BuildingLoader` in RR3.
---
## State altitudes
- **Currently working toward:** M1.5 — Indoor world feels right
- **Current phase:** A8 — full WorldBuilder RenderInsideOut + RenderOutsideIn port
- **Tasks remaining:** RR2 (spike) → RR3-RR11 (impl + visual gates) → RR12 (ship)
- **Estimated remaining:** 8-10 sessions / 1.5-2 weeks calendar
---
## What shipped this session (8 commits)
| SHA | What |
|---|---|
| `f9bab50` | docs(research): RR0 findings — A+C caused by R3, NOT pre-existing on main |
| `ea60d1f` | docs(spec): Full WB RenderInsideOut + RenderOutsideIn port design |
| `651e7e2` | docs(plan): 12-task implementation plan |
| `84c4a70` | diag(render): `[vis]` probe — light up dormant `ProbeVisibilityEnabled` |
| `664ca9c` | Revert R3.5 v2 (`2bfeafd`) |
| `b931038` | Revert R3.5 v1 (`38d5374`) |
| `fd721af` | Revert R3 (`60f07bc`) — with `[vis]` probe preserved through conflict |
| `29e306b` | docs: superseded the prior restructure design + plan |
Kept (NOT reverted): R1 (`ed72704` IsBuildingShell tag) + R2 (`55f26f2` EntitySet partition) + Tasks 1-6 infrastructure (PortalPolygons, IndoorCellStencilPipeline, portal_stencil shaders, ProbeVisibilityEnabled). All consumed as-is by the upcoming work.
---
## Canonical pickup docs (READ FIRST in fresh session)
In order:
1. **[docs/superpowers/specs/2026-05-26-phase-a8-wb-full-port-design.md](../superpowers/specs/2026-05-26-phase-a8-wb-full-port-design.md)** — the approved design. Single source of truth for what's being built.
2. **[docs/superpowers/plans/2026-05-26-phase-a8-wb-full-port.md](../superpowers/plans/2026-05-26-phase-a8-wb-full-port.md)** — 12-task plan. Pickup at RR2.
3. **[docs/research/2026-05-26-a8-rr0-falsification-findings.md](2026-05-26-a8-rr0-falsification-findings.md)** — evidence that triggered the scope expansion.
4. **[references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/VisibilityManager.cs:73-330](../../references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/VisibilityManager.cs)** — the proven reference (RenderInsideOut Steps 1-5 at 73-239; RenderOutsideIn at 241-330).
5. **[CLAUDE.md](../../CLAUDE.md)** — "Currently working toward" section.
---
## RR2 — the spike
### Goal
Before implementing `BuildingLoader` in RR3, verify (a) what fields `DatReaderWriter.Types.BuildingInfo` exposes (specifically the portal list field name + the per-portal `OtherCellId`); (b) how WB's `PortalRenderManager` actually computes a building's full cell set from BuildingInfo entries (the interior-portal walk algorithm).
The plan's RR3 tests reference `building.Portals` and `BldPortal.OtherCellId` — RR2 confirms or corrects those names.
### Steps (per plan §RR2, 6 sub-steps)
1. **RR2-S1** — Inspect `BuildingInfo` struct shape:
```bash
grep -rn "class BuildingInfo\|struct BuildingInfo\|record BuildingInfo" references/Chorizite.DatReaderWriter/ 2>/dev/null | head -5
```
Or look in NuGet cache: `~/.nuget/packages/chorizite.datreaderwriter/*/lib/*/BuildingInfo.cs`. Also document what `LandblockLoader.cs:74-87` references.
2. **RR2-S2** — Read WB `PortalRenderManager.cs:518-551` (or grep `BuildingPortalGPU` + `EnvCellIds`):
```bash
grep -n "BuildingPortalGPU\|EnvCellIds" references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/PortalRenderManager.cs | head -10
```
Document the interior-portal walk algorithm.
3. **RR2-S3** — Live-inspect a Holtburg cottage's BuildingInfo. Add a temporary diagnostic in `src/AcDream.Core/World/LandblockLoader.cs:74-87` (the Buildings loop):
```csharp
Console.WriteLine($"[building-shape] lb=0x{landblockId:X8} idx={i} ModelId=0x{building.ModelId:X8} " +
$"Frame.Origin=({building.Frame.Origin.X:F1},{building.Frame.Origin.Y:F1},{building.Frame.Origin.Z:F1}) " +
$"Portals={building.Portals?.Count ?? 0}");
if (building.Portals is not null)
{
foreach (var p in building.Portals)
Console.WriteLine($"[building-shape] portal -> OtherCellId=0x{p.OtherCellId:X8} " +
$"(remaining fields: {p.GetType().GetProperties().Length} total)");
}
```
Build + launch + walk to Holtburg. Capture log. Then `git checkout HEAD -- src/AcDream.Core/World/LandblockLoader.cs` to revert.
4. **RR2-S4** — Write `docs/research/2026-05-26-a8-buildings-data-shape.md` with 5 sections (per plan).
5. **RR2-S5** — Commit findings.
6. **RR2-S6** — Gate decision:
- Data shape compatible with design → proceed to RR3.
- Data shape incompatible → STOP, re-brainstorm.
### Expected duration
~30-60 minutes including live-inspect cycle.
### Human-in-the-loop step
RR2-S3 (live-inspect) needs the user to launch + walk into a cottage + report.
---
## Quick context primer for a fresh session
### Why this phase
RR0 falsification proved:
- HEAD (R3.5 v2): Issues A + C reproduce
- R3 baseline (60f07bc): same A + C (R3.5 patches didn't help)
- main (7034be9, no A8 work): A + C don't reproduce, BUT constant #78 (houses below terrain from inside)
R3's stencil work fixed #78 but introduced A + C by rendering all 16 BFS-reachable cells at full screen extent. The fix is **per-portal recursive culling** (what retail + WB both do). For acdream's stack, WB's stencil approach is closer to existing infrastructure than retail's polygon-clip scissor.
### Why full WB port (not minimum)
The user explicitly chose "full WB port now" over (a) revert all A8 and live with #78 or (b) keep R1+R2 and revert only R3+R3.5. Decision recorded in the design doc's "Brainstorm outcomes" section.
### What's in scope (per design)
- WB `RenderInsideOut` Steps 1-5 (including 3-stencil-bit cross-building visibility, Step 5)
- WB `RenderOutsideIn` (cottage interiors visible through windows from outside)
- Per-building cell association (Building + BuildingRegistry + LoadedCell.BuildingId — Option C dual-indexed per user's Q2 answer)
- Single strict `cameraInsideBuilding` gate (drop grace for render path; CellVisibility's grace stays alive for non-render consumers)
- Stencil-gated sky inside indoor branch (acdream enhancement over WB)
### What's NOT in scope
- Retail polygon-clip scissor port (multi-week alternative)
- Cell-side `BuildingId` as SOLE data source (Option B — rejected for awkward API)
- Reverting R1+R2 (kept — orthogonal infrastructure)
---
## Files state at session end
```
Branch: claude/strange-albattani-3fc83c
HEAD: 29e306b (RR1 footer-marks)
Working tree: clean (only untracked log files + research docs from prior sessions)
Build: green
Tests: within flaky window
Uncommitted predecessor docs (intentionally not committed by previous sessions):
docs/research/2026-05-26-a8-entity-taxonomy.md
docs/superpowers/plans/2026-05-26-phase-a8-indoor-cell-visibility-culling.md
docs/superpowers/plans/2026-05-26-phase-a8-replan.md
(plus several A6 / issue-78 / issue-101 / cellar saga docs)
These are not blocking — they were referenced by handoff docs that DID get
committed, so the chain works at the file-system level. If a fresh session
wants tidy git history, run a single tidy-up commit gathering these.
```
---
## Pickup prompt for fresh session
```
Phase A8 — full WB RenderInsideOut + RenderOutsideIn port. RR1 cleanup
shipped 2026-05-26 PM (commits 84c4a70 → 29e306b). Pick up at RR2 (spike).
Read first (REQUIRED, in order):
1. docs/research/2026-05-26-a8-wb-full-port-rr1-shipped-handoff.md
(this doc)
2. docs/superpowers/specs/2026-05-26-phase-a8-wb-full-port-design.md
(the approved design)
3. docs/superpowers/plans/2026-05-26-phase-a8-wb-full-port.md
(12-task plan; pick up at RR2)
4. docs/research/2026-05-26-a8-rr0-falsification-findings.md
(evidence triggering the scope expansion)
5. references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/VisibilityManager.cs:73-330
(the proven reference)
6. CLAUDE.md — "Currently working toward" line + A8 paragraph
State both altitudes:
Currently working toward: M1.5 — Indoor world feels right
Current phase: A8 — full WB port
HEAD: 29e306b (RR1 footer-marks)
Test baseline: build green; ~14-23 flaky test window (pre-existing)
Session flow:
### RR2 — spike (this session)
Per plan §RR2, 6 steps:
S1: inspect DatReaderWriter.BuildingInfo via grep/nuget
S2: read WB PortalRenderManager:518-551
S3: live-inspect a Holtburg cottage's BuildingInfo (temp diagnostic in
LandblockLoader.cs, launch, user walks to cottage, capture log,
revert diagnostic)
S4: write docs/research/2026-05-26-a8-buildings-data-shape.md
S5: commit findings
S6: gate — if data shape compatible, proceed to RR3; else re-brainstorm
Expected ~30-60 min. Single commit.
### RR3-RR12 — subsequent sessions
Subagent-driven (one fresh Sonnet subagent per code task with two-stage
review). Direct orchestration for RR8/RR10/RR12 visual gates.
## Constraints
- Per CLAUDE.md "no workarounds" rule — fix root causes. The new design
is the proper fix; don't iterate on R3.5-style symptom patches.
- WB code under references/WorldBuilder/ is MIT-licensed and the same
stack as acdream (Silk.NET, .NET); port verbatim where possible with
WB line refs in comments.
- Visual verification is the acceptance test. RR8 gate must close #78
+ R4 Issues A + C BEFORE proceeding to RR9 (Step 5).
- DO NOT re-revert R1 (ed72704) + R2 (55f26f2) — they're orthogonal
infrastructure consumed by the upcoming work.
## What success looks like
After all 12 tasks ship (~1.5-2 weeks calendar):
- Standing inside Holtburg cottage / cellar / inn / dungeon: all walls
solid. Sky visible through windows.
- Exit/entry transitions clean (Issues A + C closed).
- Cross-building visibility (Step 5) working — inn → cottage interior.
- Cottage interiors visible from outside through windows
(RenderOutsideIn).
- #78 closed; #102 closed; no regression on #100.
- M1.5 indoor scope fully shipped.
```
---
## Why pause now
The session has covered ~8 hours of brainstorm + design + plan + cleanup. Context budget is substantial. A fresh session for RR2 (the spike) lets the upcoming long stretch of RR3-RR12 implementation start with maximum context room for the subagents to consume. This is exactly the milestone-discipline rhythm CLAUDE.md describes.
Tomorrow's session opens with the pickup prompt above. The work is well-shaped: RR2 is small (≤1 hour); each of RR3-RR11 is one or two sessions of subagent-dispatched work with two-stage review; RR8 + RR10 + RR12 are visual gates that need ~30 min each of you driving the test client.
Good night. M1.5 is closer than it was this morning.

View file

@ -0,0 +1,485 @@
# Phase A8 RR7 reverted — full WB port handoff (2026-05-27)
## TL;DR for next session
RR7 (render-frame integration) shipped 4 times in one session; all 4 broke
the visual differently. **All four are reverted.** Branch is back to the
pre-A8 visual ("looks good"). RR3-RR6 infrastructure (`Building`,
`BuildingRegistry`, `BuildingLoader`, `WbDrawDispatcher.Draw(cellIds:)`
overload, `IndoorCellStencilPipeline` 3-bit + occlusion-query) remains
shipped + tested in isolation.
**The fundamental mistake:** RR3-RR7 ported WB's RenderInsideOut Steps 1-4
**conceptually** but routed cell-mesh rendering through our
`ObjectMeshManager` / `WbDrawDispatcher.Draw(IndoorPass)` pipeline. WB
doesn't do that — WB has a separate `EnvCellRenderManager` (862 LOC) that
renders cells via a different path. Without extracting that, the indoor
branch fires (gate works post-RR7.2) but cell interiors never render →
flat fog-color floors.
**Next session's mission:** port WB **verbatim**, including extracting
`EnvCellRenderManager.cs` + dependencies into our tree. No conceptual
adaptations. No "modern equivalent" decisions. Follow WB byte-for-byte
where the algorithm runs, just as Phase O extracted WB's mesh path.
User direction (verbatim, 2026-05-27):
> "Either we port exact behavior from retail or we port exact behavior
> from WB. ... Make a detailed plan to port WB verbatim behaviour to fix
> this. No quickfixes or fixes that might cause issues down the line ...
> use superpowers but DONT stop me for questions, be perfect, no
> band-aids. When you have a visual test ready with all rendering fix
> for this you launch the client for me to verify."
User decision: **WB**. (See decision rationale in
"Why WB and not retail" below.)
## Session log — what was tried and why it failed
This session opened picking up RR2 (BuildingInfo data-shape spike,
shipped clean) and then drove RR3 → RR4 → RR5 → RR6 → RR7 as planned.
The four RR7-variant fix attempts came after the user reported broken
visuals at the first visual gate.
### Commits shipped this session, before revert
| SHA | Phase | Status now | What it did |
|---|---|---|---|
| `f44a9bf` | RR2 | **KEPT** | Findings doc — `BuildingInfo` data shape + WB walk algorithm |
| `f125fdb` | RR3 | **KEPT** | `Building` + `BuildingRegistry` + `BuildingLoader` + 10 unit tests |
| `f8d0499` | RR4 | **KEPT** | `LoadedCell.BuildingId` + landblock-load wiring + 1 test |
| `3361933` | RR5 | **KEPT** | `WbDrawDispatcher.Draw(cellIds:)` overload + 2 tests |
| `6a7894a` | RR6 | **KEPT** | `IndoorCellStencilPipeline` 3-bit + 9 occlusion-query/state methods |
| `3d28d70` | RR7 | **REVERTED** by `4fa3390` | GameWindow render-frame restructure |
| `a1a3e0e` | RR7.1 | **REVERTED** by `21dc72b` | `AllLoadedCells` + late-stamp on drain |
| `efe3520` | RR7.2 | **REVERTED** by `9aaae02` | `_buildingRegistries` key normalization |
| `56673e1` | RR7.3 | **REVERTED** by `07c5981` | Dat-driven BFS in BuildingLoader |
Net infrastructure shipped: 5 commits, ~1100 LOC of production + 13
unit tests. All correct in isolation. None of the integration code
remains on the branch.
### Visual-gate launches and what they revealed
**Launch v1 — RR7 alone (commit `3d28d70`)**
- User reported: "Yes looks good!"
- `[vis]` log: `branch=indoor` count = **0** (out of 47,266 outdoor
decisions). 17,748 frames had `inside=True really=True` (camera in an
indoor cell) — but the gate's `BuildingId is not null` check failed
every time.
- **Why "looks good" was misleading:** RR7's call site used
`drainedCells` (the per-frame `_pendingCells` drain). Cells streamed
in over many frames, but `BuildingLoader.Build` ran once per landblock
load with whatever was in drainedCells THAT frame. Most building cells
were stamped on a frame when they weren't yet drained, so
`BuildingId` stayed null. Then `cameraInsideBuilding=false`, the
outdoor branch ran with full sky + initial terrain. Visually
indistinguishable from pre-A8.
- **My process failure:** declared visual gate passed without reading
the `[vis]` data first. "Looks good" without diagnostic correlation
is not verification.
**Launch v2 — RR7 + RR7.1 (`a1a3e0e`)**
- User reported: "All textures are missing, ground, sky only buildings
and objects are visible. Looks much worse."
- `[vis]` log: `branch=indoor` STILL 0 of 163,670 (with 125,476
`inside=True`).
- **Why it got worse:** RR7.1 made `BuildingLoader.Build` use
`_cellVisibility.AllLoadedCells` (every loaded cell, not just the
drain) which stamped MORE cells with `BuildingId`. That made
`cameraInsideBuilding=true` for more frames. But the registry-key
lookup at the gate STILL missed (storage at `0xA9B4FFFF`, lookup at
`0xA9B40000` — see RR7.2 below). So `cameraInsideBuilding=true`
sky + initial terrain GATED OFF → indoor branch's inner gate
(`camBuildings.Count > 0`) FAILED → outdoor branch ran WITHOUT sky
and terrain → black through windows.
**Launch v3 — RR7 + RR7.1 + RR7.2 (`efe3520`)**
- User reported: missing texture indoors (screenshot shows light-grey
fog-color areas where cell interior surfaces should be).
- `[vis]` log: `branch=indoor` = **119,471** vs outdoor 2,910. Indoor
branch finally fires.
- **Why it still broke:** RR7.2 fixed the registry key. Indoor branch
fires, `MarkAndPunch` runs, `Draw(IndoorPass, cellIds: camCellIds)`
runs. Building shells (cottage walls / inn walls — the
`IsBuildingShell` entities) render. But cell-mesh entities
(registered with `MeshRef(envCellId, ...)`) don't produce a textured
floor. The `[vis]` data confirms the gate works; the visual confirms
the cell-mesh path doesn't.
**Launch v4 — RR7 + RR7.1 + RR7.2 + RR7.3 (`56673e1`)**
- User reported: still flat grey areas.
- **Why it still broke:** RR7.3 made BFS dat-driven so building
EnvCellIds is complete regardless of cell load timing. Confirmed
BFS short-circuiting was NOT the cause — `camCellIds` contains the
user's current cell, the cell-mesh entity is walked, but the floor
doesn't appear.
### Root cause (only fully understood at session end)
WB's `VisibilityManager.RenderInsideOut`
(`references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/VisibilityManager.cs:73-239`)
renders the inside-building cells via:
```csharp
envCellManager!.Render(pass1RenderPass, _currentEnvCellIds);
```
This calls into a **separate manager class**
`EnvCellRenderManager.cs`, 862 LOC, also in WB — that handles cell
rendering with its own GL pipeline, separate from
`ObjectMeshManager.cs`. The two managers exist because cell rendering
has different requirements (per-cell texture batching, different
transparency handling, cell-portal-aware geometry) from per-GfxObj
rendering.
Our RR7 collapsed Steps 3 (cell rendering) and Step 4 (stencil-gated
outdoor) into:
```csharp
_wbDrawDispatcher!.Draw(camera, ..., cellIds: camCellIds,
set: EntitySet.IndoorPass);
```
The dispatcher's `IndoorPass` walks entities including cell-mesh
entities (created in `GameWindow.BuildInteriorEntitiesForStreaming` at
line ~5441 with `MeshRefs = new[] { cellMeshRef }` where
`cellMeshRef.GfxObjId = envCellId`). But `ObjectMeshManager`'s draw
path is fundamentally per-GfxObj batched + MDI; it has a dat-side
`PrepareEnvCellMeshData` path (line ~1184 of WB's ObjectMeshManager,
also in our extracted copy) but that path's output isn't wired into
the dispatcher's instance-buffer layout the same way GfxObj meshes
are. Building shells render (they ARE GfxObj entities with proper
mesh refs after hydration at line ~5160). Cell meshes don't render
correctly.
In short: **the cell-mesh entity scheme we use is an architectural
mismatch with WB's render algorithm.** WB renders cells through
`EnvCellRenderManager.Render(cellIdSet)` — a per-cell rendering call.
We render cells through `Dispatcher.Draw(set: IndoorPass)` — a
per-entity batched call. The two are not interchangeable.
## Why WB and not retail
User asked decisively: "Either we port exact behavior from retail or we
port exact behavior from WB. What do you want?"
I chose WB. Reasons:
1. **Retail's algorithm doesn't fit modern GL.** Retail's
`PView::DrawCells` at `acclient_2013_pseudo_c.txt:432709` uses
software polygon-clip rects (set per portal during recursive cell
traversal). Porting verbatim requires either (a) inventing a
modern-equivalent — which is what WB already did — or (b)
implementing per-fragment shader-discard against portal polygons,
which is expensive and non-trivial.
2. **WB is already our rendering base.** Phase N.4 (2026-05-08) adopted
WB as our rendering oracle. Phase N.5 made WB's bindless +
`glMultiDrawElementsIndirect` mandatory. Phase O (2026-05-21)
extracted WB's mesh + dat-handling code into our tree
(`references/WorldBuilder/` remains as read-reference, but the
actual pipeline files live at `src/AcDream.App/Rendering/Wb/`).
Adopting WB's `EnvCellRenderManager` + `VisibilityManager` is the
natural continuation.
3. **Modern code, retail behavior** — WB is the existing "modern code,
retail-equivalent behavior" port. WB's stencil-based RenderInsideOut
is the modern-GL realization of retail's polygon-clip algorithm.
The observable behavior matches.
4. **Same exact stack.** WB is MIT-licensed Silk.NET + .NET 10 +
DatReaderWriter — verbatim our stack. No translation cost.
5. **Tested by WB's developers.** WB's RenderInsideOut works in their
tool. Faithful porting means we inherit their validation.
## What WB's render frame actually does (the spec for the redo)
The render frame algorithm lives at
`references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/VisibilityManager.cs:73-239`.
The `RenderInsideOut` method takes managers as parameters
(`portalManager`, `envCellManager`, `terrainManager`, `sceneryManager`,
`staticObjectManager`, `sceneryShader`). Each step:
### Step 1: Stencil bit 1 at our building's portals (lines 78-97)
- `Enable(StencilTest)`, `ClearStencil(0)`, `Clear(StencilBufferBit)`.
- `Disable(CullFace)`, `StencilFunc.Always(1, 0xFF)`,
`StencilOp(Keep, Keep, Replace)`, `StencilMask(0x01)`,
`ColorMask(false×4)`, `DepthMask(false)`, `Enable(DepthTest)`,
`DepthFunc.Always`.
- For each building containing the current cell: `portalManager?.RenderBuildingStencilMask(building, snapshotVP, false)`.
### Step 2: Punch depth at portals (lines 99-105)
- `DepthMask(true)`, `DepthFunc.Always`.
- For each building containing the current cell: `RenderBuildingStencilMask(building, snapshotVP, true)`.
### Step 3: Render OUR cells (stencil OFF) (lines 107-127)
- `ColorMask(true, true, true, false)` (note: alpha bit OFF — WB intentional choice).
- `DepthMask(true)`, `Disable(StencilTest)`, `DepthFunc.Less`.
- `sceneryShader?.Bind()`.
- Collect `_currentEnvCellIds` from `_buildingsWithCurrentCell.SelectMany(b => b.EnvCellIds)`.
- `envCellManager!.Render(pass1RenderPass, _currentEnvCellIds)`.
- If transparency enabled: `DepthMask(false)`, render transparent pass, `DepthMask(true)`.
### Step 4: Stencil-gated outdoor — terrain + scenery + static objects (lines 129-154)
- If `didInsideStencil` (we had buildings): `Enable(StencilTest)`,
`StencilFunc.Equal(1, 0x01)`, `StencilOp(Keep, Keep, Keep)`,
`StencilMask(0x00)`, `ColorMask(true, true, true, false)`,
`DepthMask(true)`, `Enable(CullFace)`, `DepthFunc.Less`.
- `terrainManager.Render(snapshotView, snapshotProj, snapshotVP, snapshotPos, snapshotFov)`.
- `sceneryShader?.Bind()`.
- If scenery enabled: `sceneryManager?.Render(pass1RenderPass)`.
- If static-objects/buildings shown: `staticObjectManager?.Render(pass1RenderPass)`.
### Step 5: Other-buildings' cells through portals (lines 156-232)
- Collect `_otherBuildings` from `_visibleBuildingPortals` filtering OUT
buildings that contain `currentEnvCellId`.
- For each other-building (per `_otherBuildings`):
- Read back previous frame's occlusion query
(`GetQueryObject(building.QueryId, ResultAvailable)`,
`GetQueryObject(... Result)`). Update `building.WasVisible`.
- Start new query: `BeginQuery(SamplesPassed, building.QueryId)`,
`building.QueryStarted = true`.
- **a. Mark Bit 2 (Ref=3, Mask=0x02) where Bit 1 set**
(`StencilFunc.Equal(3, 0x01)`, `StencilOp Replace`,
`StencilMask 0x02`, `ColorMask off`, `DepthMask off`,
`Disable(CullFace)`).
`portalManager?.RenderBuildingStencilMask(building, snapshotVP, false)`.
- `EndQuery(SamplesPassed)`.
- **b. Clear depth where Stencil == 3** (`StencilFunc.Equal(3, 0x03)`,
`StencilMask 0x00`, `DepthMask true`, `DepthFunc.Always`).
`RenderBuildingStencilMask(building, snapshotVP, true)`.
- **c. Render other-building's EnvCells gated by Stencil == 3**
(`ColorMask(true, true, true, false)`, `DepthFunc.Less`,
`Enable(CullFace)`). `sceneryShader.Bind()`.
`envCellManager.Render(pass1RenderPass, building.EnvCellIds)` (+ transparent).
- **d. Reset Bit 2 back to 0** for next iteration
(`StencilMask 0x02`, `StencilFunc.Always(1, 0x02)`,
`StencilOp Replace`, `ColorMask off`, `DepthMask off`).
`RenderBuildingStencilMask(building, snapshotVP, false)`.
### Cleanup (lines 234-238)
- `Disable(StencilTest)`, `StencilMask(0xFF)`, `ColorMask(true×3, false)`.
## Why our RR7 didn't match this
1. **No `envCellManager.Render(...)` call.** We routed cells through
`Dispatcher.Draw(IndoorPass)`, which is per-GfxObj-batched, not
per-cell.
2. **No separate transparency pass for cells.** Step 3's
`DepthMask(false) + Render(Transparent)` was missing.
3. **No `sceneryShader.Bind()` between passes.** WB's algorithm
assumes a specific shader is bound at each step; we never did.
4. **Step 5 missing entirely.** Cross-building visibility (cottage
cellar visible from cottage above, inn rooms visible through doors)
not implemented. Would have shipped in RR9 but RR7 should have at
least scaffolded the order.
5. **ColorMask alpha-bit pattern not preserved.** WB uses
`ColorMask(true, true, true, false)` deliberately — alpha-bit OFF.
Our outdoor branch's `Draw(All)` doesn't toggle alpha bit, but
WB's path does. Could affect alpha-to-coverage downstream.
## The plan for the next session
### Phase 1: Extract `EnvCellRenderManager` into our tree (~862 LOC)
Mirror Phase O's pattern:
1. Read `references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/EnvCellRenderManager.cs`
in full.
2. Identify its dependencies — likely `GlobalMeshBuffer`,
`ObjectMeshManager` (already extracted), `TextureAtlasManager`,
`IRenderManager`, `RenderPass`, `SceneData`. Extract any missing
dependencies.
3. Copy `EnvCellRenderManager.cs` to
`src/AcDream.App/Rendering/Wb/EnvCellRenderManager.cs`.
4. Adapt namespaces (`Chorizite.OpenGLSDLBackend.Lib`
`AcDream.App.Rendering.Wb`).
5. Resolve any references to types we don't have. Stub or extract as
needed.
6. Build green. No tests yet at this step.
### Phase 2: Wire `EnvCellRenderManager` into the existing landblock load
`EnvCellRenderManager.Register(envCell, cellStruct, worldTransform, ...)` is
how cells join its registry. Currently we call `CellMesh.Build` at
`GameWindow.BuildInteriorEntitiesForStreaming` (line ~5423). Replace
that with the `EnvCellRenderManager` registration path — cell meshes
flow through ITS pipeline, not through ObjectMeshManager via fake-
GfxObj-id MeshRefs.
The `WorldEntity` we create with `MeshRefs = [cellMeshRef]` (line 5441)
becomes irrelevant for cell rendering — the EnvCellRenderManager owns
the cells, the dispatcher renders only entities that have real GfxObj
mesh refs.
### Phase 3: Replicate `VisibilityManager.RenderInsideOut` byte-for-byte
In `GameWindow.cs` render frame (after the per-frame `glClear` +
visibility computation), replace the `if (cameraInsideBuilding)
{ ... } else { ... }` block we shipped + reverted with a call to a
new method `RenderInsideOutAcdream` that follows WB's Steps 1-5 line by
line.
`PortalRenderManager.RenderBuildingStencilMask(building, vp, punch)` is
the other dependency. Extract from
`references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/PortalRenderManager.cs`
(702 LOC) — at minimum the stencil-mask method + its mesh upload path.
The plumbing may be reusable for our existing `IndoorCellStencilPipeline`.
Our `IndoorCellStencilPipeline` already implements WB's Steps 1+2 +
Step 5 a/b/c/d. The mismatch is **what calls them** — our code calls
them with `_indoorStencilPipeline.MarkAndPunch(...)` etc. WB calls them
via `portalManager.RenderBuildingStencilMask(building, vp, punch)`.
The pipelines are equivalent in spirit but the entry point differs.
Map our pipeline methods onto WB's interface signature so the
RenderInsideOut algorithm can call them by name.
### Phase 4: Probes BEFORE visual launches
Mandatory before any visual gate. Add (gated on
`ACDREAM_PROBE_VIS=1` or a new `ACDREAM_PROBE_ENVCELL=1` flag):
- **`[envcells]` per frame**: count of cells walked by
`EnvCellRenderManager.Render`, count of triangles drawn, the
cellId set being rendered.
- **`[stencil]` per frame**: vertex count uploaded for MarkAndPunch
(the existing pipeline emits this internally — surface it).
- **`[draworder]` per frame**: assertion that the algorithm ran each
step in the right order with the right GL state on entry.
When a visual gate fires:
- ALWAYS read the probe data FIRST. Confirm indoor branch fired,
envcells were rendered, stencil mask was non-empty.
- Compare probe data to expected (the design doc has the algorithm
spelled out).
- ONLY THEN ask the user for visual confirmation.
### Phase 5: Visual gate (single)
Once Phases 1-4 done + probe data confirms correct behavior:
launch the client for the user to verify. ONE gate. Not four.
## Open questions for the next session to investigate
These DON'T require user input — investigate during execution:
1. **`PortalRenderManager.RenderBuildingStencilMask` mesh upload.**
Does WB upload exit portal polygons differently than we do? Our
`UploadBuildingPortalMesh` (Phase A8 RR6) might map cleanly to
WB's expectation, or might need adjustment.
2. **`EnvCellRenderManager.Register` API.** What does it accept?
Compare to our `_pendingCellMeshes[envCellId] = cellSubMeshes`.
Identify the seam.
3. **Transparency pass.** WB's Step 3 has an `if
(state.EnableTransparencyPass)` second `Render(Transparent)` call.
We don't have a state object yet; need to either add one or pick
the default (likely enabled, since indoor transparency matters for
stained glass, ornate furniture).
4. **Occlusion queries (RR9 scope).** RR7's job was Steps 1-4 only;
RR9 was supposed to add Step 5. But WB's RenderInsideOut has Step 5
inline — we shouldn't split it. Land Steps 1-5 together in the
next attempt. RR9 becomes a no-op or absorbed.
5. **`OutdoorScenery` EntitySet.** WB's Step 4 calls
`sceneryManager.Render(pass1RenderPass)` and
`staticObjectManager.Render(pass1RenderPass)` separately. We've
collapsed both into `Draw(EntitySet.OutdoorScenery)`. Need to
verify our `OutdoorScenery` partition matches what WB's two
managers cover, OR split them into two dispatch calls.
## Process rules for the next session (carved from this session's mistakes)
1. **No visual-gate launch without probe data first.** If the probe
says branch=indoor count = 0, the user's "looks good" doesn't
confirm A8 is working. Read the probe BEFORE asking the user.
2. **No partial WB ports.** Extract the manager. Wire it. Implement
the algorithm in full. No "Steps 1-4 now, Step 5 later." The steps
are interdependent; partial implementations have wrong cumulative
state.
3. **No conceptual adaptations of WB.** If WB does X, do X. If our
stack has a different way of doing it, either extract the WB way
into our stack OR use the existing analog 1:1 without "improvement."
No new abstractions invented mid-port.
4. **Trust-but-verify after every subagent dispatch.** Subagents
compile + pass tests in their isolation but don't verify visual
correctness. The harness pattern from #98 saga applies: build the
apparatus first, then trust evidence over plausible-looking code.
5. **Acknowledge the cost-of-failure asymmetry.** Each "fix" that
doesn't work costs the user a launch cycle, screenshot review,
bug-report write-up. Three wrong fixes in a row > one fully-thought
fix. Slow down at the brainstorming step, not at the implementation
step.
## Files that remain shipped (RR3-RR6 infrastructure)
These work in isolation and stay on the branch:
| File | LOC | Tested |
|---|---|---|
| `src/AcDream.App/Rendering/Wb/Building.cs` | 57 | 2 tests |
| `src/AcDream.App/Rendering/Wb/BuildingRegistry.cs` | 73 | 4 tests |
| `src/AcDream.App/Rendering/Wb/BuildingLoader.cs` | 144 | 5 tests |
| `src/AcDream.App/Rendering/Wb/WbDrawDispatcher.cs` (additions: `Draw(cellIds:)` overload + `WalkEntitiesForTestByCellIds`) | +153 | 2 tests |
| `src/AcDream.App/Rendering/IndoorCellStencilPipeline.cs` (additions: 4 stencil-3-bit methods + 4 occlusion-query methods + UploadBuildingPortalMesh) | +243 | 0 tests (GL required) |
The `LoadedCell.BuildingId` field also persists (from RR4) — that's a
1-property addition to `CellVisibility.cs`. RR4's wire-in in
`GameWindow.cs` (the `_buildingRegistries` dict + the
`BuildingLoader.Build(...)` call at line ~5876 + the RemoveLandblock
callbacks) is **also reverted** by the RR7 revert chain — the dict and
all references to it are gone now. Confirm via:
```
grep -n _buildingRegistries src/AcDream.App/Rendering/GameWindow.cs
```
If zero matches, the revert is complete. If matches remain, RR4 needs
manual cleanup (likely a stray field declaration the revert didn't
catch).
## Pickup prompt for next session
> Read this entire handoff doc, then read these in order:
>
> 1. `references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/VisibilityManager.cs:73-239` (the RenderInsideOut algorithm we're porting verbatim)
> 2. `references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/EnvCellRenderManager.cs` (the manager to extract — 862 LOC)
> 3. `references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/PortalRenderManager.cs` (the other dependency — extract the stencil-mask method + any infrastructure)
> 4. `docs/architecture/worldbuilder-inventory.md` (what we've already extracted from WB and where it lives)
> 5. `docs/superpowers/plans/2026-05-26-phase-a8-wb-full-port.md` (the original A8 plan — IGNORE its RR7 design, follow this handoff doc's plan instead)
>
> Then brainstorm + write a fresh detailed plan covering:
> - The exact extraction list (every WB file to copy into our tree)
> - The exact wire-in points in GameWindow.cs
> - The probe trail with format specifications
> - The expected visual outcomes per step
> - The order of execution (extraction → wiring → probes → visual gate)
>
> Use the superpowers:writing-plans skill. The plan goes to
> `docs/superpowers/plans/2026-05-28-phase-a8-wb-render-inside-out-port.md`.
>
> Once the plan is written, execute it without stopping. No questions
> to the user mid-flight. When the visual test is ready, launch the
> client for visual confirmation. Read probe data BEFORE accepting any
> "looks good" report.
>
> User authorization (verbatim 2026-05-27): "use superpowers but DONT
> stop me for questions, be perfect, no bandaids."
## Key references
- Plan we deviated from: `docs/superpowers/plans/2026-05-26-phase-a8-wb-full-port.md`
- Design doc: `docs/superpowers/specs/2026-05-26-phase-a8-wb-full-port-design.md`
- WB extraction precedent (Phase O): commit `6a7894a`'s parent chain
- WB code root: `references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/`
- This session's RR1 handoff (still relevant for project context):
`docs/research/2026-05-26-a8-wb-full-port-rr1-shipped-handoff.md`
- RR2 findings (BuildingInfo data shape — still accurate, useful for
understanding the building model):
`docs/research/2026-05-26-a8-buildings-data-shape.md`

View file

@ -0,0 +1,168 @@
# Phase A8 Cellar-Flap Handoff - 2026-05-28
## Status
The remaining cottage/cellar artifact is still visible. The user sees a short-lived green terrain-like rectangle/flap over the cellar entrance or floor opening when entering/exiting a building and when the chase camera angle changes.
This thread hit the project rule: 3 visual-gated fixes failed for this specific artifact. Stop here. Do not ship a fourth speculative renderer change. The next step must be source-provenance instrumentation or an architecture comparison.
The client was stopped after the failed visual gate to avoid build/file locks.
## Last Good Context
The broader A8 indoor renderer is much improved and shippable aside from this artifact:
- Theory A front-face alignment plus per-batch `CullMode` fixed major missing/inside-out geometry.
- The `InstanceData` stride fix explained the texture explosion/distortion root cause.
- Building shell scoping, portal bounds, streaming/promotion fixes, and FPS fixes remain important.
- The current artifact is narrow: a green terrain-like flap at/near cellar entrance transitions.
## Failed Fixes In This Micro-Loop
### 1. Camera branch gate relaxation
Hypothesis: strict `PointInCell` camera-inside-building gating was causing the renderer to switch branches while the chase camera crossed walls/portals.
Change tried: allow the inside-out branch to continue based on `CameraCell.BuildingId`, without the strict `PointInCell` check.
Result: visual gate failed. User still saw the artifact. Change was reverted back to strict `PointInCell`.
Evidence ruled out: the artifact is not solely caused by the render branch dropping out when the camera is slightly outside the cell.
### 2. Portal stencil `pos.w` clamp
Hypothesis: near-zero clip W for portal stencil triangles was exploding the screen-space mask as the camera crossed a portal plane.
Change tried: restored WorldBuilder-style clamp in `src/AcDream.App/Rendering/Shaders/portal_stencil.vert`:
```glsl
vec4 pos = uViewProjection * vec4(aPosition, 1.0);
if (abs(pos.w) < 0.001)
pos.w = pos.w < 0.0 ? -0.001 : 0.001;
gl_Position = pos;
```
Result: visual gate failed. User still saw the artifact.
Interpretation: this may still be WB parity and may be worth keeping, but it is not the root cause of the cellar flap.
### 3. Visible-cell portal mask
Hypothesis: `RenderInsideOutAcdream` was punching outdoor terrain through all exit portals in the camera building, not just exits reached by the current portal traversal. A window/door portal could project over the cellar opening and let terrain draw through it.
Changes tried:
- Added `CellVisibility.TryGetCell(uint, out LoadedCell?)`.
- Added `IndoorCellStencilPipeline.DrawUploadedPortalMesh(...)` to draw the already-uploaded visible-cell portal mesh.
- Changed `RenderInsideOutAcdream` Step 1/2 to upload portal triangles from `visibleCellIds`, not `camBuildings`.
- Wrapped Step 4 terrain/outdoor scenery draw inside `if (didInsideStencil)`.
- Added pure math test `BuildTriangles_OnlyIncludesProvidedVisibleCells`.
Verification before launch:
```powershell
dotnet build src\AcDream.App\AcDream.App.csproj -c Debug --no-restore
dotnet test tests\AcDream.App.Tests\AcDream.App.Tests.csproj -c Debug --filter FullyQualifiedName~IndoorCellStencilPipelineTests --no-restore
```
Both passed after shutting down build servers. The first parallel test attempt only failed because `VBCSCompiler` locked `AcDream.App.dll` while a parallel build was running.
Visual result: failed. User still saw the flap.
Evidence ruled out: the artifact is not explained only by building-wide vs visible-cell exit portal masking.
## Current Uncommitted Changes From This Micro-Loop
These are not visually validated as a fix:
- `src/AcDream.App/Rendering/Shaders/portal_stencil.vert`
- WB-style `pos.w` clamp.
- `src/AcDream.App/Rendering/CellVisibility.cs`
- `TryGetCell`.
- `src/AcDream.App/Rendering/IndoorCellStencilPipeline.cs`
- `DrawUploadedPortalMesh`.
- `src/AcDream.App/Rendering/GameWindow.cs`
- visible-cell portal mask in `RenderInsideOutAcdream`.
- Step 4 fail-closed when no portal mesh uploaded.
- `tests/AcDream.App.Tests/Rendering/IndoorCellStencilPipelineTests.cs`
- visible-cell-only triangle generation test.
Do not commit these as "the fix" unless a later source-provenance check proves they are correct and harmless. If the next architecture route goes elsewhere, revert or split them honestly.
## What We Know About The Cellar Entrance
Prior inspection found:
- The cellar transition around Holtburg cottage cells `0xA9B40143 -> 0x0146 -> 0x0147` is indoor-to-indoor.
- The relevant transition portals are not literal `OtherCellId == 0xFFFF` outside exits.
- The portal cap polygons inspected there are `NoPos`.
- That made "cell mesh uploads the outdoor floor polygon" unlikely, but not proven impossible for the visible green patch.
## What The Failed Fixes Suggest
The green pixels may not be caused by the portal mask at all. We need to prove which render pass writes them before touching behavior again.
Candidate explanations now:
1. **The green patch is not Step 4 terrain.**
It could be an EnvCell surface with the wrong texture/material, stale texture binding, or an indoor static object surface resolving to a grass texture.
2. **The patch is Step 4 terrain, but the stencil/depth mask source is not the exit portal list.**
It may be a stale stencil/depth state leak, a Step 3/4 depth ordering issue, or a missing retail occlusion/scissor lifecycle detail.
3. **The cellar opening has a missing or late-loaded indoor object.**
If a hatch/stair/floor object should cover or occlude that area and is missing for one or more frames, outdoor/terrain pixels behind it become visible.
4. **Camera clipping is exposing an intentionally hidden surface.**
Retail camera collision is incomplete in acdream. If the camera clips through the wall/floor volume, the renderer may be showing a view retail never permits. This does not explain all cases by itself, because the user also sees it during transitions, but it must be separated from a renderer bug.
## Next Step - Evidence Only
Add a provenance diagnostic, not a fix. One visual launch should answer what writes the green flap.
Recommended diagnostic:
- Add an env-gated pass tint or disable switch for only the inside-out Step 4 terrain draw.
- Add a separate tint/disable for Step 4 `OutdoorScenery`.
- Add a tint/disable for Step 3 EnvCell opaque.
- Keep default behavior unchanged.
Example env names:
```text
ACDREAM_A8_DIAG_DISABLE_INSIDE_STEP4_TERRAIN=1
ACDREAM_A8_DIAG_DISABLE_INSIDE_STEP4_OUTDOOR=1
ACDREAM_A8_DIAG_DISABLE_INSIDE_ENVCELL_OPAQUE=1
```
Run only one diagnostic at a time:
- If disabling Step 4 terrain removes the green flap, the issue is still terrain leaking through the indoor view mask.
- If disabling Step 4 terrain does not remove it, stop looking at portal exits and inspect EnvCell/static texture/material assignment for the specific surface.
- If disabling EnvCell opaque removes it, dump the exact `CellStruct` polygon/material under the camera/screenshot area.
Do not ship the diagnostic switches. Strip them after the source is known.
## Launch/Test Notes
Last clean visual launch:
```powershell
$env:ACDREAM_DAT_DIR = "$env:USERPROFILE\Documents\Asheron's Call"
$env:ACDREAM_LIVE = "1"
$env:ACDREAM_TEST_HOST = "127.0.0.1"
$env:ACDREAM_TEST_PORT = "9000"
$env:ACDREAM_TEST_USER = "testaccount"
$env:ACDREAM_TEST_PASS = "testpassword"
$env:ACDREAM_A8_INDOOR_BRANCH = "1"
dotnet run --project src\AcDream.App\AcDream.App.csproj --no-build -c Debug
```
The log files from the last launch were empty because this was a clean no-probe run:
- `a8-visible-cell-portalmask-cellar-flap-20260528-161702.out.log`
- `a8-visible-cell-portalmask-cellar-flap-20260528-161702.err.log`
## Stop Rule
This artifact has now consumed three visual failures in this micro-loop. Next session should not make another fix until it has hard evidence identifying the pass/polygon/source texture responsible for the green pixels.

View file

@ -0,0 +1,176 @@
# A8 cellar-flap — option-2 handoff + brainstorming kickoff (2026-05-28 PM)
## Purpose
The cellar flap is the **last** A8 indoor-rendering defect. Its root cause is
fully understood (offline-confirmed). The targeted fix (option 1) was tried,
**failed**, and the failure revealed a deeper architectural coupling. The
decision is to fix it the **retail-faithful way (option 2: WB-style recursive
portal visibility)** via a fresh `superpowers:brainstorming` session. This doc
is the single pickup point for that session.
## Current tree state (do NOT reset)
- Worktree: `.claude/worktrees/strange-albattani-3fc83c/`, branch
`claude/strange-albattani-3fc83c`, tip `e415bb3` — **all A8 work is
uncommitted** in a dirty tree.
- Build green. App tests pass (90 baseline; the 3 option-1 tests were removed).
- The option-1 code (`PortalMeshBuilder.CollectSameLevelPortalCells` +
`IsVerticalPortal` + 3 tests + the GameWindow call-site change) has been
**reverted/removed** — tree is back to the working-with-cellar-flap baseline.
- `tools/A8CellAudit` gained a `portals` mode this session (offline cell/portal
dumper) — **kept**, it's the investigation workhorse.
### What WORKS in the dirty tree (the valuable A8 batch — keep)
- EnvCellRenderer SSBO **stride fix** (mat4 upload, not 80-byte InstanceData).
- WB-style global `FrontFace(CW)` + per-batch `CullMode` through MDI.
- `EntitySet` partitioning (IndoorPass / OutdoorScenery / LiveDynamic) +
`BuildingShellAnchorCellId` scoping.
- `RenderOutsideInAcdream` (look into buildings from outside).
- `CollectVisiblePortalBuildings` frustum cull of portal bounds.
- Streaming near/far priority queues + `PromoteToNear` + the
`LandblockEntriesWithoutAnimatedIndex` hot-path fix (fixed bridge/wall/collision
regressions after travel).
- Temporary `ACDREAM_A8_DIAG_*` flags (strip before any commit).
### What DOESN'T work
- **Cellar flap** + the broader inside-out fragility (see the coupling below).
> **Decision point for the human:** the working A8 batch is large and
> uncommitted. Consider committing it (after stripping the `ACDREAM_A8_DIAG_*`
> flags) so the option-2 work starts from a clean baseline. Deferred per
> "don't commit yet," but flagged.
## The cellar flap — root cause (confirmed)
Full evidence: [`docs/research/2026-05-28-a8-cellar-flap-root-cause.md`](2026-05-28-a8-cellar-flap-root-cause.md).
Short version: the inside-out stencil mask flat-marks the exit portals
(windows/doors) of **every** visibility-BFS-reached cell. From the cellar
(`0xA9B40171`, **zero** exit portals), the BFS reaches the ground-floor cells
(`0x16F`, `0x170`) up the stairwell and marks **their** windows. Step 4 then
paints the whole outdoor world through those silhouettes wherever the cellar's
stairwell hole leaves them un-occluded. There is no constraint tying a deeper
cell's exit portal to the portal chain (the narrow stairwell) it was reached
through.
## ⭐ THE KEY FINDING — `didInsideStencil` double-duty coupling
This is the expensive lesson; do not re-pay it.
`RenderInsideOutAcdream` Step 4 (GameWindow.cs ~11167) wraps **both** the
terrain draw **and** the entire `OutdoorScenery` dispatcher draw (which includes
neighbor **building shells**, scenery, and the depth-repair pass) in:
```csharp
if (didInsideStencil) { ... terrain + OutdoorScenery ... }
```
where `didInsideStencil == (camera-side-filtered exit-portal mask is non-empty)`.
So the portal mask is doing **two jobs at once**:
- **Job A (intended):** gate "paint terrain/sky *through* the portal openings."
- **Job B (accidental):** decide "draw exterior geometry (shells/scenery/depth-repair) **at all**."
**Why option 1 failed:** option 1 correctly shrank the mask (same-level cells
only) so the cellar's mask went empty → `didInsideStencil=false` → **Step 4
skipped entirely** → exterior shells + terrain vanished → "walls transparent,
sky behind, terrain gone." The old flat mask (all visN cells) *papered over*
this by almost always keeping the mask non-empty.
**Consequence for ANY fix:** correctly scoping/clipping the portal mask is not
enough on its own — it will empty the mask in legitimate cases (looking at an
interior wall, sealed cellar) and kill exterior rendering. **Job A and Job B
must be decoupled** so exterior geometry draws regardless of whether any portal
is currently visible. This is true for option 2 as much as option 1.
## Decision: option 2 (WB-faithful recursive portal visibility)
Chosen over option 1 (decouple-only) because:
- The project mandate is faithful WB/retail porting; option 1 is a structural
deviation from WB's RenderInsideOut, and prior "cleaner redesign" deviations
were reverted.
- Option 2 handles every case (cellar, stacked floors, deep dungeons) without
per-case special-casing.
- It is large enough to deserve design-first (brainstorm), not a mid-session patch.
Note: option 2 still has to solve the Job-A/Job-B decoupling above — it's not
optional.
## Open design questions for the brainstorm (resolve BEFORE coding)
1. **Does WB even render a sealed sub-cell (cellar) via inside-out?** Check how
WB derives `_buildingsWithCurrentCell` (VisibilityManager.PrepareVisibility +
PortalRenderManager.GetBuildingPortalsByCellId). If WB *excludes* a cell with
no exit portals from the inside-out path, the "fix" may be a classification
change, not recursion. **Verify against WB source — don't assume recursion exists.**
2. **How does WB ACTUALLY constrain per-portal visibility?** Re-read
`VisibilityManager.cs` (RenderInsideOut/RenderOutsideIn) + `PortalRenderManager.cs`
end-to-end. Is the clipping (a) recursive portal traversal, (b) the 3-bit
stencil Step-5 pipeline, (c) pure Step-3 depth occlusion, or (d) the BSP
portal-graph in PrepareVisibility? Our port copied the *flat* Steps 1-4; the
constraint mechanism may live in code we didn't port.
3. **Job-A/Job-B decoupling.** Design how exterior geometry (shells + scenery +
depth-repair) draws independent of the portal mask, while terrain-through-portal
stays stencil-gated. This must land regardless of the recursion design.
4. **Stencil-bit budget + occlusion-query lifecycle** if the full WB Step-5
cross-building path is adopted (currently gated off via `ACDREAM_A8_STEP5`).
## Key source references for the brainstorm
WB (the algorithm being ported):
- `references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/VisibilityManager.cs`
`PrepareVisibility` (47-71), `RenderInsideOut` (73-239), `RenderOutsideIn` (241+).
- `references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/PortalRenderManager.cs`
`RenderBuildingStencilMask`, `GetVisibleBuildingPortals`, `GetBuildingPortalsByCellId`.
- `references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/EnvCellRenderManager.cs`.
acdream (the current port):
- `src/AcDream.App/Rendering/GameWindow.cs``RenderInsideOutAcdream`
(~11012), `RenderOutsideInAcdream` (~196), the Step-4 `didInsideStencil` gate
(~11167). **This is where the Job-A/Job-B coupling lives.**
- `src/AcDream.App/Rendering/IndoorCellStencilPipeline.cs``PortalMeshBuilder`
(camera-side filter), `RenderBuildingStencilMask`, `DrawUploadedPortalMesh`.
- `src/AcDream.App/Rendering/Wb/BuildingLoader.cs` — building cell-set BFS
(mirrors WB PortalService; cellar IS expanded into building 0xA).
Retail oracle (if WB is ambiguous):
- `docs/research/named-retail/acclient_2013_pseudo_c.txt``CObjCell::find_visible_child_cell`
(≈311397), `PView::DrawCells` (≈432709). Retail uses screen-space polygon-clip
scissor recursion — the conceptual ancestor of "clip each portal to the chain."
Offline tooling:
- `tools/A8CellAudit``dotnet run -- portals <cellId...>` dumps a cell's
CellPortals (exit vs interior); `-- buildings <lb> <radius>` dumps
building→cell grouping. Reproduces the whole investigation in seconds, no launch.
## Brainstorming kickoff prompt (copy-paste into a fresh session)
> Use the `superpowers:brainstorming` skill. We're designing the retail-faithful
> fix for the A8 "cellar flap" — the last A8 indoor-rendering defect.
>
> Read first, in order:
> 1. `docs/research/2026-05-28-a8-cellar-flap-option2-handoff.md` (this doc) —
> current state, the confirmed root cause, the `didInsideStencil` double-duty
> finding, the decision, and the open design questions.
> 2. `docs/research/2026-05-28-a8-cellar-flap-root-cause.md` — the offline evidence.
> 3. The WB + acdream source references listed in the handoff.
>
> The goal: design how acdream's indoor visibility should render outside-through-
> portals **correctly clipped to the portal chain** (so a sealed cellar shows no
> terrain, a windowed room shows its own windows, deep rooms show only the sliver
> visible through the doorway chain) **AND** decouple "draw exterior geometry at
> all" from "is the portal mask non-empty" (the coupling that made the targeted
> fix regress).
>
> Brainstorm MUST resolve the 4 open design questions in the handoff before any
> code — especially Q1 (does WB even render a sealed cellar inside-out?) and Q2
> (what is WB's ACTUAL per-portal clipping mechanism — verify against source,
> don't assume recursion). Output a written design/plan; do not start coding
> until the design is agreed.
>
> Process rules still in force: no workarounds/band-aids; faithful WB/retail
> port; one visual gate only when a complete fix is ready; the broken indoor
> branch is behind `ACDREAM_A8_INDOOR_BRANCH=1` (default off = pre-A8 visual).
> The dirty tree has valuable uncommitted A8 work — decide whether to commit it
> (strip `ACDREAM_A8_DIAG_*` first) before starting.

View file

@ -0,0 +1,120 @@
# A8 cellar-flap — structured debugging root cause (2026-05-28 PM)
## Method
Systematic-debugging Phases 1-3, all evidence gathered **offline** via the
`tools/A8CellAudit` tool (extended with a `portals` mode) — no live launches
needed. Deterministic, instant, reproducible.
## Phase 1 — evidence
Scenario (from `launch-a8-probe-normal-20260528-194536.out.log`):
- Camera in cell `0xA9B40171`, `inside=True really=True`.
- `camBldgs=[0xA]`, `visN=7 [0x16F,0x170,0x171,0x172,0x173,0x174,0x175]`.
- Portal stencil mask = 12 verts (not the old over-punch case).
- Bisection (prior session): writer is **Step 4 content**; disabling Step-2
punch does **not** fix it.
Offline audit findings:
**Building grouping** (`A8CellAudit buildings 0xA9B40000`):
```
buildingOrdinal=10 registryId=0xA model=0x01002232 portalCells=[0xA9B4016F,0xA9B40170]
```
Building `0xA`'s LandBlockInfo seed = `{0x16F, 0x170}`. `BuildingLoader` then
BFS-expands through interior portals → all 7 cells (incl. the cellar). The
BFS matches WB's `PortalService` (same algorithm), so the grouping is not the
divergence.
**Exit-portal ownership** (`A8CellAudit portals ...`):
| Cell | exit portals (0xFFFF) | interior | role |
|------|----|----|------|
| `0x16F` | **1** | 1 | ground floor (window/door) |
| `0x170` | **1** | 1 | ground floor (window/door) |
| `0x171` (camera) | **0** | 3 | cellar |
| `0x172``0x175` | **0** | 12 | cellar rooms |
So the 12-vert mask = `0x16F` exit (6v) + `0x170` exit (6v). **The cellar
camera (zero exit portals) is marking the two ground-floor windows.**
**Topology**:
```
0x171.portal[0] -> 0x170 (stairwell/hatch, polyId 54)
0x170.portal[1] -> 0x171 (polyId 5)
0x170.portal[0] -> EXIT (window/door to outside, polyId 4)
```
Cellar connects directly up to ground floor `0x170`; `0x16F` is one further hop.
**Occluder geometry** (`A8CellAudit 0xA9B40170` / `0xA9B40171`):
- `0x170` floor poly `0x0002` (n.Z=+1) **emits** — the cellar's ceiling/occluder exists.
- `0x171` has a ceiling `0x0003` (n.Z=-1, emits) AND three `NoPos` polys
`0x0036/0x0037/0x0038` (surface `0x080000DF`) that do **not** emit —
`0x0038` is a ceiling-plane poly = the **stairwell hole** up to the ground floor.
## Phase 2 — pattern vs WB
WB `RenderInsideOut` marks the building's exit portals (flat — same as us) and
relies on **Step-3 cell depth** to occlude them: terrain only survives where the
punched/cleared far-depth isn't overwritten by rendered cell geometry.
Our code matches that structure. The difference that produces the visible flap:
WB's outside view through a portal is the world geometrically behind that
portal; from a cellar, the only un-occluded opening is the **stairwell hole**
(`0x0038`, not rendered). Through that hole, stencil=1 (ground-floor window
marked) and depth=far → **Step 4 draws the entire outdoor world (terrain +
buildings) through the hole**, not a window-sized sliver. The two ground-floor
windows are 12 BFS hops above the camera and should contribute essentially
nothing from the cellar, but their full silhouettes are marked.
"Disable Step-2 punch doesn't fix it" is explained: the leak pixels are the
stairwell hole, which has **cleared (far) depth** regardless of the punch
because no cell geometry covers it — terrain passes `DepthFunc.Less` either way.
## Phase 3 — single hypothesis (root cause)
**The inside-out exit-portal stencil mask is built by flat-marking the exit
portals of every visibility-BFS-reached cell. From the cellar, the BFS reaches
the ground-floor cells, whose windows get full-silhouette-marked. Where the
cellar's stairwell hole leaves those silhouettes un-occluded, Step 4 paints the
whole outdoor world through them. There is no constraint tying a deeper cell's
exit portal to the portal chain (here: the narrow stairwell) through which its
cell became visible.**
This is a flat-vs-constrained masking gap. Not a depth bug (occluders emit and
render), not the Step-2 punch, not the camera-side filter (the cellar camera is
geometrically on the interior side of a ground-floor window's plane, so the
per-portal filter passes it).
## Phase 4 — fix options
1. **Camera-cell-scoped mask (minimal, conservative).** Mark only the camera
cell's own exit portals. Cellar (0 exit portals) → empty mask → no leak;
windowed room → marks its own windows. **Risk:** loses daylight through an
*adjacent* cell's window seen across a doorway in multi-cell ground-floor
rooms (e.g. the inn) — a visible-but-minor regression, and the flat approach
was wrong there anyway.
2. **Vertical-portal-aware scoping (targeted).** Don't propagate exit-portal
marking across a floor/ceiling (vertical-normal) portal. The cellar→ground
stairwell is a horizontal-plane portal; suppressing inheritance across it
stops the cellar from marking ground-floor windows while preserving
same-level multi-cell rooms. Needs per-portal polygon-normal classification.
3. **WB recursive/constrained portal masking (faithful, largest).** Constrain
each deeper portal's stencil to the screen region of the portal chain leading
to it. Correct for all cases (cellar + multi-cell rooms) but a substantial
port of WB's recursive RenderInsideOut.
**Recommendation:** option 2 is the best correctness/effort trade — it fixes the
cellar without the inn regression risk of option 1, and is a principled scoping
rule (don't inherit a different vertical level's exterior openings) rather than a
band-aid. Option 3 remains the eventual faithful target if cross-level portal
visibility ever needs to be exact.
## Reproduction / verification assets
- `tools/A8CellAudit` `portals` mode (added this session) dumps any cell's
`CellPortals` offline. `A8CellAudit buildings <lb> <radius>` dumps
building→cell grouping. These make the whole investigation re-runnable in
seconds with zero launches.

View file

@ -0,0 +1,260 @@
# Phase A8 — EnvCellRenderer line-by-line audit findings (2026-05-28 PM)
## TL;DR
The post-Wave-5 visual chaos (flickering colors, missing walls, GPU 100%, ~10 FPS)
is caused by **two interconnected pool-management bugs** in
`src/AcDream.App/Rendering/Wb/EnvCellRenderer.cs`. Both were introduced by the
WB port author's "simplification" — dropping `PostPreparePoolIndex` from the
snapshot (calling it scenery-only) and dropping `list.Clear()` from
`GetPooledList`. Either bug alone is enough to corrupt rendering; together they
explain every reported symptom precisely.
Neither bug was found by the five post-Wave-5 speculative fixes because none
of them inspected the pool-management code path. The bug is in code the
subagent wrote that nobody had read line-by-line against WB. The audit pattern
in the handoff doc (Process retrospective §3 "Trust-but-verify on subagent
work") found it in under 30 minutes once actually applied.
A third minor bug in `PopulateRecursive` is also documented for completeness.
## Bug #1`GetPooledList` is missing `list.Clear()`
### Code as shipped
`src/AcDream.App/Rendering/Wb/EnvCellRenderer.cs:959-969`:
```csharp
private List<InstanceData> GetPooledList()
{
lock (_listPool)
{
if (_poolIndex < _listPool.Count) return _listPool[_poolIndex++]; // MISSING Clear()
var fresh = new List<InstanceData>();
_listPool.Add(fresh);
_poolIndex++;
return fresh;
}
}
```
### WB canonical implementation
`references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/ObjectRenderManagerBase.cs:1221-1233`:
```csharp
protected List<InstanceData> GetPooledList() {
lock (_listPool) {
if (_poolIndex < _listPool.Count) {
var list = _listPool[_poolIndex++];
list.Clear(); // ← CRITICAL: clears stale data from prior frames
return list;
}
var newList = new List<InstanceData>();
_listPool.Add(newList);
_poolIndex++;
return newList;
}
}
```
### Effect of the bug
`PrepareRenderBatches`'s merge phase pattern (`EnvCellRenderer.cs:494-501`) is:
```csharp
if (!gfxDict.TryGetValue(gfxKvp.Key, out var list))
{
list = GetPooledList();
gfxDict[gfxKvp.Key] = list;
}
list.AddRange(gfxKvp.Value); // ← appends to whatever was in list
```
Each frame, when `GetPooledList` returns a previously-used pool list:
- That list STILL contains prior frames' instance data
- `list.AddRange(gfxKvp.Value)` appends the new frame's data on top
- Lists grow unbounded across frames
- The GPU receives buffers with N frames' worth of instance data
After ~50 frames at 60Hz, every batch's instance count is ~50× what it should
be. The GPU processes 50× more vertices per draw. Hence GPU 100% + low FPS.
Worse: the per-instance transforms in the appended data are STALE — they
represent where instances were in prior frames, not now. Hence "flickering
colors" (transforms appear/disappear randomly) and "missing walls" (instances
whose latest data has been buried under stale entries).
## Bug #2`_poolIndex = snapshot.BatchedByCell.Count` instead of `snapshot.PostPreparePoolIndex`
### Code as shipped
`src/AcDream.App/Rendering/Wb/EnvCellRenderer.cs:614`:
```csharp
lock (_renderLock)
{
var snapshot = _activeSnapshot;
_shader.Use();
_poolIndex = snapshot.BatchedByCell.Count; // reset point (mirrors WB line 405) ← WRONG
```
And `EnvCellVisibilitySnapshot.cs:11-13` (the snapshot definition):
```csharp
/// The scenery-side VisibleGroups / VisibleGfxObjIds / IntersectingLandblocks
/// / PostPreparePoolIndex are dropped — we render scenery through
/// WbDrawDispatcher, not through this snapshot.
```
### WB canonical implementation
`references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/EnvCellRenderManager.cs:405`:
```csharp
_poolIndex = snapshot.PostPreparePoolIndex;
```
And `references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/VisibilitySnapshot.cs:31`:
```csharp
public int PostPreparePoolIndex { get; init; }
```
Stored at `EnvCellRenderManager.cs:367` during the Prepare→snapshot swap:
```csharp
_activeSnapshot = new VisibilitySnapshot {
BatchedByCell = newBatchedByCell,
...
PostPreparePoolIndex = _poolIndex // ← captured before _poolIndex reset
};
```
### Effect of the bug
The snapshot author's reasoning ("PostPreparePoolIndex is scenery-side, we
don't need it") is **wrong**. PostPreparePoolIndex has nothing to do with
scenery — it's the pool-index high-water mark from Prepare's merge phase,
which tells Render's filter-path where the pool's safe region begins.
Specifically:
- After Prepare, `_listPool[0..K-1]` hold the data referenced by
`snapshot.BatchedByCell`. K = `_poolIndex` after merge.
- For Render to call `GetPooledList` safely (without trampling
snapshot data), `_poolIndex` must be set to K, so new pool calls return
`_listPool[K..]` — past the snapshot's region.
Our broken code sets `_poolIndex = BatchedByCell.Count`, which has NO RELATION
to the pool's high-water mark. For a typical indoor scene at Holtburg with
~5-20 cells visible, `BatchedByCell.Count` ≈ 18, but Prepare may have used
50-100 pool lists (lists per unique (cellId, gfxObjId) combo).
When Render's filter-path (`EnvCellRenderer.cs:684`) calls `GetPooledList`,
it returns a list at index 18 — which IS inside `snapshot.BatchedByCell`. With
Bug #1 fixed (Clear added), `Clear()` wipes that snapshot list. With Bug #1
unfixed, `AddRange` corrupts it. Either way: snapshot data destroyed mid-Render.
## Combined effect — why both bugs together explain the symptoms
| Symptom | Bug #1 alone | Bug #2 alone | Both together |
|---|---|---|---|
| GPU 100%, ~10 FPS | YES (lists grow per frame) | partial | YES (compounding) |
| Flickering colors | YES (stale + new transforms mixed) | YES (snapshot lists overwritten mid-render) | YES (both effects) |
| Missing walls | YES (new instances buried under stale) | YES (snapshot data wiped before draw) | YES (both effects) |
| Black diagonal sliver (final report) | YES (lists hit memory pressure / driver limits) | YES (snapshot corruption produces degenerate transforms) | YES |
The "ColorMask alpha-bit off" / "cull-cache stale" / "double terrain" / "missing
IndoorPass" fixes from the five post-Wave-5 attempts may all be real (some are
real bugs, just not THE bug). They didn't help visually because the pool
aliasing was always the dominant cause.
## Bug #3 (minor) — `PopulateRecursive` always passes `isSetup: false`
### Code as shipped
`src/AcDream.App/Rendering/Wb/EnvCellRenderer.cs:358`:
```csharp
foreach (var (partId, partTransform) in rd.SetupParts)
{
var combined = partTransform * transform;
PopulateRecursive(group, partId, isSetup: false, combined, cellId); // ← always false
}
```
### WB canonical implementation
`references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/ObjectRenderManagerBase.cs:813`:
```csharp
foreach (var (partId, partTransform) in renderData.SetupParts) {
PopulateRecursive(groups, partId, (partId >> 24) == 0x02, partTransform * transform, cellId, flags);
}
```
WB detects nested Setups by checking `(partId >> 24) == 0x02` (Setup IDs use
the 0x02 high byte in retail). Our code always treats child parts as plain
GfxObjs.
### Effect
For most cells in Holtburg, this doesn't matter — typical cells reference
simple GfxObj parts (high byte 0x01). But if any Setup contains a nested
Setup, our code adds a stub InstanceData with the nested Setup's ID into the
group dict, then `TryGetRenderData(nestedSetupId)` returns a Setup record,
and Render's `if (renderData != null && !renderData.IsSetup)` check fails →
instance silently dropped. Some renderings missing in specific cases but not
the catastrophic chaos.
Fixed as a defense-in-depth measure to match WB exactly.
## Fix plan
### Required (closes Bugs #1 + #2):
1. **`EnvCellRenderer.cs:959-969`** — Add `list.Clear()` before return in
`GetPooledList` reuse branch.
2. **`EnvCellVisibilitySnapshot.cs`** — Add `public int PostPreparePoolIndex { get; init; }`.
3. **`EnvCellRenderer.cs:523-534`** — Capture `_poolIndex` into snapshot's
`PostPreparePoolIndex` field during the atomic swap.
4. **`EnvCellRenderer.cs:614`** — Read `snapshot.PostPreparePoolIndex`
instead of `snapshot.BatchedByCell.Count`.
### Recommended (closes Bug #3):
5. **`EnvCellRenderer.cs:358`** — `isSetup: (partId >> 24) == 0x02` to
correctly detect nested Setups.
### Apparatus (safety net for any unidentified bug):
6. **GL state assertion probe extension** — extend `EmitDrawOrderProbe` to
log full GL state at each step boundary and compare against WB-expected
state. Lifts cost of finding the NEXT bug (if any) from a multi-hour
speculation cycle to a one-launch evidence cycle.
## Regression coverage
The pool aliasing bug is hard to unit-test without a GL context (Render is the
mutation site, and Render requires a shader and global mesh buffer). A targeted
test for the simpler invariant — `GetPooledList` returns a cleared list — is
sufficient: it directly verifies Bug #1's fix and provides a guard against
future regressions. Bug #2 is verified by the visual gate.
## Process retrospective
The "trust-but-verify subagent work" item in the handoff doc's Process
retrospective §3 names exactly this failure mode: the subagent wrote
`EnvCellRenderer.cs` (Wave 2, ~1013 LOC), build-test-green checks passed, but
the audit never happened. The five post-Wave-5 speculative fixes were chasing
symptoms because the audit step was skipped.
The cost of the missed audit: ~1 hour of speculation across five fix attempts,
plus the user's five failed visual gates.
The cost of the audit, when actually performed: ~30 minutes to find both root
cause bugs via line-by-line comparison.
**Rule for subagent-written code touching production paths:** always read the
diff line-by-line against the cited WB / retail source before merging. Bugs
in subagent code that "build clean and pass tests" can still be 100%-wrong on
the algorithm — the test coverage was not designed to catch them.

View file

@ -0,0 +1,218 @@
# Phase A8 — Session 2: pool fix shipped, 4 more fixes shipped, residual visuals remain (2026-05-28 PM)
## TL;DR for next session
The session-1 handoff said "BUILD APPARATUS, NOT MORE SPECULATIVE FIXES." I
built apparatus (per-step GL state probe + per-cell mesh audit + pool
diagnostics) AND, before the apparatus was used, line-by-line audited
`EnvCellRenderer.cs` against WB source. The audit found **two
high-confidence bugs** (pool aliasing) in 30 minutes — these were the
root cause of the post-Wave-5 catastrophic visual chaos. Pool fix shipped
(`9559726`) and the visual went from "thin black diagonal sliver, GPU
100%, 10 FPS, can't see anything" to "walls + objects + sky render
cleanly, FPS normal."
Five more targeted fixes shipped across visual gates #1-#5. The first
four landed real bugs. The fifth (cull-restore revert) was based on a
hypothesis the [draworder] probe data invalidated — gate-#5 showed cull
state was already off at Step 3 before EnvCellRenderer.Render ran, so
the propagation theory didn't apply.
**Per systematic-debugging skill's `≥3-failures → question architecture`
rule, I stopped and wrote this handoff rather than ship a 6th speculative
fix.** The remaining symptoms (transparent floor, texture warping,
distortion) point to architectural-level issues that need a different
investigation approach.
## Visual progress chronicle
| Gate | Symptoms reported | Cause if known |
|------|-------------------|---|
| Pre-session (from session-1 handoff) | "Thin black diagonal sliver, GPU 100%, 10 FPS, can't see anything" | Pool aliasing (cleared by session-2 commit `9559726`) |
| Gate #1 (`375f9a7` + sky-fix not yet) | Walls + objects render, no flicker, FPS normal. No sky through windows. Char + doors missing. Floor missing. Purple tint on walls. | Pool fixed (huge win). LiveDynamic/sky/cull not yet addressed. |
| Gate #2 (sky fix + audit probe) | Sky visible through windows ✓. Char + doors still missing. Floor still missing. Purple still. | Sky fix worked. Audit dumped per-cell render data. |
| Gate #3 (LiveDynamic + cull-disable A/B) | Char + doors visible ✓. Floor sometimes visible. See-through-head (cull-off side effect). | LiveDynamic fix worked. Cull-disable proved cull was hiding floor. |
| Gate #4 (Landblock→None + cull-restore) | "BROKEN textures, floor is now transparent" — sky visible through floor | Cull-restore at exit propagated cull-back to dispatcher's IndoorPass, culling cottage shell's floor poly. |
| Gate #5 (revert cull-restore) | "No change at all, textures warped, missing textures, floors transparent and flickering" | Revert didn't help — [draworder] probe shows cull was already off at Step 3 entry, so removing my cull-restore at exit doesn't change inherited state. |
## What's shipped this session
| SHA | Description | Status |
|-----|-------------|--------|
| `9559726` | Pool aliasing root cause fix (Clear + PostPreparePoolIndex + nested-Setup detection) + 4 regression tests + audit findings doc | **KEPT — closes the post-Wave-5 chaos** |
| `375f9a7` | Full GL state probe + pool diagnostics extension (option-1 apparatus) | **KEPT — apparatus** |
| `772d69c` | Sky-when-cameraInsideBuilding fix + per-cell audit probe | **KEPT — sky through windows works** |
| `b19f3c1` | LiveDynamic dispatcher call in indoor branch + ACDREAM_A8_DISABLE_CULL A/B gate | **KEPT — chars + doors visible inside** |
| `0940d79` | Cell-mesh Landblock CullMode → None + cull-state restore at exit | **PARTIALLY KEPT — Landblock→None is good; cull-restore was wrong (reverted in d5deeb3)** |
| `d5deeb3` | Revert cull-restore at EnvCellRenderer exit | **KEPT — leaves cull-off propagating** |
## What's still wrong (visual gate #5 state)
User-reported symptoms with kill-switch ON (`ACDREAM_A8_INDOOR_BRANCH=1`):
1. **Floor transparent** — sky color visible where floor should be. Cell
mesh has Landblock→None override that should render cell polys
double-sided, but the floor poly either (a) isn't in the upload, (b)
has wrong winding/orientation, or (c) is being rendered but z-fails or
alpha-discards.
2. **Texture warping** — vague but visible in screenshots. Some surfaces
show wrong texture or texture appears stretched/distorted.
3. **Flickering** — surfaces alternate between visible/invisible across
frames. Could be Z-fighting (cell mesh vs cottage shell at same depth),
alpha-test threshold instability, or animated camera causing
per-frame frustum-test results to differ.
4. **General distortion** — overall scene "looks broken." Possibly purple
tint on lighting (mentioned in gates #1-#3, not explicitly in #5).
## Apparatus state
These probes are wired and operate when env vars are set:
- `ACDREAM_PROBE_VIS=1` — emits `[draworder]` (per-step GL state),
`[stencil]` (per stencil mark/punch), `[buildings]` (camera-building
list), `[envcells]` (cells + tris + pool stats).
- `ACDREAM_A8_AUDIT=1` — one-shot per (cellId, gfxObjId) pair dump of
render data: batches count, total IndexCount, CullModes encountered,
IsTransparent + IsAdditive flags, BindlessTextureHandle == 0 count.
Sample audit data captured in gate-#2 (`a8-visual-gate-2.log`):
```
[a8-audit] cell=0xA9B4013F gfx=0x7F852B220B93AD instances=1 isSetup=False batches=4 totalIdx=144 cull=[Landblock] translucent=0 additive=0 zeroHandle=0
```
Every cell mesh batch has CullMode=Landblock (uniform). Render data
loads correctly (no nulls, no zero handles).
Sample [draworder] data captured in gate-#5 (`a8-visual-gate-5.log`):
```
[draworder] frame=155 step=3 stencil=off depthFn=0x201 depthMask=True cull=off(back) blend=0x302/0x303 sFunc=0x207:1:0xFF sOp=0x1E00/0x1E00/0x1E01 sMask=0x1 cMask=(RGB-) vao=0 prog=6
```
Cull is OFF at Step 3 entry (Step 1's `gl.Disable(EnableCap.CullFace)`
already disabled it; my cull-restore-at-exit revert had no effect on
incoming state).
## Root-cause analysis — why the speculative fixes can't close it
### Theory A: AC's polygon winding requires `glFrontFace(CW)`
WB sets `glFrontFace(GLEnum.CW)` globally at
[GameScene.cs:843](references/WorldBuilder/Chorizite.OpenGLSDLBackend/GameScene.cs:843).
Our `WbDrawDispatcher.cs:1056` sets `glFrontFace(CCW)` in the transparent
pass with a comment claiming "our fan triangulation emits pos-side polys
as (0, i, i+1) — CCW." But the actual triangulation in
`BuildCellStructPolygonIndices` ([ObjectMeshManager.cs:1518-1586](src/AcDream.App/Rendering/Wb/ObjectMeshManager.cs:1518))
emits `(i, i-1, 0)` — the REVERSE of (0, i, i+1). The comment is wrong
about our actual winding.
If AC's polys are wound CCW from their PosSurface side (the "front" side
in retail convention), our triangulation produces CW-from-PosSurface
triangles. WB's `FrontFace=CW` makes CW = front, so cull-back removes
the back side correctly. Our `FrontFace=CCW` makes CCW = front, so
cull-back removes the WRONG side — hiding polys whose PosSurface is
camera-facing.
**Verification approach**: change `FrontFace` to CW globally (matching
WB at GameScene.cs:843) and audit every consumer (sky, particles, UI,
translucent crystal mesh) for impact. The dispatcher's CCW set at
line 1056 has a comment about a Phase 9.2 fix (lifestone crystal
see-through-hollow-interior) — that fix might have papered over the
underlying FrontFace mismatch instead of fixing it properly.
**Risk**: changing FrontFace globally might re-introduce the
hollow-interior bug for closed-shell translucent meshes. Needs careful
audit and possibly per-renderer FrontFace push/pop.
### Theory B: Cell polys' floor is filtered out at upload time
`PrepareCellStructMeshData` ([ObjectMeshManager.cs:1295-1306](src/AcDream.App/Rendering/Wb/ObjectMeshManager.cs:1295)):
```csharp
if (!poly.Stippling.HasFlag(StipplingType.NoPos))
AddSurfaceToBatch(poly, poly.PosSurface, false);
bool hasNeg = poly.Stippling.HasFlag(StipplingType.Negative) ||
poly.Stippling.HasFlag(StipplingType.Both) ||
(!poly.Stippling.HasFlag(StipplingType.NoNeg) && poly.SidesType == CullMode.Clockwise);
if (hasNeg)
AddSurfaceToBatch(poly, poly.NegSurface, true);
```
For a floor poly with `Stippling=NoPos + SidesType=Landblock + no
Negative/Both flag`, NEITHER side is uploaded → no rendering at all.
Plausible if AC encodes floor polys this way.
**Verification approach**: dump per-poly Stippling + SidesType + PosSurface
+ NegSurface values for cells. Add to the audit probe.
### Theory C: cottage shell has no floor poly + cell mesh's floor is broken
In retail AC, the cottage's "shell" GfxObj (from `info.Buildings[i].ModelId`)
contains walls + roof + door frame. The floor is provided entirely by the
cell's CellStruct PosSurface polygons. If our cell mesh's floor poly is
broken (winding, missing, wrong texture), nothing else fills in.
**Verification approach**: run WB's executable against the same dat,
take a screenshot from the same camera position inside the same cottage,
diff against our screenshot. Identifies whether the floor source is
the cell mesh or somewhere else.
## Process retrospective — what worked this session
1. **Audit BEFORE apparatus**: line-by-line read of EnvCellRenderer vs
WB source found the pool bug in 30 min. The handoff doc warned about
subagent-written code never being audited; that was the right warning.
2. **Apparatus shipped alongside fix**: GL state probe + audit dumps
captured concrete data that informed subsequent fixes. Gates #1-#5
all relied on probe data, not pure visual.
3. **Stopping after 4 fixes**: per systematic-debugging skill. The
alternative (a 6th speculative attempt) would have either burned more
user testing cycles or shipped another band-aid.
## What this session did NOT do (in scope for next session)
- Match WB's `glFrontFace(CW)` globally + audit consumers.
- Inspect per-poly Stippling/SidesType for cell floors.
- WB renderer side-by-side comparison.
- Investigate purple tint on walls (lighting / scene UBO).
- Investigate texture warping (UV / sampler issues).
- Investigate flickering (Z-fighting / alpha threshold).
- Remove the ACDREAM_A8_INDOOR_BRANCH kill-switch (still needed; default
OFF restores pre-A8 behavior).
## Pickup prompt for next session
> Phase A8 indoor branch is partially working as of `d5deeb3`. Pool
> aliasing root cause is fixed. Sky-through-windows, LiveDynamic chars,
> cell-mesh double-sided rendering all work. But the floor is transparent
> (sky visible through it), textures warp, and the scene has residual
> distortion + flickering.
>
> Read this doc end-to-end. Then pick ONE of the three theories above
> and verify before any code change:
>
> 1. **Theory A (FrontFace=CW)**: highest-leverage. WB sets CW globally;
> we set CCW. Audit translucent crystal + sky shaders' winding
> assumption first. If safe, set FrontFace=CW globally and visual-gate.
>
> 2. **Theory B (cell-poly filtered)**: extend the existing
> `ACDREAM_A8_AUDIT=1` probe to dump per-poly Stippling + SidesType
> + PosSurface/NegSurface for a few cells. Live-capture data; check
> if any floor poly is "no upload" per the conditional.
>
> 3. **Theory C (WB side-by-side)**: build WB's executable from
> `references/WorldBuilder/`, point at same dat dir, screenshot same
> cottage interior. Compare. Confirms or rules out our cell mesh
> upload as the source of the bug.
>
> The kill-switch (`ACDREAM_A8_INDOOR_BRANCH=1`) remains the way to
> reproduce the indoor branch. Pre-A8 behavior (kill-switch unset) is
> still the default and unchanged.
>
> User authorization: "use superpowers but DONT stop me for questions,
> be perfect, no bandaids." The "no bandaids" rule is why this session
> stopped at fix #5 and wrote the handoff instead of attempting fix #6.
> Carry that discipline forward.

View file

@ -0,0 +1,293 @@
# Phase A8 WB port — scaffolding shipped, indoor branch broken — handoff (2026-05-28)
## TL;DR for next session
The Phase A8 WB RenderInsideOut port shipped across six waves of commits.
The indoor branch FIRES correctly per probe data, but renders incorrectly
on visual inspection (texture flicker, missing walls, GPU 100%, ~10 FPS).
Five post-Wave-5 speculative fixes did not resolve the visual breakage.
**Default behavior is now kill-switched back to pre-A8 (working build).
Set `ACDREAM_A8_INDOOR_BRANCH=1` to re-enable the indoor branch for
further investigation.**
Today's session shipped 16 commits on top of `3e9ff7a` (the RR7 revert
chain). The WB scaffolding is solid (RenderPass enum, Frustum, SceneryInstance/
EnvCellLandblock, VisibilityVisibilitySnapshot, EnvCellRenderer, IndoorCellStencilPipeline
extension). The render-frame integration is broken in a way that
required-more-iteration than this session had budget for.
**Next session's mission:** stop speculating, build the apparatus.
Either a deterministic replay harness for the render frame (record frame
inputs, replay offline, diff per-step GL state) or a side-by-side
comparison harness against WB's own renderer. The pattern is the same one
that finally cracked issue #98 after six months of failed speculative fixes.
User direction (verbatim, 2026-05-27): "no quickfixes or fixes that
might cause issues down the line ... use superpowers but DONT stop me
for questions, be perfect, no bandaids."
Today's session had no bandaids in the WB extraction work (Waves 1-5).
The five post-Wave-5 fix attempts WERE band-aids — each chased a
plausible-looking symptom without confirming root cause. After five in a
row failed, I stopped and shipped the kill-switch instead. That's the
process correction: an apparatus comes before the next attempt.
## What shipped today
| SHA | Wave | Description | Status |
|---|---|---|---|
| `fc68d6d` | 1 | WB scaffolding extraction — RenderPass enum, Frustum + WbBoundingBox + FrustumTestResult, SceneryInstance + EnvCellLandblock, EnvCellVisibilitySnapshot, IndoorCellStencilPipeline.RenderBuildingStencilMask low-level | **KEPT** |
| `f16b8e9` | 2 | EnvCellRenderer (WB EnvCellRenderManager port, 1013 LOC + inline RenderModernMDIInternal) | **KEPT** |
| `aad9ed4` | 2.1 | Shader API: GLSLShader → legacy Shader (Bind→Use, SetUniform→SetInt etc.) | **KEPT** |
| `4b4f687` | 3 | Wire EnvCellRenderer into landblock streaming (BuildInteriorEntitiesForStreaming + FinalizeLandblock + RemoveLandblock) | **KEPT** |
| `f9a644a` | 4 | RenderInsideOutAcdream byte-for-byte WB port in GameWindow.cs | **KEPT** |
| `8532c84` | 5 | Probes — [envcells]/[stencil]/[draworder]/[buildings] | **KEPT** |
| `0fc6003` | 5.1 | Stamp BuildingId on already-loaded cells (richer cellsByCellId dict to BuildingLoader.Build) | **KEPT** |
| `5d41876` | 5.2 | Normalize _buildingRegistries key (`& 0xFFFF0000u`) — RR7.2 saga, landed properly | **KEPT** |
| `9c59910` | 6 | Step 5 gate-off-by-default + GL state restore at cleanup | **KEPT** |
| `9ee42d4` | 7 | Invalidate `_currentVao` / `_currentCullMode` static caches at Render entry | **KEPT** |
| `f143ece` | 8 | Skip line-7200 terrain.Draw when cameraInsideBuilding (avoid double-terrain Z-fight) | **KEPT** |
| `2bf5013` | 9 | Render IndoorPass entities (IsBuildingShell building walls) between Step 3 and Step 4 | **KEPT** |
| `<this commit>` | 10 | **Kill-switch:** ACDREAM_A8_INDOOR_BRANCH gate; default OFF; pre-A8 depth-clear workaround restored when off | **KEPT** |
All ~1,830 LOC of new code remains in tree, accessible for the next-session apparatus to test against.
## Visual-gate chronicle
**Launch 1** (after Waves 1-5 + key/BuildingId fix at `5d41876`)
- `[buildings] camCell=0xA9B40143 camBldgs=[0x7] otherBldgs=109 totalKnown=110`
- `[envcells] cells=18 tris=74932 ourBldgs=1 otherBldgs=109 filterCnt=18`
- `[stencil] op=mark/punch bld=0x7 verts=39`
- `[draworder]` showed full Step 1 → 2 → 3 → 4 → 5{a,b,c,d} cycle
- Probes confirmed the WB pipeline firing — user verification gate triggered.
- User report: **"completely unplayable, textures flickering, can barely move, ACdream crashed especially indoor."**
**Launch 2** (after `9c59910` Step 5 disable + ColorMask restore at cleanup)
- User report: **"still chaos, GPU 100%"**
**Launch 3** (after `9ee42d4` cull-cache invalidation)
- User report: **"Cant see anything, flickering colors, sometimes I see textures and sometimes I see inside, the house is missing lots of walls. 10 FPS."**
**Launch 4** (after `f143ece` skip-double-terrain)
- User report: **"No difference"**
**Launch 5** (after `2bf5013` IndoorPass added)
- User screenshot showed: mostly fog-color screen with a thin black diagonal sliver. "Basically the same. Screen is flickering like this."
- This was the call to ship the kill-switch instead of continuing.
## Root-cause hypotheses tested and falsified
Each of these was a plausible single-cause hypothesis; the visual didn't
improve materially after fixing them. They may be PARTIALLY correct
(some of them ARE bugs the next session needs to keep fixed), but none
were THE cause of the visual chaos.
1. **Step 5 perf / TDR (commit `9c59910`)** — Step 5 was iterating 109
other-buildings/frame with no frustum cull, 545 GL draws/frame, plausible
driver TDR cause. Disabled → "still chaos, GPU 100%." Was a perf risk but
not the visual root cause.
2. **ColorMask alpha-bit off at cleanup (`9c59910`)** — WB exits with
ColorMask(t,t,t,FALSE). Our pipeline uses alpha-to-coverage; subsequent
passes need alpha writes. Fix didn't visually improve anything.
3. **`_currentCullMode` / `_currentVao` stale (`9ee42d4`)** — these are
STATIC caches in EnvCellRenderer; other consumers (dispatcher, terrain
renderer) change actual GL state without updating them, leaving the
cache lying. WB invalidates at Render entry; our port missed that.
Confirmed-real bug, fix didn't visually improve anything.
4. **Double-terrain draw (`f143ece`)** — line 7200 unconditional terrain
+ Step 4 stencil-gated terrain = terrain drawn twice indoor. Was a
perf doubler + Z-fight cause. Fixed; no visual improvement.
5. **Missing IndoorPass (`2bf5013`)** — pre-A8 indoor walls came from
`IsBuildingShell` entities rendered via `Draw(set: IndoorPass)`.
A8's Step 4 calls `Draw(set: OutdoorScenery)` which excludes
shells. EnvCellRenderer provides cell mesh (floor + CellStruct walls)
but landblock-baked exterior wall shells were never drawn. Fixed;
no visual improvement.
## What's STILL likely wrong
The remaining unknowns (any combination could be the visual root cause):
1. **EnvCellRenderer.RenderModernMDIInternal** — the inlined `RenderModernMDI`
extraction from `BaseObjectRenderManager.cs:709-848`. This is single-slot
(vs WB's 3-slot ring) and may have subtle differences from WB. Particularly
suspect: the buffer orphan-and-write pattern, the SSBO bind layout, the
`MemoryBarrier` placement, the `glMultiDrawElementsIndirect` offset
arithmetic. Without a per-draw comparison against WB output, we can't tell.
2. **`InstanceData.CellId` population** — our subagent stored cellId in
`EnvCellSceneryInstance.InstanceId` (a `uint`) because WB's ObjectId
struct wasn't easily portable. Whether `instance.CurrentPreviewCellId != 0
? instance.CurrentPreviewCellId : instance.InstanceId` resolves to the
same value WB expects needs verification — a subtle off-by-one or
wrong-bits here would cause the per-cell filter logic in PrepareRenderBatches
to misclassify which cells go in which group.
3. **Cell-mesh async upload race**`PrepareEnvCellGeomMeshDataAsync` is
fire-and-forget. By the time `Render` runs, `TryGetRenderData(cellGeomId)`
may return null for some cells (mesh not uploaded yet). Render silently
skips those, producing "missing walls until upload completes" — but
the user reports persistent flicker, not transient initial-load black.
4. **Shader uniform leakage between consumers**`_meshShader` is shared
between EnvCellRenderer and WbDrawDispatcher. Each consumer sets
uniforms (`uRenderPass`, `uDrawIDOffset`, `uHighlightColor`,
`uFilterByCell`). They might leak between calls in ways that subtly
affect output. The dispatcher's own `SetInt` calls should overwrite
but ordering matters.
5. **PrepareRenderBatches `filter == null` semantics** — we call
PrepareRenderBatches with `filter = null` at the top of the frame, then
call Render(filter: cellIds) inside Step 3. Per WB EnvCellRenderManager
line 282-285, when PrepareRenderBatches has a non-null filter, it
skips instances whose CellId isn't in the filter. We pass null so it
includes EVERYTHING. Then per-Render(filter:) the filter narrows for
that draw. This is the WB pattern. BUT in our PrepareRenderBatches we
ALWAYS pass null — even WB sometimes passes the visible-cell set into
PrepareRenderBatches. Worth checking if our approach causes the
per-Render filter to silently miss data.
6. **Building shells filter** — IsBuildingShell entities have no
ParentCellId, so they pass the visibleCellIds filter unconditionally
in the IndoorPass call we added. But the dispatcher's `EntityMatchesSet`
logic might filter them OUT in another path. Verify by adding a
`[indoorpass]` probe that counts how many IsBuildingShell entities
actually drew.
## Recommended next-session apparatus
This is the saga that finally needs the apparatus pattern. From
`docs/research/2026-05-23-a6-p3-issue98-comparison-harness-findings.md`
(the #98 saga that took six speculative fixes before the apparatus shipped):
> The right move is to build a deterministic apparatus that captures
> live state, replays it offline, and diffs per-step. Iteration in
> <500ms beats five-minute live-test cycles every time.
For A8, the analog apparatus is:
1. **Frame-replay harness for the indoor branch.** Capture every frame's
inputs to RenderInsideOutAcdream (viewProj, camPos, camera-buildings,
other-buildings, GL state on entry), persist as JSONL. Replay test
in xUnit: instantiate a headless GL context, run RenderInsideOutAcdream
on the captured inputs, snapshot the output framebuffer to PNG, diff
against an "expected" PNG. Five seconds of in-game capture = 300
frames = 300 replayable tests.
2. **Per-step GL state assertion probe.** Extend `EmitDrawOrderProbe` to
also log the full relevant GL state (cull mode, blend mode, polygon
offset, depth range, stencil func/op/mask, color mask). Compare
step-by-step against WB's expected state (lifted from
`references/WorldBuilder/.../VisibilityManager.cs:73-239` line by
line). Any divergence = the bug.
3. **Side-by-side with WB's own renderer.** WB renders Holtburg correctly.
Compile WB's own executable, point it at the same dat, capture a
screenshot from a known camera position inside a Holtburg cottage.
Compare against acdream's screenshot from the same camera. Pixel-diff
tells us how far off we are. If WB's screenshot also looks broken,
the bug is in our dat handling. If WB's looks right, the bug is in
our render-frame integration.
4. **Mesh-data audit.** Add a probe that, when ACDREAM_A8_AUDIT=1,
walks every cell in a camera-building and dumps:
- cellGeomId
- `_meshManager.TryGetRenderData(cellGeomId)` non-null?
- `renderData.Batches.Count`
- `renderData.Batches[i].IndexCount`, `BaseVertex`, `FirstIndex`
- `renderData.Batches[i].CullMode`, `IsTransparent`, `IsAdditive`,
`BindlessTextureHandle`, `TextureIndex`
If batches' IndexCount is 0 or `BindlessTextureHandle` is 0, the mesh
upload is incomplete. If everything is non-zero but visual is wrong,
the bug is in the draw path.
The apparatus pattern means: NO MORE LIVE LAUNCHES until the apparatus
can reproduce the bug deterministically. Five-minute launch cycles
masking the problem are exactly the trap that ate today's session.
## Process retrospective — what the next session should NOT do
These are the patterns from today that drove the session into the saga
trap:
1. **Treating "[probes fire correctly] but [visual broken]" as a small
gap to bridge with one more fix.** That gap was actually a large
integration bug under the surface; each speculative fix wedged
another fragment of behavior in without addressing it.
2. **Pattern-matching old RR7 saga symptoms to today's situation.** The
key/BuildingId fixes WERE real RR7-saga bugs that needed re-fixing.
But the post-key visual breakage was NEW — applying RR7-era fix
templates (ColorMask, depth-clear) without re-deriving from evidence
was lazy.
3. **Trust-but-verify on subagent work.** Both Task 5 (EnvCellRenderer
port) and Task 8 (render-frame integration) were dispatched to
subagents. I verified compile + tests-pass but did NOT line-by-line
read the diff for WB-faithfulness. The flicker bug may be in code I
never actually read.
4. **Counting [draworder] frames as "correctness".** Probe data showed
1,595 indoor frames with valid stencil/cells/buildings outputs. That
counted as "process complete" — but the visual still chaos. The
probes are NECESSARY but NOT SUFFICIENT.
## Files touched
| File | Status |
|---|---|
| `src/AcDream.App/Rendering/Wb/WbRenderPass.cs` | NEW (verbatim) |
| `src/AcDream.App/Rendering/Wb/WbFrustum.cs` | NEW (verbatim + tests) |
| `src/AcDream.App/Rendering/Wb/EnvCellSceneryInstance.cs` | NEW (verbatim with renames + tests) |
| `src/AcDream.App/Rendering/Wb/EnvCellVisibilitySnapshot.cs` | NEW (verbatim narrowed) |
| `src/AcDream.App/Rendering/Wb/EnvCellRenderer.cs` | NEW (1013 LOC, WB EnvCellRenderManager port + tests) |
| `src/AcDream.App/Rendering/Wb/WbMeshAdapter.cs` | MODIFIED (MeshManager accessor) |
| `src/AcDream.App/Rendering/IndoorCellStencilPipeline.cs` | MODIFIED (RenderBuildingStencilMask low-level + probe fields) |
| `src/AcDream.App/Rendering/CellVisibility.cs` | MODIFIED (GetCellsForLandblock for richer stamping dict) |
| `src/AcDream.App/Rendering/GameWindow.cs` | MODIFIED (extensive: ctor wiring, RegisterCell call site, RenderInsideOutAcdream method, probe emitters, kill-switch) |
| `src/AcDream.Core/Rendering/RenderingDiagnostics.cs` | MODIFIED (ProbeEnvCellEnabled flag) |
| `docs/superpowers/plans/2026-05-28-phase-a8-wb-render-inside-out-port.md` | NEW (plan committed at session start) |
| `docs/research/2026-05-28-a8-wb-port-shipped-but-broken-handoff.md` | NEW (this doc) |
## Kill-switch usage
To re-enable the indoor branch for investigation (next session):
```powershell
$env:ACDREAM_A8_INDOOR_BRANCH = "1" # enable the broken indoor branch
$env:ACDREAM_PROBE_VIS = "1" # emit [envcells]/[stencil]/[draworder]/[buildings]
$env:ACDREAM_A8_STEP5 = "1" # optional: enable Step 5 cross-building visibility
```
When `ACDREAM_A8_INDOOR_BRANCH` is unset or != "1":
- `cameraInsideBuilding` is forced false regardless of actual state
- The outdoor branch (`Draw(set: All)`) runs for indoor cells too
- The pre-A8 `if (cameraInsideCell) Clear(DepthBufferBit)` workaround is
restored
- Visual behavior matches pre-A8 baseline
- All A8 code remains in tree (probes silent, indoor branch unreachable)
## Pickup prompt for next session
> Read `docs/research/2026-05-28-a8-wb-port-shipped-but-broken-handoff.md`
> in full. Then read the kill-switch implementation in `GameWindow.cs`
> at the `cameraInsideBuilding` declaration (~line 7110) to see how
> default behavior is restored.
>
> Your mission: BUILD APPARATUS, not more speculative fixes. The
> apparatus options (frame-replay harness / per-step GL state probe /
> WB-renderer side-by-side / mesh-data audit) are listed in the
> "Recommended next-session apparatus" section. Pick one, build it,
> use it.
>
> Do NOT launch the client live with `ACDREAM_A8_INDOOR_BRANCH=1` more
> than ONCE before the apparatus is in place. The session's failure
> mode is exactly the pattern of "one more fix, then test live again"
> that ate six attempts in the #98 saga before the apparatus saved it.
>
> User authorization remains: "no questions, no band-aids." The
> apparatus build is NEITHER — it's the proper investigation path.

View file

@ -0,0 +1,392 @@
# Phase A8.F — camera-collision root cause + handoff (2026-05-29, session 2)
## TL;DR
The A8.F "flap" (building walls / ground blinking in and out) and the
missing/transparent-wall symptoms are **not** a builder or enforcement bug.
Their root cause is the **3rd-person camera EYE passing through walls**: the
A8.F renderer computes "am I inside a building?" and "which side of each
doorway am I on?" from the camera eye position, so when the eye clips a wall
those decisions flip frame-to-frame.
This session: (1) **fixed + committed** the worst symptom — the cellar terrain
flood (commit `9417d3c`); (2) established the recursive-clip **builder actually
works** for most cells (the prior "Bug B" framing was wrong); (3) **reframed the
root cause** to the camera, with the user's help; (4) researched the camera
system (Opus agent + my verification).
**The fix is retail-faithful (CORRECTED 2026-05-29 PM):** retail's camera **does**
avoid walls — `SmartBox::update_viewer` (0x00453ce0) sweeps a 0.3 m collision
sphere (`viewer_sphere`) from the player head-pivot to the desired eye via
`CTransition::find_valid_position` and uses the stopped position; it **also** fades
the player to translucent when that eye is super close (`CameraSet::UpdateCamera`).
Both are retail-faithful, and acdream already owns the `Transition` swept-sphere
engine to port the collision (the fade is already ported). So the next session
implements a faithful swept-sphere camera collision — **no divergence, no sign-off
needed.** (An earlier pass in this doc wrongly said "retail has no camera
collision"; that was a research error — see the KEY FINDING section.)
---
## What this session shipped / established
- **Bug A FIXED + committed (`9417d3c`):** an empty `OutsideView` while inside a
building now draws **no** outdoor terrain/scenery (empty mask = "no outdoors
visible," not "all outdoors"). Previously the `else` branch disabled the
stencil and flooded ungated terrain over the cell interior. **Visual-confirmed
by the user: cottage cellar walls are solid now, no terrain bleed-through.**
App tests 108/108. All behind `ACDREAM_A8_INDOOR_BRANCH=1`; default play
unaffected.
- **"Bug B" (builder under-produces) is substantially NOT real.** The pv-dump
census showed `PortalVisibilityBuilder` produces correctly *narrowed*
`OutsideView` regions for most cells (`0172/0173/0162/015E/0165/016F`
`polys=1`). The empty cases are mostly legitimate (a windowless cellar can't
see daylight) or driven by the camera being in an invalid position (see root
cause). The handoff predecessor's Finding 2 ("never narrows") does not hold.
- **Root cause reframed → the camera.** Confirmed below.
- **Camera research done** (Opus agent, findings verified against the decomp +
the code by this session).
### Diagnostics added (committed in `9417d3c`, all opt-in)
- `PortalVisibilityBuilder.Build` now emits a **`CAMPORTAL[i]` census** under
`ACDREAM_A8_DUMP_PV=1`: per camera-cell portal, before the BFS guards, it logs
`other=`, `polyLen=`, `hasPlane=`, `interiorSide=`, `planeN=`. This is what let
us see that the cottage front-door exit portal was *culled by the side test*
(`interiorSide=False`) rather than missing.
- `[opaque]` probe (`ACDREAM_PROBE_ENVCELL=1`) — opaque cell-render stats
(`cells`/`tris`) **before** the transparent loop overwrites them (the existing
`[envcells]` line reads post-loop and is misleading — ignore its `cells=1
tris=0`).
- `tools/A8CellAudit` `portals <cellId>` now **replicates `BuildLoadedCell`'s
polygon-vertex resolution** and prints `BUILDER_SEES=OK/EMPTY` per portal, so
exit-portal validity is checkable offline without a launch.
---
## The visual symptoms (user-observed, 2026-05-29, with `ACDREAM_A8_INDOOR_BRANCH=1`)
1. Cellar walls **solid** (Bug A fix confirmed). ✓
2. **Flap**: buildings/ground disappear when passing from inside to outside.
3. Cellar entrance no longer covered by terrain (Bug A fix). ✓
4. Looking into certain windows from **outside**: back walls missing.
5. Inside **other** buildings: external + internal walls transparent (showing
sky / NPCs / particles).
**The user's key correlation:** items 2/4/5 happen **when the camera passes
through a door, or when standing inside and panning the camera through a wall**.
They're intermittent — not always reproducible — which fits a camera-position
trigger, not a static rendering bug.
---
## Root cause: the camera eye drives A8.F visibility, and it clips through walls
`camPos` is the **camera eye**, extracted from the inverse of the active
camera's View matrix:
```
GameWindow.cs:7270-7271
Matrix4x4.Invert(camera.View, out var invView);
var camPos = new Vector3(invView.M41, invView.M42, invView.M43); // = the eye
```
(`camera.View` is `CreateLookAt(_dampedEye, …)`, so the translation is the
chase-camera eye — **not** the player avatar position, which is tracked
separately for interior lighting at `GameWindow.cs:7415-7417`.)
That eye then drives all three A8.F visibility decisions:
1. **Camera-cell + portal BFS**`_cellVisibility.ComputeVisibility(camPos)`
(`GameWindow.cs:7323`) → `FindCameraCell` via `PointInCell(camPos, …)`. The
BFS portal side-test (`CellVisibility.cs:466-481`) culls portals the eye is on
the wrong side of.
2. **Strict inside-building gate** (`GameWindow.cs:7343-7346`):
`cameraInsideBuilding = a8IndoorBranchEnabled && PointInCell(camPos,
CameraCell) && CameraCell.BuildingId != null`. Picks the inside-out vs
outside-in render branch.
3. **Per-portal interior-side cull in the recursive-clip builder**
`PortalVisibilityBuilder.CameraOnInteriorSide(cell, i, cameraPos)`
(`PortalVisibilityBuilder.cs:196-203`; cull site ~`:124`):
transforms `cameraPos` to cell-local and dot-tests against the portal plane.
**The flap mechanism:** when the eye damps to a position outside the room (or in
the next room), `PointInCell(eye)` flips and `CameraOnInteriorSide` inverts.
The camera-cell ping-pongs, the inside/outside branch switches, and the exit
portal through a doorway is culled-then-uncovered frame-to-frame → walls/ground
blink. The 3-frame `CellSwitchGraceFrameCount` hysteresis (`CellVisibility.cs:167`)
only masks single-frame blips; a sustained multi-frame clip defeats it.
**Hard evidence (this session's capture):** at cottage cell `0xA9B40170` the
front-door exit portal was valid (`polyLen=4`) but the census showed
`interiorSide=False` — i.e. the eye was on the *outdoor* side of the door plane
`(0,-0.995,0.105)`. The plane math puts the cull threshold at eye Y < 8.50 with
the door at Y≈8.5: **the eye had poked out through the front wall while the
player stood inside.** The same portal projected fine when the BFS reached
`0170` from a deeper room (`0173`) where the eye was well inside.
---
## KEY FINDING: retail DOES collide the camera (swept sphere) — plus the close-up fade
Retail avoids walls via a **swept-sphere spring arm**, then fades the player when
very close. Three stages — an earlier research pass conflated stages 1+3 and missed
stage 2, producing a wrong "no collision" conclusion that this section corrects.
- **Stage 1 — compute the *desired* eye (no collision here).**
`CameraManager::UpdateCamera` (0x00456660, decomp `:95505-95953`) computes
eye = `pivot + viewer_offset`, damps it (`Frame::interpolate_origin`, `:95922`),
and stores it as `SmartBox::viewer_sought_position` (a `Position`,
`acclient.h:35196`). This *producer* does no raycast — which is ALL the earlier
pass read, hence its wrong conclusion.
- **Stage 2 — collide the desired eye (the camera pull-in the user observes).**
`SmartBox::update_viewer` (0x00453ce0, decomp `:92761-92892`) — the *consumer* of
`viewer_sought_position` — runs it through a swept-sphere `CTransition`:
`makeTransition``init_object(player, 0x5c)``init_sphere(1, &viewer_sphere, 1f)`
`init_path(cell, pivot, sought)`**`find_valid_position`** → on success
`set_viewer(sphere_path.curr_pos)` (the STOPPED position). Fallbacks:
`CPhysicsObj::AdjustPosition`, then snap to the player's position. `viewer_sphere`
is a global `CSphere`, **radius 0.3 m**, center (0,0,0) (decomp `:93308-93314`,
`:1144645`).
- **Stage 3 — fade the player when the (collided) eye is super close.**
`CameraSet::UpdateCamera` (0x00458ae0) calls
`CPhysicsObj::SetTranslucencyHierarchical(player, …)` (decomp `:97679/97698/97725/97737`)
— opaque at ≥0.45 m, transparent at ≤0.20 m. **acdream already ports this** as
`RetailChaseCamera.ComputeTranslucency` (`RetailChaseCamera.cs:367-376`).
**Verification (mine, this session):** confirmed `viewer` / `viewer_sought_position`
are `Position` structs (`acclient.h:35193/35196`); read `SmartBox::update_viewer` in
full and confirmed the `CTransition` swept-sphere from pivot→sought + the
`set_viewer(sphere_path.curr_pos)` use of the collided position; confirmed
`viewer_sphere.radius = 0.300000012f` (`:93314`); confirmed the player-fade call
sites in `CameraSet::UpdateCamera`. The earlier "no collision" finding was **wrong**
it traced the *producer* (`CameraManager::UpdateCamera`) but not the *consumer*
(`update_viewer`), where the collision lives. **Caught by the user**, who has played
retail and observed the camera pulling in at walls. (Lesson worth keeping: when the
decomp says "no X" but a domain expert says X exists, trace the consumer of the
computed value, not just the producer.)
**Implication:** a swept-sphere camera collision is the **retail-faithful** fix, NOT a
divergence. No special divergence sign-off is needed; this is a straight port, and
acdream already owns the swept-sphere machinery.
---
## acdream's current camera (file:line)
All under `src/AcDream.App/Rendering/` unless noted.
- **Two chase cameras, toggled per-frame:** legacy `ChaseCamera.cs` (rigid) and
default `RetailChaseCamera.cs` (damped). Selected by
`CameraController.Active` (`CameraController.cs:20-33`) via
`CameraDiagnostics.UseRetailChaseCamera` (default ON,
`src/AcDream.Core/Rendering/CameraDiagnostics.cs`).
- **Eye computation (`RetailChaseCamera.Update`, `:86-142`):**
`pivotWorld = playerPos + (0,0,1.5)`; `targetEye = pivotWorld +
forward*(-Distance·cosPitch) + up*(Distance·sinPitch)` (`:113-117`); then
exponential damping `_dampedEye = Lerp(_dampedEye, targetEye, alpha)` (`:121-133`);
published as `Position` + `View = CreateLookAt(_dampedEye, …)` (`:136-137`).
**No geometry test anywhere between target and publish.**
- **The "input-lag for turning/jumping" port** = the RetailChaseCamera's three
smoothing mechanisms, all verified faithful to the decomp:
- Exponential damping (follow-lag): `:121-133` + `ComputeDampingAlpha:323-329`
↔ retail `:95866-95923` (`stiffness*dt*10` clamped).
- 5-frame velocity-averaged, slope-aligned heading (the jump/slope feel):
`:94-107`, `:290-314`, `ComputeHeading:211-257` ↔ retail `old_velocities[5]`
`:95644-95677`.
- Mouse low-pass filter (input lag on flicks): `FilterMouseDelta:164-177` +
`FilterMouseAxis:340-358` ↔ retail `CameraSet::FilterMouseInput` (0x00457530,
`:96250-96279`); wired at `GameWindow.cs:1171-1199`.
- **Constants** (from retail `CameraSet::SetDefaultOffsets` 0x00458F80,
`:97916-97967`): `PivotHeight=1.5`, `Distance=2.61` (=|(0,2.5,0.75)|),
`Pitch=0.291 rad`, distance clamp `[2,40]`, pitch clamp `[0.7,1.4]`. **No
collision-radius constant exists** (no collision is done).
- **No 1st-person mode** (retail `SetInHead` 0x00458CE0 unported).
- Spec: `docs/superpowers/specs/2026-05-18-retail-chase-camera-design.md`
explicitly scopes collision OUT (`:454-457`: "we don't attempt 'camera
collides with wall' — same as retail").
---
## Integration: acdream already has the collision machinery
A camera "spring-arm" sweep can reuse the player's existing swept-sphere engine
(`src/AcDream.Core/Physics/`):
- **`PhysicsEngine.ResolveWithTransition(curPos, targetPos, cellId, radius,
height, stepUp, stepDown, isOnGround, body, moverFlags, entityId)`**
(`PhysicsEngine.cs:589`) → `Transition.FindTransitionalPosition`
(`TransitionTypes.cs:653`) → per sub-step `FindEnvCollisions`
(`:1933`, indoor branch fetches `GetCellStruct(cellId)``cellPhysics.BSP.Root`
`BSPQuery.FindCollisions`).
- **`BSPQuery`** primitives (`BSPQuery.cs`): `PointInsideCellBsp` (`:1034`),
`SphereIntersectsCellBsp` (`:1077`), `FindCollisions` (`:1637`),
`SphereIntersectsPoly` (`:2085`). A purpose-built pivot→eye cast off these is
likely a **better fit** than the full `Transition` (which carries unwanted
step-up / walkable / gravity semantics).
- **`CellPhysics`** (`PhysicsDataCache.cs:511-564`): per-cell `BSP` (collision),
`CellBSP` (point-in-cell), polys/planes, transforms. Fetched via
`GetCellStruct(cellId)`.
- Player sweep params for reference: `radius 0.48`, `height 1.2`
(`PlayerMovementController.cs:1107-1108`). A camera probe wants a **small**
radius (e.g. `PhysicsGlobals.DummySphereRadius = 0.1`), `height 0` (single
sphere), `isOnGround:false`, `body:null` (no contact-plane persistence).
- **Slot-in point:** retail collides *after* damping (`CameraManager::UpdateCamera`
damps → `viewer_sought_position`; `SmartBox::update_viewer` then collides it). So in
acdream, sweep from `pivotWorld` (`RetailChaseCamera.cs:113`) to the **damped** eye
and replace the published eye — i.e. **after `:131` (damp), before `:136`
(publish)**. This requires injecting a collision probe into `RetailChaseCamera`
(currently GL-free, no engine ref); App→Core is the allowed dependency direction, so
inject `PhysicsEngine` or a narrow `ICameraCollisionProbe` interface. Check whether
acdream has a `find_valid_position` equivalent (retail uses that, not the movement
sweep) or whether `FindTransitionalPosition` must be adapted.
---
## The fix: a retail-faithful swept-sphere camera collision (no divergence)
Port retail's stage-2 collision (stage 1 = the damped desired eye, and stage 3 =
the close-up fade, are both already in acdream):
1. Keep the existing damped desired-eye computation (`RetailChaseCamera`).
2. **Add the swept-sphere collision** (the missing piece): sweep a 0.3 m sphere from
the head-pivot to the *damped* eye via acdream's `Transition` swept-sphere, and
publish the **stopped** position as the camera eye. Mirror retail's fallbacks
(`AdjustPosition`, then snap to the player) when no valid spot is found.
3. Verify the already-ported player fade (`ComputeTranslucency`) still triggers once
the eye stops clipping (it will fade *less* — the eye now stays out of walls).
This fixes both the render (eye no longer behind walls) **and** the A8.F visibility
(stable camera-cell + side-tests, since the eye stays in valid space). Because it
matches retail, **no divergence sign-off is needed**. Per roadmap discipline, file it
as a phase before coding (a short brainstorm to confirm scope is fine, but the
behavior question is settled: this is what retail does).
---
## Open design questions for the implementation session
1. **`find_valid_position` equivalent:** retail uses `CTransition::find_valid_position`
(place/validate a sphere, not the movement sweep). Confirm acdream has an
equivalent, or adapt `Transition.FindTransitionalPosition` / a `BSPQuery`-level
cast. (Behavior question is settled — this is the API question.)
2. **Sweep radius:** retail's `viewer_sphere` is **0.3 m** — start there to stay
faithful; tune only if the eye hugs walls / near-plane clips in tight rooms.
3. **Which primitive:** retail uses the full `CTransition` (`init_object` on the
player + `find_valid_position`). A purpose-built `BSPQuery.FindCollisions`
ray/sphere cast may be a leaner fit but diverges from retail's exact call —
prefer matching retail's `Transition` path first.
4. **Indoor vs outdoor geometry:** indoor walls are in `CellPhysics.BSP` (per
cell). **Cottage exterior shells live in landblock-baked GfxObjs**, not cells
(cf. issue #98/#101 — cottage floors/walls in GfxObj `0x01000A2B` etc.). A
cell-BSP-only sweep fixes the indoor case but misses outdoor shells (the
`FindObjCollisions` / ShadowEntry path would be needed for those). Decide scope.
5. **1st-person fallback:** none today; a spring arm must no-op at distance 0 if
1st-person is added.
6. **Interaction with the ported translucency fade** (`ComputeTranslucency`) —
verify the fade still behaves once the eye stops clipping.
7. **Pivot reference:** sweep from `pivotWorld` (head, `:113`), not the feet.
---
## Apparatus / diagnostics (committed `9417d3c`; opt-in)
Launch (PowerShell), then walk `+Acdream` into a Holtburg cottage:
```powershell
$env:ACDREAM_DAT_DIR="$env:USERPROFILE\Documents\Asheron's Call"; $env:ACDREAM_LIVE="1"
$env:ACDREAM_TEST_HOST="127.0.0.1"; $env:ACDREAM_TEST_PORT="9000"
$env:ACDREAM_TEST_USER="testaccount"; $env:ACDREAM_TEST_PASS="testpassword"
$env:ACDREAM_A8_INDOOR_BRANCH="1"; $env:ACDREAM_A8_DUMP_PV="1"; $env:ACDREAM_PROBE_ENVCELL="1"
dotnet run --project src\AcDream.App\AcDream.App.csproj --no-build -c Debug 2>&1 | Tee-Object -FilePath "a8f.log"
```
- `ACDREAM_A8_DUMP_PV=1``[pv-dump]` per camera cell incl. the **`CAMPORTAL[i]`
census** (polyLen + interiorSide per portal, before the guards) + the
EXIT-CULLED/PROJ/CLIP trace + `OUTSIDEVIEW polys=N`.
- `ACDREAM_PROBE_ENVCELL=1``[opaque]` cell-render stats (reliable; pre-loop).
- `ACDREAM_PROBE_VIS=1``[buildings]`/`[draworder]`/`[stencil]`/`[envcells]`
(heavy: ~17 GL queries × several calls per frame; keep captures short).
- `tools/A8CellAudit portals <cellId>` — offline portal dump with
`BUILDER_SEES=OK/EMPTY` per portal.
**A future "camera-cell ≠ player-cell" probe** would directly confirm the
flap-by-frame: log when `PointInCell(eye)` disagrees with the player's cell.
---
## Current state / safety
- **Default game safe** — everything gated behind `ACDREAM_A8_INDOOR_BRANCH=1`.
- Bug A fix + diagnostics committed (`9417d3c`); App tests 108/108.
- Builder works (not the bug); camera-collision work **not started** (awaiting the
design decision).
- Tree clean as of `9417d3c` plus untracked launch logs (`a8f-*.log`).
---
## Pickup prompt
> Read `docs/research/2026-05-29-a8f-camera-collision-handoff.md`. The A8.F flap /
> missing-walls are caused by the 3rd-person camera EYE clipping through walls and
> destabilizing the camera-cell + portal side-tests (the eye drives `PointInCell` /
> `CameraOnInteriorSide` via `camPos` at GameWindow.cs:7271). Bug A (cellar terrain
> flood) is already fixed + committed (`9417d3c`); the recursive-clip builder works.
> The fix is **retail-faithful** (verified against the decomp): port retail's
> swept-sphere camera collision — `SmartBox::update_viewer` (0x00453ce0) sweeps a
> 0.3 m `viewer_sphere` via `CTransition::find_valid_position` from the head-pivot to
> the (damped) eye and uses the **stopped** position; the close-up player fade is
> already ported (`RetailChaseCamera.ComputeTranslucency`). acdream already owns the
> `Transition` swept-sphere engine. Slot the sweep in after `RetailChaseCamera.cs:131`
> (damp), before `:136` (publish): sweep `pivotWorld``_dampedEye`, radius 0.3 m, with
> retail's fallbacks (`AdjustPosition`, then snap to player). Check whether acdream has
> a `find_valid_position` equivalent or must adapt `FindTransitionalPosition`. Watch
> the open questions (outdoor building shells live in GfxObjs not cells; 1st-person;
> the fade interaction). A short brainstorm to confirm scope is fine; then file a
> roadmap phase. No divergence sign-off needed — this is what retail does.
---
## Reference index
**acdream code** (`src/…`):
- `AcDream.App/Rendering/RetailChaseCamera.cs` — eye `:113-137`, damping
`:121-133`/`:323-329`, heading `:211-257`, mouse filter `:340-358`,
translucency `:367-376`.
- `AcDream.App/Rendering/ChaseCamera.cs` (legacy), `CameraController.cs:20-33`,
`ICamera.cs`.
- `AcDream.Core/Rendering/CameraDiagnostics.cs``UseRetailChaseCamera`, tunables.
- `AcDream.App/Rendering/GameWindow.cs` — eye `:7270-7271`, visibility `:7323`,
`cameraInsideBuilding` `:7343-7346`, camera updates `:6851`/`:6862`, mouse-filter
`:1171-1199`.
- `AcDream.App/Rendering/PortalVisibilityBuilder.cs``CameraOnInteriorSide`
`:196-203`, cull `:124`, CAMPORTAL census in `Build`.
- `AcDream.App/Rendering/CellVisibility.cs``ComputeVisibility` `:272`,
`PointInCell` `:367`, BFS side-test `:466-481`, grace `:167`.
- `AcDream.Core/Physics/``PhysicsEngine.cs:589`, `TransitionTypes.cs:653/1933`,
`BSPQuery.cs:1034/1077/1637`, `PhysicsDataCache.cs:511-564`,
`PlayerMovementController.cs:1107-1108`.
- Spec: `docs/superpowers/specs/2026-05-18-retail-chase-camera-design.md` (collision
out-of-scope `:454-457`).
**Retail decomp** (`docs/research/named-retail/acclient_2013_pseudo_c.txt`):
- `CameraManager::UpdateCamera` @ 0x00456660 (`:95505-95953`) — computes the
*desired* eye (no collision here); damping `:95866-95923`; velocity ring
`:95644-95677`; result stored as `viewer_sought_position`.
- `SmartBox::update_viewer` @ 0x00453ce0 (`:92761-92892`) — **THE camera collision**:
swept `viewer_sphere` via `CTransition` (`init_sphere`/`init_path`/`find_valid_position`)
pivot→sought; `set_viewer(sphere_path.curr_pos)`; fallbacks `AdjustPosition` then snap
to player.
- `viewer_sphere` — global `CSphere`, radius 0.3 m, center (0,0,0) (`:93308-93314`;
decl `:1144645`).
- `CameraSet::UpdateCamera` @ 0x00458AE0 (`:97643-97742`) — player-fade
`SetTranslucencyHierarchical` `:97679/97698/97725/97737`.
- `CameraSet::FilterMouseInput` @ 0x00457530 (`:96250-96279`).
- `CameraSet::SetDefaultOffsets` @ 0x00458F80 (`:97916-97967`) — pivot (0,0,1.5),
viewer (0,2.5,0.75).
- `SmartBox::PlayerPhysicsUpdatedCallback` @ 0x00452d60 (`:91842`) — writes the
damped desired eye to `viewer_sought_position` (collision happens later, in
`update_viewer`).
- `CameraManager` struct `acclient.h:35238-35263` — no collision fields (the
collision lives in `update_viewer`, not `CameraManager`); `viewer` /
`viewer_sought_position` are `Position`s (`acclient.h:35193/35196`).

View file

@ -0,0 +1,191 @@
# Phase A8.F — visual-gate failure + pickup handoff (2026-05-29)
## TL;DR
The retail portal-frame visibility port (**Phase A8.F**) shipped as code (Tasks 08,
committed) but **FAILED its visual gate**. With `ACDREAM_A8_INDOOR_BRANCH=1`, indoor
and outside-in rendering is broadly broken: cottage/cellar interiors are "covered in
outdoor terrain / transparent walls," and walls are invisible in other houses from
both inside and outside.
**The default game is UNAFFECTED.** `cameraInsideBuilding = a8IndoorBranchEnabled &&
(inside a building)` (GameWindow.cs:7343), so `RenderInsideOutAcdream` only runs with
the opt-in env var. Without it, rendering is the pre-A8 path (walls render; only the
old cellar flap remains). **Do not panic — normal play is fine; the A8.F branch is the
broken opt-in.**
The work is committed (not reverted): the GL-free CPU layer is solid and unit-tested;
the **integration** (CPU-built clipped NDC mask → stencil-gate all outdoor terrain/
scenery) is what fails at runtime. This doc has the root-cause analysis, the apparatus,
and a pickup prompt.
## What was built (the A8.F port)
Spec: [`docs/superpowers/specs/2026-05-29-phase-a8f-portal-frame-visibility-design.md`](../superpowers/specs/2026-05-29-phase-a8f-portal-frame-visibility-design.md)
Plan: [`docs/superpowers/plans/2026-05-29-phase-a8f-portal-frame-visibility.md`](../superpowers/plans/2026-05-29-phase-a8f-portal-frame-visibility.md)
Idea: port retail's `PView` recursive portal-clip (`ConstructView`/`ClipPortals`/
`GetClip`) — WB has NO such recursion, so the flat WB stencil can't fix the cellar flap;
retail clips each portal to its portal chain. We built a GL-free CPU builder that walks
the portal graph and produces `OutsideView` (a screen-space NDC region = exit portals
recursively clipped), then stencil-gate outdoor terrain/scenery to it.
Commits (on `claude/strange-albattani-3fc83c`, after baseline `5dc4140`):
- `bb903bc` Task 0 — strip ACDREAM_A8_DIAG_* flags.
- `406307e` Task 1 — ViewPolygon + CellView (GL-free data model). Unit-tested.
- `7f46c27` Task 2 — ScreenPolygonClip (Sutherland-Hodgman convex intersection). Unit-tested.
- `a28a176` + `9ec8330` Task 3 — PortalProjection (NDC + near-plane clip). Unit-tested.
(A near-plane bug was caught + fixed during impl: `w>=WEps``w+z>=0`.)
- `0ed462c` + `270c21f` Task 4 — PortalVisibilityBuilder (the BFS). Unit-tested.
(Known dungeon-scaling fast-follow filed as **issue #102**.)
- `d12892b` + `08f6a0c` + `d581f4c` Task 5 — IndoorCellStencilPipeline.MarkAndPunchNdc.
- `9e2eb90` Task 6 — RenderInsideOut rewrite: builder-driven mask + **Job-A/B decouple**.
- `1c02a01` + `5a012c0` Task 7 — wire-in #2 per-cell translucent clip on stencil bit 2.
(A DepthFunc-leak bug was caught + fixed by code review.)
- `e0051e0` + `452ee5b` Task 8 — wire-in #3 cross-building (ungated Step 5, clipped bit-1).
- `7c3ee43` — triage apparatus (this debugging session; see below).
All `dotnet build` + `dotnet test` green throughout (App baseline 108).
## The visual-gate failure — symptoms
With `ACDREAM_A8_INDOOR_BRANCH=1` at Holtburg cottages (camera = `+Acdream`):
1. Outside→in (looking into a cottage from outside): cellar entrance looked correct.
2. Inside the cellar: **covered in outdoor terrain; walls transparent (see-through).** Passable (render-only).
3. Looking out from inside (toward a window): looked roughly normal.
4. Passing inside→out: **buildings + ground disappear; only server-spawned things
(doors/NPCs/particles) remain.**
5. **Invisible walls in OTHER houses, both from inside and outside.**
## Root-cause analysis (evidence-based; see apparatus below)
**Finding 1 — the cell walls DO render.** `[opaque]` probe (opaque cell-render stats,
captured BEFORE the per-cell transparent loop overwrites them): `cells=7 tris=50/60`,
`cells=25 tris=108` in occupied cottage cells. `tris=0` only in transient frustum-culled
frames. So "transparent walls" is **NOT** walls failing to render — it's terrain drawn
*over* them. (NOTE: the older `[envcells]` probe reads stats AFTER the transparent loop,
so its `cells=1 tris=0` is a misleading artifact — ignore it.)
**Finding 2 — `OutsideView` is frequently EMPTY, and when non-empty it doesn't narrow.**
`[pv-dump] OUTSIDEVIEW polys=N`: `polys=0` in the majority of frames; `polys=1` sometimes.
When non-empty, the clipped region ≈ the full source window (e.g. from the cellar, the
`0xA9B40170` window passes through ~unclipped, not narrowed to the stairwell sliver). So
the recursive-clip — the entire point of A8.F — is **not constraining at runtime**.
**Finding 3 — projection/clip MATH is correct; the builder under-produces.** When a
`[pv-dump] EXIT` line fires, the local quad → NDC → clipped chain is sane (window quad
`local=[(5.55,-8.61,0)(7.45,-8.61,0)(7.45,-8.35,2.5)(5.55,-8.35,2.5)]` → reasonable NDC →
clipped region). The `viewProj` is a valid System.Numerics row-vector `view*proj`
(`M33≈M34` because far≫near makes `proj.M33≈-1`; `M44` varies with camera, expected).
`ProjectToNdc` matches the GPU convention (verified algebraically: `Vector4.Transform(v,M)`
== GPU `M*v` for transpose=false upload). **Projection is not the bug.** The bug is the
builder yielding empty/too-wide regions for most real camera positions — the exit-portal
clip produces empty (needs deeper trace: portal-side cull? FullScreen-clip producing
empty? BFS not reaching exit portals from most positions?).
**Finding 4 — the Job-A/B decoupling floods terrain when `OutsideView` is empty (the
proximate cause of "transparent walls").** Task 6 made Step-4 terrain/scenery draw
UNCONDITIONALLY, with only the stencil *state* gated. When `OutsideView` is empty
(`didInsideStencil=false`), the `else` branch **disables the stencil and draws terrain
ungated** (GameWindow.cs ~11142). Combined with Finding 2 (empty most frames), terrain
floods over the (rendered) cell interior → "covered in terrain / transparent walls."
This is exactly the Opus Task-6 code-review **Minor #2** risk, realized at scale.
**Why WB doesn't hit this but we do:** in WB, `didInsideStencil = "inside a building"`
(always true indoors, because it marks the whole building's exit-portal set, which is
non-empty). WB never has the "inside + empty mask" case. Our builder produces empty masks
frequently, so the `else` branch (which WB effectively never exercises with an empty mask)
floods. The CPU-NDC-recursive-clip mask is far more fragile at runtime than WB's flat
building mask.
## The two compounding root causes (summary)
1. **`OutsideView` builder under-produces at runtime** — empty most frames; never narrows
recursively. (Builder/clip integration with real geometry; not the projection math.)
2. **Empty-`OutsideView` → ungated terrain flood** — the Job-A/B decoupling's `else` branch
draws terrain everywhere when the mask is empty, painting over the cell interior.
## Concrete first-fix hypothesis (try this first next session)
The `else` branch is wrong: **an empty `OutsideView` means "no outdoors visible from
here," not "all outdoors visible."** When inside a building with an empty mask, draw NO
outdoor terrain/scenery (or fall back to the pre-A8 "depth-clear-when-inside" behavior),
rather than ungated terrain. That alone should stop the flooding (walls become solid;
you temporarily lose terrain-through-portal until the builder is fixed, but the interior
renders correctly). This decouples the two bugs so each can be fixed independently.
Then separately debug Finding 2 (why the builder yields empty/too-wide regions) — the
`[pv-dump]` apparatus already traces local→NDC→clipped; extend it to log the side-test
result and the per-stage vert counts for ALL exit portals (the current dump's EXIT-CULLED/
EXIT-PROJ/EXIT-CLIP lines do this — read them across many frames to see which gate kills
the portals when `polys=0`).
## The architectural question (escalate to the human before a big rewrite)
Is "CPU-build a recursively-clipped NDC region + stencil-gate ALL outdoor terrain/scenery
to it" viable in acdream's pipeline, or is it too fragile (Finding 2)? Options:
- (a) Fix the builder + the else-branch (incremental; the first-fix hypothesis above).
- (b) Reconsider enforcement — e.g., port retail's per-cell screen-space scissor more
literally, or keep WB's flat building mask (accept the cellar flap) and special-case
only the cellar. The user explicitly chose the faithful retail port (option A) at
brainstorm; revisit only if (a) proves intractable.
## Safety / current state
- **Default game safe**: indoor branch gated behind `ACDREAM_A8_INDOOR_BRANCH=1`
(`cameraInsideBuilding = a8IndoorBranchEnabled && inside`, GameWindow.cs:7343).
- Work is **committed, not reverted** (CPU layer is good; integration needs fixing).
- The old cellar flap (the original M1.5 blocker) is **still present** in the default
(pre-A8) path — A8.F did not fix it.
- Tree clean as of `7c3ee43`.
## Apparatus (all env-gated; require `ACDREAM_A8_INDOOR_BRANCH=1` to reach the code)
- `ACDREAM_A8_DUMP_PV=1``[pv-dump]` lines: per camera cell, the exit-portal
local→NDC→clipped geometry (EXIT-CULLED / EXIT-PROJ / EXIT-CLIP / EXIT) + `OUTSIDEVIEW
polys=N`. First 2 Build calls per distinct camera cell. (PortalVisibilityBuilder.cs.)
- `ACDREAM_PROBE_ENVCELL=1``[opaque]` line: opaque cell-render stats (cells/tris)
BEFORE the transparent loop overwrites `_envCellRenderer.Stats`. One-shot per camera
cell. (GameWindow.cs, after the Step-3 opaque render.)
- `ACDREAM_PROBE_VIS=1``[buildings]`/`[draworder]`/`[stencil]`/`[envcells]` (existing).
NOTE `[envcells]` is post-transparent-loop (misleading); `[stencil] verts` reflects the
OutsideView triangle count.
- `tools/A8CellAudit` — offline cell/portal dumper (`portals <cellId>` / `buildings <lb> <radius>`).
Launch (PowerShell), then walk `+Acdream` into a Holtburg cottage ground floor + cellar:
```powershell
$env:ACDREAM_DAT_DIR="$env:USERPROFILE\Documents\Asheron's Call"; $env:ACDREAM_LIVE="1"
$env:ACDREAM_TEST_HOST="127.0.0.1"; $env:ACDREAM_TEST_PORT="9000"
$env:ACDREAM_TEST_USER="testaccount"; $env:ACDREAM_TEST_PASS="testpassword"
$env:ACDREAM_A8_INDOOR_BRANCH="1"; $env:ACDREAM_A8_DUMP_PV="1"; $env:ACDREAM_PROBE_ENVCELL="1"
dotnet run --project src\AcDream.App\AcDream.App.csproj --no-build -c Debug 2>&1 | Tee-Object -FilePath "a8f.log"
```
Cottage cells: `0xA9B40170` (ground floor, has window exit portal), `0xA9B40171` (cellar),
`0xA9B40174/175` (cellar rooms), building `0xA`. Inn vestibule: `0xA9B40164/162`.
## Code anchors
- `src/AcDream.App/Rendering/PortalVisibilityBuilder.cs` — the builder (Finding 2 lives here).
- `src/AcDream.App/Rendering/GameWindow.cs``RenderInsideOutAcdream` (~11012);
Step-4 `else` ungated-terrain branch (~11142, Finding 4); call-site gate (~7343, 7636).
- `src/AcDream.App/Rendering/IndoorCellStencilPipeline.cs` — MarkAndPunchNdc + bit-2 helpers.
- `src/AcDream.App/Rendering/PortalProjection.cs` / `ScreenPolygonClip.cs` / `PortalView.cs` — CPU layer (correct).
- `references/WorldBuilder/.../VisibilityManager.cs:73-239` — the WB reference (flat, no recursion).
- Retail oracle: `docs/research/named-retail/acclient_2013_pseudo_c.txt``PView::ConstructView` 433750, `ClipPortals` 433572, `GetClip` 432344.
## Pickup prompt
> Read `docs/research/2026-05-29-a8f-visual-gate-failure-handoff.md` and pick up the A8.F
> debugging. The default game is SAFE (indoor branch gated behind ACDREAM_A8_INDOOR_BRANCH).
> Use `superpowers:systematic-debugging`. Two compounding root causes are documented:
> (1) the OutsideView builder under-produces (empty most frames, never narrows); (2) the
> Job-A/B decoupling floods ungated terrain when OutsideView is empty. **Start with the
> first-fix hypothesis**: make the empty-OutsideView case draw NO outdoor terrain/scenery
> when inside (an empty mask = "no outdoors visible," not "all outdoors"), to stop the
> terrain-over-walls flood and isolate the two bugs. Verify via the apparatus
> (ACDREAM_A8_DUMP_PV / ACDREAM_PROBE_ENVCELL) — read the EXIT-CULLED/PROJ/CLIP lines across
> frames to learn which gate kills the exit portals when polys=0. Then fix the builder.
> If the builder proves intractable, escalate the architectural question (handoff §"The
> architectural question") to the user before any big rewrite — do NOT thrash. No
> speculative fixes without root cause (the Iron Law). The visual gate (user looking at a
> Holtburg cottage cellar) is the acceptance test.

View file

@ -0,0 +1,218 @@
# Unified retail-faithful render pipeline — decision, scope, and handoff (2026-05-30)
## TL;DR (the decision)
We are **abandoning the two-pipe (inside / outside) rendering approach** and
committing to a **single, unified, retail-faithful render pipeline** built around
retail's portal-visibility view (`PView`). Modern code, retail behavior. This is a
new roadmap phase — **Phase U (Unified Render Pipeline)** — and it is milestone-scale
work, not a patch.
**Why:** acdream inherited a *two-pipe* render structure from WorldBuilder (WB) — a
normal outdoor draw plus a separate flat `RenderInsideOut` stencil pass toggled on
`cameraInsideBuilding` (`GameWindow.cs` ~7345). That split is the root cause of every
indoor/outdoor seam bug (the "flap", missing/transparent walls, terrain bleeding into
interiors). **Retail has no such split.** Retail renders through one portal-visibility
traversal: starting from whatever cell the camera is in, it walks recursively through
portals (doors, windows, cell openings), builds a screen-space clip region per opening,
and draws every visible cell — indoor *and* outdoor — in one pass. There is no
"am I inside a building?" branch, so transitions are seamless **by construction**.
The A8.F effort tried to graft retail's recursive clip *on top of* WB's two-pipe
stencil (a CPU-built NDC mask bridging the two pipes). That hybrid is inherently
fragile and failed its visual gate (**issue #103**). You cannot make two pipes hand
off seamlessly at a doorway; retail avoids the entire bug class by never splitting.
This was confirmed in collaboration with the user, who correctly identified that the
whole direction was wrong: *"in retail there is totally seamless transitions between
out and in… it's like we are on the wrong path here."* Correct.
---
## What this session actually shipped (the salvage)
This session was originally a "camera collision" effort (a reframing of the #103
failure onto the camera eye — itself a detour). The camera work is **real,
retail-faithful, and kept**, but it is **not** the fix for seamless transitions:
- **Swept-sphere camera collision** (retail `SmartBox::update_viewer`, `0x00453ce0`):
`CameraDiagnostics.CollideCamera` (default on), `ICameraCollisionProbe` +
`PhysicsCameraCollisionProbe` (reuses `PhysicsEngine.ResolveWithTransition`,
retail `viewer_sphere` 0.3 m, `init_object(player, 0x5c)` =
`IsViewer|PathClipped|FreeRotate|PerfectClip`), wired into `RetailChaseCamera`
(collide into a *separate* published eye — never the damped sought eye — to avoid
the wall-press oscillation), `GameWindow` wiring + a Camera-menu toggle.
- **Physics fix (retail-faithful):** viewer/sight sweeps bypass acdream's 30-step
safety cap in `Transition.FindTransitionalPosition` (retail `find_transitional_position`
has no cap; `calc_num_steps` has a `state & 4` viewer branch). Gate: `&& !ObjectInfo.IsViewer`.
- Commits `69c7f8d``aae5300` (plus the design spec/plan docs).
**Camera-collision residual nits (minor, deferred — NOT blockers):**
- Eye can clip ~0.3 m into a wall and reveal outside (near-clip plane 1 m vs collision
radius 0.3 m — tuning).
- Residual non-retail-smooth feel at walls (the gross oscillation is fixed; tuning remains).
These are polish; revisit after the pipeline lands (or never — they're cosmetic).
---
## Git state at handoff
- Branch `claude/strange-albattani-3fc83c` was **239 commits ahead of `main`**
(both forked at Phase O, `2256006`, 2026-05-21). The branch carries ~9 days of good
work since Phase O: A6 physics, issues #98/#100/#101, A7 lighting, the A8/A8.F
rendering arc, and the camera work — all of it.
- Per user decision (2026-05-30): **the whole branch was merged into `main`** so none
of that work is lost. The dormant, gated-off A8 two-pipe rendering rides along and is
**deleted as Task 1 of Phase U** (see below). The user pushes `main` to remotes.
- The `#98`/`#101` physics WIP from a prior session is preserved in `git stash` on this
branch (a subagent had `git stash apply`-ed it mid-session; it was re-stashed). Do not
lose it.
---
## Phase U — Unified Render Pipeline (scope / design sketch)
> This is a scope sketch to start a proper brainstorm + spec from — **not** a finalized
> design. The next session brainstorms it.
### Goal
One render path. The camera's current cell is the root of a per-frame portal-visibility
traversal that yields *(visible cells, per-cell screen-space clip region)*; the renderer
draws all visible geometry (indoor cells, outdoor cells, entities, terrain) in a single
pass gated by that visibility. **No `cameraInsideBuilding` branch. No `RenderInsideOut`
stencil pass. No outdoor-vs-indoor toggle.** Seamless in/out by construction.
### Retail oracle (the thing to port)
- `PView::ConstructView` (decomp ~`433750`), `PView::ClipPortals` (~`433572`),
`PView::GetClip` (~`432344`) — the recursive per-portal screen-space clip-region BFS.
- `CEnvCell::find_visible_child_cell` (`acclient_2013_pseudo_c.txt:311397`,
`0x0052dc50`) — per-cell portal-visible-child resolution; call site `:280028`.
- `RenderDeviceD3D::DrawBlock` (~`430027`) — the render loop the visibility chain feeds.
- See the project memory note **indoor-portal-visibility-wb-vs-retail** for why WB
cannot express per-portal clipping and the oracle is retail `PView`.
### What to KEEP (do not re-port)
- The WB-derived **mesh/dat pipeline** is fine and stays: `ObjectMeshManager`,
`WbMeshAdapter`, `WbDrawDispatcher`, terrain (`TerrainModernRenderer`,
`LandblockMesh`), `DatCollection`, texture decode. Phase U is about **visibility +
draw orchestration**, not mesh extraction.
- The **camera collision** and **physics** work from this session.
- Cell data: `CellVisibility` (`FindCameraCell`, `PointInCell`, portal data),
`PhysicsDataCache` cell structures.
### What is likely SALVAGEABLE from the failed A8.F (verify, don't assume)
The A8.F CPU clip-builder pieces are **unit-test-correct** (the integration is what
failed). They may feed a unified draw directly:
- `PortalProjection` (GL near-plane clip), `ScreenPolygonClip` (2D convex intersection),
`ViewPolygon`/`CellView` (clip-region model), `PortalVisibilityBuilder` (recursive
portal-clip BFS producing per-cell `OutsideView`).
The failure was the *two-pipe stencil graft* around them (the CPU NDC mask gating a
*separate* outdoor pipe), not the clip math. A unified pipeline can likely reuse the
builder to produce per-cell clip frames and gate **one** pass.
### What to DELETE (Task 1 — clear the deck)
The dead two-pipe rendering, so the unified path is built clean, not bolted on:
- `RenderInsideOutAcdream` and the `cameraInsideBuilding` branch in `GameWindow.cs`
(~`7345`+), the `IndoorCellStencilPipeline` / `MarkAndPunchNdc` stencil graft, the
Job-A/B decouple, the `ACDREAM_A8_INDOOR_BRANCH` kill-switch, the
`EnvCellRenderer` WB `RenderInsideOut` port (`f9a644a` lineage).
- Keep `EnvCellRenderManager`'s *mesh* path if the unified draw needs it; delete only
the inside-out *visibility/stencil* machinery.
- Audit before deleting: some A8 commits also fixed real bugs (e.g. `BuildingId`
stamping, pool aliasing `9559726`) — keep those.
### Approach sketch (for the brainstorm to refine)
1. **Visibility pass** (CPU, GL-free, testable): from the camera cell, recursive portal
BFS → ordered set of visible cells, each with a screen-space clip polygon (intersection
of the portal openings along its chain). This is retail `PView::ConstructView`.
Likely reuses the salvaged A8.F `PortalVisibilityBuilder` family.
2. **Draw pass** (single, unified): for each visible cell (front-to-back), draw its
geometry + entities clipped to its clip region (scissor or stencil-per-cell, retail
uses a clip rect/region). Outdoor cells are just cells in this set — no special path.
Terrain is drawn per visible outdoor cell, gated the same way.
3. **No branch:** the camera being indoors vs outdoors changes only *which cell is the
root*, not the algorithm.
### Key risks / lessons (do not repeat)
- **Do not graft retail recursion onto WB's flat two-pipe stencil** — that's what
#103 was. Build the unified pass; don't bridge two pipes.
- **Unit tests on synthetic visibility data did not catch #103 — only the visual gate
did.** Visual verification at the cottage/cellar/inn/dungeon is the real acceptance.
Build a runtime visibility probe early (`ACDREAM_PROBE_VIS`) and validate against live
frames, not just synthetic fixtures.
- **A CPU-built mask gating ALL outdoor geometry is fragile.** The unified pass should
gate per-cell at draw time (scissor/stencil per visible cell), close to how retail
clips, rather than one global mask.
- **The camera is not the fix.** That reframing cost this session; the fix is the
visibility architecture.
### Success criteria (visual)
- Walk Holtburg cottage → cellar → out the door: no flap, walls solid, no terrain
bleed, seamless threshold crossing from any camera angle/zoom.
- Holtburg Inn: no outdoor stabs/terrain visible through the floor/walls (closes #78).
- Dungeon via Town Network portal: `visibleCells` stays sane (~415), no other-dungeon
geometry (closes/relates #95).
- No regression to outdoor rendering (the default game today).
---
## Next-session pickup prompt
```
We are building Phase U — a single unified retail-faithful render pipeline (retail
PView portal-visibility), abandoning the WB-inherited two-pipe (inside/outside) split
that caused the indoor seam bugs (the flap, missing/transparent walls, terrain bleed).
The decision + full scope is in
docs/research/2026-05-30-unified-render-pipeline-decision-and-handoff.md — READ IT FIRST,
then the project memory note "indoor-portal-visibility-wb-vs-retail" and the #103
failure handoff (docs/research/2026-05-29-a8f-visual-gate-failure-handoff.md).
State both altitudes:
Currently working toward: M1.5 — Indoor world feels right.
Current phase: U (Unified Render Pipeline). This supersedes the abandoned A8/A8.F
two-pipe approach (#103).
Start with superpowers:brainstorming to design the unified pipeline (do NOT jump to
code). The scope sketch in the handoff doc is the input. Key decisions to settle in the
brainstorm: (a) reuse the salvaged A8.F clip-builder (PortalProjection/ScreenPolygonClip/
ViewPolygon/PortalVisibilityBuilder — unit-test-correct) vs fresh port; (b) per-cell
clip mechanism (scissor rect vs stencil-per-cell) matching retail's GetClip; (c) how
terrain + outdoor entities become "just cells" in the visible set.
Task 1 of implementation is to DELETE the dead two-pipe code (RenderInsideOutAcdream,
the cameraInsideBuilding branch, IndoorCellStencilPipeline/MarkAndPunchNdc, the
ACDREAM_A8_INDOOR_BRANCH kill-switch) to clear the deck — but audit first; some A8
commits fixed real bugs (BuildingId stamping, pool aliasing) that must be kept.
Retail anchors: PView::ConstructView ~433750, ClipPortals ~433572, GetClip ~432344,
CEnvCell::find_visible_child_cell :311397, RenderDeviceD3D::DrawBlock ~430027.
Keep: the WB mesh/dat pipeline (ObjectMeshManager/WbDrawDispatcher/terrain), the camera
collision + physics from the 2026-05-30 session. Visual verification at Holtburg
cottage/cellar/inn + a portal dungeon is the acceptance gate — unit tests did not catch
#103.
Preserve the git stash on the branch (#98/#101 physics WIP).
```
---
## Reference index
- Decision context / why WB can't do it: project memory **indoor-portal-visibility-wb-vs-retail**.
- #103 failure detail: `docs/research/2026-05-29-a8f-visual-gate-failure-handoff.md`.
- Camera-collision work this session: `docs/superpowers/specs/2026-05-29-a8f-camera-collision-design.md`,
`docs/superpowers/plans/2026-05-29-a8f-camera-collision.md`.
- A8.F portal-frame port (the failed two-pipe graft):
`docs/superpowers/specs/2026-05-29-phase-a8f-portal-frame-visibility-design.md`.
- Retail decomp: `docs/research/named-retail/acclient_2013_pseudo_c.txt`.
- WB visibility reference: `references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/VisibilityManager.cs`.

View file

@ -0,0 +1,785 @@
# A6.P3 Slice 1 — Indoor ContactPlane retention (Finding 2 fix) Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Stop acdream's indoor-physics ContactPlane resynthesis blowup (A6.P2 Finding 2 — ~1,470× more CP writes than retail). Strip the per-frame synthesis path inside `Transition.FindEnvCollisions` indoor branch and rely on the existing Mechanism A (Path-6 land write) + Mechanism B (LKCP restore) for ContactPlane state.
**Architecture:** Make our indoor branch of `FindEnvCollisions` match retail's tiny `CEnvCell::find_env_collisions` (10 lines: just call `BSPTREE::find_collisions` and return state). The current branch calls `TryFindIndoorWalkablePlane` (a synthesis workaround) + `ValidateWalkable` (which writes CP) EVERY frame — that's the blowup. The body-level LKCP-restore already exists in `PhysicsEngine.RunTransitionResolve` (lines 668-674) and handles the cross-frame retention. We add per-transition Mechanism B (LKCP restore into `ci.ContactPlane` inside the transition resolver) so the indoor branch can return OK without writing CP and downstream consumers still see a valid CP.
**Tech Stack:** C# .NET 10, existing physics code in `src/AcDream.Core/Physics/`. Unit tests use xUnit (already wired). Integration verification uses the A6.P1 cdb probe infrastructure (already shipped).
**Spec:** [`docs/superpowers/specs/2026-05-21-phase-a6-indoor-physics-fidelity-design.md`](../specs/2026-05-21-phase-a6-indoor-physics-fidelity-design.md) §1.2 (hypothesis), §5 (A6.P3 fix surface).
**Findings:** [`docs/research/2026-05-21-a6-cdb-capture-findings.md`](../../research/2026-05-21-a6-cdb-capture-findings.md) Finding 2.
**Retail oracle:** `docs/research/named-retail/acclient_2013_pseudo_c.txt`:
- `CEnvCell::find_env_collisions` line 309573 — the 10-line indoor branch shape.
- `COLLISIONINFO::set_contact_plane` line 271925 — the CP setter.
- LKCP restore inside `validate_transition` family — line 272565-272582 (restore CP from `last_known_contact_plane` when sphere is geometrically close).
- `frames_stationary_fall` flat-CP synthesis — line 272622+ (Mechanism C; deferred to A6.P3 slice 2).
**Out of scope (deferred to A6.P3 slice 2):**
- Mechanism C (`frames_stationary_fall` counter + flat CP synthesis after 2+ stationary-falling frames). Add only if slice 1 visual verification shows first-frame fall-through after teleport / cell entry.
- Finding 3 (cell-resolver sling-out). Independent fix surface. Separate plan.
- Issue #95 (visibility blowup). Outside A6 scope.
**Acceptance for slice 1:**
- scen3 re-capture: acdream cp-write count drops from 86,748 to ≤ 200 (≤ retail BP7 + small idle buffer).
- scen1, scen5 re-captures: CP-write ratio drops from 1,000+× to ≤ 10×.
- Visual verification at Holtburg inn 2nd floor: walking feels solid (no falling, no jitter, no fall-through to outdoor terrain).
- `dotnet build` + `dotnet test` green (1147+8 baseline maintained).
---
## File Structure
| Path | Purpose | Change |
|---|---|---|
| `src/AcDream.Core/Physics/TransitionTypes.cs` | `Transition` class — indoor BSP branch of `FindEnvCollisions` is the blowup site (lines 1514-1777). `TryFindIndoorWalkablePlane` (lines 1294-1380) is the synthesis workaround. | Modify — strip synthesis from `FindEnvCollisions` indoor branch; leave `TryFindIndoorWalkablePlane` definition in place (deleted in A6.P4) |
| `src/AcDream.Core/Physics/PhysicsEngine.cs` | `RunTransitionResolve` already has cross-frame LKCP restore (lines 668-674). Verify per-tick Mechanism B (LKCP restore into `ci.ContactPlane`) is wired or add it. | Read; modify if needed |
| `tests/AcDream.Core.Tests/Physics/IndoorContactPlaneRetentionTests.cs` | New regression test asserting CP-write count stays low across multiple `FindTransitionalPosition` calls on the same flat plane. | Create |
| `docs/research/2026-05-21-a6-p3-slice1-retail-mech-b-research.md` | Short research note grounding the fix in retail's exact LKCP-restore pattern. Mandatory before code changes. | Create |
| `docs/research/2026-05-21-a6-captures/scen1-recap/` etc | Re-capture directories for verification. | Create dirs (already part of capture protocol) |
---
## Task Decomposition
### Task 1: Research note — retail Mechanism B + how `FindEnvCollisions` returns OK without writing CP
This is non-negotiable. The current synthesis path was added to fix a real bug (fall-through to outdoor terrain at the inn doorway). Removing it without understanding retail's equivalent retention pattern will re-introduce that bug. Read retail's flow and document the exact path before changing code.
**Files:**
- Create: `docs/research/2026-05-21-a6-p3-slice1-retail-mech-b-research.md`
- [ ] **Step 1: Read `CEnvCell::find_env_collisions` in retail decomp**
Run:
```bash
sed -n '309570,309600p' docs/research/named-retail/acclient_2013_pseudo_c.txt
```
Expected: see the 10-line function (already inspected in plan-write phase). Confirm: returns OK after `BSPTREE::find_collisions` returns OK; no `set_contact_plane` call inside `find_env_collisions`.
- [ ] **Step 2: Read retail's `validate_transition` LKCP-restore block**
Run:
```bash
sed -n '272540,272620p' docs/research/named-retail/acclient_2013_pseudo_c.txt
```
Expected: see the block where `last_known_contact_plane_valid != 0` triggers `COLLISIONINFO::set_contact_plane(&this->collision_info, &last_known_contact_plane, ...)`. This is the per-transition restore that closes the gap.
- [ ] **Step 3: Find which retail function contains the LKCP-restore**
Run:
```bash
awk 'NR<=272540 && /void __thiscall.*::/ {f=$0; ln=NR} NR>272540 {print ln, f; exit}' docs/research/named-retail/acclient_2013_pseudo_c.txt
```
Expected: print the most recent function header before line 272540. This identifies whether the restore is inside `validate_transition`, `find_obj_collisions`, `transitional_insert`, or another function.
- [ ] **Step 4: Find the equivalent in our code**
Search for our equivalent of that retail function name. Try `grep -n` for the C++ method name without the class prefix:
Run:
```bash
grep -rn "TransitionalInsert\|TransitionalInsertGround\|FindTransitionalPosition\|ValidateTransition" src/AcDream.Core/Physics/
```
Expected: identifies the C# method that should contain Mechanism B. Likely `Transition.FindTransitionalPosition` or `Transition.CheckTransition`.
- [ ] **Step 5: Write the research note**
```bash
# Open and write the file with the answers
```
Write `docs/research/2026-05-21-a6-p3-slice1-retail-mech-b-research.md` with these sections:
1. **`CEnvCell::find_env_collisions` shape** (paste the 10-line function with our line annotations).
2. **Retail Mechanism B location** (the function name + line number from Step 3).
3. **Retail Mechanism B trigger condition** (the geometric proximity check at line 272569 — `|dot(global_curr_center, LKCP.N) + LKCP.d| <= radius + 0.0002f`).
4. **Our equivalent function** (from Step 4).
5. **Decision: where to add Mechanism B in our code** — either inside `Transition.FindEnvCollisions` itself (if our equivalent isn't called per-transition) OR inside the per-transition resolver (if it is). Note the choice.
6. **Risk: first-frame fall-through** — what happens when LKCP is invalid AND BSP returns OK on the first frame in a cell (post-teleport or post-cell-cross). Either accept the risk (slice 2 adds Mechanism C) or document a slice-1 mitigation.
- [ ] **Step 6: Commit the research note**
```bash
git add docs/research/2026-05-21-a6-p3-slice1-retail-mech-b-research.md
git commit -m "docs(research): A6.P3 slice 1 — retail Mechanism B oracle for CP retention
Pre-fix research note grounding the indoor CP-retention refactor in
retail's exact LKCP-restore pattern (acclient_2013_pseudo_c.txt:272565-272582)
and CEnvCell::find_env_collisions tiny shape (line 309573).
Output of this note drives the per-transition Mechanism B insertion
point selection in Task 4 + the slice-1 acceptance shape.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>"
```
---
### Task 2: Add CP-write count probe assertion infrastructure
We need a deterministic way to count CP writes from a unit test. Look for an existing counter; if none, add one inside `CollisionInfo.SetContactPlane`.
**Files:**
- Read: `src/AcDream.Core/Physics/TransitionTypes.cs:251-270` (the `SetContactPlane` setter)
- Modify (if needed): `src/AcDream.Core/Physics/TransitionTypes.cs` (add a static counter for tests)
- Test: `tests/AcDream.Core.Tests/Physics/IndoorContactPlaneRetentionTests.cs` (created in Task 3)
- [ ] **Step 1: Read the current `SetContactPlane` setter**
```bash
sed -n '245,275p' src/AcDream.Core/Physics/TransitionTypes.cs
```
Expected: see the method signature `public void SetContactPlane(Plane plane, uint cellId, bool isWater = false)` setting `ContactPlane`, `ContactPlaneCellId`, `ContactPlaneIsWater`, plus updating LKCP fields.
- [ ] **Step 2: Check if a CP-write counter already exists**
```bash
grep -rn "ContactPlaneWriteCount\|CpWriteCount\|TotalContactPlaneWrites" src/AcDream.Core/Physics/
```
Expected: probably no result (counter doesn't exist). If a result appears, use that counter and skip step 3.
- [ ] **Step 3: Add a test-only counter to `CollisionInfo`**
In `src/AcDream.Core/Physics/TransitionTypes.cs`, inside the `CollisionInfo` class (around line 245), add:
```csharp
/// <summary>
/// Test-only counter for ContactPlane writes. Incremented by every
/// call to <see cref="SetContactPlane"/>. Used by
/// IndoorContactPlaneRetentionTests to assert that CP retention is
/// working (A6.P3 slice 1, 2026-05-21).
/// </summary>
internal int ContactPlaneWriteCount { get; private set; }
```
Then modify `SetContactPlane` (the existing method around line 251) to increment the counter. Find the existing line:
```csharp
public void SetContactPlane(Plane plane, uint cellId, bool isWater = false)
{
ContactPlane = plane;
```
Insert before `ContactPlane = plane;`:
```csharp
ContactPlaneWriteCount++;
ContactPlane = plane;
```
- [ ] **Step 4: Build and verify**
```bash
dotnet build src/AcDream.Core/AcDream.Core.csproj -c Debug
```
Expected: `Build succeeded. 0 Warning(s). 0 Error(s).`
- [ ] **Step 5: Commit**
```bash
git add src/AcDream.Core/Physics/TransitionTypes.cs
git commit -m "test(phys): A6.P3 slice 1 — add CollisionInfo.ContactPlaneWriteCount
Internal test-only counter incremented by SetContactPlane. Required
by IndoorContactPlaneRetentionTests to assert CP retention works
post-Finding-2 fix (A6.P2).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>"
```
---
### Task 3: Write the failing regression test
TDD step. Encode the expected post-fix behavior as a test that FAILS today and will PASS after the fix.
**Files:**
- Create: `tests/AcDream.Core.Tests/Physics/IndoorContactPlaneRetentionTests.cs`
- [ ] **Step 1: Identify the test fixture pattern in the existing test project**
```bash
ls tests/AcDream.Core.Tests/Physics/ | head -20
```
Look for an existing test file using `Transition` directly — likely a `TransitionTests.cs` or `BSPQueryTests.cs`. Read it to learn the setup pattern (how to construct `Transition`, `SpherePath`, `CollisionInfo`, mock `PhysicsEngine`).
```bash
grep -l "new Transition\|Transition.*FindTransitionalPosition\|Transition.*FindEnvCollisions" tests/AcDream.Core.Tests/Physics/
```
Expected: one or more files; pick the most-similar test file as the template.
- [ ] **Step 2: Write the test file**
Create `tests/AcDream.Core.Tests/Physics/IndoorContactPlaneRetentionTests.cs`:
```csharp
using System.Numerics;
using AcDream.Core.Physics;
using Xunit;
// Plus any using statements identified from the template file in Step 1.
namespace AcDream.Core.Tests.Physics;
/// <summary>
/// A6.P3 slice 1 (2026-05-21). Regression tests for Finding 2:
/// ContactPlane resynthesis blowup. Asserts that running multiple
/// FindEnvCollisions calls on the same indoor flat-floor configuration
/// does NOT cause an unbounded CP-write count.
/// </summary>
public class IndoorContactPlaneRetentionTests
{
[Fact]
public void IndoorFlatFloorWalk_DoesNotResynthesizeContactPlanePerFrame()
{
// Arrange: build a minimal indoor scenario.
// - Create a Transition with a SpherePath positioned inside an
// indoor cell (CellId low 16 bits >= 0x0100).
// - Mock a cell with a flat floor poly at Z=0.
// - Seed the CollisionInfo's ContactPlane + LastKnownContactPlane
// with the floor plane (simulating "we've already touched the floor").
// - Reset ContactPlaneWriteCount to 0 just before the test loop.
//
// Setup the Transition + SpherePath + CollisionInfo per the
// template-test pattern identified in Step 1.
var transition = /* ... build per template ... */;
var ci = transition.CollisionInfo;
// Seed: simulate "already on the floor"
var floorPlane = new System.Numerics.Plane(new Vector3(0, 0, 1), 0f);
ci.SetContactPlane(floorPlane, cellId: 0xA9B40166, isWater: false);
// Reset write counter after seeding.
int seededWrites = ci.ContactPlaneWriteCount;
Assert.Equal(1, seededWrites);
// Act: simulate 60 frames of flat-floor walking by calling
// FindEnvCollisions 60 times with positions that stay on the same
// flat plane (small horizontal deltas, identical Z).
var engine = /* ... mock PhysicsEngine ... */;
for (int frame = 0; frame < 60; frame++)
{
transition.SpherePath.SetCheckPos(
new Position(/* same Z, small XY delta */),
cellId: 0xA9B40166);
var state = transition.FindEnvCollisions(engine);
Assert.Equal(TransitionState.OK, state);
}
// Assert: after 60 frames of identical flat-floor walking, the
// ContactPlane should have been written AT MOST a small constant
// number of additional times (ideally 0; allow a small budget for
// legitimate Mechanism A re-lands by BSP path-6).
int totalWrites = ci.ContactPlaneWriteCount;
int additionalWrites = totalWrites - seededWrites;
// Threshold: 60 frames should produce at most ~5 additional
// CP writes (ratio ≤ 0.1 writes/frame). Today's broken code
// produces ~60-180 (1-3 writes/frame from per-frame synthesis).
Assert.True(additionalWrites <= 5,
$"Expected ≤5 additional CP writes across 60 flat-floor frames, "
+ $"got {additionalWrites}. Finding 2 fix not complete.");
}
}
```
**Note:** The exact `Transition` construction depends on the template test from Step 1. If the template requires concrete `CellPhysics` + `DataCache`, replicate that setup. If the construction is too painful (requires mocking many fields), simplify the test by directly calling `Transition.FindEnvCollisions` and verifying ContactPlaneWriteCount stays low — even if the test setup is artificial, the assertion's job is to prove "we don't write CP on every frame."
- [ ] **Step 3: Build the test project**
```bash
dotnet build tests/AcDream.Core.Tests/AcDream.Core.Tests.csproj -c Debug
```
Expected: build succeeds.
- [ ] **Step 4: Run the test — expect FAIL**
```bash
dotnet test tests/AcDream.Core.Tests/AcDream.Core.Tests.csproj --filter "FullyQualifiedName~IndoorContactPlaneRetentionTests" --no-build
```
Expected: test FAILS with message like "Expected ≤5 additional CP writes ... got 60+." This proves the test correctly captures the bug.
If the test PASSES today (no extra writes), the test setup doesn't exercise the indoor synthesis path. Revisit: either the SpherePath cell isn't in the indoor range (low 16 bits must be ≥ 0x0100), or the `engine.DataCache` mock isn't returning a cell with a BSP — meaning the indoor branch isn't entered.
- [ ] **Step 5: Commit the failing test**
```bash
git add tests/AcDream.Core.Tests/Physics/IndoorContactPlaneRetentionTests.cs
git commit -m "test(phys): A6.P3 slice 1 — failing regression for Finding 2 CP blowup
Test asserts 60 frames of indoor flat-floor walking should produce
≤5 ContactPlane writes. Fails today (broken code: ~60-180 writes).
Will pass after Task 4 + Task 5 strip the synthesis path.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>"
```
---
### Task 4: Add Mechanism B (LKCP restore) per-transition
Based on Task 1's research note, add the LKCP-restore step in the correct location (per Step 5 of Task 1's research note). The shape should match retail's `validate_transition` lines 272565-272582 — when LKCP is valid AND the sphere is close to the LKCP plane, restore CP from LKCP.
**Files:**
- Modify: `src/AcDream.Core/Physics/TransitionTypes.cs` OR `src/AcDream.Core/Physics/PhysicsEngine.cs` (location determined by Task 1 Step 5)
- [ ] **Step 1: Re-read Task 1's research note**
```bash
cat docs/research/2026-05-21-a6-p3-slice1-retail-mech-b-research.md
```
Locate Section 5: "Decision: where to add Mechanism B in our code."
- [ ] **Step 2: Read the target function (the file + line range identified in Task 1 Step 5)**
Read the function in full to identify the correct insertion point. Mechanism B should fire AFTER the transition's sub-step loop completes with OK_TS but BEFORE the body persist.
- [ ] **Step 3: Write the Mechanism B insertion**
Add at the determined insertion point (replace `<INSERTION_POINT>` with the exact location from Step 2):
```csharp
// ── Mechanism B — restore CP from LKCP when geometrically close ──
// A6.P3 slice 1 (2026-05-21). Retail oracle:
// acclient_2013_pseudo_c.txt:272565-272582 (validate_transition's
// LKCP-restore block). When the player is moving across a flat
// indoor floor and FindEnvCollisions returns OK without writing
// a fresh CP (per the now-stripped synthesis path), restore CP
// from LastKnownContactPlane if the sphere is close to the LKCP
// plane geometrically.
if (ci.LastKnownContactPlaneValid && !ci.ContactPlaneValid)
{
var sphereCenter = SpherePath.GlobalCurrCenter[0].Origin;
var lkcp = ci.LastKnownContactPlane;
float distToLKCP = MathF.Abs(
Vector3.Dot(sphereCenter, lkcp.Normal) + lkcp.D);
float threshold = SpherePath.GlobalSphere[0].Radius + 0.0002f;
if (distToLKCP <= threshold)
{
ci.SetContactPlane(
lkcp,
ci.LastKnownContactPlaneCellId,
ci.LastKnownContactPlaneIsWater);
}
}
```
- [ ] **Step 4: Build**
```bash
dotnet build src/AcDream.Core/AcDream.Core.csproj -c Debug
```
Expected: build succeeds.
- [ ] **Step 5: Run the full Core test suite — should be 1147+ green**
```bash
dotnet test tests/AcDream.Core.Tests/AcDream.Core.Tests.csproj --no-build
```
Expected: still 1147+ pass. Mechanism B by itself should not break anything (it only fires when ContactPlane is invalid but LKCP is valid — a state that's rare in the current code, but harmless if it does happen).
If a previously-green test now fails: the Mechanism B logic is firing where it shouldn't. Inspect the failing test's setup to understand what state assumption it makes, then either narrow the Mechanism B condition or fix the test.
- [ ] **Step 6: Commit Mechanism B alone**
```bash
git add src/AcDream.Core/Physics/<file_modified>
git commit -m "feat(phys): A6.P3 slice 1 step 1 — add Mechanism B (LKCP restore)
Restores CollisionInfo.ContactPlane from LastKnownContactPlane when:
- LKCP is valid
- ContactPlane is currently invalid
- sphere is geometrically close to the LKCP plane
(|dot(center, N) + d| <= radius + 0.0002)
Matches retail's validate_transition LKCP-restore at
acclient_2013_pseudo_c.txt:272565-272582. Slice 1 step 1 of the
A6.P3 indoor CP retention fix. Step 2 (Task 5) strips the
TryFindIndoorWalkablePlane synthesis from FindEnvCollisions.
Tests pass: 1147+ green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>"
```
---
### Task 5: Strip the synthesis path from `FindEnvCollisions` indoor branch
The main fix. Match retail's `CEnvCell::find_env_collisions` tiny shape.
**Files:**
- Modify: `src/AcDream.Core/Physics/TransitionTypes.cs:1623-1737` (the synthesis block after BSP returns OK)
- [ ] **Step 1: Re-read the current indoor branch**
```bash
sed -n '1623,1745p' src/AcDream.Core/Physics/TransitionTypes.cs
```
Expected: see the block that calls `CheckOtherCells`, then `TryFindIndoorWalkablePlane`, then `ValidateWalkable` on the synthesized plane.
- [ ] **Step 2: Identify what to KEEP vs DELETE**
KEEP:
- The `cellState != TransitionState.OK` early return (lines 1623-1628). Retail's tiny version has this.
- The Phase A4 `CheckOtherCells` call (lines 1638-1642). This is multi-cell collision iteration — separate concern from CP retention; keep.
- All probe diagnostics (`[indoor-bsp]`, `[indoor-walkable]` lines). These print only when the env var is set; keep for A6.P3 verification re-captures.
DELETE:
- The `TryFindIndoorWalkablePlane` call (lines 1658-1662).
- The `if (walkableHit)` block calling `ValidateWalkable` (lines 1727-1737).
- The defensive comment about "fall through to outdoor terrain" — replaced by Mechanism B.
- The `[walk-miss]` diagnostic block (lines 1682-1725). This diagnostic was tied to the synthesis MISS case; with synthesis stripped, the diagnostic is meaningless. Move out of scope OR delete.
REPLACE with: `return TransitionState.OK;` immediately after `CheckOtherCells` returns OK.
- [ ] **Step 3: Apply the edit**
Replace the block (lines 1645-1743 approximately) with:
```csharp
// ── Indoor walkable handling — A6.P3 slice 1 (2026-05-21) ─
// Retail's CEnvCell::find_env_collisions (decomp
// acclient_2013_pseudo_c.txt:309573) returns OK after
// BSPTREE::find_collisions returns OK — NO call to
// set_contact_plane or any synthesis. ContactPlane is
// either:
// - Already valid from a previous frame's Path-6 land
// write inside BSPQuery.FindCollisions (Mechanism A).
// - Restored from LKCP by the per-transition Mechanism B
// in <function> (see Task 4).
//
// The old TryFindIndoorWalkablePlane synthesis path is
// removed here; the function definition is retained for
// now and is deleted in A6.P4 along with the #90
// workaround.
//
// If subsequent visual verification shows first-frame
// fall-through (LKCP invalid AND no Path-6 land happens
// for a flat-walk-only scenario), A6.P3 slice 2 adds
// Mechanism C (retail's frames_stationary_fall flat-CP
// synthesis at acclient_2013_pseudo_c.txt:272622+).
return TransitionState.OK;
}
}
// ── Outdoor terrain collision ────────────────────────────────────
```
(Keep everything BEFORE `// ── Synthesize indoor walkable contact plane ──` and everything AFTER `// ── Outdoor terrain collision ──`.)
- [ ] **Step 4: Build**
```bash
dotnet build src/AcDream.Core/AcDream.Core.csproj -c Debug
```
Expected: build succeeds. If `walkableHit`, `indoorPlane`, `indoorVertices`, `hitPolyId`, `INDOOR_WALKABLE_PROBE_DISTANCE`, `WalkMissDiagnostic` are referenced by other code OUTSIDE this block, build will fail with "unused variable" warnings — those are now unused and the code outside this block should also not reference them; reach for `git grep`.
- [ ] **Step 5: Run the IndoorContactPlaneRetentionTests test — expect PASS now**
```bash
dotnet test tests/AcDream.Core.Tests/AcDream.Core.Tests.csproj --filter "FullyQualifiedName~IndoorContactPlaneRetentionTests" --no-build
```
Expected: test PASSES. The 60-frame flat-walk no longer writes CP per-frame because synthesis is gone, and Mechanism B (Task 4) keeps `ci.ContactPlane` valid via LKCP restore.
If test still fails: Mechanism B isn't firing in the test scenario. Debug by adding a `Console.WriteLine` inside the Mechanism B block to print whether it fires. Re-run test, inspect output.
- [ ] **Step 6: Run the FULL test suite — should be 1147+8 green**
```bash
dotnet test --no-build
```
Expected: full pass. Tests that touched the old indoor-walkable path (if any) may need updating; investigate failures individually. Common causes:
- A test that expected `[walk-miss]` diagnostic to fire — re-scope or remove.
- A test that called `TryFindIndoorWalkablePlane` directly — those tests are testing the deleted-in-A6.P4 workaround; remove them or skip.
- [ ] **Step 7: Commit the strip**
```bash
git add src/AcDream.Core/Physics/TransitionTypes.cs
# Any other modified test files from Step 6
git commit -m "fix(phys): A6.P3 slice 1 step 2 — strip indoor walkable synthesis
Closes A6.P2 Finding 2 (ContactPlane resynthesis blowup, 250x to ∞x
more CP writes than retail). Indoor branch of Transition.FindEnvCollisions
now matches retail's CEnvCell::find_env_collisions tiny shape (decomp
line 309573): call BSPTREE::find_collisions, return state. No
synthesis, no per-frame ValidateWalkable call, no per-frame
ContactPlane write.
Cross-frame CP retention now flows via:
- Mechanism A: BSPQuery.FindCollisions Path-6 land write (already
present, unchanged).
- Mechanism B: per-transition LKCP restore (added in prior commit).
- PhysicsEngine.RunTransitionResolve body persist (unchanged).
TryFindIndoorWalkablePlane definition retained for now; deleted in
A6.P4 alongside the #90 sphere-overlap workaround.
Verification:
- IndoorContactPlaneRetentionTests now passes.
- Full suite 1147+8 green maintained.
- Re-capture verification deferred to Task 6.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>"
```
---
### Task 6: Re-capture scen3 and verify CP-write ratio drops
Integration verification. Re-run the A6.P1 capture protocol for scen3 (flat 2nd-floor walk, the cleanest CP-write-blowup signal) and confirm the ratio drops from ~86,748 to ≤200.
**Files:**
- Create: `docs/research/2026-05-21-a6-captures/scen3_inn_2nd_floor_postfix/` directory
- The acdream.log is the artifact; no code changes in this task.
- [ ] **Step 1: Confirm build is green**
```bash
dotnet build -c Debug 2>&1 | tail -5
```
Expected: `Build succeeded. 0 Warning(s). 0 Error(s).`
- [ ] **Step 2: User confirms retail is running, character on Holtburg inn 2nd floor (via stairs in retail), ready to walk**
Interactive coordination — wait for the user to confirm position.
- [ ] **Step 3: Run cdb capture for retail-postfix (optional — gives a fresh paired baseline)**
```powershell
.\tools\cdb\a6-probe-runner.ps1 -ScenarioTag "scen3_inn_2nd_floor_postfix"
```
Wait for "a6-probe v4 armed:" in `docs/research/2026-05-21-a6-captures/scen3_inn_2nd_floor_postfix/retail.log`.
User walks: forward 3 m, sidestep 1 m, walk back. Shuffle a bit at the end. Then close retail to release cdb.
- [ ] **Step 4: Decode retail.log**
```bash
py tools/cdb/decode_retail_hex.py docs/research/2026-05-21-a6-captures/scen3_inn_2nd_floor_postfix/retail.log
```
- [ ] **Step 5: Launch acdream with probe env vars + walk same scenario**
```powershell
$env:ACDREAM_DAT_DIR = "$env:USERPROFILE\Documents\Asheron's Call"
$env:ACDREAM_LIVE = "1"
$env:ACDREAM_TEST_HOST = "127.0.0.1"
$env:ACDREAM_TEST_PORT = "9000"
$env:ACDREAM_TEST_USER = "testaccount"
$env:ACDREAM_TEST_PASS = "testpassword"
$env:ACDREAM_DEVTOOLS = "1"
$env:ACDREAM_PROBE_PUSH_BACK = "1"
$env:ACDREAM_PROBE_INDOOR_BSP = "1"
$env:ACDREAM_PROBE_CELL = "1"
$env:ACDREAM_PROBE_CELL_CACHE = "1"
$env:ACDREAM_PROBE_CONTACT_PLANE = "1"
dotnet run --project src\AcDream.App\AcDream.App.csproj --no-build -c Debug 2>&1 |
Out-File -FilePath "docs\research\2026-05-21-a6-captures\scen3_inn_2nd_floor_postfix\acdream.log" -Encoding ASCII
```
User teleports +Acdream to the inn 2nd floor (via `@teleport` per the original scen3 capture protocol), walks the same scenario, closes gracefully.
- [ ] **Step 6: Compare CP-write counts**
```bash
D="docs/research/2026-05-21-a6-captures/scen3_inn_2nd_floor_postfix"
echo "--- retail BP7 (set_contact_plane) ---"
grep -c "^\[BP7\]" "$D/retail.decoded.log"
echo "--- acdream cp-write ---"
grep -c "^\[cp-write\]" "$D/acdream.log"
echo "--- prefix vs postfix acdream ---"
echo "scen3 prefix: 86,748 cp-writes (committed at 4b5aebc)"
echo "scen3 postfix: $(grep -c "^\[cp-write\]" "$D/acdream.log") cp-writes"
```
Expected: acdream cp-write count ≤ 200 (the success threshold). If still in the thousands, Finding 2 fix is INCOMPLETE — diagnose:
- Is Mechanism B firing? Add a probe inside the Mechanism B `if` block, re-launch, check.
- Is some OTHER write site still firing? Grep `git log -p src/AcDream.Core/Physics/` for `SetContactPlane` calls not yet audited.
- [ ] **Step 7: Visual verification with user**
User walks +Acdream on the inn 2nd floor for ~30 seconds in various patterns:
- Walk back and forth.
- Stand still for a few seconds.
- Walk into a wall.
- Walk into furniture.
User reports: does the player feel solid (no falling, no jitter, no fall-through to outdoor terrain)?
If user reports a regression: ROLLBACK the commit from Task 5 and either:
- Add Mechanism C (frames_stationary_fall synthesis) per slice 2, then retry; OR
- Re-scope: keep TryFindIndoorWalkablePlane gated on `!ci.LastKnownContactPlaneValid` (synthesis only on first frame in cell), keep the rest of the fix.
- [ ] **Step 8: Commit the verification capture**
```bash
git add docs/research/2026-05-21-a6-captures/scen3_inn_2nd_floor_postfix/
git commit -m "capture(research): A6.P3 slice 1 — scen3 post-fix verification
Re-capture of scen3 (Holtburg inn 2nd floor, flat-floor walk) after
the A6.P3 slice 1 fix. CP-write ratio:
scen3 pre-fix (4b5aebc): retail BP7 = 0, acdream cp-write = 86,748
scen3 post-fix: retail BP7 = N, acdream cp-write = M
[Fill in N + M from Step 6 output.]
Visual verification with user at the inn 2nd floor: [PASS/FAIL/notes].
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>"
```
---
### Task 7: Re-capture scen1 and scen5 for full slice 1 sign-off
Two more re-captures to ensure the fix generalizes (no regressions in scenarios where retail DID write CP). scen2 stair-fail + scen4 sling-out are intentionally OUT of slice 1 scope (those are Finding 3 territory).
**Files:**
- Create: `docs/research/2026-05-21-a6-captures/scen1_inn_doorway_postfix/`
- Create: `docs/research/2026-05-21-a6-captures/scen5_sewer_entry_postfix/`
- [ ] **Step 1: Re-capture scen1 (doorway walk-through)**
Apply the same protocol as Task 6 Steps 2-6, with `-ScenarioTag "scen1_inn_doorway_postfix"`. Walk script: walk forward through inn front door, stop just inside.
Expected: cp-write count drops from 73,304 to a small multiple of retail's BP7 (18 hits) — target ≤ ~200.
- [ ] **Step 2: Re-capture scen5 (Town Network portal entry)**
Apply the same protocol with `-ScenarioTag "scen5_sewer_entry_postfix"`. Walk script: walk to Town Network Portal, enter, walk 2m inside.
Expected: cp-write count drops from 20,956 to a small multiple of retail's BP7 (65 hits) — target ≤ ~500 (portal threshold + indoor hub walking is naturally more CP-active than flat 2nd-floor walking).
- [ ] **Step 3: Commit both verifications**
```bash
git add docs/research/2026-05-21-a6-captures/scen1_inn_doorway_postfix/
git add docs/research/2026-05-21-a6-captures/scen5_sewer_entry_postfix/
git commit -m "capture(research): A6.P3 slice 1 — scen1 + scen5 post-fix verification
scen1 pre-fix vs post-fix CP-write ratio:
retail BP7: 18
acdream cp-write: 73,304 -> N (ratio ~4072x -> ~Nx)
scen5 pre-fix vs post-fix CP-write ratio:
retail BP7: 65
acdream cp-write: 20,956 -> M (ratio ~322x -> ~Mx)
[Fill in N + M from re-capture decodes.]
A6.P3 slice 1 acceptance threshold (CP-write ratio ≤ ~10x) met
across all three flat-floor + portal-walk scenarios.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>"
```
---
### Task 8: Update findings doc + roadmap; mark A6.P3 slice 1 SHIPPED
Bookkeeping. Updates the A6.P2 findings doc with the post-fix data and the roadmap with the slice-1 ship.
**Files:**
- Modify: `docs/research/2026-05-21-a6-cdb-capture-findings.md`
- Modify: `docs/plans/2026-04-11-roadmap.md`
- Modify: `CLAUDE.md` (Currently-working-toward block)
- [ ] **Step 1: Append a "post-fix" section to the findings doc**
In `docs/research/2026-05-21-a6-cdb-capture-findings.md`, add a new section after the existing "Findings" section:
```markdown
## A6.P3 slice 1 — Finding 2 closed (2026-MM-DD)
[Update date when this lands.]
Strip + Mechanism B fix shipped at commits [list]. Re-capture verification:
| Scenario | Pre-fix cp-write | Post-fix cp-write | Pre-fix ratio | Post-fix ratio |
|---|---:|---:|---:|---:|
| scen1 inn doorway | 73,304 | N | 4,072x | Nx |
| scen3 inn 2nd floor | 86,748 | M | ∞ | Mx |
| scen5 town network | 20,956 | K | 322x | Kx |
[Fill in N, M, K from Task 6 + Task 7 outputs.]
Finding 2 closed. Finding 1 dispatcher entry frequency mismatch
[status — closed as side effect / still wide / TBD per re-capture].
Next: Finding 3 (cell-resolver sling-out from scen4). Separate plan
when A6.P3 slice 2 is scoped.
```
- [ ] **Step 2: Update the roadmap A6.P3 entry**
In `docs/plans/2026-04-11-roadmap.md` find the `- **A6.P3 — Fix the BSP correction paths**` line and update to show slice 1 SHIPPED with the commits.
- [ ] **Step 3: Update CLAUDE.md Currently-working-toward block**
In `CLAUDE.md` find the M1.5 + A6.P3 block. Update the "Current phase" line to reflect slice 1 ship + next slice (slice 2 = Finding 3 OR slice 2 = Mechanism C if needed).
- [ ] **Step 4: Commit the bookkeeping**
```bash
git add docs/research/2026-05-21-a6-cdb-capture-findings.md
git add docs/plans/2026-04-11-roadmap.md
git add CLAUDE.md
git commit -m "docs(roadmap+findings): A6.P3 slice 1 — SHIPPED
CP-write resynthesis blowup (Finding 2) closed. scen1/3/5 re-captures
confirm ratio drop from 250-∞x to ≤10x. Strip-synthesis + Mechanism B
land. TryFindIndoorWalkablePlane retained pending A6.P4 deletion.
Next: assess Finding 1 (dispatcher entry frequency) post-fix; if
still wide, scope as A6.P3 slice 2. Otherwise proceed to Finding 3
(cell-resolver sling-out) as slice 2.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>"
```
---
## Self-review checklist (for the implementer, before declaring slice 1 done)
- [ ] Task 1's research note exists and is committed.
- [ ] CollisionInfo.ContactPlaneWriteCount counter exists and is incremented by SetContactPlane.
- [ ] IndoorContactPlaneRetentionTests exists, fails pre-fix, passes post-fix.
- [ ] Mechanism B (LKCP restore) is wired at the correct per-transition site.
- [ ] FindEnvCollisions indoor branch is stripped to match retail's CEnvCell::find_env_collisions shape.
- [ ] TryFindIndoorWalkablePlane definition is NOT deleted (deferred to A6.P4).
- [ ] Build green; full test suite 1147+8 green.
- [ ] scen3 re-capture cp-write count ≤ 200.
- [ ] scen1 + scen5 re-capture cp-write ratios ≤ 10x.
- [ ] Visual verification at Holtburg inn 2nd floor passes (no fall-through, no jitter).
- [ ] Findings doc + roadmap + CLAUDE.md updated.
- [ ] If visual verification revealed a regression: rolled back and either added Mechanism C (slice 2 promotion) or applied a narrower gating fix.
If all 12 items check ✅, slice 1 is shipped. Next: either Finding 3 (separate plan) or Mechanism C if first-frame fall-through was reported.

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,481 @@
# A6.P3 issue #98 — cellar-up fix plan (diagnostic-first)
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Close issue #98 (sphere stuck on cottage cellar ramp) by identifying the EXACT failure point through a focused diagnostic probe, then writing a minimal evidence-driven fix. The user can walk up and out of a Holtburg cottage cellar in the live client.
**Architecture:** Three phases. Phase 0 re-reads the existing capture for everything the divergence comparison missed — no code. Phase 1 adds ONE focused probe at the only site where existing instrumentation is silent (inside `AdjustOffset`). Phase 2 runs the probe and decides which of three branches the fix must take. Phase 3 writes the fix against the replay harness as TDD oracle.
**Tech Stack:** C# .NET 10. The deterministic replay loop ([Issue98CellarUpReplayTests.cs](../../tests/AcDream.Core.Tests/Physics/Issue98CellarUpReplayTests.cs), <200ms) is the inner test loop. The cdb capture script ([tools/cdb/issue98-runner.ps1](../../tools/cdb/issue98-runner.ps1)) is the outer ground-truth loop.
**Risk:** Four prior sessions guessed-and-shipped (10+ variants). The pattern: convinced of diagnosis → ship fix → user reports "still can't pass." This plan refuses to ship code until the diagnostic data NAMES the failure site.
---
## What the existing log already proves (Phase 0 already partially done)
The slice 7 handoff and the divergence comparison both claimed *"our sphere oscillates at world Z ≈ 92.01 with no altitude gain."* That framing is **wrong**. Scanning the 2,589 `[step-walk]` lines in [a6-issue98-negpoly-20260523-135032.out.log](../../a6-issue98-negpoly-20260523-135032.out.log) shows:
| What the log shows | Implication |
|---|---|
| Sphere `cur` Z climbs **90.00 → 92.79** across the capture (2.79 m gain). | The climb works for most of the ramp. Z gain per `[step-walk] site=after-adjust` is +0.197 to +0.227 when offset points INTO the ramp normal. |
| Climb **caps at world Z ≈ 92.79**, then descends back. | Something at the top of the ramp prevents the last ~1.21 m of climb to cottage floor (world Z=94). |
| At the peak (line 1745817485 in the log): `before-insert` has `check=(141.58, 7.18, 92.79)`, `walkPoly=True`. `after-insert` has same position but **`walkPoly=False`** + `winterp=-0.0000`. | `DoStepDown` at the peak burned `WalkInterp` from 1.0 → 0 in one tick and the sphere ended up off the walkable polygon. |
| `cell=0xA9B40147->0xA9B40147` for every line. | Our sphere never transitions to a cottage cell (`0xA9B40143` / `0xA9B40146`) during the climb. Retail's BPA capture shows queries against cottage cells starting at BPA hit#430. |
| ContactPlane normal is `(0, +0.719, +0.695)` in our log. The divergence doc says retail's ramp normal is `(0, -0.719, +0.695)` (sign-flipped) — but retail's BPE NEVER writes the ramp as ContactPlane at all (only flat cellar floor or flat cottage floor). | Retail does not use the ramp as a ContactPlane. Our code does. This is a SHAPE-of-the-fix question for later; do not act on it before the diagnostic confirms it matters. |
**Revised hypothesis (informed by the data, not the divergence doc):**
The climb works while the sphere is mid-ramp. It **fails at the top of the ramp** where the polygon ends. Three mutually-exclusive failure modes are plausible:
1. **Geometric cap.** The ramp polygon physically ends at world Z ≈ 92.79; there is no further walkable surface within the step-up reach. Retail's sphere reaches the cottage floor by a transition we never trigger.
2. **Cell-set divergence.** At world Z ≈ 92.79 the sphere overlaps cottage cell volumes, but our `CheckOtherCells` either doesn't query the cottage cell, or queries it with the wrong sphere reference (foot vs lifted). Retail's BPA hit#430 / hit#434 show queries against TWO different leaves at SIGNIFICANTLY different sphere positions (cell A: local 0.48, cell B: local 0.63) — that's the retail two-step-up pattern at work, which we may not be doing.
3. **WalkInterp depletion before forward motion applies.** The peak `[step-walk]` line at 17485 shows `winterp=-0.0000` after `DoStepDown`. If `DoStepDown` consumed all WalkInterp on the lift (the slice 7 handoff's reading), the next tick starts WalkInterp at 1.0 again — so this isn't the inter-tick blocker the handoff thought. But INTRA-tick, the next call's `AdjustOffset` runs with no remaining WalkInterp, which could legitimately gate further motion.
**The point of Phase 1 is to disambiguate between (1), (2), and (3).**
---
## Phase 1 — focused diagnostic (no code logic changes)
Add ONE probe site inside `AdjustOffset` and rerun. The existing `[step-walk]` probe instruments the call from the outside (req → adj across the call) but never reveals which **branch** AdjustOffset took. The new probe reveals the branch; that's the missing signal.
### Task 1.1: Add `[step-walk-adjust]` probe call inside AdjustOffset
**Files:**
- Modify: `src/AcDream.Core/Physics/PhysicsDiagnostics.cs` (add one new helper)
- Modify: `src/AcDream.Core/Physics/TransitionTypes.cs:2635-2739` (add 4 probe calls inside AdjustOffset's branches)
- [ ] **Step 1: Add the diagnostic emitter to PhysicsDiagnostics.cs**
Add this method right after `LogStepWalk` (currently ending around line 700). It is intentionally separate from `LogStepWalk` so the line format stays short and greppable.
```csharp
/// <summary>
/// A6.P3 issue #98 (2026-05-23) — focused probe INSIDE AdjustOffset
/// revealing which branch was taken and the Z gain per call. Pair with
/// <c>[step-walk] site=after-adjust</c> at the call site to triangulate
/// where the projection ends up. Caller MUST guard with
/// <c>if (!ProbeStepWalkEnabled) return;</c> before calling.
/// </summary>
public static void LogStepWalkAdjust(
string branch,
Vector3 input,
Vector3 output,
Plane? contactPlane,
bool slidingValid,
Vector3 slidingNormal,
float collisionAngle,
float walkInterp)
{
var culture = System.Globalization.CultureInfo.InvariantCulture;
string cpDesc = contactPlane is { } cp
? string.Format(culture,
"n=({0:F4},{1:F4},{2:F4}) d={3:F4}",
cp.Normal.X, cp.Normal.Y, cp.Normal.Z, cp.D)
: "n/a";
string slideDesc = slidingValid
? string.Format(culture,
"({0:F4},{1:F4},{2:F4})",
slidingNormal.X, slidingNormal.Y, slidingNormal.Z)
: "n/a";
Console.WriteLine(string.Format(culture,
"[step-walk-adjust] branch={0} input=({1:F4},{2:F4},{3:F4}) " +
"output=({4:F4},{5:F4},{6:F4}) zGain={7:F4} " +
"cp={8} slide={9} colAngle={10:F4} winterp={11:F4}",
branch,
input.X, input.Y, input.Z,
output.X, output.Y, output.Z,
output.Z - input.Z,
cpDesc, slideDesc, collisionAngle, walkInterp));
}
```
- [ ] **Step 2: Wire the probe at the four AdjustOffset branches**
Open [TransitionTypes.cs](../../src/AcDream.Core/Physics/TransitionTypes.cs) at line 2635 (`private Vector3 AdjustOffset(Vector3 offset)`). The body currently has FOUR exit/branch points:
1. **no-cp** path (line 26542659): `if (!ci.ContactPlaneValid) return result;`
2. **slide** path (lines 26652676): `if (checkSlide) { ... result = ... }`
3. **into-plane** path (lines 26772681): `else if (collisionAngle <= 0f) { result -= ... }`
4. **away-plane** path (lines 26822687): `else { result -= ... }` — same arithmetic as `into-plane`, kept distinct for the probe.
At the FINAL return (line 2738 `return result;`), emit ONE probe line that captures the branch taken and the final result.
Replace lines 26352739 with a body that tracks the branch token (initialize to `"unknown"`) and assigns it at each branch point. Concretely:
```csharp
private Vector3 AdjustOffset(Vector3 offset)
{
var sp = SpherePath;
var ci = CollisionInfo;
Vector3 result = offset;
bool checkSlide = false;
string branch = "init"; // ← new
// Check if we should apply sliding.
float slidingAngle = Vector3.Dot(result, ci.SlidingNormal);
if (ci.SlidingNormalValid)
{
if (slidingAngle < 0f)
checkSlide = true;
else
ci.SlidingNormalValid = false;
}
// No contact plane — simple slide projection.
if (!ci.ContactPlaneValid)
{
if (checkSlide)
{
result -= ci.SlidingNormal * slidingAngle;
branch = "no-cp-slide"; // ← new
}
else
{
branch = "no-cp"; // ← new
}
if (PhysicsDiagnostics.ProbeStepWalkEnabled) // ← new
PhysicsDiagnostics.LogStepWalkAdjust(
branch, offset, result,
contactPlane: null,
slidingValid: ci.SlidingNormalValid,
slidingNormal: ci.SlidingNormal,
collisionAngle: 0f,
walkInterp: sp.WalkInterp);
return result;
}
// Have a contact plane — project movement onto the contact surface.
float collisionAngle = Vector3.Dot(result, ci.ContactPlane.Normal);
Vector3 slideOffset = Vector3.Cross(ci.ContactPlane.Normal, ci.SlidingNormal);
if (checkSlide)
{
// Project movement along the crease between contact and slide planes.
float slideLen = slideOffset.Length();
if (slideLen < PhysicsGlobals.EPSILON)
{
result = Vector3.Zero;
branch = "slide-degenerate"; // ← new
}
else
{
slideOffset /= slideLen;
result = Vector3.Dot(slideOffset, result) * slideOffset;
branch = "slide-crease"; // ← new
}
}
else if (collisionAngle <= 0f)
{
// Moving into the contact plane: remove component into the plane.
result -= ci.ContactPlane.Normal * collisionAngle;
branch = "into-plane"; // ← new
}
else
{
// Moving away from contact plane: snap to plane surface.
result -= ci.ContactPlane.Normal * collisionAngle;
branch = "away-plane"; // ← new
}
// ── (Existing safety check unchanged — keep lines 26892736 verbatim) ──
if (ci.ContactPlaneCellId != 0 && !ci.ContactPlaneIsWater)
{
Vector3 globCenter = sp.GlobalSphere[0].Origin;
float radius = sp.GlobalSphere[0].Radius;
float dist = Vector3.Dot(globCenter, ci.ContactPlane.Normal)
+ ci.ContactPlane.D;
float naturalRestingDist = radius * ci.ContactPlane.Normal.Z;
if (dist < naturalRestingDist - PhysicsGlobals.EPSILON)
{
float zDist = (naturalRestingDist - dist) / ci.ContactPlane.Normal.Z;
if (radius > MathF.Abs(zDist))
{
sp.AddOffsetToCheckPos(new Vector3(0f, 0f, zDist));
branch += "+safety-push"; // ← new — branch annotation
}
}
}
if (PhysicsDiagnostics.ProbeStepWalkEnabled) // ← new
PhysicsDiagnostics.LogStepWalkAdjust(
branch, offset, result,
contactPlane: ci.ContactPlane,
slidingValid: ci.SlidingNormalValid,
slidingNormal: ci.SlidingNormal,
collisionAngle: collisionAngle,
walkInterp: sp.WalkInterp);
return result;
}
```
**What this adds:** ONE log line per AdjustOffset call (probe-gated) naming the branch and showing Z gain. Nothing else changes — the math is identical.
- [ ] **Step 3: Run dotnet build to confirm the edit compiles**
```powershell
dotnet build src\AcDream.Core\AcDream.Core.csproj -c Debug
```
Expected: green build. The diff is additive (one new diagnostic helper + four single-line probe gates inside an existing method).
- [ ] **Step 4: Run dotnet test to confirm the apparatus tests stay green**
```powershell
dotnet test tests\AcDream.Core.Tests\AcDream.Core.Tests.csproj -c Debug --filter "FullyQualifiedName~Issue98CellarUpReplayTests"
```
Expected: all 7 Issue98CellarUpReplayTests pass (they document the bug; the failing-frame assertions still pin the current behavior). No new failures elsewhere.
- [ ] **Step 5: Commit the probe-only change**
```powershell
git add src\AcDream.Core\Physics\PhysicsDiagnostics.cs src\AcDream.Core\Physics\TransitionTypes.cs
git commit -m "diag(phys): A6.P3 #98 — [step-walk-adjust] probe inside AdjustOffset
Adds one log line per AdjustOffset call (gated by ACDREAM_PROBE_STEP_WALK)
naming the branch taken (no-cp / no-cp-slide / slide-degenerate /
slide-crease / into-plane / away-plane, optionally +safety-push) plus
zGain = output.Z - input.Z.
No math changes — pure observability so the next capture can disambiguate
the three failure-mode hypotheses for the cellar-ramp climb cap at
world Z ≈ 92.79."
```
---
### Task 1.2: Capture with the new probe
- [ ] **Step 1: Confirm ACE is up and the test character is in the cottage cellar**
The character is `+Acdream` (server guid `0x5000000A`). Stand on the cellar ramp, facing the top.
- [ ] **Step 2: Launch the client with the step-walk probe enabled**
```powershell
$env:ACDREAM_DAT_DIR = "$env:USERPROFILE\Documents\Asheron's Call"
$env:ACDREAM_LIVE = "1"
$env:ACDREAM_TEST_HOST = "127.0.0.1"
$env:ACDREAM_TEST_PORT = "9000"
$env:ACDREAM_TEST_USER = "testaccount"
$env:ACDREAM_TEST_PASS = "testpassword"
$env:ACDREAM_PROBE_STEP_WALK = "1"
$timestamp = Get-Date -Format "yyyyMMdd-HHmmss"
dotnet run --project src\AcDream.App\AcDream.App.csproj --no-build -c Debug 2>&1 |
Tee-Object -FilePath "a6-issue98-stepwalkadjust-$timestamp.out.log"
```
- [ ] **Step 3: User walks SLOWLY up the cellar ramp until stuck — no 180° turn-around**
The previous capture (negpoly log) polluted the trajectory because the user turned around and walked back. This capture must be a clean monotone climb.
- [ ] **Step 4: User closes the client (graceful — Alt-F4 / window close, not Ctrl-C)**
Per the logout-before-reconnect rules in CLAUDE.md — hard kill costs 3 minutes of ACE session recovery.
- [ ] **Step 5: Snapshot the capture to docs/research**
```powershell
$captureDir = "docs\research\2026-05-23-a6-captures\stepwalkadjust"
New-Item -ItemType Directory -Force $captureDir | Out-Null
Copy-Item "a6-issue98-stepwalkadjust-$timestamp.out.log" "$captureDir\acdream.log"
```
---
### Task 1.3: Analyze the new probe data
- [ ] **Step 1: Extract the [step-walk-adjust] lines from the climb portion only**
```powershell
Select-String "step-walk-adjust" "docs\research\2026-05-23-a6-captures\stepwalkadjust\acdream.log" |
Select-Object -ExpandProperty Line |
Set-Content "docs\research\2026-05-23-a6-captures\stepwalkadjust\adjust-only.log"
```
- [ ] **Step 2: Cross-correlate [step-walk-adjust] with [step-walk] site=after-adjust by line position**
For each ramp-climb tick, you should see the pattern:
```
[step-walk] site=after-adjust cur=(...) req=(rX,rY,0) adj=(aX,aY,aZ) ...
[step-walk-adjust] branch=... input=(rX,rY,0) output=(aX,aY,aZ) zGain=aZ cp=(...) ...
```
- [ ] **Step 3: Classify the climb by branch token**
Walk forward through the lines from the start of the climb to the peak (world Z ≈ 92.79). Build a histogram:
```
branch=into-plane — how many calls? Avg zGain?
branch=away-plane — how many? Avg zGain?
branch=slide-crease — how many? Avg zGain?
branch=no-cp — how many? (means ContactPlaneValid cleared mid-climb)
+safety-push annotation — how often? At what zGain?
```
- [ ] **Step 4: Decide which Phase 2 branch the fix takes**
Decision tree (read the histogram + the climb-cap moment together):
| Observation | Implies fix target | Phase 2 branch |
|---|---|---|
| `into-plane` dominates the climb, +zGain ~0.2/call, then at peak the branch flips to `no-cp` or `away-plane` with zero zGain | **Target A: ContactPlane is being cleared / replaced at the ramp top.** Fix: investigate why the ramp's CP is dropping when the climb is incomplete. | **Branch A** |
| `into-plane` continues at peak but zGain becomes near-zero (offset.Y trends to zero) | **Target B: forward motion is being consumed elsewhere.** Either by step-up burning WalkInterp before AdjustOffset gets the offset, or by the offset being slid horizontally by a wall hit. Look at `[step-walk] site=before-insert` for a Y-collapsing pattern. | **Branch B** |
| `into-plane` at peak with correct zGain but CurPos doesn't advance | **Target C: the offset is computed correctly but never committed.** Look at `TransitionalInsert` / `Insert` for a path that returns before commit. | **Branch C** |
| Any branch with frequent `slide-crease` mid-climb | **Side-finding: a SlidingNormal is being set against the ramp.** Should not happen on a clean slope — points at a wall poly being mis-classified. | **Branch D (side-investigation, not the fix)** |
- [ ] **Step 5: Write a 1-page findings note**
Save to `docs/research/2026-05-23-a6-stepwalkadjust-findings.md`. Format:
```markdown
# A6.P3 #98 — [step-walk-adjust] capture analysis (YYYY-MM-DD)
## Climb branch histogram (90.00 → peak)
- into-plane: X calls, avg zGain=Y
- away-plane: ...
- ...
## At the climb cap (world Z ≈ ZZZ):
- Branch flipped from X to Y at log line NNN
- ContactPlane normal at cap: (...)
- WalkInterp at cap: ...
## Conclusion: Fix target is Branch A / B / C / D.
## Reason (one paragraph)
...
```
- [ ] **Step 6: Commit the findings note**
```powershell
git add docs\research\2026-05-23-a6-stepwalkadjust-findings.md docs\research\2026-05-23-a6-captures\stepwalkadjust\
git commit -m "research(phys): A6.P3 #98 — [step-walk-adjust] capture + findings
Identifies fix target Branch [A/B/C/D] for the cellar-up climb cap."
```
---
## Phase 2 — branch to fix target (data-driven)
**Do NOT execute Phase 2 until Phase 1's findings note exists and the user has reviewed it.** This is the explicit lesson from the 4-session failure pattern: shipping speculative fixes wastes a session each time.
Per CLAUDE.md (no-workarounds rule) the fix must address the root cause. Each branch below names the LIKELY fix shape. The actual code lands as a separate plan once Phase 1 confirms the branch.
### Branch A — ContactPlane clearing / replacement at ramp top
If the climb works while CP=ramp and FAILS when CP changes mid-climb:
**Investigate first (no code):**
- Where in `Transition.FindEnvCollisions` / `Transition.SetContactPlane` is the ramp CP dropped? Likely candidates: `[TransitionTypes.cs:1814](../../src/AcDream.Core/Physics/TransitionTypes.cs:1814)` (FindEnvCollisions write), `[TransitionTypes.cs:1837](../../src/AcDream.Core/Physics/TransitionTypes.cs:1837)` (post-stepdown write).
- Compare against retail's BPE pattern: retail's CP toggles cellar-floor → cottage-floor with **no intermediate** (per the divergence comparison table). It never sets CP to the ramp. Our code DOES. The fix MAY be making our ramp-CP behavior match retail (never set CP=ramp; set CP=cellar-floor while climbing, then transition CP=cottage-floor when the sphere is above cottage floor) — but this is a SHAPE-of-the-fix question; verify it's actually needed before changing the contract.
**Probable fix file:** [src/AcDream.Core/Physics/TransitionTypes.cs](../../src/AcDream.Core/Physics/TransitionTypes.cs) around `FindEnvCollisions` (line 17001900).
**Acceptance:** Phase 1's findings note's branch histogram shows the failure point. The fix must produce a log where `cur` Z monotonically increases from 90.00 to ≥ 94.00 across the climb.
### Branch B — forward motion consumed before AdjustOffset
If `into-plane` continues at peak but zGain → 0 because input offset Y → 0:
**Investigate first:**
- Look at `[step-walk] site=before-insert` in the existing log around line 17458 at the peak. Compare the `req` value to the next tick's `req`. If `req.Y` decays to zero across consecutive ticks, the motion source itself is being cut. Trace back to where the per-tick offset is generated — `[PlayerMovementController](../../src/AcDream.Core/Physics/PlayerMovementController.cs)` and the input → velocity → offset chain.
- Compare against retail's BPF (adjust_sphere_to_plane) cadence — 431 hits over 35K BPs suggests retail re-projects relatively frequently but not in a way that zeros out motion. Our code might be over-projecting.
**Probable fix file:** [src/AcDream.Core/Physics/PlayerMovementController.cs](../../src/AcDream.Core/Physics/PlayerMovementController.cs) or the velocity-source upstream of the physics tick.
**Acceptance:** Same as Branch A.
### Branch C — offset computed but never committed (CurPos doesn't advance)
If `[step-walk-adjust] zGain` is correct per call but `[step-walk] cur` doesn't accumulate across calls:
**Investigate first:**
- `TransitionalInsert` in [TransitionTypes.cs](../../src/AcDream.Core/Physics/TransitionTypes.cs) — find the path that returns without committing `sp.CurPos += sp.GlobalOffset`. Likely guard: a collision state that retreats.
- The `[step-walk] delta=(0,0,0)` pattern across consecutive lines is the smoking gun if it persists at the peak.
**Probable fix file:** `TransitionTypes.cs`, the per-step commit at the bottom of the step loop (around line 689760).
**Acceptance:** Same as Branch A.
### Branch D — sliding-normal mis-classification (side-investigation only)
If `slide-crease` appears mid-climb, the ramp polygon is being treated as a wall by some upstream call. This is unlikely to be the fix for #98 — slide is the right answer for walls, the question is which polygon triggered it.
**Investigate:** which polygon's normal got installed as `SlidingNormal`? Add a one-line probe at `ci.SlidingNormal = ...` write sites and capture again.
---
## Phase 3 — write the fix (TDD against replay tests)
Once Phase 1 names the branch and the findings note is committed, write a fresh plan for the fix:
`docs/superpowers/plans/2026-MM-DD-a6-p3-issue98-cellar-up-fix-impl.md`
Use [superpowers:test-driven-development](../../.claude/superpowers/skills/test-driven-development/SKILL.md) discipline:
1. **Flip the replay test assertions FIRST** so they document the FIXED behavior. Two assertions to invert:
- `FailingFrame_CottageNeighborA_NearestWalkableIsOutsideSphereAndEdges` — at minimum one of `OverlapsSphere` / `InsideEdges` must become `true`.
- `FailingFrame_NoCottageNeighbourYieldsAcceptedWalkable` — at least one neighbour cell must accept.
2. Run `dotnet test`; the inverted assertions fail (red).
3. Implement the minimal fix in the file Phase 2 identified.
4. Run `dotnet test`; the inverted assertions pass (green).
5. Run the live client; the user verifies they can walk up out of the cottage cellar.
---
## Acceptance for Phase 1 (the only Phase landing in this plan)
- [ ] `dotnet build` green
- [ ] `dotnet test --filter "FullyQualifiedName~Issue98CellarUpReplayTests"` — 7 passing (no regressions)
- [ ] `dotnet test` overall — baseline 1167 + 8 maintained (no new failures)
- [ ] Diagnostic commit landed (one commit, additive only — no math/control-flow changes)
- [ ] Capture commit landed (acdream.log + findings note)
- [ ] Findings note names ONE branch (A / B / C / D) as the Phase 2 fix target
**Acceptance for Phase 3 (the eventual fix, captured here for the next session):**
- [ ] `Issue98CellarUpReplayTests.FailingFrame_NoCottageNeighbourYieldsAcceptedWalkable` — assertion inverted to require acceptance; test passes.
- [ ] `Issue98CellarUpReplayTests.FailingFrame_CottageNeighborA_NearestWalkableIsOutsideSphereAndEdges` — assertion inverted to require sphere-overlap; test passes.
- [ ] `dotnet build` green.
- [ ] `dotnet test` baseline 1167 + 8 (or better) maintained.
- [ ] User walks up out of the Holtburg cottage cellar in the live client — **visual verification required**.
---
## What this plan does NOT do (per CLAUDE.md no-workarounds rule)
- Does not reattempt placement-insert bypasses in `BSPQuery.FindCollisions`, `Transition.FindEnvCollisions`, or `Transition.DoStepDown`. Six variants already tried; none worked.
- Does not reattempt cell-resolver tiebreaker changes in `PhysicsEngine.ResolveCellId`. Slice 3 already shipped a stickiness fix; the bug persists.
- Does not reattempt negative-side polygon handling. Reverted in `35b37df`; the neg-poly branch fired zero times in the failing log.
- Does not reattempt the bldg-check / IsLandblockBuilding flag propagation. Reverted in `35b37df`.
- Does not add suppression flags, grace periods, retry loops, or `if (problematicState) return` symptom guards.
If at any point during Phase 1 implementation you feel certain about the fix without having seen the new probe's branch histogram, **STOP**. The 4-session pattern was: convinced of diagnosis → ship fix → user reports "still can't pass." Verifying via the probe is what breaks the pattern.
---
## Self-review (writing-plans skill discipline)
**Spec coverage.** The user's prompt asks for: state altitudes (done in handoff), diagnostic-first verification (Task 1.11.3), branch-to-fix decision (Phase 2), TDD against replay tests (Phase 3), forbidden shortcut list (final section). ✓
**Placeholder scan.** No "TBD" / "add error handling" / "similar to Task N" left in the plan. Exact file paths and line numbers throughout. ✓
**Type consistency.** `LogStepWalkAdjust` matches `LogStepWalk`'s signature pattern. Branch tokens are stringly-typed but enumerated explicitly. `[step-walk-adjust]` prefix is distinct from `[step-walk]`. ✓
**Realism.** Phase 1 ships one diagnostic commit + one capture commit. ~30 minutes if everything goes smoothly. Phase 2 is research (no code). Phase 3 is the actual fix — separate plan. The whole arc fits in 12 working days, broken at the user-review checkpoint after Phase 1's findings note.

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,646 @@
# Issue #100 — Transparent Ground Around Buildings — Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Replace acdream's cell-level `hiddenTerrainCells` mechanism (which produces 24m × 24m transparent rectangles around every Holtburg house) with retail's per-vertex Z nudge (`zFightTerrainAdjust = 0.00999999978`). Render terrain everywhere and let coplanar building floors win the depth test by being 1 cm higher than the rendered terrain.
**Architecture:** One-line change to [src/AcDream.App/Rendering/Shaders/terrain_modern.vert:139](src/AcDream.App/Rendering/Shaders/terrain_modern.vert:139) — pre-subtract 0.01 from `aPos.z` before the projection multiply, so every terrain vertex renders 1 cm below its physical Z. Physics path untouched (reads the un-nudged heightmap via [TerrainSurface](src/AcDream.Core/Physics/TerrainSurface.cs)). Then delete the `hiddenTerrainCells` / `BuildingTerrainCells` plumbing that's been threading through `LandblockMesh.Build`, `LoadedLandblock`, `LandblockLoader`, `GameWindow`, `GpuWorldState`, and `LandblockStreamer` — ~50 LOC of dead surface area once the nudge replaces it.
**Tech Stack:** GLSL 460 (terrain shader), C# .NET 10 (Core + App layers), xUnit tests, dotnet CLI.
**Retail oracle:**
- `docs/research/named-retail/acclient_2013_pseudo_c.txt:1120769``float zFightTerrainAdjust = 0.00999999978`
- `docs/research/named-retail/acclient_2013_pseudo_c.txt:702254` (address `006b6402`) — `edi_4[1] = ((float)(((long double)esi_1[2]) - ((long double)zFightTerrainAdjust)));` inside `ACRender::landPolysDraw(arg2=2)`
**Predecessor research:** [docs/research/2026-05-25-issue-100-terrain-cutout-handoff.md](docs/research/2026-05-25-issue-100-terrain-cutout-handoff.md). Phase 1 verification: see chat transcript (research confirmed end-to-end).
---
## Constraints
1. **Render-only.** The Z nudge MUST land in the shader, NOT in `LandblockMesh.Build` vertex output. Physics reads terrain Z from the same source — if we modify the mesh data, physics breaks too.
2. **Constant value `0.01f`.** Match retail's literal `0.00999999978` to single-precision: `0.01f` is bit-identical when round-tripped. Don't use `glPolygonOffset` (slope-dependent, hardware-variable); use a constant world-Z subtract in the vertex shader.
3. **No belt-and-suspenders.** Per [handoff §do-not-retry #5](docs/research/2026-05-25-issue-100-terrain-cutout-handoff.md), don't keep the `hiddenTerrainCells` mechanism alongside the nudge — delete it.
4. **One commit per logical change.** Don't bundle the shader nudge with the plumbing removal — keep the bisect window honest if a regression appears.
5. **Test-suite baseline.** Pre-flight `dotnet test` produces a baseline number for the focused suites (some pre-existing flakiness exists per the A6.P3 evening v2 follow-on note in CLAUDE.md). Each task must not increase the failure count.
---
## File Structure
**Files modified:**
| File | Role | Change |
|---|---|---|
| `src/AcDream.App/Rendering/Shaders/terrain_modern.vert` | Terrain vertex shader | Add Z nudge before projection multiply |
| `src/AcDream.Core/Terrain/LandblockMesh.cs` | Terrain mesh builder | Drop `hiddenTerrainCells` parameter + collapse block |
| `src/AcDream.Core/World/LoadedLandblock.cs` | Loaded-landblock DTO | Drop `BuildingTerrainCells` field |
| `src/AcDream.Core/World/LandblockLoader.cs` | Dat → LoadedLandblock | Drop `BuildBuildingTerrainCells` method + its call |
| `src/AcDream.App/Rendering/GameWindow.cs` | Runtime wiring (3 sites) | Drop the field reference at each Build / ctor site |
| `src/AcDream.App/Streaming/GpuWorldState.cs` | World-state owner (6 ctor sites) | Drop the 4th arg at each ctor site |
| `src/AcDream.App/Streaming/LandblockStreamer.cs` | Worker-side hydration | Drop the 4th arg at the one ctor site |
| `tests/AcDream.Core.Tests/World/LandblockLoaderTests.cs` | Loader unit tests | Delete `BuildBuildingTerrainCells_*` test |
| `docs/ISSUES.md` | Issue tracker | Close #100 with the commit SHA |
**Files NOT modified (verified):**
- `src/AcDream.Core/Physics/TerrainSurface.cs` — physics reads un-nudged Z; unaffected.
- `src/AcDream.App/Rendering/TerrainModernRenderer.cs` — consumes `LandblockMeshData` (vertices unchanged).
- `tests/AcDream.Core.Tests/Terrain/LandblockMeshTests.cs` — no `hiddenTerrainCells` test exists here (handoff was wrong about this surface; the `LandblockMesh.Build` test surface is for triangle/index correctness, not the cell-hide feature).
- All `references/WorldBuilder/**` — read-reference; unchanged.
---
## Pre-flight
- [ ] **Step 0.1: Confirm working tree is on the expected branch**
```powershell
git rev-parse --abbrev-ref HEAD
```
Expected: `claude/strange-albattani-3fc83c` (or whatever branch the user is operating on; not `main`).
- [ ] **Step 0.2: Establish the pre-fix test baseline**
```powershell
dotnet build
```
Expected: `Build succeeded.` (warnings OK, no errors).
```powershell
dotnet test --nologo --no-build --verbosity quiet
```
Capture the total / passed / failed numbers. **Record the baseline failure count** — that's the regression sentinel for Tasks 1 and 2. Per CLAUDE.md, some pre-existing static-state-leak flakiness (819 failures across runs) is known; it's independent of issue #100. The plan's regression check is "failures didn't grow."
---
## Task 1: Terrain shader Z nudge
**Files:**
- Modify: [src/AcDream.App/Rendering/Shaders/terrain_modern.vert:139](src/AcDream.App/Rendering/Shaders/terrain_modern.vert:139)
The single substantive change. After this commit the rectangles around buildings are still there (the `hiddenTerrainCells` plumbing still collapses the terrain inside each building's outdoor cell). Task 2 removes that.
- [ ] **Step 1.1: Edit the terrain vertex shader**
In `src/AcDream.App/Rendering/Shaders/terrain_modern.vert`, replace the final two lines of `void main()` (currently lines 138139, the blank line before `gl_Position` and the `gl_Position` write itself):
```glsl
gl_Position = uProjection * uView * vec4(aPos, 1.0);
```
with:
```glsl
// Retail zFightTerrainAdjust (acclient_2013_pseudo_c.txt:1120769 = 0.00999999978,
// applied per terrain vertex inside ACRender::landPolysDraw at line 702254,
// address 006b6402). Render terrain 1 cm below its physical Z so coplanar
// building floors win the depth test. Physics path is unaffected — it reads
// the un-nudged heightmap via TerrainSurface.SampleZ.
// Closes issue #100; supersedes the hiddenTerrainCells cell-collapse hack.
vec3 terrainPos = vec3(aPos.xy, aPos.z - 0.01);
gl_Position = uProjection * uView * vec4(terrainPos, 1.0);
```
- [ ] **Step 1.2: Build to verify the shader file compiles into the binary**
The shader is copied to `bin/Debug/net10.0/Rendering/Shaders/terrain_modern.vert` at build time (it's a `CopyToOutputDirectory` content item). The C# build doesn't statically validate GLSL, but it does confirm the file is in the right place.
```powershell
dotnet build
```
Expected: `Build succeeded.` (warnings OK, 0 errors).
- [ ] **Step 1.3: Run the focused test suites to make sure we haven't broken anything**
```powershell
dotnet test --nologo --no-build --verbosity quiet
```
Expected: total failure count ≤ the baseline from Step 0.2. The shader change cannot affect any non-rendering test, so this is a sanity check that nothing else regressed during the build.
- [ ] **Step 1.4: Commit**
```powershell
git add src/AcDream.App/Rendering/Shaders/terrain_modern.vert
git commit -m @'
fix(render): #100 — render terrain 1 cm below physical Z (retail zFightTerrainAdjust)
Subtract 0.01 from every terrain vertex Z in the modern terrain vertex
shader, matching retail's per-draw nudge applied inside
ACRender::landPolysDraw(arg2=2). Coplanar building floors now always win
the depth test against the rendered terrain, so the visual "ground at
the building floor" reads as the building's floor, not as Z-fighting.
Constant 0.01f bit-equals retail's float literal 0.00999999978 when
rounded to single precision.
Render-only — physics reads the un-nudged heightmap via
TerrainSurface.SampleZ / SampleZFromHeightmap. The same render-vs-
physics split is already established for EnvCell render lift
(+0.02m at GameWindow.cs around the cell-mesh draw).
Retail anchors:
docs/research/named-retail/acclient_2013_pseudo_c.txt:1120769
docs/research/named-retail/acclient_2013_pseudo_c.txt:702254
Cross-ref:
docs/research/2026-05-25-issue-100-terrain-cutout-handoff.md
docs/superpowers/plans/2026-05-25-issue-100-terrain-cutout.md
Followed by Task 2 (delete the hiddenTerrainCells / BuildingTerrainCells
plumbing). Visible result of this commit alone: building floors stop
Z-fighting, but the 24m × 24m transparent rectangles persist until the
plumbing is removed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
'@
```
---
## Task 2: Remove `hiddenTerrainCells` / `BuildingTerrainCells` plumbing
**Files:**
- Modify: `src/AcDream.Core/Terrain/LandblockMesh.cs` (drop parameter + collapse block)
- Modify: `src/AcDream.Core/World/LoadedLandblock.cs` (drop field)
- Modify: `src/AcDream.Core/World/LandblockLoader.cs` (drop method + call)
- Modify: `src/AcDream.App/Rendering/GameWindow.cs` (lines 1809, 5149, 8806)
- Modify: `src/AcDream.App/Streaming/GpuWorldState.cs` (6 ctor sites)
- Modify: `src/AcDream.App/Streaming/LandblockStreamer.cs` (line 231235)
- Modify: `tests/AcDream.Core.Tests/World/LandblockLoaderTests.cs` (delete `BuildBuildingTerrainCells_*` test)
Pure removal. The cell-collapse mechanism is the cause of the 24m transparent rectangles around buildings; Task 1's shader nudge replaces it. After this commit the rectangles are gone and terrain renders continuously under every building.
**Order matters within this task:** start at the deepest leaf (`LandblockMesh.Build`'s parameter and collapse block), then work outward through the data type (`LoadedLandblock` record), the loader, and finally the callers. Build at each step to catch shifted-line surprises. The commit at the end captures the whole removal.
- [ ] **Step 2.1: Remove the `hiddenTerrainCells` parameter from `LandblockMesh.Build`**
In `src/AcDream.Core/Terrain/LandblockMesh.cs`:
Delete the docs line for the parameter (currently line 43):
```csharp
/// <param name="hiddenTerrainCells">Optional cell indices (cy * 8 + cx) to draw as zero-area triangles.</param>
```
Change the `Build` signature (currently lines 4451) from:
```csharp
public static LandblockMeshData Build(
LandBlock block,
uint landblockX,
uint landblockY,
float[] heightTable,
TerrainBlendingContext ctx,
System.Collections.Generic.IDictionary<uint, SurfaceInfo> surfaceCache,
System.Collections.Generic.IReadOnlySet<int>? hiddenTerrainCells = null)
```
to:
```csharp
public static LandblockMeshData Build(
LandBlock block,
uint landblockX,
uint landblockY,
float[] heightTable,
TerrainBlendingContext ctx,
System.Collections.Generic.IDictionary<uint, SurfaceInfo> surfaceCache)
```
Replace the index-build loop (currently lines 171185, between the closing `}` of the cell loop and the `return new LandblockMeshData(...)`):
```csharp
// Indices are trivial 0..383 since we don't deduplicate verts. When
// a building owns an outdoor terrain cell, keep the fixed 384-index
// contract but collapse its two triangles so the building/stair mesh
// can visually own the hole.
for (uint i = 0; i < VerticesPerLandblock; i++)
{
int cellIdx = (int)i / VerticesPerCell;
if (hiddenTerrainCells is not null && hiddenTerrainCells.Contains(cellIdx))
{
indices[i] = (uint)(cellIdx * VerticesPerCell);
continue;
}
indices[i] = i;
}
```
with:
```csharp
// Indices are trivial 0..383 since we don't deduplicate verts.
for (uint i = 0; i < VerticesPerLandblock; i++)
indices[i] = i;
```
**Don't build yet** — `LoadedLandblock.BuildingTerrainCells` references will still be present in other files; we'll align them in the next steps before the first build.
- [ ] **Step 2.2: Remove the `BuildingTerrainCells` field from `LoadedLandblock`**
In `src/AcDream.Core/World/LoadedLandblock.cs`, replace the entire file:
```csharp
using DatReaderWriter.DBObjs;
namespace AcDream.Core.World;
public sealed record LoadedLandblock(
uint LandblockId,
LandBlock Heightmap,
IReadOnlyList<WorldEntity> Entities);
```
- [ ] **Step 2.3: Remove `BuildBuildingTerrainCells` from `LandblockLoader` and update the `Load` caller**
In `src/AcDream.Core/World/LandblockLoader.cs`, replace the entire body of the `Load` method (currently lines 1631) with:
```csharp
public static LoadedLandblock? Load(DatCollection dats, uint landblockId)
{
var block = dats.Get<LandBlock>(landblockId);
if (block is null)
return null;
var info = dats.Get<LandBlockInfo>((landblockId & 0xFFFF0000u) | 0xFFFEu);
var entities = info is null
? Array.Empty<WorldEntity>()
: BuildEntitiesFromInfo(info, landblockId);
return new LoadedLandblock(landblockId, block, entities);
}
```
Then delete the `BuildBuildingTerrainCells` method entirely (currently lines 3350, the `/// <summary>` block and the method body). The deleted lines look like:
```csharp
/// <summary>
/// Map LandBlockInfo.Buildings to 8x8 terrain mesh cells (cy * 8 + cx).
/// Retail attaches each CBuildingObj to its outside landcell during
/// CLandBlock::init_buildings; keep this signal separate from stabs so
/// ordinary static props do not punch holes in terrain.
/// </summary>
public static IReadOnlySet<int> BuildBuildingTerrainCells(LandBlockInfo info)
{
var result = new HashSet<int>();
foreach (var building in info.Buildings)
{
int cx = Math.Clamp((int)(building.Frame.Origin.X / 24f), 0, 7);
int cy = Math.Clamp((int)(building.Frame.Origin.Y / 24f), 0, 7);
result.Add(cy * 8 + cx);
}
return result;
}
```
- [ ] **Step 2.4: Update the two `LandblockMesh.Build` call sites in `GameWindow.cs`**
In `src/AcDream.App/Rendering/GameWindow.cs`:
**Site 1 (around line 18081809):** replace
```csharp
return AcDream.Core.Terrain.LandblockMesh.Build(
lb.Heightmap, lbX, lbY, _heightTable, _blendCtx, _surfaceCache, lb.BuildingTerrainCells);
```
with
```csharp
return AcDream.Core.Terrain.LandblockMesh.Build(
lb.Heightmap, lbX, lbY, _heightTable, _blendCtx, _surfaceCache);
```
**Site 2 (around line 88058806):** replace
```csharp
return AcDream.Core.Terrain.LandblockMesh.Build(
lb.Heightmap, lbX, lbY, _heightTable, _blendCtx, _surfaceCache, lb.BuildingTerrainCells);
```
with
```csharp
return AcDream.Core.Terrain.LandblockMesh.Build(
lb.Heightmap, lbX, lbY, _heightTable, _blendCtx, _surfaceCache);
```
**Site 3 (around line 51455149):** the `new LoadedLandblock(...)` ctor that passes `baseLoaded.BuildingTerrainCells`. Replace
```csharp
return new AcDream.Core.World.LoadedLandblock(
baseLoaded.LandblockId,
baseLoaded.Heightmap,
merged,
baseLoaded.BuildingTerrainCells);
```
with
```csharp
return new AcDream.Core.World.LoadedLandblock(
baseLoaded.LandblockId,
baseLoaded.Heightmap,
merged);
```
- [ ] **Step 2.5: Update the six `LoadedLandblock` ctor sites in `GpuWorldState.cs`**
In `src/AcDream.App/Streaming/GpuWorldState.cs`, each of the six ctor sites currently passes a 4th argument referencing `BuildingTerrainCells`. Drop that argument at each site. The line numbers may shift as edits land — use Grep with pattern `BuildingTerrainCells` to find each site, and edit each one to drop its 4th argument and the trailing comma on the previous line.
For example, the site around line 176180 changes from:
```csharp
landblock = new LoadedLandblock(
landblock.LandblockId,
landblock.Heightmap,
merged,
landblock.BuildingTerrainCells);
```
to:
```csharp
landblock = new LoadedLandblock(
landblock.LandblockId,
landblock.Heightmap,
merged);
```
The site around line 344 is a one-line form:
```csharp
_loaded[kvp.Key] = new LoadedLandblock(lb.LandblockId, lb.Heightmap, newList, lb.BuildingTerrainCells);
```
becomes:
```csharp
_loaded[kvp.Key] = new LoadedLandblock(lb.LandblockId, lb.Heightmap, newList);
```
Apply the same transform to all six sites. After this step there must be **zero** matches for `BuildingTerrainCells` in `src/AcDream.App/Streaming/GpuWorldState.cs`.
- [ ] **Step 2.6: Update the one `LoadedLandblock` ctor site in `LandblockStreamer.cs`**
In `src/AcDream.App/Streaming/LandblockStreamer.cs`, around line 231235, replace
```csharp
lb = new LoadedLandblock(
lb.LandblockId,
lb.Heightmap,
System.Array.Empty<AcDream.Core.World.WorldEntity>(),
lb.BuildingTerrainCells);
```
with
```csharp
lb = new LoadedLandblock(
lb.LandblockId,
lb.Heightmap,
System.Array.Empty<AcDream.Core.World.WorldEntity>());
```
- [ ] **Step 2.7: Delete the `BuildBuildingTerrainCells_*` test**
In `tests/AcDream.Core.Tests/World/LandblockLoaderTests.cs`, delete the test method `BuildBuildingTerrainCells_UsesBuildingsOnlyAndMapsToMeshCellIndex` (currently lines 120147 — the `[Fact]` attribute, the method declaration, the body, and the trailing blank line). The deleted block looks like:
```csharp
[Fact]
public void BuildBuildingTerrainCells_UsesBuildingsOnlyAndMapsToMeshCellIndex()
{
var info = new LandBlockInfo
{
Objects =
{
new Stab
{
Id = 0x02000001u,
Frame = new Frame { Origin = new Vector3(120, 72, 0) },
},
},
Buildings =
{
new BuildingInfo
{
ModelId = 0x020000AAu,
Frame = new Frame { Origin = new Vector3(141.5f, 7.2f, 94f) },
},
},
};
var cells = LandblockLoader.BuildBuildingTerrainCells(info);
Assert.Single(cells);
Assert.Contains(5, cells); // cy=0, cx=5 => mesh index cy * 8 + cx.
}
```
Leave the rest of the test class untouched.
- [ ] **Step 2.8: Sweep for any remaining `BuildingTerrainCells` / `hiddenTerrainCells` / `BuildBuildingTerrainCells` references**
```powershell
# Should return ONLY hits in docs/ (handoff doc, ISSUES.md historical entries, archived plans).
# Zero hits in src/ and tests/.
```
Use Grep with pattern `BuildingTerrainCells|hiddenTerrainCells|BuildBuildingTerrainCells` over the repo. Confirm `src/` and `tests/` are clean; `docs/` may still reference these names historically and that's fine.
- [ ] **Step 2.9: Build + test**
```powershell
dotnet build
```
Expected: `Build succeeded.` 0 errors. Warnings should be similar in count to the pre-flight baseline (no new warnings introduced — if a warning appears, the removal missed a site).
```powershell
dotnet test --nologo --no-build --verbosity quiet
```
Expected: failure count ≤ baseline minus 1 (the deleted `BuildBuildingTerrainCells_*` test was passing, so failure count stays at the baseline; total count drops by 1).
- [ ] **Step 2.10: Close issue #100 in `docs/ISSUES.md`**
In `docs/ISSUES.md`, locate the `#100 — Transparent rectangular patches around every house (terrain rendering)` block (around line 764) and update its **Status:** line from:
```markdown
**Status:** OPEN
```
to:
```markdown
**Status:** DONE
**Closed:** 2026-05-25
**Commits:** `<TASK_1_COMMIT_SHA>`, `<TASK_2_COMMIT_SHA>`
```
Replace `<TASK_1_COMMIT_SHA>` with the first 7 chars of Task 1's commit, and `<TASK_2_COMMIT_SHA>` with the first 7 chars of this task's commit (you'll know it after Step 2.11 — re-edit ISSUES.md if needed, OR get the SHAs via `git log --oneline -5` and amend before pushing).
Then move the closed block to the **Recently closed** section at the bottom of `docs/ISSUES.md`, following the format used by the other DONE entries (e.g. #84, #85, #87).
Append a one-line resolution paragraph immediately under the **Commits:** line:
```markdown
**Resolution (2026-05-25 · #100):** Replaced the cell-level
`hiddenTerrainCells` mechanism with retail's per-vertex Z nudge
(`zFightTerrainAdjust = 0.00999999978`) applied inside the modern
terrain vertex shader. Render terrain everywhere; coplanar building
floors win the depth test by being 1 cm higher than the rendered
terrain. Physics path untouched. ~50 LOC of `BuildingTerrainCells`
plumbing removed across LandblockMesh / LoadedLandblock /
LandblockLoader / GameWindow / GpuWorldState / LandblockStreamer
plus the corresponding unit test. Retail anchors:
acclient_2013_pseudo_c.txt:1120769 + :702254.
```
- [ ] **Step 2.11: Commit**
```powershell
git add src/AcDream.Core/Terrain/LandblockMesh.cs `
src/AcDream.Core/World/LoadedLandblock.cs `
src/AcDream.Core/World/LandblockLoader.cs `
src/AcDream.App/Rendering/GameWindow.cs `
src/AcDream.App/Streaming/GpuWorldState.cs `
src/AcDream.App/Streaming/LandblockStreamer.cs `
tests/AcDream.Core.Tests/World/LandblockLoaderTests.cs `
docs/ISSUES.md
git commit -m @'
refactor: #100 — remove hiddenTerrainCells / BuildingTerrainCells plumbing
Retired in favour of Task 1's retail-faithful terrain shader Z nudge.
Pure removal — ~50 LOC of dead surface area across:
- src/AcDream.Core/Terrain/LandblockMesh.cs (drop parameter +
cell-collapse block)
- src/AcDream.Core/World/LoadedLandblock.cs (drop field)
- src/AcDream.Core/World/LandblockLoader.cs (drop method + call)
- src/AcDream.App/Rendering/GameWindow.cs (3 sites)
- src/AcDream.App/Streaming/GpuWorldState.cs (6 ctor sites)
- src/AcDream.App/Streaming/LandblockStreamer.cs (1 ctor site)
- tests/AcDream.Core.Tests/World/LandblockLoaderTests.cs (drop test)
No retail anchor — the deleted mechanism never had one; this commit
rolls our code back to the actual retail behaviour established in
the prior commit's shader nudge.
ISSUES.md #100 moved to Recently closed.
Cross-ref:
docs/research/2026-05-25-issue-100-terrain-cutout-handoff.md
docs/superpowers/plans/2026-05-25-issue-100-terrain-cutout.md
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
'@
```
After committing, fix up the SHA references in `docs/ISSUES.md` if Step 2.10 used placeholders:
```powershell
git log --oneline -3
```
Capture the two SHAs (Task 1 and Task 2). Edit `docs/ISSUES.md` to replace `<TASK_1_COMMIT_SHA>` and `<TASK_2_COMMIT_SHA>` with the real values, then amend:
```powershell
git add docs/ISSUES.md
git commit --amend --no-edit
```
---
## Task 3: Visual verification at Holtburg
**This is the acceptance test.** The M1.5 milestone explicitly states visual verification is the acceptance gate. The two unit tests we have (`dotnet build` and `dotnet test`) prove the code compiles and the focused suites still pass — they don't prove the bug is gone.
- [ ] **Step 3.1: Launch the client against the live ACE server**
Per CLAUDE.md "Running the client against the live server" — graceful-close any prior session first.
```powershell
$proc = Get-Process -Name AcDream.App -ErrorAction SilentlyContinue
if ($proc) {
$proc.CloseMainWindow() | Out-Null
if (-not $proc.WaitForExit(5000)) { $proc | Stop-Process -Force }
}
Start-Sleep -Seconds 3
$env:ACDREAM_DAT_DIR = "$env:USERPROFILE\Documents\Asheron's Call"
$env:ACDREAM_LIVE = "1"
$env:ACDREAM_TEST_HOST = "127.0.0.1"
$env:ACDREAM_TEST_PORT = "9000"
$env:ACDREAM_TEST_USER = "testaccount"
$env:ACDREAM_TEST_PASS = "testpassword"
dotnet run --project src\AcDream.App\AcDream.App.csproj --no-build -c Debug 2>&1 |
Tee-Object -FilePath "issue100-verify-launch.log"
```
Launch in the background; give the window ~8 seconds to reach in-world state.
- [ ] **Step 3.2: Inspect each acceptance scenario**
| Scenario | Expected outcome | Pass/Fail |
|---|---|---|
| Stand outside Holtburg cottage (any cottage) | Ground around the building reads as continuous cobblestone / grass — no dark rectangular patch | |
| Walk in a circle around a cottage | Terrain stays continuous on all four sides | |
| Approach the inn from the south | No transparent rectangle in front of the inn | |
| Approach the inn from the north | No transparent rectangle behind the inn | |
| Building floors at threshold height | No Z-fighting flicker between terrain and building floor | |
| Walk inside the cottage cellar | (Regression check) Z-fight inside / floor still walkable | |
Each row must read PASS. If any row reads FAIL — stop, capture a screenshot, file as a follow-up, and ASK before "fixing" anything new.
- [ ] **Step 3.3: Close the client cleanly**
```powershell
$proc = Get-Process -Name AcDream.App -ErrorAction SilentlyContinue
if ($proc) {
$proc.CloseMainWindow() | Out-Null
if (-not $proc.WaitForExit(5000)) { $proc | Stop-Process -Force }
}
```
- [ ] **Step 3.4: User sign-off**
This is the *user's* acceptance gate. After Step 3.2 produces all-PASS, report to the user with concrete observations from at least 3 cottages + the inn, then await the user's explicit confirmation before declaring #100 closed.
If the user observes any new visual regression (e.g. terrain visibly sinking into water polygons, or scenery objects appearing to float), pause and investigate — that's a sign the 0.01 nudge interacts with something we haven't anticipated. Do not retry or paper over.
---
## Self-review (post-write)
**Spec coverage:**
| Requirement | Task | Coverage |
|---|---|---|
| Add 1 cm Z subtract in terrain vertex shader at line 139 | Task 1 | ✔ |
| Reference retail anchors in code comment | Task 1.1 | ✔ |
| Delete `hiddenTerrainCells` parameter + collapse block | Task 2.1 | ✔ |
| Delete `BuildingTerrainCells` field on `LoadedLandblock` | Task 2.2 | ✔ |
| Delete `BuildBuildingTerrainCells` method + `Load` call | Task 2.3 | ✔ |
| Update all `LandblockMesh.Build` call sites | Task 2.4 | ✔ (3 sites in GameWindow.cs) |
| Update all `LoadedLandblock` ctor sites | Tasks 2.4, 2.5, 2.6 | ✔ (1 in GameWindow.cs + 6 in GpuWorldState.cs + 1 in LandblockStreamer.cs) |
| Delete dead unit test | Task 2.7 | ✔ |
| Sweep for stragglers | Task 2.8 | ✔ |
| Close issue #100 in ISSUES.md | Task 2.10 | ✔ |
| Visual verification at Holtburg cottages | Task 3.2 | ✔ |
| Don't touch physics | Constraint 1, Task 1.1 comment | ✔ |
| Don't use glPolygonOffset | Constraint 2 | ✔ |
| Don't keep both mechanisms | Constraint 3 | ✔ |
**Placeholder scan:** Done — no "TBD", "TODO", "implement later", or "similar to Task N" references. Every step has the actual code.
**Type consistency:** `LandblockMesh.Build` signature appears in Tasks 2.1 + 2.4; both drop the 7th parameter consistently. `LoadedLandblock` ctor appears in Tasks 2.2 + 2.4 + 2.5 + 2.6; all use the 3-argument form. `BuildingTerrainCells` field referenced in Tasks 2.4 + 2.5 + 2.6 + 2.8; all removals consistent.
**Spec compliance check:** plan matches the user's session brief structure (3 tasks ≈ "expect 34 tasks" — the original brief's Tasks 2 + 3 are merged here because the test deletion has a compile dependency on the plumbing removal). Visual verification is the explicit acceptance test, matching the brief's "visual verification is the acceptance test" sentence. Do-not-retry items from the handoff doc are honored as Constraints + do-not-fix language in Step 3.4.

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,628 @@
# A8.F Swept-Sphere Camera Collision — Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Stop the 3rd-person camera eye from clipping through walls by sweeping a 0.3 m collision sphere from the head-pivot to the desired eye and publishing the stopped position — porting retail's `SmartBox::update_viewer` spring arm. This stabilizes the A8.F indoor-visibility decisions (which key off the eye) and fixes the flap / missing-wall symptoms.
**Architecture:** A narrow `ICameraCollisionProbe` is injected into `RetailChaseCamera`. After the camera damps the desired eye and before it publishes, it asks the probe to sweep `pivot→eye`. The concrete `PhysicsCameraCollisionProbe` wraps the existing `PhysicsEngine.ResolveWithTransition`, which already collides against both indoor cell walls (`FindEnvCollisions`) and outdoor/baked GfxObj shells (`FindObjCollisions`). Gated by `CameraDiagnostics.CollideCamera` (default ON).
**Tech Stack:** C# / .NET 10, Silk.NET, xUnit. Spec: `docs/superpowers/specs/2026-05-29-a8f-camera-collision-design.md`.
**Reference (read before starting):**
- Spec: `docs/superpowers/specs/2026-05-29-a8f-camera-collision-design.md`
- Camera: `src/AcDream.App/Rendering/RetailChaseCamera.cs` (eye `:113`, damp `:131`, publish `:136`, fade `:367`)
- Engine: `src/AcDream.Core/Physics/PhysicsEngine.cs:589` (`ResolveWithTransition`, returns `sp.CheckPos` as `.Position` at `:846`/`:865`)
- Sphere convention: `src/AcDream.Core/Physics/TransitionTypes.cs:517-547` (`InitPath` sets `LocalSphere[0].Origin = (0,0,radius)`)
- Player self-skip: `src/AcDream.App/Input/PlayerMovementController.cs` (`CellId` `:133`, `LocalEntityId` `:144`)
---
## Task 1: Add `CameraDiagnostics.CollideCamera` flag
**Files:**
- Modify: `src/AcDream.Core/Rendering/CameraDiagnostics.cs`
- Test: `tests/AcDream.Core.Tests/Rendering/CameraDiagnosticsTests.cs`
- [ ] **Step 1: Write the failing test**
Add to `CameraDiagnosticsTests.cs` (inside the `CameraDiagnosticsTests` class):
```csharp
[Fact]
public void CollideCamera_DefaultOn_AndPersistsRuntimeChanges()
{
CameraDiagnostics.CollideCamera = true;
Assert.True(CameraDiagnostics.CollideCamera);
CameraDiagnostics.CollideCamera = false;
Assert.False(CameraDiagnostics.CollideCamera);
CameraDiagnostics.CollideCamera = true; // reset so other tests aren't poisoned
}
```
- [ ] **Step 2: Run test to verify it fails**
Run: `dotnet test tests/AcDream.Core.Tests/AcDream.Core.Tests.csproj --filter "FullyQualifiedName~CameraDiagnosticsTests.CollideCamera_DefaultOn"`
Expected: FAIL — compile error, `CollideCamera` does not exist.
- [ ] **Step 3: Add the property**
In `CameraDiagnostics.cs`, add after the `UseRetailChaseCamera` property (after line 28):
```csharp
/// <summary>
/// When true (default), the chase camera sweeps a 0.3 m collision
/// sphere from the head-pivot to the desired eye and stops it at the
/// first wall (retail <c>SmartBox::update_viewer</c> spring arm), so
/// the eye never sits behind/inside geometry. Initial state from
/// <c>ACDREAM_CAMERA_COLLIDE</c>; default-on if unset, off only when
/// explicitly set to <c>"0"</c>.
/// </summary>
public static bool CollideCamera { get; set; } =
Environment.GetEnvironmentVariable("ACDREAM_CAMERA_COLLIDE") != "0";
```
- [ ] **Step 4: Run test to verify it passes**
Run: `dotnet test tests/AcDream.Core.Tests/AcDream.Core.Tests.csproj --filter "FullyQualifiedName~CameraDiagnosticsTests.CollideCamera_DefaultOn"`
Expected: PASS.
- [ ] **Step 5: Commit**
```bash
git add src/AcDream.Core/Rendering/CameraDiagnostics.cs tests/AcDream.Core.Tests/Rendering/CameraDiagnosticsTests.cs
git commit -m "feat(render): Phase A8.F — add CameraDiagnostics.CollideCamera flag (default on)"
```
---
## Task 2: Camera-collision probe interface + `PhysicsCameraCollisionProbe`
**Files:**
- Create: `src/AcDream.App/Rendering/ICameraCollisionProbe.cs`
- Create: `src/AcDream.App/Rendering/PhysicsCameraCollisionProbe.cs`
- Test: `tests/AcDream.App.Tests/Rendering/PhysicsCameraCollisionProbeTests.cs`
- [ ] **Step 1: Write the failing test**
Create `tests/AcDream.App.Tests/Rendering/PhysicsCameraCollisionProbeTests.cs`:
```csharp
using System.Numerics;
using AcDream.App.Rendering;
using AcDream.Core.Physics;
using Xunit;
namespace AcDream.App.Tests.Rendering;
public class PhysicsCameraCollisionProbeTests
{
// The probe must convert the desired eye path (where the SPHERE CENTER
// should travel) into the foot-capsule path InitPath expects (which offsets
// sphere0 up by radius), then invert it on the result. Verify the round trip.
[Fact]
public void SpherePathOffset_RoundTrips()
{
var p = new Vector3(10f, 20f, 30f);
const float r = 0.3f;
var path = PhysicsCameraCollisionProbe.ToSpherePath(p, r);
Assert.Equal(p.Z - r, path.Z, 5);
Assert.Equal(p.X, path.X, 5);
Assert.Equal(p.Y, path.Y, 5);
var back = PhysicsCameraCollisionProbe.FromSpherePath(path, r);
Assert.Equal(p.X, back.X, 5);
Assert.Equal(p.Y, back.Y, 5);
Assert.Equal(p.Z, back.Z, 5);
}
// cellId == 0 means "no starting cell" — the probe must short-circuit and
// return the desired eye without touching the engine.
[Fact]
public void SweepEye_NoStartingCell_ReturnsDesiredEyeUnchanged()
{
var probe = new PhysicsCameraCollisionProbe(new PhysicsEngine());
var pivot = new Vector3(0f, 0f, 1.5f);
var eye = new Vector3(-2f, 0f, 2.2f);
var result = probe.SweepEye(pivot, eye, cellId: 0, selfEntityId: 0);
Assert.Equal(eye, result);
}
}
```
- [ ] **Step 2: Run test to verify it fails**
Run: `dotnet test tests/AcDream.App.Tests/AcDream.App.Tests.csproj --filter "FullyQualifiedName~PhysicsCameraCollisionProbeTests"`
Expected: FAIL — `ICameraCollisionProbe` / `PhysicsCameraCollisionProbe` do not exist.
- [ ] **Step 3: Create the interface**
Create `src/AcDream.App/Rendering/ICameraCollisionProbe.cs`:
```csharp
using System.Numerics;
namespace AcDream.App.Rendering;
/// <summary>
/// Sweeps a small sphere from the camera pivot (player head) toward the
/// desired eye and returns the stopped (non-penetrating) eye. The seam that
/// lets <see cref="RetailChaseCamera"/> collide its eye without depending on
/// the physics engine directly (and stay unit-testable with a fake).
/// </summary>
public interface ICameraCollisionProbe
{
/// <summary>
/// Roll a collision sphere from <paramref name="pivot"/> to
/// <paramref name="desiredEye"/>; return the position it reaches without
/// penetrating geometry. Returns <paramref name="desiredEye"/> unchanged
/// when nothing blocks the path or when <paramref name="cellId"/> is 0.
/// </summary>
Vector3 SweepEye(Vector3 pivot, Vector3 desiredEye, uint cellId, uint selfEntityId);
}
```
- [ ] **Step 4: Create the implementation**
Create `src/AcDream.App/Rendering/PhysicsCameraCollisionProbe.cs`:
```csharp
using System.Numerics;
using AcDream.Core.Physics;
namespace AcDream.App.Rendering;
/// <summary>
/// <see cref="ICameraCollisionProbe"/> backed by the player's swept-sphere
/// engine. Ports retail's <c>SmartBox::update_viewer</c> (0x00453ce0): sweep
/// the 0.3 m <c>viewer_sphere</c> from the head-pivot to the desired eye via a
/// <c>CTransition</c> and use the stopped position. Reusing
/// <see cref="PhysicsEngine.ResolveWithTransition"/> collides against indoor
/// cell walls (<c>FindEnvCollisions</c>) AND outdoor/baked GfxObj shells
/// (<c>FindObjCollisions</c>) in one faithful path.
/// </summary>
public sealed class PhysicsCameraCollisionProbe : ICameraCollisionProbe
{
/// <summary>Retail <c>viewer_sphere</c> radius (acclient :93314).</summary>
public const float ViewerSphereRadius = 0.3f;
private readonly PhysicsEngine _physics;
public PhysicsCameraCollisionProbe(PhysicsEngine physics) => _physics = physics;
public Vector3 SweepEye(Vector3 pivot, Vector3 desiredEye, uint cellId, uint selfEntityId)
{
// No starting cell → nothing to sweep against; keep the desired eye.
if (cellId == 0) return desiredEye;
// SpherePath.InitPath puts sphere0's center at pathPos + (0,0,radius)
// (the player foot-capsule convention). Retail's viewer_sphere center is
// (0,0,0), so shift the path DOWN by the radius to make the SPHERE CENTER
// travel pivot→eye, then add it back to the swept stop position.
Vector3 begin = ToSpherePath(pivot, ViewerSphereRadius);
Vector3 end = ToSpherePath(desiredEye, ViewerSphereRadius);
var r = _physics.ResolveWithTransition(
currentPos: begin,
targetPos: end,
cellId: cellId,
sphereRadius: ViewerSphereRadius,
sphereHeight: 0f, // single sphere (no head sphere)
stepUpHeight: 0f, // no step-up for a camera
stepDownHeight: 0f, // no step-down / ground snap
isOnGround: false, // no contact-plane / walkable semantics
body: null, // no cross-frame persistence
// Retail init_object(player, 0x5c) = IsViewer|PathClipped|FreeRotate|
// PerfectClip (pseudo-C :92864). PathClipped = hard-stop at first contact
// (the spring arm, not edge-slide); IsViewer = eye passes through creatures,
// colliding only with world geometry. NOT IsPlayer -> stays out of the #98
// capture filter. (Updated from ObjectInfoState.None during implementation
// per the Task-10 code-quality review; shipped in fcea05f / spec §5.1.)
moverFlags: ObjectInfoState.IsViewer | ObjectInfoState.PathClipped
| ObjectInfoState.FreeRotate | ObjectInfoState.PerfectClip,
movingEntityId: selfEntityId); // skip the player's own ShadowEntry
return FromSpherePath(r.Position, ViewerSphereRadius);
}
/// <summary>Eye/pivot point → InitPath path point (subtract the sphere-center offset).</summary>
internal static Vector3 ToSpherePath(Vector3 spherePoint, float radius)
=> spherePoint - new Vector3(0f, 0f, radius);
/// <summary>InitPath path point → eye point (add the sphere-center offset back).</summary>
internal static Vector3 FromSpherePath(Vector3 pathPoint, float radius)
=> pathPoint + new Vector3(0f, 0f, radius);
}
```
- [ ] **Step 5: Run test to verify it passes**
Run: `dotnet test tests/AcDream.App.Tests/AcDream.App.Tests.csproj --filter "FullyQualifiedName~PhysicsCameraCollisionProbeTests"`
Expected: PASS (both tests).
- [ ] **Step 6: Commit**
```bash
git add src/AcDream.App/Rendering/ICameraCollisionProbe.cs src/AcDream.App/Rendering/PhysicsCameraCollisionProbe.cs tests/AcDream.App.Tests/Rendering/PhysicsCameraCollisionProbeTests.cs
git commit -m "feat(render): Phase A8.F — PhysicsCameraCollisionProbe (swept-sphere eye via ResolveWithTransition)"
```
---
## Task 3: Wire the probe into `RetailChaseCamera`
**Files:**
- Modify: `src/AcDream.App/Rendering/RetailChaseCamera.cs` (property, `Update` signature, sweep call)
- Test: `tests/AcDream.App.Tests/Rendering/RetailChaseCameraTests.cs`
- [ ] **Step 1: Write the failing tests**
Add to `RetailChaseCameraTests.cs` (inside the class). These need a fake probe and exercise `Update`:
```csharp
// ── Camera collision (A8.F) ───────────────────────────────────────
private sealed class FakeProbe : ICameraCollisionProbe
{
public int Calls;
public Vector3 ReturnEye;
public Vector3 SweepEye(Vector3 pivot, Vector3 desiredEye, uint cellId, uint selfEntityId)
{
Calls++;
return ReturnEye;
}
}
[Fact]
public void Update_WithProbeAndFlagOn_PublishesCollidedEye()
{
CameraDiagnostics.CollideCamera = true;
var collided = new Vector3(1f, 2f, 3f);
var probe = new FakeProbe { ReturnEye = collided };
var cam = new RetailChaseCamera { CollisionProbe = probe };
cam.Update(
playerPosition: Vector3.Zero, playerYaw: 0f, playerVelocity: Vector3.Zero,
isOnGround: true, contactPlaneNormal: Vector3.UnitZ, dt: 1f / 60f,
cellId: 0x100, selfEntityId: 0x5);
Assert.True(probe.Calls >= 1);
Assert.Equal(collided, cam.Position);
}
[Fact]
public void Update_FlagOff_DoesNotConsultProbe()
{
CameraDiagnostics.CollideCamera = false;
var probe = new FakeProbe { ReturnEye = new Vector3(99f, 99f, 99f) };
var cam = new RetailChaseCamera { CollisionProbe = probe };
cam.Update(
playerPosition: Vector3.Zero, playerYaw: 0f, playerVelocity: Vector3.Zero,
isOnGround: true, contactPlaneNormal: Vector3.UnitZ, dt: 1f / 60f,
cellId: 0x100, selfEntityId: 0x5);
Assert.Equal(0, probe.Calls);
Assert.NotEqual(new Vector3(99f, 99f, 99f), cam.Position);
CameraDiagnostics.CollideCamera = true; // reset
}
[Fact]
public void Update_NullProbe_DoesNotThrow()
{
CameraDiagnostics.CollideCamera = true;
var cam = new RetailChaseCamera { CollisionProbe = null };
// Should run with no collision and publish a valid view.
cam.Update(
playerPosition: Vector3.Zero, playerYaw: 0f, playerVelocity: Vector3.Zero,
isOnGround: true, contactPlaneNormal: Vector3.UnitZ, dt: 1f / 60f,
cellId: 0x100, selfEntityId: 0x5);
Assert.NotEqual(default, cam.View);
}
```
- [ ] **Step 2: Run tests to verify they fail**
Run: `dotnet test tests/AcDream.App.Tests/AcDream.App.Tests.csproj --filter "FullyQualifiedName~RetailChaseCameraTests.Update_"`
Expected: FAIL — `CollisionProbe` property and the `cellId`/`selfEntityId` `Update` parameters do not exist.
- [ ] **Step 3: Add the `CollisionProbe` property**
In `RetailChaseCamera.cs`, add to the public tunables region (after the `PivotHeight` property, around line 53):
```csharp
/// <summary>
/// Optional spring-arm collision probe. When set (and
/// <see cref="CameraDiagnostics.CollideCamera"/> is true), the damped eye
/// is swept from the head-pivot and stopped at the first wall. Null leaves
/// the eye uncollided (the default for tests and the legacy path).
/// </summary>
public ICameraCollisionProbe? CollisionProbe { get; init; }
```
- [ ] **Step 4: Extend the `Update` signature**
In `RetailChaseCamera.cs`, change the `Update` signature (line 86-92) to add two optional params at the end:
```csharp
public void Update(
Vector3 playerPosition,
float playerYaw,
Vector3 playerVelocity,
bool isOnGround,
Vector3 contactPlaneNormal,
float dt,
uint cellId = 0,
uint selfEntityId = 0)
```
- [ ] **Step 5: Insert the sweep between damp and publish**
In `RetailChaseCamera.cs`, between the end of the damping block (line 133 `}`) and the `// 6. Publish renderer surface.` comment (line 135), insert:
```csharp
// 5b. Spring-arm collision (A8.F). Retail SmartBox::update_viewer
// (0x00453ce0) sweeps viewer_sphere from the head-pivot to the
// desired eye and uses the stopped position. Keeps the eye out of
// walls so the A8.F camera-cell + portal side-tests stay stable.
// A null probe or disabled flag leaves the eye unchanged.
if (CameraDiagnostics.CollideCamera && CollisionProbe is not null)
_dampedEye = CollisionProbe.SweepEye(pivotWorld, _dampedEye, cellId, selfEntityId);
```
(The fade at step 7, line 140, already reads `_dampedEye`, so it now uses the collided eye automatically.)
- [ ] **Step 6: Run tests to verify they pass**
Run: `dotnet test tests/AcDream.App.Tests/AcDream.App.Tests.csproj --filter "FullyQualifiedName~RetailChaseCameraTests"`
Expected: PASS (the new `Update_*` tests plus all existing `Heading_*` / `BuildBasis_*` tests).
- [ ] **Step 7: Commit**
```bash
git add src/AcDream.App/Rendering/RetailChaseCamera.cs tests/AcDream.App.Tests/Rendering/RetailChaseCameraTests.cs
git commit -m "feat(render): Phase A8.F — RetailChaseCamera consumes the camera-collision probe"
```
---
## Task 4: Wire the probe in `GameWindow`
**Files:**
- Modify: `src/AcDream.App/Rendering/GameWindow.cs` (two camera constructions, one `Update` call)
No unit test — `GameWindow` wiring is verified by build + the visual acceptance in Task 7.
- [ ] **Step 1: Inject the probe at the first construction site**
In `GameWindow.cs`, the construction around line 10693 currently reads:
```csharp
_retailChaseCamera = new AcDream.App.Rendering.RetailChaseCamera
{
Aspect = _chaseCamera.Aspect,
};
```
Change it to:
```csharp
_retailChaseCamera = new AcDream.App.Rendering.RetailChaseCamera
{
Aspect = _chaseCamera.Aspect,
CollisionProbe = new AcDream.App.Rendering.PhysicsCameraCollisionProbe(_physicsEngine),
};
```
- [ ] **Step 2: Inject the probe at the second construction site**
In `GameWindow.cs`, the construction around line 10826 currently reads:
```csharp
_retailChaseCamera = new AcDream.App.Rendering.RetailChaseCamera
{
Aspect = _window!.Size.X / (float)_window.Size.Y,
};
```
Change it to:
```csharp
_retailChaseCamera = new AcDream.App.Rendering.RetailChaseCamera
{
Aspect = _window!.Size.X / (float)_window.Size.Y,
CollisionProbe = new AcDream.App.Rendering.PhysicsCameraCollisionProbe(_physicsEngine),
};
```
- [ ] **Step 3: Pass cell + self-entity into the per-frame `Update`**
In `GameWindow.cs`, the camera update around line 6862 currently ends with `dt: (float)dt);`. Change the call to:
```csharp
_retailChaseCamera!.Update(result.RenderPosition, _playerController.Yaw,
playerVelocity: _playerController.BodyVelocity,
isOnGround: result.IsOnGround,
contactPlaneNormal: _playerController.ContactPlane.Normal,
dt: (float)dt,
cellId: _playerController.CellId,
selfEntityId: _playerController.LocalEntityId);
```
- [ ] **Step 4: Build to verify the wiring compiles**
Run: `dotnet build`
Expected: build succeeds (0 errors).
- [ ] **Step 5: Commit**
```bash
git add src/AcDream.App/Rendering/GameWindow.cs
git commit -m "feat(render): Phase A8.F — wire camera-collision probe + cell/self id into GameWindow"
```
---
## Task 5: Add the live-toggle menu item
**Files:**
- Modify: `src/AcDream.App/Rendering/GameWindow.cs` (the `Camera` ImGui menu, ~line 7934)
- [ ] **Step 1: Add the checkbox menu item**
In `GameWindow.cs`, the `Camera` menu (around line 7934) currently reads:
```csharp
if (ImGuiNET.ImGui.BeginMenu("Camera"))
{
if (_cameraController is not null)
{
string flyLabel = _cameraController.IsFlyMode
? "Exit Free-Fly Mode" : "Enter Free-Fly Mode";
if (ImGuiNET.ImGui.MenuItem(flyLabel, "Ctrl+Shift+F"))
ToggleFlyOrChase();
}
ImGuiNET.ImGui.EndMenu();
}
```
Insert the toggle before `ImGuiNET.ImGui.EndMenu();`:
```csharp
if (ImGuiNET.ImGui.BeginMenu("Camera"))
{
if (_cameraController is not null)
{
string flyLabel = _cameraController.IsFlyMode
? "Exit Free-Fly Mode" : "Enter Free-Fly Mode";
if (ImGuiNET.ImGui.MenuItem(flyLabel, "Ctrl+Shift+F"))
ToggleFlyOrChase();
}
// A8.F: spring-arm camera collision (live A/B toggle).
if (ImGuiNET.ImGui.MenuItem("Collide Camera (spring arm)", "",
AcDream.Core.Rendering.CameraDiagnostics.CollideCamera))
AcDream.Core.Rendering.CameraDiagnostics.CollideCamera =
!AcDream.Core.Rendering.CameraDiagnostics.CollideCamera;
ImGuiNET.ImGui.EndMenu();
}
```
- [ ] **Step 2: Build to verify**
Run: `dotnet build`
Expected: build succeeds (0 errors).
- [ ] **Step 3: Commit**
```bash
git add src/AcDream.App/Rendering/GameWindow.cs
git commit -m "feat(render): Phase A8.F — Camera menu toggle for spring-arm collision"
```
---
## Task 6: Correct the prior camera spec's collision note
**Files:**
- Modify: `docs/superpowers/specs/2026-05-18-retail-chase-camera-design.md` (lines 454-457)
- [ ] **Step 1: Mark the stale note as superseded**
In `2026-05-18-retail-chase-camera-design.md`, replace the bullet at lines 454-457:
```markdown
- **Camera-vs-world collision.** Retail's per-frame update doesn't
raycast world geometry (see investigation report 2026-05-18 in chat).
The auto-fade handles "camera passes through player"; we don't
attempt "camera collides with wall" — same as retail.
```
with:
```markdown
- **Camera-vs-world collision.** ~~Retail's per-frame update doesn't
raycast world geometry; we don't attempt "camera collides with wall"
— same as retail.~~ **SUPERSEDED 2026-05-29:** this was a research
error — retail DOES collide the camera in `SmartBox::update_viewer`
(0x00453ce0), which the earlier pass missed by tracing only the
desired-eye producer. Implemented as a swept-sphere spring arm; see
`docs/superpowers/specs/2026-05-29-a8f-camera-collision-design.md`.
```
- [ ] **Step 2: Commit**
```bash
git add docs/superpowers/specs/2026-05-18-retail-chase-camera-design.md
git commit -m "docs(render): Phase A8.F — supersede the old 'no camera collision' note"
```
---
## Task 7: Full verification + acceptance
**Files:** none (verification only).
- [ ] **Step 1: Full build**
Run: `dotnet build`
Expected: 0 errors.
- [ ] **Step 2: Full test suite**
Run: `dotnet test`
Expected: green. Note the App.Tests baseline should increase by the new camera tests; no regressions in Core/Net.
- [ ] **Step 3: Visual verification (the real acceptance — requires the user)**
Launch against the live ACE server with the A8.F branch on (PowerShell):
```powershell
$env:ACDREAM_DAT_DIR="$env:USERPROFILE\Documents\Asheron's Call"; $env:ACDREAM_LIVE="1"
$env:ACDREAM_TEST_HOST="127.0.0.1"; $env:ACDREAM_TEST_PORT="9000"
$env:ACDREAM_TEST_USER="testaccount"; $env:ACDREAM_TEST_PASS="testpassword"
$env:ACDREAM_A8_INDOOR_BRANCH="1"
dotnet run --project src\AcDream.App\AcDream.App.csproj --no-build -c Debug 2>&1 | Tee-Object -FilePath "a8f-cameracollide.log"
```
Walk `+Acdream` into a Holtburg cottage and down into its cellar, panning the camera through walls and crossing the doorway inside↔outside. Confirm:
- the flap is gone — walls/ground stay solid while panning and while crossing the doorway;
- back walls no longer go missing when looking through a window from outside;
- the player fades (rather than the camera sitting inside the player mesh) when backed into a corner.
Then toggle `Collide Camera (spring arm)` off via the Camera menu (or relaunch with `ACDREAM_CAMERA_COLLIDE=0`) and confirm the flap returns — proving the fix is what closed it.
- [ ] **Step 4: Update the roadmap / milestones on visual pass**
After the user confirms the visual result, update the A8.F entry in `CLAUDE.md` (the M1.5 "currently working toward" block) and `docs/plans/2026-04-11-roadmap.md` shipped table to note the swept-sphere camera collision shipped + visual-verified, and move/close the related A8.F flap notes. Commit:
```bash
git add CLAUDE.md docs/plans/2026-04-11-roadmap.md
git commit -m "docs(render): Phase A8.F — camera collision shipped + visual-verified"
```
---
## Notes for the implementer
- **Do not** re-implement collision in the probe. The whole point of reusing
`ResolveWithTransition` is that the env+obj sweep is already tested. The probe
is param-marshalling + the z-offset round trip.
- **Self-skip is load-bearing.** The sweep starts at the player's head, inside
the player's own 0.48 m collision sphere / ShadowEntry. Passing
`selfEntityId` (= `LocalEntityId`) is what stops the eye from snapping onto
the head every frame. If the eye appears glued to the player, this is the
first thing to check.
- **Slide vs hard-stop (open question).** Reusing the transition gives the
player path's edge-slide (the eye glides along a wall, no jitter). If visual
verification shows the eye behaving oddly, read retail's `find_valid_position`
and match its stop/slide semantics — but do not change the architecture for it.
- **If the eye hugs/penetrates in a tight room**, the spec's optional
`AdjustPosition` fallback (spec §7) is the escalation; add it only if needed.

File diff suppressed because it is too large Load diff

Some files were not shown because too many files have changed in this diff Show more