fix(phys): A6.P4 door bug — AddAllOutsideCells coord convention + replay apparatus

CellTransit.AddAllOutsideCells assumed sphere coords were absolute world
coords (subtracting lbXf = 0xA9 * 192 = 32448 from the sphere position).
Production has used landblock-local coords since Phase A.1
(streaming-center landblock at world origin), so the subtraction
produced localX = -32316, gridX = -1346 → out-of-range → early return
→ ZERO outdoor cells added.

For outdoor primary cells the bug was masked by GetNearbyObjects's
radial sweep. For indoor primary cells (where #98 gates the outdoor
sweep), the door's outdoor cell 0xA9B40029 never reached
portalReachableCells, the door's BSP was never queried, and the player
walked through Holtburg cottage doors unimpeded.

Fix: AddAllOutsideCells treats worldSphereCenter as landblock-local
directly. Matches retail CLandCell::add_all_outside_cells which uses
the per-cell 6-byte landblock-relative position struct.

Existing CellTransitAddAllOutsideCellsTests + CellTransitFindCellSetTests
updated to use landblock-local sphere coords (they were the only callers
using the world-coord convention; production never did).

Apparatus shipped:
- DoorBugTrajectoryReplayTests — live-capture-driven replay harness
  that pinpointed the bug per-field at unit-test speed (<500ms iteration)
- AddAllOutsideCells_LandblockLocalSphere_AddsDoorOutdoorCell — direct
  unit test that demonstrates the fix
- FindTransitCellsSphere_IndoorExitPortal_AddsOutsideForCapturedSpherePos
  — verifies cell-portal traversal at the captured sphere position
- DoorSetupGfxObjInspectionTests.HoltburgCottage_CellPortals_DatInspection
  — dat-direct EnvCell + Environment.Cells + portal-poly inspector
- Fixture: tests/AcDream.Core.Tests/Fixtures/door-bug/live-capture.jsonl
  (tick 13558 walkthrough + tick 22760 outdoor block)

Visual verification (user-driven at Holtburg cottage door, ~50cm off-center):
- outside→inside RUN: now BLOCKS (was: walks through)
- outside→inside WALK: presumed blocks (not retested)
- inside→outside RUN: PARTIAL — body intersects door, sphere slides through
- inside→outside WALK: same partial behavior

The remaining inside→outside asymmetry is a SEPARATE bug in BSP
collision response for two-sided polygons. The [bsp-test] probe now
fires 245 times for the door entity from indoor (was 0 pre-fix) —
door IS being queried; the BSP polygon-level collision response is
the new bug. Handoff at
docs/research/2026-05-25-door-bug-partial-fix-shipped.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Erik 2026-05-25 07:53:34 +02:00
parent 6a2c432e5a
commit 28cd97be62
8 changed files with 1134 additions and 40 deletions

View file

@ -0,0 +1,215 @@
# Door collision — apparatus replay shipped, root cause identified
2026-05-24 (continuation of the door-collision investigation)
> **SUPERSEDED 2026-05-25** by
> [`docs/research/2026-05-25-door-bug-partial-fix-shipped.md`](2026-05-25-door-bug-partial-fix-shipped.md).
> The root-cause analysis here was correct in direction
> (cell-portal traversal is upstream of BSP query) but missed the
> specific bug: `CellTransit.AddAllOutsideCells` silently failed for
> landblock-local sphere coords (production's convention) because it
> subtracted an absolute-world `lbXf` offset. Diagnosis + fix in the
> 2026-05-25 doc.
## TL;DR
The trajectory-replay apparatus is **wired and useful**. Run the diagnostic
test for the failing tick and the engine's full `[step-walk]` trace
prints, naming the divergence per-field.
**The bug: `CellTransit.FindCellSet` does not surface outdoor cell
`0xA9B40029` (where the door is registered) from indoor primary cell
`0xA9B40150`.** With issue #98's indoor-cell gate on the outdoor radial
sweep, the door is therefore invisible to `GetNearbyObjects` and the
BSP slab is never tested. The player walks through unimpeded.
Cn=(0,1,0) from the harness is **not the door** — it's the seeded
walkable polygon's south edge being treated as a wall when the sphere
falls off it. The harness reproduces production's "door not queried"
behavior, just with an apparatus artifact in place of clean walkthrough.
## What was shipped
1. **Live capture** (`door-walkthrough.jsonl`, 24,310 records ≈ 45 MB).
The capture was driven via `ACDREAM_CAPTURE_RESOLVE` + the existing
`[entity-source]` + `[bsp-test]` probes. **One record per
`PhysicsEngine.ResolveWithTransition` call** with full
`PhysicsBody` snapshots before/after.
2. **Fixture extraction**
([tests/AcDream.Core.Tests/Fixtures/door-bug/live-capture.jsonl](../../tests/AcDream.Core.Tests/Fixtures/door-bug/live-capture.jsonl), 4 KB).
Two representative ticks pulled from the JSONL:
- **Tick 13558** — the walkthrough. Player at (132.36, 16.81, 94) in
**indoor cell 0xA9B40150**, target (132.43, 17.20, 94). Live
result.Position = target with `collisionNormalValid = false`. Door
centered at world XY (132.57, 16.99), BSP radius 1.975, state
`0x00010008` = `PERSISTENT_PS | 0x8` (NO `ETHEREAL_PS = 0x4`
**CLOSED**).
- **Tick 22760** — the working block. Player at (133.14, 18.02, 94)
in **outdoor cell 0xA9B40029**, target (133.10, 17.60, 94). Live
blocks at Y=18.018 with cn=(0, +1, 0). Same door, different
primary cell type.
3. **Replay harness**
([DoorBugTrajectoryReplayTests.cs](../../tests/AcDream.Core.Tests/Physics/DoorBugTrajectoryReplayTests.cs)):
loads tick fixtures, hydrates door GfxObj `0x010044B5` from real dat
(`DatCollection.Get<GfxObj>`), registers a synthetic door via
`ShadowObjectRegistry.RegisterMultiPart` at the captured BSP world
center (`(132.57, 16.99, 95.36)`) with `cellScope=0u` (mirrors
production registration at
[GameWindow.cs:3158-3167](../../src/AcDream.App/Rendering/GameWindow.cs#L3158)).
`AssertCallMatchesCapture` replays the call and prints the first
per-field divergence. Diagnostic variant enables every
`PhysicsDiagnostics.Probe*Enabled` and dumps the full engine trace.
## Chronology (from `door-walkthrough.launch.log`)
Confirmed the door state at the time of every walkthrough:
| Log line | Event |
|---|---|
| 10796 | `[setstate]` door state → `0x0001000C` (PERSISTENT + ETHEREAL = OPEN) |
| 10993 | `[setstate]` door state → `0x00010008` (PERSISTENT, NOT ethereal = CLOSED) |
| 1099511071 | First and last `[bsp-test]` line on door 0x000F4246. All `state=0x00010008` |
So every `[bsp-test]` hit on the door, and every walkthrough event in
the JSONL, is against the **closed** door. The bug is real, not an
ETHEREAL pass-through.
## What the diagnostic test prints (tick 13558)
```
=== Replay tick 13558 (the walkthrough) ===
[step-walk] site=find-start cur=(132.36,16.81,94) ... walkPoly=True
[step-walk-adjust] branch=into-plane input=(0.07,0.39,0.00) output=(0.07,0.39,0.00) zGain=0
[step-walk] site=before-insert ... delta=(0.0744,0.3928,0) cell=0xA9B40150 ... walkPoly=True
[step-walk] site=stepdown-enter ... delta=(0.0744,0.3928,0) stepDown=True walkableZ=0.6642
[step-walk] site=stepdown-after-offset ... delta=(0.0744,0.3928,-0.75) ... walkPoly=True
... (probes down by 0.75, then 1.5; all OK; walkPoly=True)
[step-walk] site=stepdown-enter ... delta=(0.0744,0.0000,0) ... hit=(0,-1,0) walkPoly=False
... (probes down again; hit stays (0,-1,0); walkPoly=False throughout)
[step-walk] site=after-insert state=Collided ... hit=(0,-1,0) walkPoly=False
[step-walk] site=after-validate state=OK ... position back to input
[resolve] in=(132.360,16.811,94) cell=0xA9B40150 tgt=(132.435,17.204,94)
out=(132.360,16.811,94) cell=0xA9B40150 ok=True
hit=yes n=(0,-1,0) walkable=True
=== Harness: pos=(132.36,16.81,94) cn=(0,-1,0) cnValid=True onGround=True cell=0xA9B40150
=== Live: pos=(132.43,17.20,94) cn=(0,0,0) cnValid=False onGround=True cell=0xA9B40150
```
**No `[bsp-test]` line fires.** The door's BSP is never queried. The
hit `(0, -1, 0)` is the engine's "sliding off the south edge of the
seeded walkable polygon" response — not a door collision.
This matches production: at indoor primary cell `0xA9B40150`,
`GetNearbyObjects` returns ZERO shadows because:
1. The captured `cellId` low-nibble `0x150 >= 0x100` → indoor →
issue #98's gate at
[ShadowObjectRegistry.cs:480](../../src/AcDream.Core/Physics/ShadowObjectRegistry.cs#L480)
skips the outdoor radial sweep.
2. `portalReachableCells` (built by `CellTransit.FindCellSet`) lacks
outdoor cell `0xA9B40029`. In the harness, this is because we
register no cell fixture for `0xA9B40150` and the indoor branch at
[CellTransit.cs:403-407](../../src/AcDream.Core/Physics/CellTransit.cs#L403)
early-returns with empty candidates. **In production**, the cell
IS in cache but the traversal still doesn't produce `0xA9B40029`
the cell's exit portal (`OtherCellId=0xFFFF`) either doesn't fire
`exitOutside=true` at the sphere's position, or `AddAllOutsideCells`
isn't computing the right outdoor cell.
## Next investigation move
**Dump cell `0xA9B40150` from the dat and inspect its portal list.**
Two ways:
a) **Dat-direct read in a test** (preferred — no live launch). Pattern
from
[DoorSetupGfxObjInspectionTests](../../tests/AcDream.Core.Tests/Physics/DoorSetupGfxObjInspectionTests.cs):
`dats.Get<EnvCell>(0xA9B40150u)`, then iterate
`envCell.CellPortals` and print each portal's `OtherCellId`,
`PolygonId`, `Flags`. If no portal with `OtherCellId == 0xFFFF`,
`exitOutside` can never be true → bug is in the cell's portal-graph
loading (or the cottage doesn't connect via 0xFFFF exit portals;
it might use the building-shell path via
`BuildingPhysics.CheckBuildingTransit` instead).
b) **Live `ACDREAM_DUMP_CELLS=0xA9B40150,0xA9B4013F,0xA9B40154`**
another launch cycle. Less preferred; we already have what we need
from the dat read.
The dat-direct read can be a new test method in
`DoorSetupGfxObjInspectionTests` (it's the natural home for this
class of dat-introspection checks).
## What NOT to do next
1. **Don't speculate on the fix.** We have the right replay apparatus
now; the next move is **read the dat** to determine the cell's actual
portal structure. Then we'll know whether the bug is in the dat
data, the portal loading, the exit-portal detection in
`FindTransitCellsSphere`, or `AddAllOutsideCells`'s grid math.
2. **Don't modify the replay test to mask the walkable-polygon edge
artifact.** The artifact is harmless (it documents that, given a
single isolated walkable poly, the engine treats its boundary as a
wall — true regardless of the door bug). The interesting finding is
"no `[bsp-test]` line"; the edge artifact just happens to fill the
collision slot.
3. **Don't re-do the registration shape.** Multi-part registration
+ dedup fix + Task 7 wiring are correct. Verified by the harness's
ability to query the door registration (it just isn't reached at
indoor primary cells).
## Files touched this session
**Committed:** none yet — pending commit at session end.
**Uncommitted:**
- `tests/AcDream.Core.Tests/Fixtures/door-bug/live-capture.jsonl`
2 captured ResolveWithTransition records (tick 13558 walkthrough +
tick 22760 outdoor block)
- `tests/AcDream.Core.Tests/Physics/DoorBugTrajectoryReplayTests.cs`
apparatus: 2 LiveCompare tests + 1 Diagnostic dump
- `docs/research/2026-05-24-door-bug-apparatus-shipped-findings.md`
this doc
## Pickup prompt for the next session
```
A6.P4 door bug — apparatus replay shipped. DoorBugTrajectoryReplayTests
loads tick 13558 (walkthrough) and 22760 (block) from a captured fixture
and replays through the engine. Door 0x000F4246 (closed, state=0x00010008,
BSP world (132.57, 16.99, 95.36) radius 1.975) IS registered correctly
in the harness, BUT the engine never queries it from indoor primary cell
0xA9B40150 — no [bsp-test] line fires. Root cause located:
CellTransit.FindCellSet's portal traversal does not surface outdoor cell
0xA9B40029 from indoor cell 0xA9B40150.
Read docs/research/2026-05-24-door-bug-apparatus-shipped-findings.md
State both altitudes:
Currently working toward: M1.5 — Indoor world feels right
Current phase: A6.P4 door bug — cell-portal investigation.
Apparatus shipped; next step is to dump cell
0xA9B40150's portal list (from the dat) and
determine why FindTransitCellsSphere doesn't
add outdoor cell 0xA9B40029 to candidates.
First move: add a test to DoorSetupGfxObjInspectionTests (or a new
CellPortalDatInspectionTests file) that reads EnvCell 0xA9B40150 from
the real dat and prints every portal's OtherCellId, PolygonId, Flags.
Then read 0xA9B4013F (player's other indoor cell from JSONL) and
0xA9B40029 (door's outdoor cell) for cross-comparison. The portal
structure will reveal whether cottages use 0xFFFF exit portals
(FindTransitCellsSphere path) or building-shell portals
(CheckBuildingTransit path). If 0xFFFF exit portals exist but
exitOutside isn't firing, the bug is in the sphere-vs-plane test
at CellTransit.cs:99-112. If they don't exist, the building-shell
path is misconfigured for indoor-primary calls.
DO NOT:
- Modify the replay test to mask the walkable-polygon-edge artifact
- Re-do the registration shape (correct)
- Speculate on the fix without dat evidence
```

View file

@ -0,0 +1,171 @@
# Door bug — partial fix shipped (cell visibility), inside-out asymmetric collision remains
2026-05-25
## TL;DR
**Major root cause closed.** `CellTransit.AddAllOutsideCells` was
silently failing for every production caller because it assumed sphere
positions were in absolute world coordinates (subtracting the
landblock's "absolute" world origin `lbXf = 0xA9 * 192 = 32448`), while
production has used landblock-local coordinates since Phase A.1
(streaming-center landblock at world origin → `lbOffset = (0, 0)`).
For outdoor primary cells the bug was masked by `GetNearbyObjects`'s
radial sweep. For indoor primary cells (where issue #98's gate skips
the outdoor sweep), it meant **outdoor cells were never added to
`portalReachableCells`** → cottage door's outdoor cell `0xA9B40029`
invisible from indoor cell `0xA9B40150` → door's BSP never queried
→ player walked through.
**Outside→inside now blocks correctly. Inside→outside REMAINS BROKEN
asymmetrically.** Body partially intersects the door, slides through
visibly. Not retail-faithful. This is a SEPARATE bug in
BSP-collision-response for two-sided polygons — to investigate next
session.
## Apparatus shipped
Full trajectory-replay harness:
1. **Live capture** (`door-walkthrough.jsonl` from previous session; not
committed): 24,310 records of `PhysicsEngine.ResolveWithTransition`
calls including PhysicsBody snapshots before/after.
2. **Fixture extraction**
([tests/AcDream.Core.Tests/Fixtures/door-bug/live-capture.jsonl](../../tests/AcDream.Core.Tests/Fixtures/door-bug/live-capture.jsonl), 4 KB):
tick 13558 (the walkthrough) + tick 22760 (the working outdoor block)
as representative records.
3. **Replay harness**
([DoorBugTrajectoryReplayTests.cs](../../tests/AcDream.Core.Tests/Physics/DoorBugTrajectoryReplayTests.cs)):
- `LiveCompare_*` tests load the failing tick + replay through the
harness + diff result fields vs captured live values.
- `FindTransitCellsSphere_IndoorExitPortal_AddsOutsideForCapturedSpherePos`
— direct unit test for cell-portal traversal at the captured
sphere position. PASSES (cell graph is correct).
- `AddAllOutsideCells_LandblockLocalSphere_AddsDoorOutdoorCell`
— direct unit test that pinpointed the root cause. **Initially
failed** (`AddAllOutsideCells` returned empty when given
landblock-local sphere coords). **Now passes after fix.**
4. **Dat-direct cell-portal inspector**
([DoorSetupGfxObjInspectionTests.HoltburgCottage_CellPortals_DatInspection](../../tests/AcDream.Core.Tests/Physics/DoorSetupGfxObjInspectionTests.cs)):
reads `EnvCell` + `Environment.Cells` + portal `Polygon.Plane` from the
real dat for cells `0xA9B40150` (doorway alcove), `0xA9B4013F`
(cottage interior), `0xA9B40029` (outdoor — confirmed NOT EnvCell).
Output: cell `0xA9B40150` HAS a 0xFFFF exit portal at poly `0x0005`
with plane `n_local=(0, +1, 0), d_local=+5.6`. The sphere-vs-plane
math (sphere world `(132.36, 16.81, 94)` → local `(-1.86, -5.31, 0)`
via 180° Z rotation → `dist = +0.29` within `±rad=0.5` → straddles)
confirmed `exitOutside` SHOULD fire — but `AddAllOutsideCells` then
silently dropped the outdoor cell.
## The fix
[src/AcDream.Core/Physics/CellTransit.cs](../../src/AcDream.Core/Physics/CellTransit.cs)
`AddAllOutsideCells` no longer subtracts the landblock's
"absolute" world origin from the sphere position. Treats
`worldSphereCenter` as landblock-local directly (matching retail's
`CLandCell::add_all_outside_cells` which uses the per-cell 6-byte
position struct, and matching production's universal convention since
Phase A.1).
Existing tests in
[CellTransitAddAllOutsideCellsTests.cs](../../tests/AcDream.Core.Tests/Physics/CellTransitAddAllOutsideCellsTests.cs)
and
[CellTransitFindCellSetTests.cs](../../tests/AcDream.Core.Tests/Physics/CellTransitFindCellSetTests.cs)
updated to use landblock-local sphere coords (they were the only
callers using the world-coord convention; production never did).
## Visual verification
User tested all four combinations at a closed Holtburg cottage door,
~50cm off-center:
| Direction | Speed | Pre-fix | Post-fix |
|---|---|---|---|
| outside → inside | RUN | walks through | **BLOCKS** ✅ |
| outside → inside | WALK | walks through | (presumed BLOCKS — not retested) |
| inside → outside | RUN | walks through | **PARTIAL** ⚠️ body intersects door, sometimes through |
| inside → outside | WALK | walks through | **PARTIAL** ⚠️ same as run |
User quote: *"We have partial blocking from inside out. Can get
through some times. However, char is blocked a bit through the door.
So for example if I'm running towards this from the inside, I can see
parts of the body getting blocked a bit in to the door. This is not
per retail behavior and this is not how it looks when its block from
the outside"*.
The asymmetry is the new diagnostic: outside-in produces a clean block
(no body-into-door intersection visible); inside-out produces a partial
block with visible body intersection. This is the signature of an
**asymmetric collision response** to the door slab's two-sided
polygons (`SidesType=Landblock`), or a **BSP query that handles
sphere-already-overlapping-slab differently from sphere-approaching-slab**.
The `[bsp-test]` probe fires 245 times for the door entity during the
post-fix inside-out attempts — door IS being queried. The
collision-detection mechanics produce the wrong response.
## What's next (separate bug)
**Investigate BSPQuery.FindCollisions's response for two-sided polygons
when the sphere is already overlapping the slab.** Retail's
`CBSPTree::find_collisions` family handles this specifically — the
sphere's path through the slab faces gets traced and the FIRST face
crossed in motion direction is the collision. With two-sided polygons,
both faces are collidable; the front-vs-back determination is by
sphere-velocity vs face-normal dot product.
Likely files:
- `src/AcDream.Core/Physics/BSPQuery.cs` — the BSP traversal +
sphere-poly intersection logic.
- Retail decomp anchors:
`acclient_2013_pseudo_c.txt:BSPTREE::find_collisions` +
`SPHEREPATH::sphere_intersects_poly` family.
Apparatus to write next: a focused test that registers the door at its
actual production world transform (entity origin + partFrame offset
from the dat, with correct rotation) and replays a sphere passing
through it from EACH side at various speeds. Compare collision normal
+ position-resolution per side. The asymmetric response will be
reproducible at unit-test speed.
## Commits
[List the commit SHAs of the apparatus + fix once landed.]
## Pickup prompt for the next session
```
Door bug — major root cause closed (CellTransit.AddAllOutsideCells
landblock-local coord convention). Outside→inside now blocks. But
inside→outside has asymmetric BSP collision response: body partially
intersects the door slab, sphere slides through. Same behavior at run
+ walk speed. Bug is in BSP collision response for two-sided polygons
or sphere-already-overlapping-slab handling.
Read docs/research/2026-05-25-door-bug-partial-fix-shipped.md
State both altitudes:
Currently working toward: M1.5 — Indoor world feels right
Current phase: A6.P4 door bug — inside-out asymmetric BSP collision
response. Apparatus is shipped (DoorBugTrajectoryReplayTests).
First major root cause closed. Remaining bug is in
BSP-collision-response mechanics, not cell visibility.
First move: extend the existing DoorBug apparatus with a more
faithful door registration (entity at the actual production world
pos + correct rotation; use the partFrame from the dat). Then write
TWO directional tests: sphere approaching the slab from the south
(outside-in) and sphere approaching from the north (inside-out).
Compare cn normal + resolution for each. The asymmetric response
will reproduce at unit-test speed. From there, inspect
BSPQuery.FindCollisions's handling of two-sided polygons and
sphere-already-overlapping cases. Retail oracle:
CBSPTree::find_collisions family at acclient_2013_pseudo_c.txt.
DO NOT:
- Re-investigate cell visibility (closed by AddAllOutsideCells fix)
- Re-do the registration shape (multi-part registration is correct)
- Speculate on the BSP fix without apparatus
```