acdream/docs/research/2026-05-20-phase-a4-shipped-cell-pingpong-finding.md
Erik 1534990102 docs(roadmap): A4 shipped + #90 cell-tracking ping-pong filed
Phase A4 (multi-cell BSP iteration) ships in three commits (e6369e2,
493c5e5, 691493e — with revert 3add110 + reapply during visual
verification that proved A4 is not the cause of the issue surfaced).
1139 + 8 baseline maintained. 10 new unit tests pass. Wires retail's
CTransition::check_other_cells (acclient_2013_pseudo_c.txt:272717-272798)
into Transition.FindEnvCollisions.

Visual verification at the Holtburg inn vestibule surfaced a separate,
pre-existing M2 blocker (filed as #90): CellId ping-pongs between
outdoor 0xA9B40022 and indoor 0xA9B40164 on every wall push-back
because the push-back exits the indoor CellBSP volume, causing the
resolver to flip back to outdoor and bypass walls on outdoor ticks.
Indoor BSP results (Collided/Adjusted/Slid all firing) prove walls ARE
detected when the player is indoor; the aggregate "walls walk through"
appearance comes from CellId classification instability, not from
collision detection.

Bug reproduces fully with A4 reverted (launch-revert2.log captured 18
cell-id flips between 0xA9B40022 ↔ 0xA9B40164, 11 inside=True
building-transit events, 61 indoor-bsp queries firing the full
result distribution). A4 is correct and tested but dormant in
practice until #90 is fixed.

Updates:
  - docs/research/2026-05-20-phase-a4-shipped-cell-pingpong-finding.md (new)
  - docs/plans/2026-04-11-roadmap.md (A4 shipped row added)
  - CLAUDE.md (Indoor walking Phase A4 paragraph + next-step pointer
    to #90 with retail oracle anchor at acclient_2013_pseudo_c.txt:308742-308783)
  - docs/ISSUES.md (#90 filed, HIGH severity, M2-blocker)
  - docs/research/2026-05-21-open-items-pickup-prompt.md (landscape
    table updated — A4 closed, #90 promoted to top blocker)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-20 20:10:29 +02:00

218 lines
11 KiB
Markdown

# Phase A4 shipped + cell-tracking ping-pong finding — 2026-05-20
**Status:** A4 (multi-cell BSP iteration) shipped in 3 commits + 1 revert + 1 reapply
+ 1 doc. Build green, 1139 + 8 baseline failures (same as pre-A4 baseline).
A4 is **dormant in practice** because of a separate, pre-existing cell-tracking
bug at the inn doorway that prevents the player from stably remaining in an
indoor cell.
## TL;DR
- A4 ports retail's `CTransition::check_other_cells` (`acclient_2013_pseudo_c.txt:272717-272798`).
After the primary cell's BSP returns OK, every other cell the foot-sphere overlaps
is queried via `BSPQuery.FindCollisions`. Halt on first
Collided/Adjusted/Slid; Slid clears the contact-plane fields. Matches retail
exactly.
- 10 new unit tests pass; full test suite holds at the prior 8-failure baseline.
Three commits land the slices (FindCellSet overload → CheckOtherCells helper →
FindEnvCollisions wire-up).
- **Visual verification surfaced a different bug**: walking into the Holtburg
inn ping-pongs the player's CellId between indoor `0xA9B40164` and outdoor
`0xA9B40022` rapidly. Indoor BSP DOES detect walls (Collided / Adjusted /
Slid all fire on push-back), but the push-back moves the sphere outside the
indoor CellBSP's volume → `ResolveCellId` reclassifies the player as outdoor
→ next tick bypasses indoor BSP entirely → player advances freely → re-enters
→ repeats.
- Because the player never STAYS in an indoor cell, A4's multi-cell pass is
rarely (if ever) actually exercised in production. The user's reported
"walls walk through everywhere in the inn" reproduces fully with A4 wire-up
reverted, confirming A4 is not the cause.
## What shipped
| SHA | Phase | Description |
|---|---|---|
| `b100d54` | A4 spec | docs/superpowers/specs/2026-05-20-phase-a4-multi-cell-bsp-design.md |
| `a8a0366` | A4 plan | docs/superpowers/plans/2026-05-20-phase-a4-multi-cell-bsp.md |
| `e6369e2` | A4 slice 1 | `CellTransit.FindCellSet` overload + 3 unit tests |
| `493c5e5` | A4 slice 2 | `Transition.CheckOtherCells` + `ApplyOtherCellResult` + 6 unit tests |
| `967d065` | A4 slice 3 | Wire `CheckOtherCells` into `FindEnvCollisions` + 1 integration test |
| `3add110` | A4 revert | Temporary revert of slice 3 to confirm A4 wasn't the cause |
| `691493e` | A4 reapply | Restored slice 3 after revert test proved A4 not the cause |
Total: ~380 LOC added (3 new test files + helper methods); 1139 + 8 baseline
maintained throughout.
## Visual verification — what we tested
Launched twice with the light-probe set (`ACDREAM_PROBE_INDOOR_BSP`,
`ACDREAM_PROBE_CELL`, `ACDREAM_PROBE_CELL_CACHE`).
### Launch 1 — A4 wire-up active (`launch-a4.log`, 782 lines)
User walked from spawn toward the Holtburg inn. Log captured:
- Player CellId stayed at outdoor `0xA9B4002A` the entire session.
- 0 indoor-bsp probes fired.
- 0 other-cells probes fired (A4 wire-up only runs for indoor cells).
- User reported "all interior walls in the inn can be walked through; going
from indoor to outdoor broken."
But — A4 wire-up only fires when `cellLow >= 0x0100`. The player never reached
that state. So A4 couldn't possibly be the cause of the reported behavior.
### Launch 2 — A4 wire-up reverted (`launch-revert2.log`, 18490 lines)
User walked into and out of the inn multiple times. Log captured:
- 18 cell-transit events: outdoor cells `0xA9B40021` / `0xA9B40022` /
`0xA9B4002A` ping-ponging with indoor cell `0xA9B40164` (vestibule).
- 11 `[check-bldg] inside=True` events — player crossed the building threshold.
- 61 indoor-bsp queries against `0xA9B40164` (58) and `0xA9B40162` (3).
- Indoor BSP results: 40 OK + 7 Adjusted + 7 Collided + 7 Slid.
- User confirmed: "walls still walked through (same bug)" with A4 reverted.
**The bug reproduces with A4 reverted, proving A4 is not responsible.**
## The actual bug — cell-tracking ping-pong at doorway threshold
The repeating cycle observed in the revert log:
1. Player at outdoor cell `0xA9B40022`, walking toward inn door.
2. `CheckBuildingTransit` returns `inside=True` for portal to `0xA9B40164`.
3. ResolveCellId promotes CellId to `0xA9B40164`.
4. Next tick: indoor branch of FindEnvCollisions fires. BSP query against
`0xA9B40164`'s walls returns Adjusted/Collided/Slid. The sphere is pushed
back (Adjusted/Slid) or halted (Collided).
5. The push-back moves the sphere's world position BACK toward the outdoor
side, beyond the indoor CellBSP's volume.
6. ResolveCellId re-evaluates: indoor CellBSP no longer contains the sphere
center → falls through to outdoor resolution → returns `0xA9B40022`.
7. CellId flips back to outdoor. Next tick: indoor BSP not queried, player
keeps advancing.
8. Player re-crosses the building threshold → goto 2.
Net effect: the player visually moves through the doorway zone, walls
intermittently push them back, but most ticks classify them as OUTDOOR and
those ticks bypass wall collision entirely. The aggregate behavior LOOKS LIKE
"walls walk through" even though wall hits are firing.
### Why A4 doesn't help here
A4 multi-cell iteration only runs when the primary cell BSP returns OK. In the
ping-pong cycle, the primary cell BSP returns NON-OK (Collided/Adjusted/Slid)
on most indoor frames — so A4 short-circuits early at the existing `if (cellState
!= TransitionState.OK) return cellState;` path. A4 would help if the player
were STABLY indoor (cellLow >= 0x100) AND the primary cell's BSP had sparse
geometry that missed walls in adjacent cells. The ping-pong prevents both
conditions.
### Why this is a Bug A cousin
The 2026-05-20 Bug A investigation
([docs/research/2026-05-20-indoor-walking-bug-a-handoff.md](2026-05-20-indoor-walking-bug-a-handoff.md))
documented a similar doorway-edge problem: indoor cell floor polys don't
extend past the doorway threshold, causing free-fall when stepping out.
The current ping-pong is the same family of bug, different symptom: the
indoor CellBSP volume doesn't extend past the doorway threshold either, so
the push-back from a wall collision exits the cell's containment volume,
and the cell-id resolver bounces the player back to outdoor.
Hypothesis: the inn's vestibule cell `0xA9B40164` has a CellBSP that's tightly
bounded to the room's interior volume. The doorway threshold is right at the
boundary. Walking against an interior wall pushes the foot-sphere back toward
the boundary → exits CellBSP → outdoor classification.
## Next steps (not blocking A4 ship)
Two paths to investigate the ping-pong, both out of A4's scope:
1. **CellBSP-volume retention.** Match retail's behaviour: once a player enters
an indoor cell, don't flip back to outdoor until they cross the EXIT portal
plane, not just because they exited the CellBSP volume on a push-back.
Likely a `ResolveCellId` modification that prefers the previous indoor
classification when sphere is "close enough" to the indoor CellBSP volume.
2. **CellBSP-volume expansion.** Pad the indoor cell's CellBSP volume by the
sphere radius (~0.48m) on all sides. The push-back stays within the
padded volume. Risk: may incorrectly classify nearby outdoor positions as
indoor.
The retail oracle for cell-id stickiness is at
`acclient_2013_pseudo_c.txt:308742-308783` (`CObjCell::find_cell_list` Position-
variant) and the cell-array hysteresis logic around it. Not yet ported in
detail.
## Why ship A4 anyway
- **Correctness.** A4 matches retail's `check_other_cells` exactly. 10 unit
tests pin the halt semantics + integration test verifies the wire-up. Pure
port, no design improvisation.
- **No regressions.** 1139-passing + 8-pre-existing-failing baseline holds.
All A1 / A1.5 / A1.6 / A1.7 / Bug B fixes remain green.
- **Foundation for A3.** A3 (synthesis removal) is unblocked by A4 being in
place — it can rely on multi-cell BSP coverage for floor synthesis once the
ping-pong is fixed and players stay indoor long enough.
- **Reverting it would lose work.** A4 is correct and tested. The dormant
state is caused by an unrelated bug. Reverting would just make the
unrelated bug harder to investigate (no multi-cell foundation to build on).
## What this is NOT
This is **NOT** a fix for the user's "walls walk through" report. That bug is
pre-existing, caused by cell-tracking instability at doorway thresholds.
This is **NOT** a regression introduced by A4. The bug reproduces fully with
A4's wire-up reverted (verified by `launch-revert2.log`).
This is **NOT** the same as Bug A (synthesis removal). Bug A's symptom was
free-fall on doorway exit; this is wall walk-through due to CellId classification
flipping back to outdoor on each push-back.
## Code anchors
- Phase A4 wire-up: [src/AcDream.Core/Physics/TransitionTypes.cs:1614-1631](../../src/AcDream.Core/Physics/TransitionTypes.cs#L1614).
- `CheckOtherCells` + `ApplyOtherCellResult`: TransitionTypes.cs (search `CheckOtherCells`).
- `FindCellSet` overload: [src/AcDream.Core/Physics/CellTransit.cs](../../src/AcDream.Core/Physics/CellTransit.cs) (search `FindCellSet`).
- ResolveCellId outdoor branch (where the ping-pong happens): [src/AcDream.Core/Physics/PhysicsEngine.cs:259-329](../../src/AcDream.Core/Physics/PhysicsEngine.cs#L259).
## Probe captures
- `launch-a4.log` (782 lines) — A4 active, player stayed outdoor (didn't reach
inn). Confirms A4's indoor branch never fired in that session.
- `launch-revert.log` (1.2M lines) — A4 reverted, player parked at outdoor cell
with 400K+ `[check-bldg]` probes all returning `inside=False`. Player never
moved.
- `launch-revert2.log` (18490 lines) — A4 reverted, player walked into inn
multiple times. Captured the ping-pong cycle. Indoor BSP results breakdown:
40 OK + 7 Adjusted + 7 Collided + 7 Slid. 11 `inside=True` building-transit
events.
## How to start a fresh session
Open a new Claude Code session, then:
```
Pick up the cell-tracking ping-pong investigation that blocked Phase A4
from being exercised in practice.
1. Read docs/research/2026-05-20-phase-a4-shipped-cell-pingpong-finding.md
FIRST. It documents A4 ship (correct, dormant) + the ping-pong bug it
surfaced.
2. A4 is shipped (3 commits at e6369e2, 493c5e5, 691493e). Don't touch it.
1139 + 8 baseline holds.
3. The real M2 blocker: at the Holtburg inn doorway, CellId ping-pongs
between 0xA9B40022 (outdoor) and 0xA9B40164 (vestibule) every few ticks
because indoor BSP push-back exits the indoor CellBSP volume → outdoor
reclassification → walls bypassed on outdoor ticks.
4. Investigate cell-id hysteresis. Retail oracle:
acclient_2013_pseudo_c.txt:308742-308783 (CObjCell::find_cell_list
Position-variant). Look for the cell-array stickiness logic that retail
uses to prevent ping-pong.
5. CLAUDE.md rules: no workarounds, retail-faithful, probe-first.
State M2 as the milestone, "cell-tracking ping-pong fix" as the phase.
```