docs(spec): Indoor walking Phase 1 — BSP cluster design

Brainstormed spec for the next indoor follow-up phase: surfacing root
causes for ISSUES.md #84 (blocked by air) + #85 (pass through walls
outside→in) + #86 (click selection penetrates walls). Diagnostic-first
single capture pass; one [indoor-bsp] probe in FindEnvCollisions, then
surgical fixes (one commit per issue). Mirrors the indoor cell rendering
Phase 1+2 pattern that landed earlier today.

#86's root cause is already pinned by code reading (WorldPicker has no
cell-BSP test) — its fix is structural and doesn't need capture data.

#78 (outdoor stabs through floor) is in the same handoff cluster but
defers to a separate phase — different code path (render visibility).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Erik 2026-05-19 14:07:01 +02:00
parent b04ad448fa
commit 59a4e3f6cf

View file

@ -0,0 +1,313 @@
# Indoor Walking Phase 1 — BSP cluster (#84 / #85 / #86)
**Status:** Brainstormed 2026-05-19. Awaiting user spec review before plan.
**Scope:** Diagnostic-first investigation pass across the three "indoor walking is broken" bugs that share a cell-BSP / picker root-cause cluster. Surface evidence with a single probe + one capture session, then ship surgical fixes (one commit per issue).
**Predecessors:**
- Indoor cell rendering Phase 1 (`docs/superpowers/specs/2026-05-19-indoor-cell-rendering-fix-design.md`) — the five `[indoor-*]` render-side probes.
- Indoor cell rendering Phase 2 (`docs/superpowers/specs/2026-05-19-phase2-indoor-cell-rendering-fix-design.md`) — silent-failure surfacing + WB Setup-prefix guard. Made floors render.
- Handoff: `docs/research/2026-05-19-indoor-followup-handoff.md`.
The indoor cell rendering Phase 1+2 pair made floors render. The moment floors rendered, nine pre-existing indoor bugs (`docs/ISSUES.md` #78-#86) became user-observable. This phase tackles the **BSP cluster** subset: #84, #85, #86.
`#78` (outdoor stabs visible through floor) is in the same handoff cluster but a fundamentally different code path (render-side visibility / stencil), so it's deferred to a separate phase. `#79-#83` (lighting / terrain / stairs) are in different clusters.
---
## 1. What we know from the code
Pre-investigation reads (2026-05-19) of the three issue surfaces:
### `#84` (blocked by air indoors) — cell BSP IS consulted
The handoff hypothesized "cell BSP isn't being used". Code reading says otherwise:
- **Cell BSP IS cached.** `PhysicsDataCache.CacheCellStruct` ([src/AcDream.Core/Physics/PhysicsDataCache.cs:131](src/AcDream.Core/Physics/PhysicsDataCache.cs:131)) stores `BSP`, `PhysicsPolygons`, `Vertices`, `WorldTransform`, `InverseWorldTransform`, and pre-resolved polygons (planes computed at cache time).
- **Cell BSP IS consulted in collision.** `Transition.FindEnvCollisions` ([src/AcDream.Core/Physics/TransitionTypes.cs:1188-1241](src/AcDream.Core/Physics/TransitionTypes.cs:1188)) has an explicit indoor branch gated on `cellLow >= 0x0100` that:
1. Looks up `cellPhysics` via `engine.DataCache.GetCellStruct(sp.CheckCellId)`,
2. Transforms the player's sphere to cell-local space via `InverseWorldTransform`,
3. Calls `BSPQuery.FindCollisions` with the cell's pre-resolved polys,
4. Returns `cellState` if `!= OK`.
So #84's root cause is not "wiring missing". It's one of: (a) extra physics-only polys with no visible counterpart, (b) `+0.02f` Z-bump misalignment between cellTransform (applied to physics) and player Z (computed from terrain), (c) `BSPQuery` returning false positives at certain poly side-types, (d) `cellTransform` quaternion error on rotated cells. Capture data will pin which.
### `#85` (pass through walls outside→in) — likely asymmetric path
Walking outside-in keeps `CheckCellId` as the outdoor land cell (low byte `0x00xx-0x00FF`), so the indoor cell-BSP branch at TransitionTypes.cs:1192 is **gated out by design** (`cellLow >= 0x0100` is false). The only collision tested on the outside-in approach is:
- **Terrain** (always tested),
- **Outdoor stab BSPs** ([`PhysicsDataCache.GetGfxObj`](src/AcDream.Core/Physics/PhysicsDataCache.cs) for `LandBlockInfo.Objects`) — building stab is hit via `FindObjCollisions`.
L.2d slice 1+1.5 ported `CBuildingObj` collision (per CLAUDE.md), so the outer building shell SHOULD be hit. If #85 reproduces, hypotheses:
1. The outdoor stab BSP for the Inn covers floor+roof but is missing wall polys (authoring shape — retail's interior cells own the walls, outdoor shell is a partial envelope).
2. The outdoor stab BSP has wall polys but with one-sided normals; outside approach hits the back face which BSP treats as "behind plane" → no collision (`feedback_no_patching_collision` memory's faithful-port rule means we'd need to follow retail's handling).
3. The L.2g dynamic-physics-state flag work doesn't include outdoor building shells in the collision sweep for the player's CheckCellId.
4. **Retail's actual behavior** may be that outside-in BSP probing queries the EnvCell's BSP across the cell boundary — retail's `CCellStructure::find_env_collisions` may walk neighbor-cell BSPs.
### `#86` (click selection penetrates walls) — root cause definitively pinned by code reading
`WorldPicker.Pick` ([src/AcDream.Core/Selection/WorldPicker.cs:88-160](src/AcDream.Core/Selection/WorldPicker.cs:88), and the screen-rect overload at line 202) is **pure ray-sphere against entity AABBs**. There is no cell BSP test, no scenery BSP test, no terrain test. Any entity along the ray within `maxDistance` is a candidate; nothing occludes.
No probe needed for #86. Fix is structural: add a cell-BSP ray-poly occlusion test that runs once per `Pick` call and culls entities whose ray-distance exceeds the nearest wall hit.
---
## 2. The three issues
| # | Title | Code path | Fix shape |
|---|---|---|---|
| #84 | Blocked by air indoors | `Transition.FindEnvCollisions` cell branch | TBD — pinned by probe capture |
| #85 | Pass through walls outside→in | `FindObjCollisions` outdoor-stab path or cross-cell BSP probing | TBD — pinned by probe capture |
| #86 | Click selection penetrates walls | `WorldPicker.Pick` (both overloads) | Add cell-BSP ray-poly occlusion test |
---
## 3. Architecture
```
[indoor-bsp] probe
┌───────────────────┴────────────────────┐
▼ ▼
Movement path Picker path
(FindEnvCollisions cell branch) (WorldPicker.Pick)
│ │
├─→ #84: blocked by air └─→ #86: click through walls
└─→ #85: pass through walls (cause already pinned by code reading)
(cause TBD — needs capture)
```
The probe spans only the movement path. #86's diagnosis is already known; its fix is independent of the capture and can land in parallel.
---
## 4. Components
### Component 1 — `PhysicsDiagnostics.IndoorBspEnabled`
New static toggle on `AcDream.Core.Physics.PhysicsDiagnostics`. Mirrors the existing `ResolveProbeEnabled` / `CellProbeEnabled` pattern:
- Backed by `ACDREAM_PROBE_INDOOR_BSP` env var read once at startup.
- Mutable at runtime via the DebugPanel checkbox.
- Zero-cost when off — checked before any string formatting.
Also extends `PhysicsDiagnostics.IndoorAllEnabled` cascading the way Phase 1 cascaded the render-side `ACDREAM_PROBE_INDOOR_ALL`.
### Component 2 — `[indoor-bsp]` log site
One `Console.WriteLine` block in `Transition.FindEnvCollisions` ([TransitionTypes.cs:1222](src/AcDream.Core/Physics/TransitionTypes.cs:1222)), wrapping the existing `BSPQuery.FindCollisions` call. Captured fields per call:
| Field | Source | Why |
|---|---|---|
| `cellId` | `sp.CheckCellId` | Which cell's BSP was queried (hex, full 32-bit) |
| `localPos` | `localCenter` | Sphere foot center in cell-local space (3 floats) |
| `localPrevPos` | `localCurrCenter` | Sphere previous-frame foot center in cell-local space |
| `worldPos` | `footCenter` | Sphere foot center in world space (for cross-ref with user-reported spot) |
| `result` | `cellState` | `TransitionState` enum (`OK` / `Collided` / etc.) |
| `polyId` | `ci.LastHitCellPolyId` (NEW field if needed) | Which cell poly was hit, if any |
| `polyNormal` | `cellPhysics.Resolved[polyId].Plane.Normal` | Local-space normal (3 floats) — diagnoses one-sided / orientation bugs |
| `sidesType` | `cellPhysics.Resolved[polyId].SidesType` | `Front` / `Back` / `Both` — diagnoses #85 candidate |
| `walkable` | `ci.LastKnownContactPlaneValid` | Walkable surface tracking state |
Log line format (one line, pipe-separated, machine-greppable):
```
[indoor-bsp] cell=0xA9B40100 wpos=(82.45,71.23,1.04) lpos=(0.45,2.10,1.02) result=Collided poly=0x0042 n=(0.00,1.00,0.00) sides=Front walkable=true
```
If `BSPQuery.FindCollisions` doesn't already expose the hit poly id, the log fields shrink to what's available without expanding the BSPQuery API. A separate small change to surface `lastHitPolyId` from `BSPQuery` would be in-scope for this phase if needed.
### Component 3 — DebugPanel checkbox
Adds a checkbox row in the DebugPanel's Diagnostics section (already hosts the L.2a `Resolve` and `Cell-transit` toggles, plus the Phase 1 `Indoor walk/cull/upload/lookup/xform` toggles). Surface area: ~3 lines. No new file.
### Component 4 — `WorldPicker` cell-BSP occluder
Two implementation options:
**Option C1 — Inline in `WorldPicker.Pick`.** Add a `cellOccluder` callback parameter `Func<Vector3, Vector3, float>?` that returns the nearest wall-hit `t` along the ray (or `float.PositiveInfinity` if no hit). Inside `Pick`, after computing the entity hit `t`, gate by `entityHit < cellOccluder(origin, direction)`.
**Option C2 — Separate `CellBspRayOccluder` static class.** New file `src/AcDream.Core/Selection/CellBspRayOccluder.cs`. Function `NearestWallT(Vector3 origin, Vector3 direction, IEnumerable<CellPhysics> loadedCells)` — Möller-Trumbore ray-triangle against each cell's resolved polys, returns nearest `t`. WorldPicker calls it once per `Pick` invocation.
**Recommend C2.** Reasons: testable in isolation (synthetic cell + ray); two `WorldPicker.Pick` overloads share one implementation; future picker improvements (entity body refine, scenery BSP refine) get a parallel structure to copy.
The caller (`GameWindow` Use/Select handlers) must supply the loaded `CellPhysics` set. `PhysicsDataCache` already has `GetCellStruct(id)` so the caller iterates currently-loaded `LoadedCell` ids from `CellVisibility._cellLookup` (Holtburg radius 4 keeps maybe 80 cells loaded — fast Möller-Trumbore).
### Component 5 — Fix patches (TBD)
Concrete commits drafted only after capture data lands. Candidates by issue:
**#84**:
- Remove `+0.02f` Z bump from the physics-side `cellTransform` while keeping it for render's `cellMeshRef` (separate transforms). Or apply the bump symmetrically (also bump player Z by `+0.02f` when entering an indoor cell).
- Filter out physics-only polys with no visible counterpart, IF capture data shows phantom polys are the issue.
- Patch `BSPQuery.FindCollisions` side-type handling, IF capture data shows specific side-types misbehaving.
**#85**:
- Port retail's outside-in BSP cross-cell probing — query an EnvCell's BSP from an outdoor cell when the sphere overlaps the EnvCell's world AABB. Reference: PDB-named `CCellStructure::find_env_collisions` and neighbors.
- OR ensure outdoor building-shell stab BSPs include wall polys with two-sided handling.
- Path picked from capture evidence + decomp grep.
---
## 5. Data flow
### Capture session
User runs the canonical Holtburg launch (`ACDREAM_LIVE=1`, `+Acdream` char) with `ACDREAM_PROBE_INDOOR_BSP=1` + `ACDREAM_PROBE_RESOLVE=1` (latter already shipped from L.2a). Three scripted scenarios:
1. **Inside Inn walkaround (~30 s)** — walk slowly around the common room, attempt to reproduce #84. Note world-position when an invisible block happens.
2. **Outside-in approach (~30 s)** — stand 5+ m west of the Inn, sprint at the west wall. Reproduce #85.
3. **Inside-out sanity (~30 s)** — stand inside, walk into east wall from interior. This SHOULD block (per issue text); confirms inside-out path works.
Total launch: one. Captures all three.
### Offline analysis
```
grep "\[indoor-bsp\]" launch.log | head -200 # see what fired during scenario 1
grep "\[resolve\]" launch.log | grep "obj=0x" # see which objects were hit during scenario 2
grep "\[cell-transit\]" launch.log # confirm cell ids during transitions
```
Diagnosis per issue:
- **#84**: in scenario-1 lines, find `result=Collided` events where world-pos is in open space (no visible wall). Cross-ref `polyId` with the cell's `cellStruct.PhysicsPolygons` to identify what the offending poly is. Compare its local-Z with player's local-Z to test the Z-bump hypothesis.
- **#85**: in scenario-2 lines, expect zero `[indoor-bsp]` events (gated out). Check `[resolve]` lines for the moment the player crosses the wall plane — did `FindObjCollisions` fire for any building stab? If yes, what poly? If no, the outdoor stab path is missing wall geometry → fix shape is the cross-cell BSP probing.
- **#86**: no capture needed. Code reading already pinned the cause; fix is structural.
### Fix application
Per CLAUDE.md "no workarounds" rule:
- The probe data must point at one specific code site before any fix lands.
- Each fix commit cites the evidence in its message ("`[indoor-bsp] cell=0x... wpos=... poly=... n=...` — the poly at local-Z=0.0 is the floor poly; player local-Z=-0.02 from the +0.02f bump puts foot below floor → spurious floor-up push at cell boundary").
- No try/catch swallow, no early-return guard at the symptom site.
---
## 6. Commit shape
```
1. feat(physics): Cluster A — indoor BSP collision probe
- PhysicsDiagnostics.IndoorBspEnabled toggle + env var + DebugPanel checkbox
- [indoor-bsp] log site in TransitionTypes.FindEnvCollisions cell branch
- (if needed) BSPQuery.LastHitPolyId surfacing
[CAPTURE SESSION — user-driven, no commit]
2. fix(physics): Cluster A #84<root cause from probe>
- One surgical change to TransitionTypes / GameWindow / BSPQuery
- Commit message cites probe evidence line
- Closes ISSUES.md #84
3. fix(physics): Cluster A #85<root cause from probe + decomp>
- One surgical change to TransitionTypes or PhysicsDataCache
- Commit message cites probe evidence + retail decomp anchor
- Closes ISSUES.md #85
4. fix(picker): Cluster A #86 — cell-BSP ray occlusion in WorldPicker
- New CellBspRayOccluder static class (Option C2)
- WorldPicker.Pick (both overloads) consults occluder before returning hit
- Unit test covering synthetic wall-between-camera-and-entity case
- Closes ISSUES.md #86
5. docs(roadmap+issues): Cluster A shipped — close #84/#85/#86, update roadmap
- ISSUES.md moves three issues to Recently closed
- docs/plans/2026-04-11-roadmap.md shipped table updated
- CLAUDE.md "Currently in Phase L.2..." line advanced if appropriate
```
Visual verification gate sits between commits 4 and 5. User confirms each acceptance criterion in the live client before closing.
---
## 7. Files touched
**Definite:**
- `src/AcDream.Core/Physics/PhysicsDiagnostics.cs` — new `IndoorBspEnabled` toggle.
- `src/AcDream.Core/Physics/TransitionTypes.cs``[indoor-bsp]` log site at the cell branch.
- `src/AcDream.App/UI/Panels/DebugPanel.cs` (or wherever the diagnostics checkboxes live) — UI toggle.
- `src/AcDream.Core/Selection/WorldPicker.cs` — call the new occluder.
- `src/AcDream.Core/Selection/CellBspRayOccluder.cs` — new file.
- `src/AcDream.App/Rendering/GameWindow.cs` — wire `LoadedCell` set / `CellPhysics` enumeration into the Use/Select handlers' picker calls.
- `tests/AcDream.Core.Tests/Selection/WorldPickerCellOcclusionTests.cs` — new unit test for #86 fix.
- `docs/ISSUES.md` — close #84/#85/#86.
- `docs/plans/2026-04-11-roadmap.md` — shipped table entry.
**TBD (depends on capture):**
- `src/AcDream.App/Rendering/GameWindow.cs:5362` (+0.02f Z bump site).
- `src/AcDream.Core/Physics/BSPQuery.cs`.
- `src/AcDream.Core/Physics/PhysicsDataCache.cs`.
---
## 8. Error handling
- Probe always behind `PhysicsDiagnostics.IndoorBspEnabled`. Zero-cost when off.
- Probe writes to `Console.WriteLine`, captured by the launch.log `Tee-Object` pipe (matches existing probe convention).
- `CellBspRayOccluder` returns `float.PositiveInfinity` when no cells are loaded (outdoor camera). Picker behaves exactly as today in that case.
- No try/catch around fix sites. If a fix doesn't behave, the user reports the residual symptom and the probe re-fires to identify the new cause.
---
## 9. Testing
### Unit tests
- **`WorldPickerCellOcclusionTests`** (new): synthetic `CellPhysics` with one wall poly between origin and an entity at 5 m. `Pick` returns null. Remove the wall — `Pick` returns the entity. Verifies the occluder is wired and triangulates correctly.
- **`CellBspRayOccluderTests`** (new): direct unit tests for the Möller-Trumbore intersection — ray hits poly front, back, edge, miss, parallel-to-poly. Standard ray-triangle coverage.
- **Existing tests**: `dotnet test` green. `WorldPickerTests` + `WorldPickerRectOverloadTests` + all `BSPQuery` tests must remain green.
### Visual verification (user-driven)
Three checks, one per issue:
1. **#84 acceptance** — User walks the common-room loop in Holtburg Inn. No invisible blocks. Probe shows no `TransitionState != OK` events at positions away from visible walls/furniture.
2. **#85 acceptance** — User stands 5+ m west of the Inn, runs at the west wall. Player blocks at the wall plane (within ~0.05 m of the visible wall surface). User cannot enter the building except via a door portal.
3. **#86 acceptance** — Mouse over a wall pixel from outside the Inn → cursor shows no selection. Mouse over an NPC through an open door portal → cursor shows the NPC selection ring (selection still works through real apertures).
---
## 10. Acceptance criteria
- All three issues meet their respective acceptance gates above (visual confirmation by user).
- `dotnet build` green.
- `dotnet test` green (new tests + all existing).
- Roadmap "shipped" table updated.
- `docs/ISSUES.md` #84/#85/#86 moved to "Recently closed" with commit SHAs.
- A short post-phase handoff doc (`docs/research/<ship-date>-indoor-walking-phase1-shipped-handoff.md`) records the probe evidence + the three root causes, parallel to the existing Phase 1+2 docs.
---
## 11. Phase name + roadmap placement
**Proposed name:** "Indoor walking Phase 1 — BSP cluster (#84/#85/#86)".
Reasons:
- Continues the "Indoor X Phase N" naming established by Phase 1 (probes) + Phase 2 (rendering fix).
- Distinguishes from indoor RENDERING work (which is done) — the focus has shifted to indoor WALKING.
- "Phase 1" implies more phases follow (Phase 2 likely = #78 outdoor-stab visibility cluster).
**Roadmap placement:** Add to `docs/plans/2026-04-11-roadmap.md` ahead-table as the next item in the indoor track. Insert after the Indoor cell rendering Phase 2 entry. Cross-link to ISSUES.md #84/#85/#86.
**Milestone:** This is parallel to the M2 critical path (which is F.2 / F.3 / F.5a / L.1c / L.1b). M1 already landed and is frozen. Indoor walking work is a quality-of-life parallel track — the user's recent commits put it ahead of M2 work because the rendering Phase 2 ship made it actionable.
---
## 12. Out of scope
- **#78** — outdoor stabs/buildings visible through rendered floor. Different code path (visibility / stencil). Filed for Indoor walking Phase 2.
- **#79-#82** — lighting / terrain shading. Cluster B in the handoff. Separate phase.
- **#83** — walking up stairs broken. Standalone issue. May share code with this phase if the cell BSP fix touches step-up; address opportunistically only if so.
- **Refactoring `WorldPicker`** beyond adding the occluder. The existing two-overload structure stays.
- **Stage B picker refine** (Möller-Trumbore against entity body polygons) — Issue #71, deferred per existing roadmap.
---
## 13. Risks
1. **Capture is inconclusive.** If the probe fires zero unexpected events during scenario 1 (i.e., #84 cannot be reproduced live during the capture), we extend the probe to also log `BSPQuery` internals or capture a longer session. Probably one more launch.
2. **#85 fix requires significant retail-decomp port.** Cross-cell BSP probing (querying an EnvCell's BSP from an outdoor cell) is not in the current code. The retail decomp at `named-retail/acclient_2013_pseudo_c.txt` has `CCellStructure::find_env_collisions` and neighbors that handle this. If the port is non-trivial (more than ~100 lines), promote #85 to its own dedicated phase rather than including it here. Decision point: after the capture, before commit 3.
3. **`CellBspRayOccluder` performance.** Möller-Trumbore against ~80 cells × ~50 polys each = ~4K triangle tests per `Pick` call. Picker fires once per click — acceptable. If we ever move to hover-pick (every frame), this needs an acceleration structure; not in scope here.
4. **Probe gets noisy.** If `FindEnvCollisions` fires at 30 Hz × N cells, the log can grow fast. Add a per-call rate limit only if the capture log is unreadable; default to unlimited (Phase 1+2 didn't need limiting).