Knowledge-preservation pass after the issue #98 cellar-up fix shipped (`b3ce505`). Closes the saga's documentation loop and plans the next phase. Changes: - docs/research/2026-05-23-a6-p3-issue98-comparison-harness-findings.md Appended "Resolution 2026-05-24" section: v3 hypothesis falsified, actual mechanism (head-bump cottage GfxObj floor poly from below) confirmed,b3ce505fix shipped, known door regression flagged. Memory artifacts cross-referenced. - docs/ISSUES.md #98 moved to DONE with full resolution writeup + decomp anchors. #99 filed: door regression at building thresholds (caused by b3ce505's indoor-primary gate). Closes via A6.P4. #100 filed: transparent rectangular patches around houses (terrain rendering). Bisect found commit35b37dfintroduced the hiddenTerrainCells mechanism that collapses 24m outdoor cells when buildings sit in them; cottage building only fills part of its cell so the rest of the 24m cell shows the sky-bleeding gap. Three fix-path options documented. - docs/superpowers/specs/2026-05-24-phase-a6-p4-retail-shadow-architecture.md Full A6.P4 design doc. Three-slice plan: (1) query-side portal expansion to close #99 while preserving #98 fix, (2) port retail's BuildShadowCellSet at registration time so per-cell semantics match `CObjCell::find_cell_list`, (3) removeb3ce505stopgap entirely. Decomp anchors, file-by-file plan, risk inventory, open questions. Memory entries written separately (out-of-tree at ~/.claude/projects/.../memory/): - feedback_retail_per_cell_shadow_list.md The architectural lesson: retail uses per-cell shadow_object_list with portal-aware registration; our landblock-wide spatial registry diverges at indoor/outdoor seams. - feedback_apparatus_for_physics_bugs.md The apparatus-first pattern that cracked the saga: live capture + fixture dump + replay harness. Template for future physics bugs. Quote rule: "when a physics bug is resisting and you catch yourself about to ship 'fix attempt N+1 with no new evidence,' STOP. Build the apparatus first." - MEMORY.md index updated with both new entries. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
221 lines
19 KiB
Markdown
221 lines
19 KiB
Markdown
# Phase A6.P4 — Retail-faithful per-cell shadow_object_list — design
|
||
|
||
**Status:** Drafted 2026-05-24. Ready to start when user approves.
|
||
**Milestone:** M1.5 — "Indoor world feels right" (still active; A6.P3 partial close 2026-05-24).
|
||
**Predecessor:** Phase A6.P3 (issue #98 cellar-up). Shipped `b3ce505` as a behavioral stopgap (indoor-primary radial-sweep gate) that closes the cellar but introduces a door-collision regression at building thresholds (issue #99). A6.P4 ports retail's full shadow architecture and removes the stopgap as part of the same change.
|
||
|
||
**Related:**
|
||
- [`docs/research/2026-05-23-a6-p3-issue98-comparison-harness-findings.md`](../../research/2026-05-23-a6-p3-issue98-comparison-harness-findings.md) — A6.P3 saga + resolution
|
||
- Memory: `feedback_retail_per_cell_shadow_list.md`, `feedback_apparatus_for_physics_bugs.md`
|
||
- [`docs/ISSUES.md`](../../ISSUES.md) #98 (cellar — DONE), #99 (doors — OPEN, closes here), #97 (phantom collisions — likely closes here)
|
||
|
||
---
|
||
|
||
## 1. Goal
|
||
|
||
Port retail's per-cell `shadow_object_list` collision-query architecture, including the indoor/outdoor branch in `CObjCell::find_cell_list` and the portal-visible-neighbor recursion. Eliminate the landblock-wide spatial-radius approximation in `ShadowObjectRegistry.GetNearbyObjects` and the b3ce505 stopgap that gates it on primary cell type. After A6.P4 the engine is structurally retail-faithful for the shadow-object collision path.
|
||
|
||
**Acceptance:** cellar bug stays closed (no regression of #98); door collision works from both indoor and outdoor sides (closes #99); harness `LiveCompare_FirstCap_FixClosesCottageFloorCap` still passes; existing 11/11 `CellarUpTrajectoryReplayTests` + 19+ `ShadowObjectRegistryTests` pass; user visual verification at Holtburg cottage cellar + cottage doorway + inn doorway + indoor stones/furniture + outdoor walls.
|
||
|
||
---
|
||
|
||
## 2. Problem recap
|
||
|
||
### 2.1 The two-bug pair we live with today
|
||
|
||
- **Pre-b3ce505:** cottage GfxObj registered with `cellScope=0u` (landblock-wide) → outdoor radial sweep in `GetNearbyObjects` returns it for any sphere within ~5.5 m of the cottage. Player in cellar EnvCell at (141.6, 7.3) → cottage returned → head-bumps cottage floor poly from below → stuck (#98).
|
||
- **Post-b3ce505:** indoor-primary cells skip the outdoor radial sweep entirely. Cottage no longer returned to cellar. BUT outdoor-registered door entities at doorway thresholds are also now skipped from indoor primary → walk-through (#99).
|
||
|
||
### 2.2 Retail's resolution of the same tension
|
||
|
||
Retail doesn't have this problem because shadows are placed at **registration time** into specific per-cell `shadow_object_list`s, and `CObjCell::find_obj_collisions(this, ...)` iterates only `this->shadow_object_list`. The placement uses `CObjCell::find_cell_list` which branches indoor/outdoor + recurses through portal-visible neighbors.
|
||
|
||
Result:
|
||
- Cottage building (m_position resolves outdoor) → added to outdoor cells only → never in cellar's list → cellar query doesn't see it
|
||
- Door (m_position resolves to its cell, often outdoor; portal traversal adds adjacent indoor cell too) → in both cells' lists → query from either side sees it
|
||
|
||
**Decomp anchors** (`docs/research/named-retail/acclient_2013_pseudo_c.txt`):
|
||
|
||
| Line | Function | Role |
|
||
|---|---|---|
|
||
| 308742+ | `CObjCell::find_cell_list(Position, ...)` | Builds cell list at registration |
|
||
| 308751-308769 | (within find_cell_list) | Indoor/outdoor branch — indoor adds 1 cell; outdoor calls `add_all_outside_cells` |
|
||
| 308773-308825 | (within find_cell_list) | Visible-cells iteration — vtable call at offset 0x80, recursive portal traversal |
|
||
| 282819+ | `CPhysicsObj::add_shadows_to_cells(CELLARRAY)` | Adds the object to each cell in the array via `add_shadow_object` |
|
||
| 283322, 283369, 283389 | call sites | Build cell array via find_cell_list, then call add_shadows_to_cells |
|
||
| 308584+ | `CObjCell::add_shadow_object` | Appends to per-cell `shadow_object_list` |
|
||
| 308916 | `CObjCell::find_obj_collisions(this, ...)` | Per-cell iteration at query time |
|
||
| 309560 | `CEnvCell::find_collisions` | Calls `find_env_collisions` (own BSP) THEN `find_obj_collisions(this)` |
|
||
| 316951 | `CLandCell::find_collisions` | Calls `find_env_collisions` THEN `CSortCell::find_collisions` THEN `find_obj_collisions(this)` |
|
||
|
||
---
|
||
|
||
## 3. Design
|
||
|
||
### 3.1 Inversion of where the cell-set is computed
|
||
|
||
**Today (b3ce505):**
|
||
|
||
```
|
||
Register(obj, worldPos, cellScope=0)
|
||
→ enumerates 24-m outdoor grid cells (cx, cy) the bounding sphere overlaps
|
||
→ adds obj to each computed cellId
|
||
GetNearbyObjects(worldPos, queryRadius, primaryCellId)
|
||
→ if primary indoor: query indoorCellIds only (passed by caller)
|
||
→ else: enumerate 24-m outdoor grid cells the queryRadius overlaps, query each
|
||
```
|
||
|
||
**A6.P4:**
|
||
|
||
```
|
||
Register(obj, worldPos, m_position_cellId)
|
||
→ calls BuildShadowCellSet(obj.boundingSphere, m_position_cellId):
|
||
if m_position_cellId is INDOOR (>= 0x0100):
|
||
adds m_position_cellId
|
||
recurses VisibleCellIds (portal-visible neighbors)
|
||
adds each portal-reachable cell (indoor OR outdoor)
|
||
else (OUTDOOR):
|
||
enumerates outdoor cells the bounding sphere overlaps
|
||
(existing AddAllOutsideCells equivalent — already implemented)
|
||
→ adds obj to each cell in the computed set
|
||
GetNearbyObjects(spherePrimaryCellId, sphereOverlapCells)
|
||
→ for each cell in {primary} ∪ portal-reachable neighbors of primary:
|
||
iterate its shadow_object_list
|
||
→ strict per-cell iteration, no spatial radius
|
||
```
|
||
|
||
### 3.2 Why this fixes both bugs
|
||
|
||
- **Cellar (#98 stays fixed):** cottage's m_position is outdoor → its shadow set is outdoor cells. Cellar EnvCell's shadow list never has it. Sphere in cellar queries cellar's list → cottage not returned → no cap.
|
||
- **Doorway doors (#99 closes):** door's m_position is outdoor (door sits in doorway, position resolves to outdoor cell). Door's outdoor cell has a portal to the indoor cell adjacent to the doorway. The outdoor recursion via VisibleCellIds adds the indoor cell to the door's shadow set. Sphere on either side sees the door.
|
||
|
||
Wait — is the doorway portal in the OUTDOOR cell's VisibleCellIds or the INDOOR cell's? Verify: outdoor LandCells don't typically have portals; portals live on EnvCells. So an EnvCell's portal lists the outdoor cell as its other-side cell, but the outdoor cell doesn't list the indoor portal-neighbor. **This means we need the recursion both directions** — when registering an object whose position is outdoor, walk indoor cells that LIST that outdoor cell as a portal neighbor. That's a reverse lookup.
|
||
|
||
**Two implementation choices:**
|
||
|
||
- **3.2.a (preferred):** Build a **reverse portal map** at landblock-load time. For each indoor cell, walk its portals; for each portal, record `(outdoorCellId → indoorCellId)`. Reverse map. Then when an outdoor object registers, check the reverse map and also add the indoor cell.
|
||
- **3.2.b (simpler, less retail-faithful):** At GetNearbyObjects time, if primary is indoor, also include indoor cell's portal-visible outdoor cells in the iteration set. Matches retail behaviorally at query time (the door is in the outdoor cell; the indoor query reaches into the outdoor cell's list via the indoor's VisibleCellIds). Avoids building a reverse map.
|
||
|
||
3.2.b is simpler and matches what retail's CEnvCell::find_obj_collisions effectively achieves through the indoor cell's own shadow_object_list (which would have been populated via portal-visible recursion). It's also the surgical extension of b3ce505. **Go with 3.2.b as the slice-1 implementation; option 3.2.a is the slice-2 cleanup if reverse map turns out to be needed for the strict per-cell architecture.**
|
||
|
||
### 3.3 Slice plan
|
||
|
||
**Slice 1 — query-side portal expansion (1-2 days):**
|
||
- Extend `Transition.FindObjCollisions` to compute portal-reachable outdoor cells when primary is indoor. For each indoor cell in `indoorCellIds` from `CellTransit.FindCellSet`, walk its `CellPhysics.VisibleCellIds`, collect outdoor cell ids, pass them into `GetNearbyObjects`.
|
||
- `GetNearbyObjects` gains a `portalReachableOutdoorCells` parameter. When primary is indoor, the query iterates {indoorCellIds + portalReachableOutdoorCells}, with no radial sweep. When primary is outdoor, current behavior unchanged.
|
||
- Tests: new `LiveCompare_DoorThroughDoorway_*` test fixture (or extend the harness with a captured door-traversal record). Existing 11/11 `CellarUpTrajectoryReplayTests` continue passing.
|
||
- Visual: cellar + doors + indoor furniture + outdoor walls + Holtburg inn doorway.
|
||
|
||
**Slice 2 — registration-side cell set (2-3 days):**
|
||
- `ShadowObjectRegistry.Register` gains a `m_positionCellId` parameter (not currently passed — production call sites use cellScope for indoor items and 0u for outdoor).
|
||
- New `BuildShadowCellSet` helper computes the cell set per retail's `find_cell_list` semantics:
|
||
- Indoor m_position: that cell + VisibleCellIds (forward portal traversal)
|
||
- Outdoor m_position: AddAllOutsideCells equivalent (current behavior for cellScope=0) — keep
|
||
- The registration call site in `GameWindow.cs:5893` (landblock-baked statics) passes the static's spatial-resolved cellId. The cellScope=0 path is replaced by an explicit cell-set computation.
|
||
- The `cellScope=ParentCellId` path (interior items, A1.5 fix) continues to work — `m_positionCellId = ParentCellId` reaches the indoor branch and adds that cell + portal neighbors. Today's behavior is "just that cell," so this is a small enrichment (might pick up a few more cells via portal recursion; need to verify no over-registration).
|
||
- After slice 2, the query side reverts to strict per-cell iteration (drop the slice-1 `portalReachableOutdoorCells` parameter; the registration side has already placed objects in the right cells).
|
||
- Tests: existing harness + ShadowObjectRegistry tests + new `Register_OutdoorPosition_RegistersInOutdoorCellsOnly` / `Register_IndoorPosition_RegistersInThatCellAndPortalNeighbors` round-trips.
|
||
- Visual: same as slice 1.
|
||
|
||
**Slice 3 — remove b3ce505 stopgap (1 day):**
|
||
- Delete the `primaryCellId` parameter on `ShadowObjectRegistry.GetNearbyObjects` and the `(primaryCellId & 0xFFFFu) >= 0x0100u` gate. The architecture no longer needs it.
|
||
- Delete the b3ce505 commit's comments referring to "indoor-primary gate" — replace with comments referencing the retail-faithful registration.
|
||
- Update `LiveCompare_FirstCap_FixClosesCottageFloorCap` test docstring to attribute the fix to the registration-side architecture instead of the query-side gate.
|
||
- Visual re-verify the cellar + doors after stopgap removal — fix must be load-bearing at the registration side, not the query side.
|
||
|
||
### 3.4 What's in scope vs out
|
||
|
||
**In scope:**
|
||
- ShadowObjectRegistry + Register + GetNearbyObjects + all production call sites in GameWindow.cs (3139, 5893, 5963, 5999, 6024, 6211)
|
||
- CellTransit.FindCellSet wiring for portal-visible expansion in slice 1
|
||
- Test harness updates
|
||
- Removal of b3ce505 stopgap in slice 3
|
||
|
||
**Out of scope (filed as follow-up if surfaced):**
|
||
- Reverse-portal-map approach (3.2.a) — only if 3.2.b reveals a case where the indoor-side query traversal misses a portal-direction asymmetry
|
||
- Refactoring the entire Register signature to take a `Position` object instead of separate cellId / worldPos / landblockId / cellScope params — cleaner but big diff
|
||
- `UpdatePosition` deep changes — it already calls `Register` after `Deregister`; new cell-set semantics flow through naturally
|
||
- Cylinder collision behavior changes — A6.P4 is about shadow-object set selection only; existing CylinderCollision math unchanged
|
||
- Transparent ground around houses (#100) — separate rendering issue, addressed in a different phase
|
||
|
||
---
|
||
|
||
## 4. Implementation breakdown
|
||
|
||
### 4.1 Files touched
|
||
|
||
| File | Slice | Change |
|
||
|---|---|---|
|
||
| `src/AcDream.Core/Physics/ShadowObjectRegistry.cs` | 1, 2, 3 | Slice 1: add `portalReachableOutdoorCells` to GetNearbyObjects. Slice 2: rewrite Register with BuildShadowCellSet. Slice 3: remove primaryCellId param + indoor-skip gate. |
|
||
| `src/AcDream.Core/Physics/TransitionTypes.cs` | 1 | Compute portal-reachable outdoor cells from indoorCellIds in FindObjCollisions; pass them. Slice 3: remove primaryCellId param. |
|
||
| `src/AcDream.App/Rendering/GameWindow.cs` | 2 | All 6 Register call sites: pass m_positionCellId (extracted from entity.Position.LandblockCellId or analog). For landblock-baked statics where m_position is outdoor, this is a new computation. |
|
||
| `src/AcDream.Core/Physics/CellPhysics.cs` (or new helper) | 1, 2 | Expose `IReadOnlyList<uint> PortalReachableCells` if not already covered by VisibleCellIds. |
|
||
| `tests/AcDream.Core.Tests/Physics/CellarUpTrajectoryReplayTests.cs` | 1, 3 | Slice 1: add Door-through-doorway test fixture or extend LiveCompare. Slice 3: update FixClosesCottageFloorCap docstring. |
|
||
| `tests/AcDream.Core.Tests/Physics/ShadowObjectRegistryTests.cs` | 2 | New tests for Register cell-set computation: outdoor m_position registers in outdoor cells only; indoor m_position registers in that cell + portal neighbors. |
|
||
|
||
### 4.2 Compatibility / deprecation
|
||
|
||
The `cellScope` parameter on `Register` should be **deprecated** during slice 2 — it's a function-shape relic from the A1.5 fix that papered over the lack of m_position-aware cell-set computation. New shape: `m_positionCellId` always passed. Old `cellScope` parameter kept (Obsolete attribute) for one slice, removed in slice 3.
|
||
|
||
### 4.3 Live capture infrastructure (reuse)
|
||
|
||
No new apparatus needed — slice 1's test can use:
|
||
- Existing `PhysicsResolveCapture` (env var `ACDREAM_CAPTURE_RESOLVE=<path>`) to capture a player walking through a doorway
|
||
- Extract a single tick where the door was the proximate obstruction
|
||
- Add it as a fixture to `tests/.../Fixtures/issue99/live-capture.jsonl`
|
||
- Write `LiveCompare_DoorThroughDoorway_FixCloses` similar to `LiveCompare_FirstCap_FixClosesCottageFloorCap`
|
||
|
||
If the live capture proves logistically hard (door positions vary per server, doors may auto-open on approach), slice 1 can rely on a synthetic harness test: register a fake door entity (Cylinder shadow) with an outdoor cellScope adjacent to a cellar-fixture indoor cell, verify GetNearbyObjects from the indoor cell returns it.
|
||
|
||
---
|
||
|
||
## 5. Risk inventory
|
||
|
||
### 5.1 Things that could go wrong
|
||
|
||
| Risk | Likelihood | Detection |
|
||
|---|---|---|
|
||
| Reverse portal direction matters (3.2.a needed) | MEDIUM | Slice 1 visual: doors at outdoor side might still pass through if the outdoor cell's "visibility" doesn't include the indoor cell. Need to test BOTH approach directions per door. |
|
||
| Over-registration in slice 2 — interior items end up in too many cells via portal recursion | LOW | Existing ShadowObjectRegistryTests for indoor items catch this. |
|
||
| `CellPhysics.VisibleCellIds` not populated correctly for all loaded cells | LOW-MEDIUM | The two-tier streaming might leave near-tier cells with full VisibleCellIds but far-tier without. Check `LoadFar` vs `LoadNear` paths in StreamingController. |
|
||
| Performance regression — slice 2's per-cell iteration is slightly different than slice 1's spatial query | LOW | Per-cell list iteration is O(shadows-in-relevant-cells) vs O(shadows-in-radius); should be similar or slightly better. |
|
||
| Pre-existing static-state test flakiness obscures slice signal | KNOWN | Run targeted tests in isolation per the issue-#98 saga pattern. |
|
||
|
||
### 5.2 What to verify visually after each slice
|
||
|
||
- **Slice 1:** Cottage cellar climb (still works), Holtburg cottage doorway from outside (door blocks), Holtburg cottage doorway from inside (door blocks), Holtburg inn doorway both directions.
|
||
- **Slice 2:** All slice 1 + interior items still block (chair, fireplace, table inside inn), outdoor walls still block from outside.
|
||
- **Slice 3:** Slice 2 list + verify no regression from stopgap removal — the fix is load-bearing at registration, not query.
|
||
|
||
---
|
||
|
||
## 6. Migration sequence (commit shape)
|
||
|
||
| Commit | Slice | Title |
|
||
|---|---|---|
|
||
| 1 | 1 | `feat(phys): A6.P4 slice 1 — portal-reachable outdoor cells in indoor shadow query` |
|
||
| 2 | 1 | `test(phys): A6.P4 slice 1 — door-through-doorway harness reproduction` (if live-capture-driven) |
|
||
| 3 | 2 | `feat(phys): A6.P4 slice 2 — BuildShadowCellSet for retail-faithful Register` |
|
||
| 4 | 2 | `refactor(phys): A6.P4 slice 2 — production call sites pass m_positionCellId` |
|
||
| 5 | 3 | `refactor(phys): A6.P4 slice 3 — remove b3ce505 indoor-primary gate (stopgap retired)` |
|
||
| 6 | 3 | `docs: A6.P4 ship — close #98 architectural shipped, close #99, file likely-closes for #97 + Finding 3 family` |
|
||
|
||
Each slice fully buildable + visually verifiable on its own. The user can decide to stop after slice 1 if doors close cleanly and the registration-side refactor feels too aggressive for the moment — the b3ce505 stopgap stays in place and #98 + #99 are both closed. But the long-term goal is slice 3's strict retail parity.
|
||
|
||
---
|
||
|
||
## 7. Open questions
|
||
|
||
1. **Does `CellPhysics.VisibleCellIds` already include the outdoor cell on the other side of a building doorway?** Need to inspect a real Holtburg cottage's loaded CellPhysics in the engine. If yes, slice 1 is straightforward. If no, slice 1 needs `Portals` walked directly (each PortalInfo has the other-side cellId).
|
||
2. **Are doors actually registered with outdoor cellScope today?** Need to verify by reading `GameWindow.cs:3139` carefully and tracing an actual door's `Position.LandblockId` at spawn time. If doors happen to register indoor (e.g., for inn doors that span a vestibule cell), the door regression diagnosis is wrong and we need different evidence.
|
||
3. **Two-tier streaming interaction:** when the cellar is in the NEAR tier but the cottage is also NEAR, both are loaded. When player is in the FAR tier far away from the cottage and walks into render distance, registration order matters — does the cottage register BEFORE the cellar cell finishes loading? If so, the cell-set computation might miss portals that haven't been loaded yet. Check `StreamingController` order of operations.
|
||
4. **Live entity (NPC, monster) movement:** UpdatePosition re-registers via Deregister + Register. The new cell-set semantics flow through, but if an NPC walks from indoor to outdoor, its cell membership changes per tick. Verify the deregister + re-register is cheap enough for the 5-10 Hz UpdatePosition rate.
|
||
5. **Plugin API:** if the plugin API exposes shadow registration (unlikely today, but planned eventually), the new signature change in slice 2 will need a corresponding plugin-API change. File as a separate plugin-versioning concern.
|
||
|
||
---
|
||
|
||
## 8. Out-of-scope kept-near reminders
|
||
|
||
- **Issue #100 (transparent ground around houses)** — separate rendering issue introduced by `35b37df`. Filed in ISSUES.md with the bisect finding. Addressed in a different phase (terrain-mesh polygon-level cutout, OR drop the hiddenTerrainCells mechanism with a building-floor render-only Z lift). Not part of A6.P4.
|
||
- **Issue #95 (dungeon portal-graph visibility blowup)** — separate rendering issue blocking M1.5 dungeon demo. Independent of A6.P4 but may share concepts (portal-graph traversal). If A6.P4 builds out portal-traversal infrastructure, #95 may benefit; do NOT scope-creep them together.
|
||
- **The b3ce505 commit's optional `primaryCellId` parameter signature** has a default of 0u for backward compatibility with non-cell-aware test callers. Slice 3 removes the param entirely. Tests that pass `primaryCellId=0` explicitly must be updated to drop the argument.
|