feat(physics): #32 L.5 30Hz physics tick + retail debugger toolchain (#35) + Phase 3 retail-faithful kill_velocity

Three intertwined changes from a single investigation session driven by
attaching cdb to a live retail acclient.exe (v11.4186, Sept 2013 EoR
build) and tracing what retail actually DOES on the steep-roof wedge
scenario the user reported in acdream.

═══════════════════════════════════════════════════════════
1. L.5 — physics-tick MinQuantum gate (PlayerMovementController)
═══════════════════════════════════════════════════════════

Retail's CPhysicsObj::update_object subdivides per-frame dt into 1/30 s
sized integration steps and SKIPS entirely when accumulated dt is below
MinQuantum. Live trace evidence:

  update_object        = 40,960 calls
  UpdatePhysicsInternal = 25,087 calls   (61%)

i.e., 39% of update_object calls return early via the MinQuantum gate.
Retail's effective physics tick rate is 30Hz even at 60+ Hz render.

acdream's PlayerMovementController bypassed the existing PhysicsBody.
update_object and called UpdatePhysicsInternal(dt) directly each render
frame, which compressed bounce-energy / gravity-tangent accumulation
into half the time and amplified our steep-roof wedge dynamics.

Fix: add `_physicsAccum` accumulator. Integrate only when accumulated
dt ≥ MinQuantum (clamped to MaxQuantum to bound stale-frame jumps).
HugeQuantum drops accumulated time to discard truly stale frames
(debugger break, GC pause). Render still runs at full rate; only the
physics step is gated.

═══════════════════════════════════════════════════════════
2. Phase 3 reset retail-faithful kill_velocity (TransitionTypes)
═══════════════════════════════════════════════════════════

Retail's reset path (acclient_2013_pseudo_c.txt:273231-273239) gates
kill_velocity on `last_known_contact_plane_valid`:

  if (last_known_valid == 0) {
      set_collision_normal(step_up_normal); return COLLIDED;
  }
  kill_velocity(this);
  last_known_valid = 0;
  return COLLIDED;

Earlier in this session I deviated to "unconditional kill_velocity" as
a hypothesis-driven wedge fix. The live trace then showed the
deviation CAUSED a different wedge by zeroing V every frame, leaving
the body with no tangent momentum to escape (V = (0,0,0) for 169
consecutive frames while position pre/resolved frozen). The retail-
faithful gate is restored.

Note: the gate rarely fires in normal airborne play because our L.2.4
proximity guard clears last_known_valid soon after the body separates
from its remembered floor. Live retail trace also showed
kill_velocity = 0 hits over an entire play session — same behavior. So
acdream's kill_velocity is correct as ported now.

The supporting ObjectInfo.VelocityKilled flag + StopVelocity wiring +
PhysicsEngine.ResolveWithTransition consumer that actually zeros
body.Velocity when the flag is set — these were a no-op stub before
this session and are now correctly wired. Retail anchor:
OBJECTINFO::kill_velocity → CPhysicsObj::set_velocity({0,0,0}, 0) at
acclient_2013_pseudo_c.txt:274467-274475.

═══════════════════════════════════════════════════════════
3. Retail debugger toolchain (#35)
═══════════════════════════════════════════════════════════

When the question is "what does retail actually DO at runtime?" — not
"what does retail's code SAY" — the decomp at docs/research/named-retail/
is invaluable but doesn't capture state interactions across frames.
This commit ships infrastructure to attach Windows' cdb.exe to a live
retail acclient.exe with full PDB symbols and capture state at any
breakpoint.

  - tools/pdb-extract/check_exe_pdb.py — reads any PE's CodeView entry
    and reports MATCH / MISMATCH against refs/acclient.pdb's GUID.
    Always run before attaching cdb. The matching v11.4186 build's
    GUID is 9e847e2f-777c-4bd9-886c-22256bb87f32.

  - tools/pdb-extract/dump_pdb_info.py — dumps a PDB's expected
    build timestamp + GUID + age. Used to figure out which acclient.exe
    build pairs with our PDB.

CLAUDE.md gets a Step -1 in the development workflow ("ATTACH cdb
TO RETAIL when behavior is the question, not code") and a full
"Retail debugger toolchain" section with the workflow, sample .cdb
script structure, and watchouts (PDB names use snake_case for some
classes / PascalCase for CPhysicsObj; ; is cdb's command separator;
killing cdb kills the debuggee; high-hit-rate breakpoints lag the game).

memory/project_retail_debugger.md captures the workflow + key findings
so future sessions inherit the toolchain by reading project memory.

═══════════════════════════════════════════════════════════
4. BSPQuery Path 6 slide-tangent restored (b1af56e behavior)
═══════════════════════════════════════════════════════════

After this session's retail-strict experiments showed that retail-
faithful Path 6 (SetCollide + Phase 3 reset chain) produces a
"lands on roof in falling animation, can't slide off" half-state in
acdream — because our acdream port of step_up_slide / cliff_slide is
incomplete for grounded-on-steep movement — the L.4 slide-tangent
deviation from commit b1af56e is restored as the pragmatic ship state.

The deviation: when an airborne sphere hits a polygon whose normal Z
is below FloorZ (≈ 0.6642, slope > ~49°), project the move along the
steep face to remove the into-wall displacement, set CollisionNormal +
SlidingNormal, return Slid. Body never gets ContactPlane on the steep
poly, never gets the half-state, slides off the slope under gravity's
tangent contribution.

Retail-strict requires the deeper step_up_slide / cliff_slide audit
(filed under #32). Until that lands, slide-tangent is the right
deviation — produces user-acceptable "slide off the roof" behavior.

═══════════════════════════════════════════════════════════
Test status: 833/833 green.

Refs:
  acclient_2013_pseudo_c.txt:283950 (CPhysicsObj::update_object)
  acclient_2013_pseudo_c.txt:273231-273239 (Phase 3 reset path)
  acclient_2013_pseudo_c.txt:274467-274475 (OBJECTINFO::kill_velocity)
  acclient_2013_pseudo_c.txt:323783-323821 (BSPTREE::find_collisions Path 6)

Closes #35. Updates #32 with L.4/L.5 status.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Erik 2026-04-30 22:41:12 +02:00
parent b1af56eb19
commit 235de3322a
8 changed files with 624 additions and 82 deletions

132
CLAUDE.md
View file

@ -164,10 +164,22 @@ The triangle-boundary Z bug cost 5 failed fix attempts from guessing.
The animation frame-swap bug cost 4 failed attempts. Every time we
checked the decompiled code first, we got it right on the first try.
**Now we have named retail symbols too — Step 0 cuts most lookups
from 30 minutes to 5 seconds.**
from 30 minutes to 5 seconds. And as of 2026-04-30, when "what does
retail actually DO at runtime?" is the question and decomp alone
isn't enough, attach cdb to a live retail client (Step -1).**
### For each new feature or bug fix:
-1. **ATTACH cdb TO RETAIL (when behavior is the question, not code).**
For "what does retail actually DO frame-by-frame?" questions —
wedges, weird animation flicker, geometry-specific bugs, anything
where the decomp is correct but it's not clear how it produces the
visible behavior — **don't guess; attach the Windows debugger to
a live retail client and trace it.** See "Retail debugger toolchain"
below for setup. We discovered the steep-roof wedge had a 30Hz
physics-tick cause this way; would have taken weeks of guessing
without the trace.
0. **GREP NAMED FIRST.** Before any decompilation work, search
`docs/research/named-retail/acclient_2013_pseudo_c.txt` by
`class::method` name. 99.6% of functions have real names from the
@ -249,6 +261,124 @@ Before marking any phase as done:
- [ ] Roadmap updated
- [ ] Memory updated if there's a durable lesson
## Retail debugger toolchain (live runtime trace)
**When the question is "what does retail actually DO frame-by-frame?"**
the decomp alone is often not enough — code paths interact with state
(LastKnownContactPlane, transient flags, accumulated counters) in ways
that aren't obvious from reading. As of 2026-04-30 we have a working
toolchain to attach Windows' console debugger (cdb.exe) to a live
retail acclient.exe with full PDB symbols and capture state at any
breakpoint. **Use this when guessing has failed twice in a row.**
### What we have
- **Matching binary**: `C:\Turbine\Asheron's Call\acclient.exe`
v11.4186 (linker timestamp `2013-09-06 00:17:42 UTC`,
CodeView GUID `9e847e2f-777c-4bd9-886c-22256bb87f32`). Pairs
exactly with our `refs/acclient.pdb`.
- **Debugger**: `cdb.exe` (console WinDbg) at
`C:\Program Files (x86)\Windows Kits\10\Debuggers\x86\cdb.exe`.
Install via Microsoft Store WinDbg (~50 MB). 32-bit version is
required for acclient.exe.
- **PDB**: `refs/acclient.pdb` (29 MB, Sept 2013 EoR build).
18,366 named functions + 5,371 named struct types resolve.
- **Symbol verifier**: `tools/pdb-extract/check_exe_pdb.py <exe>`
reads any acclient.exe and prints whether it pairs with our PDB
(`MATCH` / `MISMATCH (expected GUID = ...)`). Always run this on
a candidate binary BEFORE attaching.
- **PDB metadata dumper**: `tools/pdb-extract/dump_pdb_info.py refs/acclient.pdb`
prints the PDB's expected timestamp + GUID + age. Use to figure
out which build to look for if the chain ever breaks.
### Workflow
1. **Verify the binary matches the PDB:**
```bash
py tools/pdb-extract/check_exe_pdb.py "C:/Turbine/Asheron's Call/acclient.exe"
```
Expect: `=== MATCH: this exe pairs with our acclient.pdb ===`
2. **Have the user launch retail client** and connect to local ACE.
Retail must already be in-world before attaching.
3. **Write a `.cdb` script** that arms breakpoints with non-blocking
actions (count + log + `gc`). Pattern:
```
.logopen <output-path>
.sympath C:\Users\erikn\source\repos\acdream\refs
.symopt+ 0x40
.reload /f acclient.exe
r $t0 = 0
bp acclient!CTransition::transitional_insert "r $t0 = @$t0 + 1; .if (@$t0 % 5000 == 0) { .printf \"...\" }; .if (@$t0 >= 30000) { qd } .else { gc }"
bp acclient!OBJECTINFO::kill_velocity "r $t1 = @$t1 + 1; gc"
...
g
```
`gc` = "go conditional" (continue without breaking). Auto-detach
via `qd` after a hit-count threshold to avoid manual cleanup.
4. **Launch cdb in the background** via a PowerShell wrapper:
```powershell
& "C:\Program Files (x86)\Windows Kits\10\Debuggers\x86\cdb.exe" `
-pn acclient.exe -cf <script>.cdb *>&1 |
Tee-Object -FilePath <log>
```
5. **User reproduces the scenario** in the retail client window
(jump on roof, hit wall, etc.). Breakpoints fire, log fills.
6. **cdb auto-detaches** when the threshold breakpoint fires `qd`.
Retail keeps running unaffected. Read the log offline.
### Known watchouts
- **PDB function names use snake_case for some classes**
(BSPTREE, CTransition, OBJECTINFO, COLLISIONINFO, SPHEREPATH) and
**PascalCase for others** (CPhysicsObj). The Binary Ninja decomp
shows snake_case for everything; the PDB has Turbine's actual
PascalCase for CPhysicsObj. Always look up symbols with `x` first
to find the actual name.
- **`bp acclient!Class::method`** sets a breakpoint by symbol. The
cdb command parser splits on `;`, so don't put `;` inside the
action string — use newlines or escape carefully.
- **Symbol path: do NOT use `.sympath srv*<server>;<local>`** — the
`;` is a cdb command separator, gets split. Use `.sympath <local>`
(no symbol server, just our refs/) since we don't need Microsoft
system DLL symbols.
- **Killing cdb kills the debuggee.** Use `qd` (quit detached) inside
a breakpoint action to detach cleanly. `Stop-Process cdb` will
take the retail client down with it.
- **High breakpoint hit rates produce game lag.** Each breakpoint hit
traps the process briefly. For frequent functions
(transitional_insert at ~10K/sec) the cumulative cost is enough to
make retail feel sluggish. Mitigate by setting a tight auto-detach
threshold (e.g., 30,000 hits) and/or moving counters to less-frequent
functions.
- **acclient.exe is 32-bit + uses thiscall.** When dumping struct
fields in breakpoint actions, `this` is in `ecx`. Use cdb's
`dt acclient!ClassName @ecx` for full struct dump.
### When NOT to use this
- **Pure code-port questions** — the decomp at `docs/research/named-retail/`
has the answer. Don't waste time on cdb if `grep` is enough.
- **Visual / rendering bugs** — debugger doesn't help with shaders or
framebuffers; use RenderDoc or similar.
- **Network protocol questions**`holtburger` references + ACE source
+ Wireshark are the right tools, not cdb.
This toolchain was used to settle the L.5 steep-roof investigation:
30Hz physics tick (vs our 60Hz), `kill_velocity` gating,
`set_collide` rate per minute. See commit history around 2026-04-30
for the trace data and the decisions it drove.
## Subagent policy
Subagents are the primary tool for saving parent-context and keeping one