docs(research): handoff for issue #77 — close-range auto-walk + pickup overshoot

Self-contained handoff doc for a follow-up focused session that fixes #77.
Captures: the two bugs (NPC at walking range never auto-walks; pickup at
walking range overshoots and snaps back), the trace evidence from Step 2's
verification run (cmd=0x0005 speed=-1.84 from ACE), four ranked
root-cause hypotheses (H1 missing BeginServerAutoWalk fire / H2
walk-run-threshold misclassification / H3 negative ForwardSpeed
sign-interpretation / H4 arrival predicate firing too early), the
reproduction recipe with ACDREAM_PROBE_AUTOWALK=1, acceptance criteria,
and the "don't workaround, fix root cause" guardrail.

The auto-walk diagnostic infrastructure already exists from Phase B.6
work — the next session just turns it on and reads the trace.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Erik 2026-05-18 08:38:31 +02:00
parent f8829b39be
commit 677266d477

View file

@ -0,0 +1,284 @@
# Issue #77 — close-range auto-walk + pickup overshoot — investigation handoff
**Filed:** 2026-05-16 (in `docs/ISSUES.md` as the active issue at the top)
**Severity:** MEDIUM (M1-deferred polish; visible during normal play, doesn't block any phase)
**Component:** physics / auto-walk / `PlayerMovementController.DriveServerAutoWalk`
**Branch state when handed off:** main at `f8829b3` (post-merge of `claude/hungry-tharp-b4a27b`)
---
## What you're chasing
Two related close-range bugs in the server-driven auto-walk path. Both are
**pre-existing** — not caused by the LiveSessionController extraction
(0b25df5) — they were surfaced during that refactor's visual verification.
### Bug A — NPC at walking range never auto-walks
- User clicks an NPC (e.g. Royal Guard at `0x7A9B46AE`) when the player
is at "walking range" — far enough that retail would walk a short
distance to reach the NPC's `useRadius`, not close enough to fire
Use immediately.
- The client's `WorldPicker` returns `WithinUseRadius=false`, so
`OnInputAction.UseSelected` defers the Use and would expect ACE's
inbound `MoveToObject` motion update to drive the player to the NPC.
- **The local player does not visibly move.** Repeated clicks (the trace
below shows seq 81 → 87 → 90 → 96 → 105 → 141 → 146 → 159 → 163 →
169 → 173 → 177 against the same Royal Guard) produce the same
response every time without any movement.
### Bug B — Pickup at walking range runs/overshoots/snaps back
- User presses F on a ground item while in "walking range" of it.
- Player **runs** (not walks) toward the target.
- Overshoots the item, then **blips back** to the correct position
before the pickup actually fires.
- The pickup completes (item ends up in inventory), but the visual is
jarring.
The "blips back" almost certainly means ACE's server-side position
correction snaps the player back after the client overshot. The
client's run-not-walk choice is the proximal cause.
---
## What we know already (don't re-discover this)
### Trace evidence captured during merge
From `launch.log` of the Step 2 verification run (task `b01zkw68w`,
2026-05-16, with `ACDREAM_DEVTOOLS=1` + my temporary
`OnLiveMotionUpdated` diagnostic):
```
[B.4b] use guid=0x7A9B46AE seq=159 ← outbound Use packet sent
OnLiveMotionUpdated: guid=0x5000000A stance=61 cmd=0x speed= ← player → NonCombat
OnLiveMotionUpdated: guid=0x7A9B46AE stance=61 cmd=0x0003 speed= ← NPC turns to face
OnLiveMotionUpdated: guid=0x7A9B46AE stance=61 cmd=0x speed=
OnLiveMotionUpdated: guid=0x5000000A stance=0 cmd=0x0005 speed=-1.84 ← ACE sends MoveToObject for player
OnLiveMotionUpdated: guid=0x5000000A stance=0 cmd=0x speed=
```
The pattern repeats identically for every retry — ACE *is* sending the
auto-walk command, but the client isn't engaging it.
**The negative speed (-1.84) is suspicious.** Speed is parsed as a raw
IEEE 754 float by `UpdateMotion.cs:193`. Either retail encodes a sign
that we're misinterpreting, or this is a legitimate "move backward"
instruction (ACE sometimes positions the move-to point behind the
player). The auto-walk engagement condition may be filtering negative
speeds out without our intent.
### What was already established BEFORE this issue
`PlayerMovementController` got significant retail-faithful refactor work
in Phase B.6 (closed-issue #75 territory, commit `f035ea3`). That work
established:
- **Walk/run threshold = 1.0m** of remaining-distance-to-useRadius
(not ACE's wire-supplied 15m default — that's overridden).
- **One-shot walk/run decision** at `BeginServerAutoWalk` time, held
for the rest of the chain.
- **Direct body-velocity drive** — auto-walk does NOT synthesize
`MovementInput`. It steps `Yaw`, sets `_body.set_local_velocity`
from `runRate`, and calls `_motion.DoMotion(WalkForward, speed)`
directly.
The auto-walk diagnostic infrastructure already exists:
```
PhysicsDiagnostics.ProbeAutoWalkEnabled ← runtime-toggleable
ACDREAM_PROBE_AUTOWALK=1 ← env-var enable
[autowalk-out] on every SendUse / SendPickUp
[autowalk-mt] on every inbound UpdateMotion for the local player
[autowalk-up] on every inbound UpdatePosition for the local player
[autowalk-begin] when BeginServerAutoWalk fires
[autowalk-end] when EndServerAutoWalk fires
```
**You should turn this on first.** The `[autowalk-begin]` line will tell
you whether `BeginServerAutoWalk` is even being invoked for the
walking-range case.
### Where to start reading code
| File | Why |
|---|---|
| `src/AcDream.App/Input/PlayerMovementController.cs` | The auto-walk driver lives here. Key functions: `BeginServerAutoWalk` (line ~428), `DriveServerAutoWalk` (line ~550), `EndServerAutoWalk` (line ~478). |
| `src/AcDream.App/Rendering/GameWindow.cs` line ~3360 | The `OnLiveMotionUpdated` site that detects MoveToObject pattern and calls `BeginServerAutoWalk`. The `[autowalk-mt]` and `[autowalk-begin]` traces fire here. |
| `src/AcDream.Core.Net/Messages/UpdateMotion.cs` line ~193 | The inbound parser. `ForwardSpeed` is a raw float — investigate whether negative is legitimate or a sign-misinterpretation. |
| `docs/superpowers/specs/2026-05-14-phase-b6-design.md` | The Phase B.6 design spec. Read this first to understand the existing auto-walk contract. |
| `references/holtburger/crates/holtburger-core/src/client/simulation.rs` | The Rust client's equivalent — has `ServerControlledProjection` + `approximate_move_to_object_projection_target`. Holtburger handles this case correctly, so cross-checking is valuable. |
| `docs/research/named-retail/acclient_2013_pseudo_c.txt` | Grep for `MoveToManager::HandleMoveToPosition`, `MoveToManager::HandleAutonomyLevelChange`, `CMotionInterp::apply_interpreted_movement`. Retail's truth. |
### What was checked and ruled out during the Step 2 session
- The bugs exist on the **pre-Step-2 branch** (eda936d / 32423c2 / main).
This was confirmed by diff scope: `PlayerMovementController.cs`,
`PhysicsEngine.cs`, `UpdateMotion.cs` were not touched by Step 2.
- The Step 2 refactor (`0b25df5`) does not affect the auto-walk path.
- Subscriptions are wired correctly — `OnLiveMotionUpdated` IS firing
for every motion update (verified via `[step2-diag]` traces that have
since been stripped).
---
## Hypotheses to test, in order
### H1 (most likely) — `BeginServerAutoWalk` never fires for the walking-range MoveToObject
The walking-range MoveToObject from ACE may not match the pattern that
`OnLiveMotionUpdated` checks before calling `BeginServerAutoWalk`. The
condition probably checks for one of: `IsServerControlledMoveTo`,
non-zero ForwardSpeed magnitude, specific `MovementType`, or specific
`ForwardCommand` values. Walking-range UpdateMotion may differ from
running-range in one of those fields.
**Test:** Enable `ACDREAM_PROBE_AUTOWALK=1`, click the NPC at walking
range. Look for `[autowalk-mt]` (inbound parse) WITHOUT a following
`[autowalk-begin]`. That confirms H1 and points to GameWindow.cs:3360.
### H2 — `BeginServerAutoWalk` fires but `_autoWalkInitiallyRunning` decision misclassifies
The walk/run decision uses:
```csharp
remainingAtStart = initialDist - distanceToObject
_autoWalkInitiallyRunning = remainingAtStart >= 1.0m
```
If ACE sends a `distanceToObject` (useRadius) much smaller than the
NPC's actual useRadius — or if `initialDist` is computed against the
wrong target position — `remainingAtStart` could land just above 1m
even at user-perceived walking range, causing run-not-walk. That
matches **Bug B**'s "runs and overshoots" pattern.
**Test:** Compare `[autowalk-begin] dest=(...) minDist=... objDist=... walkRunThresh=...`
values between a walking-range click and a running-range click. The
`objDist` should be the wire-supplied useRadius. If it's wrong (too
small), retail's value disagrees and we have a parser bug elsewhere.
### H3 — Negative `ForwardSpeed` is filtered or misinterpreted
`speed=-1.84` is the literal IEEE 754 float on the wire. Retail's
`CMotionInterp::handle_action_walkforward` (or whichever code consumes
ForwardSpeed) may use it for direction relative to the auto-walk
heading; a sign-extension bug in our parse would matter.
**Test:** Grep `references/holtburger` and named-retail decomp for how
`ForwardSpeed` is consumed. If retail/holtburger interpret the sign
specially and we don't, that's the gap.
### H4 — Arrival predicate fires too early
`DriveServerAutoWalk` line ~601:
```csharp
withinArrival = dist <= arrivalThreshold
```
where `arrivalThreshold = _autoWalkDistanceToObject` (use-radius).
If `distanceToObject` is 0 or near-zero (a parser bug, see H2), the
arrival predicate fires on the first frame and `EndServerAutoWalk("arrived")`
is called immediately, so the player never visibly moves. That matches
**Bug A** exactly.
**Test:** Look for `[autowalk-end] reason=arrived` immediately after
`[autowalk-begin]` with zero or one frame between. That confirms H4.
---
## Reproduction recipe (~3 minutes)
1. **Launch with autowalk probe enabled:**
```powershell
$env:ACDREAM_DAT_DIR = "$env:USERPROFILE\Documents\Asheron's Call"
$env:ACDREAM_LIVE = "1"
$env:ACDREAM_TEST_HOST = "127.0.0.1"
$env:ACDREAM_TEST_PORT = "9000"
$env:ACDREAM_TEST_USER = "testaccount"
$env:ACDREAM_TEST_PASS = "testpassword"
$env:ACDREAM_DEVTOOLS = "1"
$env:ACDREAM_PROBE_AUTOWALK = "1"
dotnet run --project src\AcDream.App\AcDream.App.csproj --no-build -c Debug | Tee-Object -FilePath launch.log
```
2. **Reproduce Bug A:**
- Walk toward the inn area in Holtburg until you're ~3-5 meters from
an NPC (e.g. Royal Guard near the inn). Estimate by eye — the goal
is "you can see the NPC clearly but you'd need to take a few steps
to reach them."
- Double-click the NPC.
- Observe: player doesn't move. Click again — same result.
3. **Reproduce Bug B:**
- Find a ground item (Holtburg has scattered spell components — the
coloured Tapers are obvious). Stand ~3-5 meters away.
- Press F (or whatever your `SelectionPickUp` key is bound to).
- Observe: player runs, overshoots, snaps back, item picked up.
4. **Stop the client gracefully** (window close, not Stop-Process — see
CLAUDE.md "Logout-before-reconnect"). ACE clears stale sessions in
35 seconds on graceful close.
5. **Grep the log:**
```bash
tr -d '\000' < launch.log | grep -E "\[autowalk-(out|mt|begin|up|end)\]"
```
This should give you a complete frame-by-frame trace of every
auto-walk decision the client made.
---
## Acceptance criteria
When the fix lands:
- ✅ Click NPC at walking range → player **walks** (not runs) directly to NPC,
Use fires on arrival, NPC dialogue appears.
- ✅ Press F on ground item at walking range → player **walks** the short
distance, no overshoot, no blip-back, item enters inventory.
- ✅ Far-range click still **runs** to target (don't regress the working case).
- ✅ Out-of-walking-but-very-close-range case (right at the edge of useRadius)
still arrives without infinite spin or stuttering.
- ✅ All existing tests pass (8 pre-existing Core failures are baseline,
do NOT count against the fix).
- ✅ Visual verification at Holtburg, all three M1 demo targets still work
(door, NPC, pickup).
---
## What NOT to do
- **Don't add a workaround**, per CLAUDE.md's "no workarounds" rule.
No grace-period band-aid, no "if speed is negative, force walk" hack,
no "always walk when within 5m" override. Fix the root cause.
- **Don't rewrite the auto-walk path** — Phase B.6 was a heavy retail
decomp port. The fix is almost certainly a one-condition or one-formula
adjustment, not a new design.
- **Don't change `Step 2`'s extracted code**. `LiveSessionController` and
the wireup are clean — `OnLiveMotionUpdated` is wired and firing per
the previous session's verification traces.
---
## Time estimate
- 30 min: read the spec + reproduce + capture trace
- 30 min: identify root-cause hypothesis from the trace
- 30 min 2 hr: implement the fix (depends on which hypothesis lands)
- 30 min: visual verification + write commit message
- **Total:** 23 hours focused work
---
## When done
1. Commit message format: `fix(physics): close #77 — <root cause summary>`
2. Move `#77` to the "Recently closed" section of `docs/ISSUES.md`
with closed-date + commit SHA (matches the project convention).
3. If the fix uncovered a durable lesson (e.g. "ACE sends negative
ForwardSpeed for MoveToObject; we were filtering"), add a
`feedback_*.md` memory entry per `memory/MEMORY.md` conventions.
4. The next pre-M2 cleanup items in queue: root `.editorconfig` + Step 3
(`LiveEntityRuntime`). See `docs/architecture/code-structure.md` §4.