acdream/docs/research/2026-05-16-issue77-autowalk-handoff.md
Erik 677266d477 docs(research): handoff for issue #77 — close-range auto-walk + pickup overshoot
Self-contained handoff doc for a follow-up focused session that fixes #77.
Captures: the two bugs (NPC at walking range never auto-walks; pickup at
walking range overshoots and snaps back), the trace evidence from Step 2's
verification run (cmd=0x0005 speed=-1.84 from ACE), four ranked
root-cause hypotheses (H1 missing BeginServerAutoWalk fire / H2
walk-run-threshold misclassification / H3 negative ForwardSpeed
sign-interpretation / H4 arrival predicate firing too early), the
reproduction recipe with ACDREAM_PROBE_AUTOWALK=1, acceptance criteria,
and the "don't workaround, fix root cause" guardrail.

The auto-walk diagnostic infrastructure already exists from Phase B.6
work — the next session just turns it on and reads the trace.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 08:38:31 +02:00

284 lines
13 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Issue #77 — close-range auto-walk + pickup overshoot — investigation handoff
**Filed:** 2026-05-16 (in `docs/ISSUES.md` as the active issue at the top)
**Severity:** MEDIUM (M1-deferred polish; visible during normal play, doesn't block any phase)
**Component:** physics / auto-walk / `PlayerMovementController.DriveServerAutoWalk`
**Branch state when handed off:** main at `f8829b3` (post-merge of `claude/hungry-tharp-b4a27b`)
---
## What you're chasing
Two related close-range bugs in the server-driven auto-walk path. Both are
**pre-existing** — not caused by the LiveSessionController extraction
(0b25df5) — they were surfaced during that refactor's visual verification.
### Bug A — NPC at walking range never auto-walks
- User clicks an NPC (e.g. Royal Guard at `0x7A9B46AE`) when the player
is at "walking range" — far enough that retail would walk a short
distance to reach the NPC's `useRadius`, not close enough to fire
Use immediately.
- The client's `WorldPicker` returns `WithinUseRadius=false`, so
`OnInputAction.UseSelected` defers the Use and would expect ACE's
inbound `MoveToObject` motion update to drive the player to the NPC.
- **The local player does not visibly move.** Repeated clicks (the trace
below shows seq 81 → 87 → 90 → 96 → 105 → 141 → 146 → 159 → 163 →
169 → 173 → 177 against the same Royal Guard) produce the same
response every time without any movement.
### Bug B — Pickup at walking range runs/overshoots/snaps back
- User presses F on a ground item while in "walking range" of it.
- Player **runs** (not walks) toward the target.
- Overshoots the item, then **blips back** to the correct position
before the pickup actually fires.
- The pickup completes (item ends up in inventory), but the visual is
jarring.
The "blips back" almost certainly means ACE's server-side position
correction snaps the player back after the client overshot. The
client's run-not-walk choice is the proximal cause.
---
## What we know already (don't re-discover this)
### Trace evidence captured during merge
From `launch.log` of the Step 2 verification run (task `b01zkw68w`,
2026-05-16, with `ACDREAM_DEVTOOLS=1` + my temporary
`OnLiveMotionUpdated` diagnostic):
```
[B.4b] use guid=0x7A9B46AE seq=159 ← outbound Use packet sent
OnLiveMotionUpdated: guid=0x5000000A stance=61 cmd=0x speed= ← player → NonCombat
OnLiveMotionUpdated: guid=0x7A9B46AE stance=61 cmd=0x0003 speed= ← NPC turns to face
OnLiveMotionUpdated: guid=0x7A9B46AE stance=61 cmd=0x speed=
OnLiveMotionUpdated: guid=0x5000000A stance=0 cmd=0x0005 speed=-1.84 ← ACE sends MoveToObject for player
OnLiveMotionUpdated: guid=0x5000000A stance=0 cmd=0x speed=
```
The pattern repeats identically for every retry — ACE *is* sending the
auto-walk command, but the client isn't engaging it.
**The negative speed (-1.84) is suspicious.** Speed is parsed as a raw
IEEE 754 float by `UpdateMotion.cs:193`. Either retail encodes a sign
that we're misinterpreting, or this is a legitimate "move backward"
instruction (ACE sometimes positions the move-to point behind the
player). The auto-walk engagement condition may be filtering negative
speeds out without our intent.
### What was already established BEFORE this issue
`PlayerMovementController` got significant retail-faithful refactor work
in Phase B.6 (closed-issue #75 territory, commit `f035ea3`). That work
established:
- **Walk/run threshold = 1.0m** of remaining-distance-to-useRadius
(not ACE's wire-supplied 15m default — that's overridden).
- **One-shot walk/run decision** at `BeginServerAutoWalk` time, held
for the rest of the chain.
- **Direct body-velocity drive** — auto-walk does NOT synthesize
`MovementInput`. It steps `Yaw`, sets `_body.set_local_velocity`
from `runRate`, and calls `_motion.DoMotion(WalkForward, speed)`
directly.
The auto-walk diagnostic infrastructure already exists:
```
PhysicsDiagnostics.ProbeAutoWalkEnabled ← runtime-toggleable
ACDREAM_PROBE_AUTOWALK=1 ← env-var enable
[autowalk-out] on every SendUse / SendPickUp
[autowalk-mt] on every inbound UpdateMotion for the local player
[autowalk-up] on every inbound UpdatePosition for the local player
[autowalk-begin] when BeginServerAutoWalk fires
[autowalk-end] when EndServerAutoWalk fires
```
**You should turn this on first.** The `[autowalk-begin]` line will tell
you whether `BeginServerAutoWalk` is even being invoked for the
walking-range case.
### Where to start reading code
| File | Why |
|---|---|
| `src/AcDream.App/Input/PlayerMovementController.cs` | The auto-walk driver lives here. Key functions: `BeginServerAutoWalk` (line ~428), `DriveServerAutoWalk` (line ~550), `EndServerAutoWalk` (line ~478). |
| `src/AcDream.App/Rendering/GameWindow.cs` line ~3360 | The `OnLiveMotionUpdated` site that detects MoveToObject pattern and calls `BeginServerAutoWalk`. The `[autowalk-mt]` and `[autowalk-begin]` traces fire here. |
| `src/AcDream.Core.Net/Messages/UpdateMotion.cs` line ~193 | The inbound parser. `ForwardSpeed` is a raw float — investigate whether negative is legitimate or a sign-misinterpretation. |
| `docs/superpowers/specs/2026-05-14-phase-b6-design.md` | The Phase B.6 design spec. Read this first to understand the existing auto-walk contract. |
| `references/holtburger/crates/holtburger-core/src/client/simulation.rs` | The Rust client's equivalent — has `ServerControlledProjection` + `approximate_move_to_object_projection_target`. Holtburger handles this case correctly, so cross-checking is valuable. |
| `docs/research/named-retail/acclient_2013_pseudo_c.txt` | Grep for `MoveToManager::HandleMoveToPosition`, `MoveToManager::HandleAutonomyLevelChange`, `CMotionInterp::apply_interpreted_movement`. Retail's truth. |
### What was checked and ruled out during the Step 2 session
- The bugs exist on the **pre-Step-2 branch** (eda936d / 32423c2 / main).
This was confirmed by diff scope: `PlayerMovementController.cs`,
`PhysicsEngine.cs`, `UpdateMotion.cs` were not touched by Step 2.
- The Step 2 refactor (`0b25df5`) does not affect the auto-walk path.
- Subscriptions are wired correctly — `OnLiveMotionUpdated` IS firing
for every motion update (verified via `[step2-diag]` traces that have
since been stripped).
---
## Hypotheses to test, in order
### H1 (most likely) — `BeginServerAutoWalk` never fires for the walking-range MoveToObject
The walking-range MoveToObject from ACE may not match the pattern that
`OnLiveMotionUpdated` checks before calling `BeginServerAutoWalk`. The
condition probably checks for one of: `IsServerControlledMoveTo`,
non-zero ForwardSpeed magnitude, specific `MovementType`, or specific
`ForwardCommand` values. Walking-range UpdateMotion may differ from
running-range in one of those fields.
**Test:** Enable `ACDREAM_PROBE_AUTOWALK=1`, click the NPC at walking
range. Look for `[autowalk-mt]` (inbound parse) WITHOUT a following
`[autowalk-begin]`. That confirms H1 and points to GameWindow.cs:3360.
### H2 — `BeginServerAutoWalk` fires but `_autoWalkInitiallyRunning` decision misclassifies
The walk/run decision uses:
```csharp
remainingAtStart = initialDist - distanceToObject
_autoWalkInitiallyRunning = remainingAtStart >= 1.0m
```
If ACE sends a `distanceToObject` (useRadius) much smaller than the
NPC's actual useRadius — or if `initialDist` is computed against the
wrong target position — `remainingAtStart` could land just above 1m
even at user-perceived walking range, causing run-not-walk. That
matches **Bug B**'s "runs and overshoots" pattern.
**Test:** Compare `[autowalk-begin] dest=(...) minDist=... objDist=... walkRunThresh=...`
values between a walking-range click and a running-range click. The
`objDist` should be the wire-supplied useRadius. If it's wrong (too
small), retail's value disagrees and we have a parser bug elsewhere.
### H3 — Negative `ForwardSpeed` is filtered or misinterpreted
`speed=-1.84` is the literal IEEE 754 float on the wire. Retail's
`CMotionInterp::handle_action_walkforward` (or whichever code consumes
ForwardSpeed) may use it for direction relative to the auto-walk
heading; a sign-extension bug in our parse would matter.
**Test:** Grep `references/holtburger` and named-retail decomp for how
`ForwardSpeed` is consumed. If retail/holtburger interpret the sign
specially and we don't, that's the gap.
### H4 — Arrival predicate fires too early
`DriveServerAutoWalk` line ~601:
```csharp
withinArrival = dist <= arrivalThreshold
```
where `arrivalThreshold = _autoWalkDistanceToObject` (use-radius).
If `distanceToObject` is 0 or near-zero (a parser bug, see H2), the
arrival predicate fires on the first frame and `EndServerAutoWalk("arrived")`
is called immediately, so the player never visibly moves. That matches
**Bug A** exactly.
**Test:** Look for `[autowalk-end] reason=arrived` immediately after
`[autowalk-begin]` with zero or one frame between. That confirms H4.
---
## Reproduction recipe (~3 minutes)
1. **Launch with autowalk probe enabled:**
```powershell
$env:ACDREAM_DAT_DIR = "$env:USERPROFILE\Documents\Asheron's Call"
$env:ACDREAM_LIVE = "1"
$env:ACDREAM_TEST_HOST = "127.0.0.1"
$env:ACDREAM_TEST_PORT = "9000"
$env:ACDREAM_TEST_USER = "testaccount"
$env:ACDREAM_TEST_PASS = "testpassword"
$env:ACDREAM_DEVTOOLS = "1"
$env:ACDREAM_PROBE_AUTOWALK = "1"
dotnet run --project src\AcDream.App\AcDream.App.csproj --no-build -c Debug | Tee-Object -FilePath launch.log
```
2. **Reproduce Bug A:**
- Walk toward the inn area in Holtburg until you're ~3-5 meters from
an NPC (e.g. Royal Guard near the inn). Estimate by eye — the goal
is "you can see the NPC clearly but you'd need to take a few steps
to reach them."
- Double-click the NPC.
- Observe: player doesn't move. Click again — same result.
3. **Reproduce Bug B:**
- Find a ground item (Holtburg has scattered spell components — the
coloured Tapers are obvious). Stand ~3-5 meters away.
- Press F (or whatever your `SelectionPickUp` key is bound to).
- Observe: player runs, overshoots, snaps back, item picked up.
4. **Stop the client gracefully** (window close, not Stop-Process — see
CLAUDE.md "Logout-before-reconnect"). ACE clears stale sessions in
35 seconds on graceful close.
5. **Grep the log:**
```bash
tr -d '\000' < launch.log | grep -E "\[autowalk-(out|mt|begin|up|end)\]"
```
This should give you a complete frame-by-frame trace of every
auto-walk decision the client made.
---
## Acceptance criteria
When the fix lands:
- ✅ Click NPC at walking range → player **walks** (not runs) directly to NPC,
Use fires on arrival, NPC dialogue appears.
- ✅ Press F on ground item at walking range → player **walks** the short
distance, no overshoot, no blip-back, item enters inventory.
- ✅ Far-range click still **runs** to target (don't regress the working case).
- ✅ Out-of-walking-but-very-close-range case (right at the edge of useRadius)
still arrives without infinite spin or stuttering.
- ✅ All existing tests pass (8 pre-existing Core failures are baseline,
do NOT count against the fix).
- ✅ Visual verification at Holtburg, all three M1 demo targets still work
(door, NPC, pickup).
---
## What NOT to do
- **Don't add a workaround**, per CLAUDE.md's "no workarounds" rule.
No grace-period band-aid, no "if speed is negative, force walk" hack,
no "always walk when within 5m" override. Fix the root cause.
- **Don't rewrite the auto-walk path** — Phase B.6 was a heavy retail
decomp port. The fix is almost certainly a one-condition or one-formula
adjustment, not a new design.
- **Don't change `Step 2`'s extracted code**. `LiveSessionController` and
the wireup are clean — `OnLiveMotionUpdated` is wired and firing per
the previous session's verification traces.
---
## Time estimate
- 30 min: read the spec + reproduce + capture trace
- 30 min: identify root-cause hypothesis from the trace
- 30 min 2 hr: implement the fix (depends on which hypothesis lands)
- 30 min: visual verification + write commit message
- **Total:** 23 hours focused work
---
## When done
1. Commit message format: `fix(physics): close #77<root cause summary>`
2. Move `#77` to the "Recently closed" section of `docs/ISSUES.md`
with closed-date + commit SHA (matches the project convention).
3. If the fix uncovered a durable lesson (e.g. "ACE sends negative
ForwardSpeed for MoveToObject; we were filtering"), add a
`feedback_*.md` memory entry per `memory/MEMORY.md` conventions.
4. The next pre-M2 cleanup items in queue: root `.editorconfig` + Step 3
(`LiveEntityRuntime`). See `docs/architecture/code-structure.md` §4.