acdream/docs/research/2026-05-03-remote-anim-cycle/investigation-prompt.md
Erik 7f1bd1809a docs(research): investigation prompt for remote-anim-cycle bug
Hand-off briefing for the remaining "observed retail char's leg cycle
doesn't visibly switch in acdream" bug. Captures everything we learned
today including:

- All 8 commits shipped today (turn-sign, observed-velocity revert,
  retail-faithful tick, Commands-list SubState skip, currNode reset)
- Confirmed wins: body translation, run-in-circles, jump landing
  position + animation, turn-left direction
- Confirmed remaining bug: walk/run/idle leg cycle on observed remotes
  + residual steady-state blippiness
- Diagnostic infrastructure (FWD_WIRE, CMD_LIST, HASCYCLE, SETCYCLE,
  SEQSTATE, TURN_WIRE, OMEGA_DIAG, VEL_DIAG) and how to reproduce
- cdb live trace findings (retail uses additive add_to_queue with no
  truncate; we have ClearCyclicTail + rebuild)
- Six concrete next-step hypotheses
- A self-contained prompt for the next research agent
- Notes on rejected approaches (link-skip, full-reset, scaling hack)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-03 19:59:22 +02:00

253 lines
18 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Remote-entity animation-cycle bug — investigation prompt
**Hand-off date:** 2026-05-03
**Status:** open. Multiple shipped fixes today reduced the remote-entity motion problem to a single residual symptom — the **leg-cycle on observed remotes does not visibly switch between Walk / Run / Ready** even though every signal says it should. Plus minor blippiness in steady motion.
This document is a self-contained briefing for an agent (or fresh session) picking this up.
---
## What problem are we trying to solve?
When acdream observes another player driven by a parallel **retail** acclient.exe (connected to the same local ACE server), the remote character's **leg animation cycle** does not visibly change when that retail player switches between Run / Walk / Idle. The remote's **body** moves at the right speed (translation works), but the **legs keep playing whatever cycle was active before**.
User test: drive `+Acdream` (or any retail char) through `Press W (run) → release → Press shift+W (walk) → release` while observing in acdream's window. The body moves correctly but the leg cycle stays in idle pose / walk pose / whatever it was.
User-confirmed working perspectives:
- Local +Acdream's transitions in acdream **work**
- +Acdream observed FROM a parallel retail client **work** ✓ (proves our outbound is fine)
So the bug is **specifically** in how acdream renders the visual cycle for an observed remote-driven character.
---
## What we shipped today (commits in chronological order)
```
0997f96 fix(motion): landing fallback + TurnLeft omega sign + vel diagnostic (L.3.2)
9960ce3 fix(motion): preserve signed TurnSpeed for remote turn animations
842dfcd fix(motion): retail-faithful per-frame remote tick (L.3.2 follow-up)
b1d8e12 research(motion): cdb live trace of retail walk-to-run transition
a45c21e fix(motion): retail-faithful remote tick — clear body.Velocity, drive via seqVel
c06b6c5 fix(motion): full queue reset on locomotion-cycle direct transitions [partly reverted]
a2ae2ae revert: AnimationSequencer locomotion-cycle full-reset and link-skip
357dcc0 fix(motion): SetCycle forces _currNode onto first newly-enqueued node;
skip SubState commands in UM Commands list iteration
```
**User-confirmed wins from the above:**
- Body translation no longer races (was 2× server pace; now matches)
- Run-in-circles smooth (rectangle-effect gone — body rotates properly between UPs)
- Jump landing position correct (no mid-air force-land)
- Jump landing animation works (Falling → Ready visible)
- Turn-left visibly turns left (was animating right with snap-back)
- Signed TurnSpeed preserved (ACE encodes TurnLeft as `TurnCommand=TurnRight, Speed=negative`)
**User-confirmed remaining bugs:**
1. **Walk↔Run leg cycle on observed remotes does not visibly switch.** Body advances at correct new speed but legs continue playing previous cycle.
2. **Residual small "blip" corrections during steady-state motion** (run, walk, strafe). User describes this as a periodic micro-jitter — small but visible.
3. **(Possible)** ~20% steady-state walk overshoot (`maxSeqSpeed=3.120, serverSpeed≈2.6`) per VEL_DIAG measurements — not yet root-caused. May or may not be related to (2).
---
## What we proved about bug 1 (the cycle-doesn't-switch)
Per the diagnostic infrastructure built today:
| Signal | Result |
|---|---|
| `[FWD_WIRE]` — wire-arrival ForwardCommand transitions | ✅ ACE delivers `WalkForward → RunForward` (and direct walk↔run) correctly |
| `[CMD_LIST]` — Commands list at receive time | Empty for walk/run UMs; contains Ready/Action class for some others |
| `[HASCYCLE]` — does the dat have the requested cycle | ✅ True for both `0x44000007` (Run) and `0x45000005` (Walk) on style `0x8000003D` (NonCombat Humanoid) |
| `[SETCYCLE]` — animCycle picker calls into AnimationSequencer.SetCycle | ✅ Fires with correct (style, motion, speed) |
| `[SEQSTATE]` — per-tick `ae.Sequencer.CurrentMotion` for the observed remote | ✅ Holds the new motion correctly (e.g. shows `0x44000007 speed=2.939` after Run press, then `0x41000003 speed=1.000` after release) |
So:
- ACE wire data is correct.
- Our parser updates `InterpretedState` correctly.
- `OnLiveMotionUpdated` calls `SetCycle` with correct args.
- `SetCycle` updates the sequencer's `CurrentMotion` correctly.
- The cycle data the sequencer would play exists in the dat.
**But the visible leg cycle does NOT update.** Therefore the bug is **downstream of `ae.Sequencer.CurrentMotion`** — somewhere between the sequencer's internal state and the rendered MeshRefs:
- `AnimationSequencer.Advance(dt)` returning frames from the wrong node
- `BuildBlendedFrame()` reading from a stale `_currNode`
- `_currNode` advancing through stale link/head frames before reaching the new cycle
- Or how the per-part transforms returned by Advance get applied to the entity's `MeshRefs` for remote entities
We attempted a fix in `357dcc0` that forces `_currNode` onto the first newly-enqueued node in SetCycle — user reports **no visible change** after this fix.
---
## What's different between local (works) and remote (doesn't)
Both call **the same `AnimationSequencer.SetCycle` method** in `src/AcDream.Core/Physics/AnimationSequencer.cs:360`. So the sequencer code itself is shared.
Local +Acdream path:
- `PlayerMovementController``UpdatePlayerAnimation` (in `GameWindow.cs:6664`) → resolves cycle → `ae.Sequencer.SetCycle(...)`
- Fast-path early-return when cmd+speed unchanged (line 6713-6714)
- `OnLiveMotionUpdated` skips wire-echo SetCycle for the local player guid (line 2707)
Remote (observed retail char) path:
- Wire arrives → `OnLiveMotionUpdated` (`GameWindow.cs:3203`)
- "animCycle picker" at line 2842-2867 chooses the cycle based on Forward / Sidestep / Turn priority
- HasCycle fallback chain at line 2939
- `ae.Sequencer.SetCycle(fullStyle, cycleToPlay, animSpeed)` at line 2988
- Then iterates `update.MotionState.Commands` and routes each through `AnimationCommandRouter` (357dcc0 made this skip SubState class)
- Then ALSO updates `remoteMot.Motion.InterpretedState.ForwardCommand/ForwardSpeed` for body.Velocity computation
- Then ALSO calls `remoteMot.Motion.DoInterpretedMotion(...)` for sidestep/turn axes
**Hypotheses to investigate:**
A) After `SetCycle` fires, some other call in `OnLiveMotionUpdated` re-cycles the sequencer back. We've eliminated the `Commands` list (357dcc0 skip-SubState). Other candidates: `PlayAction` calls inside `RouteWireCommand`, the spawn-time SetCycle at line 2313, or something in `ApplyServerControlledVelocityCycle` (line 3238).
B) `_currNode` actually IS in the right place after SetCycle but `Advance(dt)` doesn't read from it correctly. Maybe a thread-safety issue (SetCycle on net thread, Advance on render thread, partial state visible).
C) `Advance` returns the right frames but `seqFrames` are not applied to the entity's `MeshRefs` for the remote entity specifically. Look at `GameWindow.cs:6510-6589` — the per-part transform application loop. There's no obvious local-vs-remote branch but worth tracing.
D) The MeshRefs themselves get rebuilt each frame and the rebuild reads from a different source for remotes. The `newMeshRefs` list is built per-frame at line 6567.
E) Local player's `ae.Sequencer.SetCycle` is called at a higher rate than remote's (per-input vs per-UM). Maybe the queue stays cleaner with frequent calls, and the bug is exposed only when SetCycle is sparse.
F) **Most likely** based on what we've seen: `Advance` plays through stale link frames before reaching the cycle. Our 357dcc0 fix forces `_currNode` onto the first newly-enqueued node — but for `Ready→Run`, the newly-enqueued sequence is `[Ready→Run link, Run cycle]`. `_currNode` lands on the **link**, the link plays for ~0.51 second, then the run cycle starts. User perceives the link's "transition pose" as "still walking / still idle."
---
## Diagnostic infrastructure available
All env-var gated on `ACDREAM_REMOTE_VEL_DIAG=1`:
| Diag | Where | What it shows |
|---|---|---|
| `[FWD_WIRE]` | `GameWindow.cs:2793-2800` | Each ForwardCommand transition received per remote |
| `[CMD_LIST]` | `GameWindow.cs:3119-3133` | Commands list contents at UM receive time |
| `[HASCYCLE]` | `GameWindow.cs:2939-2947` | HasCycle result for the requested cycle |
| `[SETCYCLE]` | `GameWindow.cs:2972-2986` | Each animCycle picker → SetCycle call |
| `[SEQSTATE]` | `GameWindow.cs:6520-6532` | Per-tick `ae.Sequencer.CurrentMotion` (1Hz throttled) |
| `[TURN_WIRE]` | `GameWindow.cs:3050-3057` | TurnCommand wire arrivals with signed speed |
| `[OMEGA_DIAG]` | `GameWindow.cs:5901-5912` | Per-tick omega being applied to body |
| `[VEL_DIAG]` | `GameWindow.cs:3327-3343` | Server-broadcast speed vs maxSeqSpeed per UP |
Also gated on `ACDREAM_INTERP_MANAGER=1` is the entire retail-faithful per-tick remote motion path. Set both env vars when reproducing.
The repo has `tools/cdb-scripts/` set up for live tracing of retail acclient.exe via cdb.exe. Two trace scripts already proven working:
- `walk_run_motion_trace.cdb` + `walk_run_motion_trace.log` — captured the exact retail walk→run sequence and proved retail uses `MotionTableManager::add_to_queue` without `truncate_animation_list`.
To launch retail tracing: have user start retail and connect, then in PowerShell:
```
& "C:\Program Files (x86)\Windows Kits\10\Debuggers\x86\cdb.exe" `
-pn acclient.exe -cf "tools\cdb-scripts\walk_run_motion_trace.cdb" *>&1 |
Tee-Object -FilePath "tools\cdb-scripts\walk_run_motion_trace.log.console"
```
Auto-detaches at 200 hits via `.detach` (do NOT use `qd` per CLAUDE.md gotcha — silently ignored). NEVER `Stop-Process` cdb — takes retail down with it.
---
## What retail actually does (from cdb live trace)
For a walk→run direct transition retail's call sequence is:
```
[79] CPhysicsObj::DoInterpretedMotion: motion=45000005 walk start (shift+W)
[82] CMotionTable::DoObjectMotion: motion=45000005
[83] MotionTableManager::add_to_queue: arg1=45000005 arg2=00000001 ← walk added looping
[89] CPhysicsObj::DoInterpretedMotion: motion=44000007 run start (release shift)
[92] CMotionTable::DoObjectMotion: motion=44000007
[93] MotionTableManager::add_to_queue: arg1=44000007 arg2=00000001 ← run added looping
[104] CMotionTable::StopObjectMotion: motion=44000007 run end (release W)
```
`MotionTableManager::truncate_animation_list` was on bp the entire trace and **never fired**. Retail just appends new motions to the queue and lets `MotionTableManager::CheckForCompletedMotions` (`0x0051BE00`) and `MotionTableManager::remove_redundant_links` (`0x0051BF20`) handle the natural progression — neither of which we have ported.
This suggests our `AnimationSequencer.SetCycle` rebuild semantics (ClearCyclicTail + enqueue link + enqueue cycle) is fundamentally different from retail's "append-only" `MotionTableManager`. May not matter for visual output as long as our queue manipulations land in the same end state, but it's a structural mismatch worth exploring if the tactical fixes don't pan out.
---
## File locations
- **`src/AcDream.Core/Physics/AnimationSequencer.cs`** — SetCycle (line 360), Advance (690), BuildBlendedFrame (1254), ClearCyclicTail (1117), AdvanceToNextAnimation (1150), EnqueueMotionData (1101), LoadAnimNode (1037)
- **`src/AcDream.App/Rendering/GameWindow.cs`** — OnLiveMotionUpdated (3203), TickAnimations (5851), animCycle picker block (2842-2988), the seqFrames-to-MeshRefs application loop (6510-6635), UpdatePlayerAnimation (6664)
- **`src/AcDream.Core/Physics/AnimationCommandRouter.cs`** — RouteWireCommand (53), Classify (29)
- **`src/AcDream.Core/Physics/MotionInterpreter.cs`** — get_state_velocity (587), GetMaxSpeed (968), apply_current_movement (653), HitGround (924)
- **`src/AcDream.Core/Physics/PositionManager.cs`** — ComputeOffset (37) (the per-tick combiner)
- **`src/AcDream.Core/Physics/InterpolationManager.cs`** — Enqueue, AdjustOffset (224), stall detection
- **Reference decomp:** `docs/research/named-retail/acclient_2013_pseudo_c.txt` (1.4M-line pseudo-C with full PDB names)
- **Symbols index:** `docs/research/named-retail/symbols.json` (greppable name → address)
- **Verbatim retail headers:** `docs/research/named-retail/acclient.h` (struct field offsets)
---
## Concrete next steps for the bug
1. **Add a per-tick diag that prints `_currNode.Anim.Id` + `_framePosition` for the observed remote.** This will conclusively answer whether `_currNode` is on the new cycle, on a stale link, or somewhere else. Implement near the existing SEQSTATE diag in `GameWindow.cs:6520`. Ask user to do the precise test sequence (W only, then shift+W only, no turns/no mouse) and read the log.
2. **Add a diag that prints `seqFrames[0].Origin` and `seqFrames[0].Orientation`** (the result of Advance) before applying to MeshRefs. If the values change meaningfully between cycles → bug is in MeshRefs application. If they're stuck → bug is in Advance/BuildBlendedFrame.
3. **Compare the call ORDER of SetCycle for local vs remote.** Maybe local's UpdatePlayerAnimation calls SetCycle then immediately also re-resolves cycle data and passes it through. Or local has frame-resolution state we lack for remotes.
4. **Try the retail-faithful additive `add_to_queue` semantics:** modify SetCycle to skip ClearCyclicTail and just append new motion data. The `MotionTableManager::CheckForCompletedMotions` cleanup we don't port might be needed — but a primitive version (drop nodes whose `IsLooping=true` count exceeds 1, keeping the newest) might suffice as a starting point.
5. **Trace retail's CSequence::update / update_internal calls live** with cdb to see what frames ARE returned per tick for a remote running and transitioning. We have the cdb toolchain set up; pattern existing scripts in `tools/cdb-scripts/`.
6. **If all else fails, dispatch a research agent** with the prompt below.
---
## For the next research agent — exact assignment
> Read this entire document.
>
> Read `src/AcDream.Core/Physics/AnimationSequencer.cs` end-to-end, focusing on:
> - `SetCycle` (line 360-560) — what state it mutates and in what order
> - `Advance` (line 690-784) — how it consumes the queue and what it returns
> - `BuildBlendedFrame` (line 1254-1313) — how the visible per-part transforms are computed
> - `ClearCyclicTail` (line 1117-1140) and `AdvanceToNextAnimation` (line 1150-1166) — node lifecycle in the queue
>
> Then read `src/AcDream.App/Rendering/GameWindow.cs:5851-6635` — the `TickAnimations` method including the dead-reckoning blocks, sequencer Advance call, and the seqFrames-to-MeshRefs application loop.
>
> Answer:
>
> 1. After `SetCycle` is called for `RunForward` (with `linkData != null` and `cycleData != null`), what is the precise queue state, the value of `_currNode`, and the value of `_framePosition` immediately after SetCycle returns? Trace step by step including ClearCyclicTail's effect on `_currNode`. Cite line numbers.
>
> 2. On the next render tick when `Advance(dt=0.0167)` is called, what does it do? Specifically, does it advance through the link frames first, or skip them, or play them and stop at the cycle? What pose does `BuildBlendedFrame` return at the end?
>
> 3. Is there any code path between `SetCycle` returning and the next `Advance` call that could RESET `_currNode` back to a stale node? List every SetCycle call site (there are ~12 in GameWindow.cs) and identify any that fire on the per-tick path (not just on UM receive).
>
> 4. Is there any difference in how `seqFrames` is consumed for the local player vs a remote-observed entity in the loop at lines 6566-6635? Both use `if (seqFrames is not null) { origin = seqFrames[i].Origin; ... }`. Find any conditional branch that bypasses seqFrames for remotes.
>
> Output: a concise (<800 word) report with line citations and a clear hypothesis for the root cause of the visible-cycle-doesn't-switch bug. Do NOT modify any code.
---
## Quick reproduction recipe
1. Start local ACE server (user has this running on `127.0.0.1:9000`).
2. Start a parallel **retail** acclient.exe and connect with a different character (NOT `+Acdream`).
3. Build acdream: `dotnet build src/AcDream.App/AcDream.App.csproj -c Debug`
4. Launch acdream from the main repo dir with both env vars:
```powershell
$env:ACDREAM_INTERP_MANAGER = "1"
$env:ACDREAM_REMOTE_VEL_DIAG = "1"
$env:ACDREAM_DAT_DIR = "$env:USERPROFILE\Documents\Asheron's Call"
$env:ACDREAM_LIVE = "1"
$env:ACDREAM_TEST_HOST = "127.0.0.1"
$env:ACDREAM_TEST_PORT = "9000"
$env:ACDREAM_TEST_USER = "testaccount"
$env:ACDREAM_TEST_PASS = "testpassword"
dotnet run --project src\AcDream.App\AcDream.App.csproj --no-build -c Debug 2>&1 |
Tee-Object -FilePath launch.log
```
5. From the retail client, drive the test character: stand 2s, press W (run) 4s, release, press shift+W (walk) 4s, release.
6. Observe the test character in the acdream window. Bug: leg cycle does NOT visibly switch between idle / run / walk poses.
7. Read diags from `launch.log` (UTF-16 — use `Get-Content -Encoding Unicode`).
---
## Notes on what NOT to do
- **Do not pass `skipTransitionLink: true` unconditionally to SetCycle** — tried in commit `c06b6c5` (link skip), broke landing-from-jump, sit-down, and every other transition that needs its dat link to play. Reverted in `a2ae2ae`.
- **Do not full-reset the queue on every motion change** — same commit, also reverted. Side effect: removed end-animations everywhere.
- **Do not "scale body.Velocity by observed serverSpeed/predictedSpeed"** — tried during the day, user explicitly rejected as a hack. Always use predicted velocity from `get_state_velocity` (= `RunAnimSpeed × ForwardSpeed`).
- **Do not `Stop-Process cdb`** while it's attached to retail — takes retail down with it (CLAUDE.md). Use `.detach` inside bp actions for graceful exit.