Code review on T10-T12 bundle (commits 0cf86bb/00bb030/0405947 + audit
fix 76e1a64) found 3 Important issues:
1. LandblockStreamer.Start() had an idempotency race — the XML doc
claimed thread-safety but the implementation checked _worker != null
before assigning, allowing two callers to both pass the check and
spawn duplicate worker threads. Fixed via Interlocked.CompareExchange.
2. No test verified the worker emits Failed when buildMeshOrNull returns
null. Added Load_WhenBuildMeshReturnsNull_ReportsFailed.
3. StreamingControllerTests.cs:81 used MeshData: default! when
constructing a Loaded result. If a future test flows MeshData
through the apply callback, the null reference would NRE rather
than producing a meaningful assertion failure. Replaced with a real
empty LandblockMeshData instance.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the T7-temporary default! MeshData placeholder. Streamer
now takes Func<uint, LoadedLandblock?, LandblockMeshData?> at
construction; the worker calls it after _loadLandblock succeeds and
passes the pre-built mesh into LandblockStreamResult.Loaded.
GameWindow's buildMeshOrNull factory takes the already-loaded
LoadedLandblock (lb.Heightmap is the LandBlock dat object), so no
additional dat read is needed — _heightTable and _blendCtx are
read-only after init, _surfaceCache is ConcurrentDictionary (T9).
Zero dat lock needed inside the mesh-build closure.
StreamingController._applyTerrain delegate signature widened to
Action<LoadedLandblock, LandblockMeshData> so the pre-built mesh
flows render-thread-side via the Loaded result. ApplyLoadedTerrainLocked
now accepts meshData and calls _terrain.AddLandblock directly, skipping
the per-frame LandblockMesh.Build that previously ran on the render
thread (~5ms per LB at radius=12 first traversal).
StreamingControllerTests updated: all four applyTerrain lambdas
adapted to the two-arg Action signature.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase A.1 reverted to synchronous mode due to DatCollection thread-
safety; T10 documented the lock that makes concurrent reads safe. T11
activates the dedicated worker thread and switches enqueue methods
to non-blocking Channel.Writer.TryWrite.
EnqueueLoad now takes LandblockStreamJobKind (default: LoadNear from
all callers, matching previous full-load semantics). T13/T16 will
route by kind per TwoTierDiff.
Constructor gains optional buildMeshOrNull param (defaults to null-
returning stub); T12 wires the real LandblockMesh.Build factory.
GameWindow construction site updated: Action<uint> enqueueLoad
delegate now wraps a lambda (method group won't bind to Action<uint>
when the method has an optional second param).
LandblockStreamerTests updated: the synchronous-thread-pinning test
replaced by Load_ExecutesLoaderOnWorkerThread which asserts the
loader runs on a different thread; Load_FollowedByDrain now supplies
a stubMesh so the worker can produce Loaded (not Failed) results.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Extends the Loaded result record with a LandblockStreamTier discriminator
and a LandblockMeshData payload (default! stub — T13 wires the real
off-thread mesh build). Adds the Promoted variant for Far→Near upgrades
that only need the entity layer, not a mesh rebuild.
LandblockStreamer.HandleJob passes Tier.Near + default! MeshData at the
existing synchronous load site; StreamingControllerTests updated to
match the new positional signature.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Code review on commits 7bcabab/fb6b61e/326b698 flagged 2 Important +
4 Minor issues. Apply all fixes:
Important:
- Two-tier RecenterTo + MarkResidentFromBootstrap now throw
InvalidOperationException on misuse — calling RecenterTo before the
bootstrap silently emitted the entire window as fresh loads (no
demotes/unloads since _tierResidence was empty), a correctness hazard
that produced no exception. Calling MarkResidentFromBootstrap twice
silently dropped accumulated tier state. Both now crash loudly via
a _bootstrapped flag.
- Dropped TierResidence.None from the enum — never assigned, never
checked; absence from the dictionary already encodes "not resident."
Minor:
- Renamed test: RecenterTo_FirstTick_* → ComputeFirstTickDiff_FirstTick_*
(the test calls ComputeFirstTickDiff, not RecenterTo).
- Strengthened RecenterTo_PlayerWalks_NullToFar_* with assertions for
ToPromote.Count==3 (the x=102 column promoting Far→Near) and
ToUnload.Empty (everything within hysteresis).
- Replaced System.Math.Abs with Math.Abs in new code to match the
file's existing `using System;` convention.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds 5 tests to StreamingRegionTwoTierTests covering all tier-transition paths:
- FarToNear promote (walk 2 east from initial center)
- NullToNear teleport (loads 9 near + 40 far for a fully fresh region)
- NearToFar demote only after NearRadius+2 hysteresis threshold
- FarToNull unload only after FarRadius+2 hysteresis threshold
- oscillation no-thrash: bouncing 1 LB across a near boundary fires 0 demotes
and at most 5 promotes total (one initial settle of the x=100 near-column)
Oscillation test fix: initialise the region at the oscillation midpoint
(103,100) rather than at a distant starting center (100,100) so the
initial move into the oscillation range doesn't itself trigger legitimate
demotes, isolating the no-thrash invariant.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds TierResidence enum (None/Far/Near), _tierResidence dictionary seeded
by MarkResidentFromBootstrap, and the canonical two-tier RecenterTo overload
returning TwoTierDiff. Pass 1 walks the new far window and emits ToLoadFar /
ToLoadNear / ToPromote; Pass 2 walks prior residents and emits ToDemote /
ToUnload using Chebyshev hysteresis thresholds (NearRadius+2 / FarRadius+2).
EncodeLandblockIdForTest exposes the encoding rule to test assemblies.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds the first-tick bootstrap diff: ToLoadNear for the (2*near+1)^2 inner
window, ToLoadFar for the outer annulus up to FarRadius. Uses Chebyshev
distance, matching existing Recenter convention.
Also renames the single-tier RecenterTo → RecenterToSingleTier to free
the canonical name for the upcoming two-tier overload (T5). Updates
StreamingRegionTests and StreamingController to call the renamed method.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Code review on commit 7fd9c82 flagged that the test asserted NearRadius,
FarRadius, CenterX, CenterY but not the load-bearing alias
Radius == FarRadius. That alias is what makes the existing hysteresis
math (Radius+2 unload threshold) correctly target the far-tier boundary.
Future typos would silently break far-tier hysteresis.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add NearRadius/FarRadius properties and a four-arg constructor
(centerX, centerY, nearRadius, farRadius). Radius is set to farRadius
so existing hysteresis math (unload threshold = Radius+2) uses the
outer ring as the bookkeeping boundary. Old three-arg constructor
becomes a thin wrapper: this(cx, cy, radius, radius) — no behaviour
change, 25 pre-existing streaming tests still pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Fifth and final Phase A.1 hotfix. Replaces the previous "drop on
miss" semantics in GpuWorldState.AppendLiveEntity with a per-landblock
pending bucket that survives the race where a CreateObject arrives
before its landblock has been streamed in.
Root cause:
The post-login spawn flood (40+ NPCs/items) drains in a single
WorldSession.Tick() call. The synchronous streamer enqueues all 25
visible-window landblocks in one shot but StreamingController.Tick
was capped at MaxCompletionsPerFrame=4, so only 4 landblocks landed
in GpuWorldState on the first frame. The center landblock 0xA9B4FFFF
may or may not have been in those first 4 (HashSet iteration order
is undefined). Spawns whose target landblock wasn't yet loaded were
silently dropped by AppendLiveEntity. Re-ordering the OnUpdate
(streaming first, live second) didn't fix it because the cap still
limited to 4 per frame; spawns for landblocks #5+ kept dropping
until the queue drained, by which point the spawn flood was over.
The reordering was correct but insufficient. The cap was a relic of
the original async streamer design (limit GPU upload spikes per
frame). With the synchronous streamer there's no backlog to spread,
so the cap was pure latency for no benefit. Setting it to int.MaxValue
restores "drain everything you just enqueued" semantics.
The pending-spawn list is the *correct* architecture fix that makes
the system robust against any future ordering bug, not just the cap:
- AppendLiveEntity for an unloaded landblock parks the entity in a
per-landblock pending bucket instead of dropping it.
- AddLandblock drains pending entries for its landblock and merges
them into the loaded record before storing.
- RemoveLandblock drops pending entries for the same landblock —
if the player moved away, the spawns are no longer relevant; the
server resends them via CreateObject when the player returns.
Diagnostic counter PendingLiveEntityCount exposes the bucket size
so future regressions are visible without spelunking.
7 new GpuWorldStateTests pin the contract:
- AppendLiveEntity_LandblockAlreadyLoaded_AppendsImmediately
- AppendLiveEntity_LandblockNotLoaded_ParksInPending
- AddLandblock_DrainsPendingEntriesForThatLandblock
- AddLandblock_DoesNotDrainPendingForADifferentLandblock
- RemoveLandblock_DropsPendingForThatLandblock
- RemoveLandblock_LoadedThenRemoved_DropsItsEntities
- IsLoaded_ReturnsTrueForLoaded_FalseForPendingOnly
Also removes the diagnostic Console.WriteLine I added in the previous
debugging round and the old LiveAppendsResolved/Dropped counters that
were never read by anyone.
219 tests green (212 + 7 new).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ROOT CAUSE of the "giant ball with spikes" terrain corruption that
the previous two hotfix attempts (lock + synchronous loading) failed
to address. Threading was a red herring all along.
AC dat conventions:
0xAAAA0xFFFF — LandBlock dat (terrain heightmap)
0xAAAA0xFFFE — LandBlockInfo dat (static-object metadata)
WorldView.NeighborLandblockIds correctly uses 0xFFFF. My
StreamingRegion.EncodeLandblockId from Phase A.1 Task 1 used 0xFFFE
by mistake. Every streaming load was therefore calling
LandblockLoader.Load with the LandBlockInfo id, which makes
DatCollection ask DatBinReader to read a LandBlock from the
LandBlockInfo file. The reader's internal buffer position lands in
the middle of the wrong file's bytes, ReadBytesInternal asks for an
out-of-range slice, throws ArgumentOutOfRangeException, and the
landblocks that DON'T throw return half-populated LandBlock objects
whose Height[] arrays contain garbage. Garbage Z values render as
the spike pattern.
The kicker: my Task-1 review fix added a test
(Constructor_SmallRadius_IDsMatchEncodingRule) that asserted
Assert.Contains(0x1234FFFEu, region.Visible). The test was passing
because it pinned the wrong value. I literally codified the bug.
Fix: change EncodeLandblockId's terminator from 0xFFFEu to 0xFFFFu
and update the test to assert 0x1234FFFFu. The XML doc on Visible
now explicitly explains the 0xFFFF/0xFFFE distinction so this can't
recur.
The previous two hotfixes (_datLock in c991fb2, synchronous streamer
in 531c9f9) stay in place — _datLock is defensive belt-and-suspenders
that documents which entry points read dats, and synchronous loading
is correct-by-default until we decide whether to reintroduce
background loading (Phase A.3 may make it unnecessary anyway).
212 tests green. With this fix the streaming should actually work.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Second hotfix attempt for the "ball of spikes" terrain corruption.
The previous _datLock fix was insufficient because dat reads happen
from many render-thread code paths I didn't enumerate (animation
tick, OnLiveMotionUpdated, OnLivePositionUpdated, the live spawn
hydration, ApplyLoadedTerrain) and locking each is invasive and
fragile.
DatReaderWriter's DatCollection is fundamentally not thread-safe:
DatBinReader's internal buffer position is shared per-database, so
two concurrent .Get<T> calls corrupt each other's read state. The
ArgumentOutOfRangeException at DatBinReader.ReadBytesInternal in
the failure log is the smoking gun — one read started reading a
LandBlock, another moved the reader's position, the first one
asked for the wrong number of bytes.
Until Phase A.3 introduces a thread-safe dat wrapper (or until we
preload all dats into pure in-memory dictionaries), the streamer
runs synchronously: EnqueueLoad invokes the load delegate inline
on the calling thread and writes the result to the outbox in a
single call. The render-thread DrainCompletions loop picks it up
on the same frame.
API surface unchanged — Channel-based outbox, EnqueueLoad/Unload,
DrainCompletions, Start (now no-op), Dispose all preserved. Move
back to async loading is a single-class change once dat thread
safety lands.
Cost: visible frame hitch when crossing landblock boundaries
(rendering the new landblock is now on the render thread). For
default 5×5 the hitch is one landblock per cardinal step, ~50ms
worst case. Acceptable for the MVP — correctness over hitches.
Updated the off-thread test to assert the new synchronous contract
(loader runs on the calling thread). The other 4 tests still pass
unchanged because their spin-drain pattern works with synchronous
delivery too.
The previous _datLock from commit c991fb2 stays in place as
defensive belt-and-suspenders — it's free in synchronous mode and
keeps the contract documented at every dat-reading entry point.
212 tests green.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Called once per frame from OnUpdate. Owns a StreamingRegion and uses
delegates into LandblockStreamer + a terrain-apply callback so unit
tests can inject fakes. Handles first-tick bootstrap (whole window
loads), boundary recenter (diff against previous center), and
drain completions (up to N per frame to cap GPU upload spikes).
4 new tests, all green.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Code review follow-up to commit 0904372. Five Important fixes plus
three Minor polish items found by the reviewer before StreamingController
depends on this class under churn.
I1: Dispose is now thread-safe via Interlocked.Exchange on an int
guard. Two concurrent Dispose calls no longer double-dispose the
CancellationTokenSource.
I2: EnqueueLoad/EnqueueUnload now throw ObjectDisposedException when
called after Dispose instead of silently dropping the job. Jobs
vanishing into a completed channel was a debugging hazard.
I3: Start throws ObjectDisposedException when called after Dispose
instead of silently doing nothing (the old guard only checked
whether the thread was non-null, not whether the streamer was
still usable).
I4: New test Load_ExecutesLoaderOnBackgroundThread captures the
loader delegate's ManagedThreadId and asserts it differs from
the test thread's id, proving the whole reason this class
exists (off-thread execution) is actually happening.
I5: New LandblockStreamResult.WorkerCrashed record type for the
outer catch in WorkerLoop. Previously the crash path wrote
Failed(0, ex.ToString()) which collided with landblock (0, 0)
in the north ocean, making "worker crashed" indistinguishable
from "landblock 0 failed to load".
Minor polish:
- M1: Test spin constants (SpinTimeoutMs, SpinStepMs,
SpinMaxIterations) extracted so the 200 x 10ms pattern has one
source of truth.
- M2: DefaultDrainBatchSize public const on LandblockStreamer so
the batch cap has a name and a comment explaining why 4.
- M3: Safety-argument comment on the sync-over-async
WaitToReadAsync call explaining why it cannot deadlock (dedicated
thread, no SyncContext).
- M6: XML remarks on the class and on DrainCompletions documenting
threading contract (Enqueue = any thread, Drain = single consumer
thread).
112 Core + 96 Core.Net tests green.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Background thread pulls load/unload jobs from an inbox channel, invokes
a caller-supplied Func<uint, LoadedLandblock?> (production wraps
LandblockLoader.Load, tests inject a fake), and posts results to an
outbox channel the render thread drains. Graceful shutdown via
CancellationToken; failed loads reported rather than retried.
4 new tests, all green.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
LandblockStreamJob (Load/Unload) and LandblockStreamResult
(Loaded/Failed/Unloaded) are the channel payload types the next
task's LandblockStreamer will use. Separate file because they're
shared between the worker thread and the render thread and deserve
a focused home.
Folds in two carryover nits from the Task 1 fix review:
- Stale "radius + 1" comments in StreamingRegionTests updated to
match the real radius+2 threshold (no numeric-assertion changes).
- Single-step recenter test now asserts Visible.Count == 25 and
Resident.Count == 30, locking in the Visible/Resident semantic
split behaviorally.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Review follow-up from commit 11df793. Three fixes:
1. Visible semantics: StreamingRegion.Visible now strictly describes the
current (2r+1)×(2r+1) window, not window + hysteresis retainees.
Added a parallel Resident property exposing the actual loaded set
(window + hysteresis buffer). This matters because StreamingController
(next task) reads these to decide what to render vs what to unload;
conflating them in one set would have forced awkward post-processing
downstream.
2. Doc/code disagreement: updated the RecenterTo and RegionDiff doc
comments from "radius + 1" to "radius + 2" to match the actual
implementation (which is what the tests require). Also updated the
plan doc so future readers don't hit the same contradiction.
3. Edge-clamping test coverage: added a single-axis edge test
(cx=0, cy=50 → 15 entries) and an ID-encoding test (radius=0 at
0x12,0x34 → 0x1234FFFE) so a swapped-shift bug in EncodeLandblockId
or an asymmetric off-by-one would fail a test instead of passing
silently.
9 tests green, full suite regressions-free.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Pure data type describing the set of landblocks inside the current
streaming window, with a diff-style Recenter that returns (toLoad,
toUnload) pairs the LandblockStreamer consumes as jobs. Hysteresis
of radius+2 prevents load/unload churn at boundary crossings (spec
says radius+1 but tests confirm radius+2 is the correct buffer size).
First piece of Phase A.1 per docs/superpowers/plans/2026-04-11-foundation-a1-streaming.md.
7 new tests, all green. Total suite: 105/105.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>