Three components:
1. WbMeshAdapter wraps the PrepareMeshDataAsync task with a continuation
that surfaces faulted-task exceptions + null-result cases for EnvCell
IDs only (gated by ProbeIndoorUploadEnabled). Two new log shapes:
[indoor-upload] FAILED cellId=0x... exception=<TypeName>: <Message>
stack=[<top 3 frames>]
[indoor-upload] NULL_RESULT cellId=0x...
2. Capture procedure: re-launch at Holtburg with the probe on, grep for
FAILED/NULL_RESULT lines, get definitive per-cell cause for the 26
missing-completion cells from Phase 1's capture.
3. Targeted fix: code change matching whichever exception type / null
pattern dominates. Fix shape is data-driven — see the contingency
table in the spec.
WB's catch at ObjectMeshManager.cs:589 already calls _logger.LogError,
but WbMeshAdapter constructs the manager with NullLogger.Instance, so
the log is dropped. Our continuation surfaces the same data scoped to
EnvCells only (avoids the thousands of GfxObj/Setup log lines a real
logger would emit during landblock streaming).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
9.6 KiB
Indoor Cell Rendering Fix — Phase 2 Design
Status: Brainstormed 2026-05-19. Awaiting user review.
Scope: Surface the silent failure in WB's PrepareEnvCellMeshData for 26/123 Holtburg cells, then implement the targeted fix.
Predecessor: Phase 1 (docs/superpowers/specs/2026-05-19-indoor-cell-rendering-fix-design.md) shipped the five [indoor-*] probes that confirmed hypothesis H1.
Capture evidence: docs/research/2026-05-19-indoor-cell-rendering-probe-capture.md.
1. What we know
Phase 1's ACDREAM_PROBE_INDOOR_ALL=1 capture at Holtburg 0xA9B4 proved:
- 123 EnvCells requested via
WbMeshAdapter.IncrementRefCount→ only 97 complete. - 26 cells silently fail. They get
[indoor-upload] requestedbut never[indoor-upload] completed. - The dispatcher then tries to draw them,
TryGetRenderDatareturns null, draw is silently skipped → user sees missing floor. - The first interior cell
0xA9B40100(likely the inn entry or another major building anchor) is among the 26.
The smoking gun is in WB's ObjectMeshManager.PrepareMeshData:
catch (Exception ex) {
_logger.LogError(ex, "Error preparing mesh data for 0x{Id:X16}", id);
return null;
}
WB logs the exception via its injected _logger. But WbMeshAdapter.cs:71 constructs ObjectMeshManager with NullLogger<ObjectMeshManager>.Instance — so the log goes to /dev/null. The exception type and message are lost.
2. Solution — three components
Component 1 — Exception-surfacing wrap
Capture the Task<ObjectMeshData?> returned by _meshManager.PrepareMeshDataAsync(id, isSetup: false) and attach a continuation that, for EnvCell IDs only, logs the failure cause.
Three logged outcomes:
- Task faulted →
[indoor-upload] FAILED cellId=0x... exception=<TypeName>: <Message> stack=[<top 3 frames>]. UnwrapAggregateException.InnerExceptionfor cleaner output. - Task succeeded with null result →
[indoor-upload] NULL_RESULT cellId=0x.... WB's deliberate null-return path (e.g.,ResolveIdreturned empty, type wasUnknown). - Task succeeded with non-null result → no extra log. The existing
Tick()drain already emits[indoor-upload] completed.
The continuation:
- Runs on
TaskScheduler.Default(ThreadPool) so it doesn't block the render thread. - Only attached for EnvCell IDs (gated by
RenderingDiagnostics.IsEnvCellId(id)) whenProbeIndoorUploadEnabledis true — zero cost when off. - Captures
cellId(aulongvalue) only; no instance closure leakage. - Truncates stack trace to top 3 frames.
Concrete code shape:
if (_metadataPopulated.Add(id))
{
PopulateMetadata(id);
var prepTask = _meshManager.PrepareMeshDataAsync(id, isSetup: false);
if (RenderingDiagnostics.IsEnvCellId(id) && RenderingDiagnostics.ProbeIndoorUploadEnabled)
{
_pendingEnvCellRequests.Add(id);
Console.WriteLine($"[indoor-upload] requested cellId=0x{id:X8}");
ulong cellId = id;
_ = prepTask.ContinueWith(t =>
{
if (t.IsFaulted && t.Exception is not null)
{
var ex = t.Exception.InnerException ?? t.Exception;
var stack = (ex.StackTrace ?? "").Split('\n')
.Take(3).Select(s => s.Trim()).Where(s => s.Length > 0);
Console.WriteLine(
$"[indoor-upload] FAILED cellId=0x{cellId:X8} " +
$"exception={ex.GetType().Name}: {ex.Message} " +
$"stack=[{string.Join(" | ", stack)}]");
}
else if (t.IsCompletedSuccessfully && t.Result is null)
{
Console.WriteLine($"[indoor-upload] NULL_RESULT cellId=0x{cellId:X8}");
}
}, TaskScheduler.Default);
}
}
using System.Linq; and using System.Threading.Tasks; may need adding (likely already present).
Component 2 — Capture procedure
Standard launch:
$env:ACDREAM_PROBE_INDOOR_UPLOAD = "1"
dotnet run --project src\AcDream.App\AcDream.App.csproj --no-build -c Debug 2>&1 | Tee-Object -FilePath launch.log
Walk into Holtburg Inn, walk into nearby buildings whose cells were on the missing-26 list (0xA9B40100, 0xA9B40111, etc.). Close gracefully.
Analyze:
Get-Content launch.log |
Where-Object { $_ -match '\[indoor-upload\] (FAILED|NULL_RESULT)' } |
Select-Object -Unique
Expected output: a per-cell list of distinct exception types or null-return signals. Most cells likely share 1–3 root causes.
Component 3 — Targeted fix (shape unknown until Component 2 captures)
Once Component 2 reveals the exception type + message, the fix is one localized code change. Likely shapes:
| Captured cause | Fix shape |
|---|---|
Texture decode Exception (e.g. KeyNotFoundException on surface ID) |
Guard at WbMeshAdapter.PopulateMetadata or pre-validate surfaces; possibly patch WB fork. |
KeyNotFoundException for missing Environment / CellStruct |
Log + skip cell with a sentinel render-data; report which dat is stale. |
NullReferenceException in PrepareCellStructMeshData |
Add null guard at the specific call site. |
| WB internal logic bug | Fork patch to WB. |
NULL_RESULT (ResolveId returned empty / type was Unknown) |
Investigate dat file integrity; possibly user needs a dat update. |
The fix is one or two code edits, lands as a single commit, and is followed by a re-launch verifying:
[indoor-upload] FAILED/NULL_RESULTlines disappear for the previously-failing cells.[indoor-upload] completedappears for those cells.- Visual verification: floor renders in Holtburg Inn.
3. Edge cases
| Scenario | Behavior |
|---|---|
| Probe toggled off mid-session | Continuation still emits if attached at request time. Acceptable — capturing the cause once matters more than honoring runtime toggle. |
| Continuation fires after adapter disposed | Harmless console write on dying process. No memory leak; closure captures only the ulong cellId. |
| Same cell requested twice | _metadataPopulated.Add(id) guards; continuation attaches exactly once. Re-streaming after Remove+Add keeps the sticky set. First failure is what we want. |
| Cancellation | t.IsCanceled is neither IsFaulted nor IsCompletedSuccessfully. Continuation silently skips. Acceptable — cancellation isn't a failure cause. |
Task.Result on faulted task |
Re-throws AggregateException. Our gate else if (t.IsCompletedSuccessfully && t.Result is null) ensures we never read Result without a clean success state. |
WB's _logger.LogError for the same exception |
WbMeshAdapter passes NullLogger — WB's log goes nowhere. Our continuation is what surfaces it. Discussed below. |
Why not just inject a real logger into ObjectMeshManager? Could replace NullLogger<ObjectMeshManager>.Instance with a real logger that writes to Console.WriteLine. Tradeoff:
- Real logger: simpler, leverages WB's existing
_logger.LogErrorcall → catches GfxObj + Setup + EnvCell failures. - Our continuation: scoped to EnvCell IDs only → less noise.
Going with the continuation approach because:
- The probe flag is already in place.
- Phase 2 is targeted at EnvCells.
- Real-logger would emit thousands of GfxObj/Setup log lines during landblock streaming, drowning the EnvCell signal.
We can revisit if a future debugging session calls for broader visibility.
4. Testing strategy
Unit tests
None for Component 1 — the continuation is straight wiring around an async API; the logic is "if faulted, log; if null result, log." Testing requires either mocking Task<ObjectMeshData?> (low value) or running a real WB instance (impractical in unit tests).
Visual verification (end-to-end)
Component 2's capture procedure is the verification mechanism:
- Build green.
- Launch with probe flag on, walk into Holtburg.
- Confirm
[indoor-upload] FAILEDorNULL_RESULTlines appear for ~26 cells. - Apply Component 3's fix.
- Re-launch, re-walk Holtburg.
- Acceptance: previously-failing cells now produce
[indoor-upload] completedlines AND the user can see the floor in Holtburg Inn.
5. What's NOT in this phase
- Tightening
IsEnvCellIdfalse-positives (flagged in Phase 1 capture note). Deferred — doesn't block Phase 2 since the upload probe gates on the correct path. - Cell collision symptoms (no wall collision when exiting, weird open-air collisions). Separate investigation phase.
- Stab-leak-through-walls (Phase 1 Task 3). Deferred.
- Broader WB logger injection for GfxObj/Setup failures. Open if we ever want broader diagnostic visibility.
6. Acceptance criteria
WbMeshAdapter.IncrementRefCountcaptures the prep task and attaches a continuation for EnvCell IDs.- Continuation logs
[indoor-upload] FAILED cellId=0x... exception=<TypeName>: <Message> stack=[...]for faulted tasks. - Continuation logs
[indoor-upload] NULL_RESULT cellId=0x...for clean-null returns. dotnet buildclean.dotnet testclean (no new failures; pre-existing 8 physics/input failures unchanged).- Capture launched, FAILED/NULL_RESULT lines appear for the previously-missing cells, distinct causes identified.
- Component 3 fix designed and implemented for each distinct cause.
- Re-capture confirms
[indoor-upload] completedappears for cells previously missing. - Visual verification: floor renders in Holtburg Inn.
- Roadmap updated with Phase 2 shipped.
- Commit messages cite the captured exception types + the fix rationale.