acdream/src/AcDream.App/Rendering/Wb/WbMeshAdapter.cs
Erik b838eccb38 feat(wb): ConsoleErrorLogger + cause report — H1 swallowed-exception confirmed
Phase 2 diagnostic chain identified the EXACT cause of 26/123 Holtburg
cells silently failing in WB's PrepareEnvCellMeshData:
ArgumentOutOfRangeException thrown from Setup.Unpack inside
DatReaderWriter when WB calls TryGet<Setup>(stab.Id, ...) on a stab id
whose prefix is GfxObj (0x01xxxxxx), not Setup (0x02xxxxxx).
DatReaderWriter finds the file in Portal's tree (GfxObjs and Setups
share tree-lookups), attempts to parse GfxObj bytes as Setup format,
throws OOR. Exception bubbles to PrepareMeshData's outer try/catch
which silently swallows + returns null. Entire cell fails to upload.

This commit lands the diagnostic infrastructure that surfaced the bug:

- WbMeshAdapter: replaced NullLogger<ObjectMeshManager> with a small
  Console-backed ConsoleErrorLogger<T> private class. Filters to
  LogLevel.Error+. WB's existing _logger.LogError(ex, ...) at the
  swallow site now writes [wb-error] lines with type + message + top 5
  stack frames. Bridges WB's intentional log point to acdream's console.
- WbMeshAdapter: extended [indoor-upload] NULL_RESULT probe with
  reader-divergence diagnostic (ourCellDb.TryGet, wbResolveId.Count,
  wbSelectedType, wbDbIsPortal, wbDbTryGet<EnvCell>, hadRenderData).
  Made it possible to rule out cache-hits and reader-divergence as
  causes before identifying the real one.
- Cause report at docs/research/2026-05-19-indoor-cell-rendering-cause.md
  documents the full chain: 55 ArgumentOutOfRangeException stack traces
  captured in one launch, all from PrepareEnvCellMeshData line 1223.

The fix itself (1-line guard at WB's TryGet<Setup> call site) is applied
to references/WorldBuilder/.../ObjectMeshManager.cs — which is a git
submodule. Will be committed separately to the WB submodule after
visual verification.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 13:00:18 +02:00

349 lines
16 KiB
C#
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading.Tasks;
using AcDream.Core.Meshing;
using AcDream.Core.Rendering;
using Chorizite.OpenGLSDLBackend;
using Chorizite.OpenGLSDLBackend.Lib;
using DatReaderWriter;
using DatReaderWriter.DBObjs;
using Microsoft.Extensions.Logging;
using Microsoft.Extensions.Logging.Abstractions;
using Silk.NET.OpenGL;
using WorldBuilder.Shared.Models;
using WorldBuilder.Shared.Services;
namespace AcDream.App.Rendering.Wb;
/// <summary>
/// Single seam between acdream and WB's render pipeline. Owns the
/// <c>ObjectMeshManager</c> instance and exposes a stable acdream-shaped API
/// so the rest of the renderer doesn't need to know about WB's types directly.
///
/// <para>
/// The adapter constructs its own <c>DefaultDatReaderWriter</c> internally; it
/// does NOT share file handles with our <c>DatCollection</c>. This duplicates
/// index-cache memory (~50100 MB) but keeps the two subsystems fully decoupled.
/// Acceptable for Phase N.4 foundation work (plan Adjustment 1).
/// </para>
/// </summary>
public sealed class WbMeshAdapter : IDisposable, IWbMeshAdapter
{
private readonly OpenGLGraphicsDevice? _graphicsDevice;
private readonly DefaultDatReaderWriter? _wbDats;
private readonly ObjectMeshManager? _meshManager;
private readonly DatCollection? _dats;
private readonly AcSurfaceMetadataTable _metadataTable = new();
private readonly HashSet<ulong> _metadataPopulated = new();
/// <summary>
/// EnvCell ids we've requested via PrepareMeshDataAsync but not yet
/// seen completion for in Tick(). Used by the [indoor-upload] probe
/// to log requested + completed pairs. Cleared per completion;
/// missing completions after a few seconds indicate WB silently
/// returned null (hypothesis H1 in the design spec).
/// </summary>
private readonly HashSet<ulong> _pendingEnvCellRequests = new();
/// <summary>
/// True when this instance was created via <see cref="CreateUninitialized"/>;
/// all public methods no-op when uninitialized.
/// </summary>
private readonly bool _isUninitialized;
private bool _disposed;
/// <summary>
/// Constructs the full WB pipeline: OpenGLGraphicsDevice → DefaultDatReaderWriter
/// → ObjectMeshManager.
/// </summary>
/// <param name="gl">Active Silk.NET GL context. Must be bound to the current
/// thread (construction runs GL queries; call from OnLoad).</param>
/// <param name="datDir">Path to the dat directory (same as the one supplied
/// to our DatCollection). DefaultDatReaderWriter opens its own file handles.</param>
/// <param name="dats">acdream's DatCollection, used to populate the surface
/// metadata side-table via <c>GfxObjMesh.Build</c>. Shares file handles with
/// the rest of the client; read-only access from the render thread.</param>
/// <param name="logger">Logger for the adapter; ObjectMeshManager uses
/// NullLogger internally.</param>
public WbMeshAdapter(GL gl, string datDir, DatCollection dats, ILogger<WbMeshAdapter> logger)
{
ArgumentNullException.ThrowIfNull(gl);
ArgumentNullException.ThrowIfNull(datDir);
ArgumentNullException.ThrowIfNull(dats);
ArgumentNullException.ThrowIfNull(logger);
_dats = dats;
_graphicsDevice = new OpenGLGraphicsDevice(gl, logger, new DebugRenderSettings());
_wbDats = new DefaultDatReaderWriter(datDir);
// Phase 2 diagnostic — replace NullLogger with a Console-backed
// logger so WB's internal catch block at ObjectMeshManager.cs:589
// (and similar) surfaces its swallowed exceptions instead of
// dropping them. ConsoleErrorLogger filters to LogLevel.Error+
// so successful operations stay quiet.
_meshManager = new ObjectMeshManager(
_graphicsDevice,
_wbDats,
new ConsoleErrorLogger<ObjectMeshManager>());
}
/// <summary>
/// Minimal Console-backed logger that fires only on
/// <see cref="LogLevel.Error"/> and above. Format:
/// <code>[wb-error] &lt;message&gt;
/// [wb-error] &lt;ExceptionType&gt;: &lt;ExceptionMessage&gt;
/// [wb-error] at &lt;frame&gt; (up to 5 frames)</code>
/// Used to surface WB's silently-caught exceptions in
/// <c>ObjectMeshManager.PrepareMeshData</c>.
/// </summary>
private sealed class ConsoleErrorLogger<T> : ILogger<T>
{
public IDisposable BeginScope<TState>(TState state) where TState : notnull => NullScope.Instance;
public bool IsEnabled(LogLevel logLevel) => logLevel >= LogLevel.Error;
public void Log<TState>(
LogLevel logLevel, EventId eventId, TState state, Exception? exception,
Func<TState, Exception?, string> formatter)
{
if (!IsEnabled(logLevel)) return;
var message = formatter(state, exception);
Console.WriteLine($"[wb-error] {message}");
if (exception is not null)
{
Console.WriteLine($"[wb-error] {exception.GetType().Name}: {exception.Message}");
var stack = (exception.StackTrace ?? "")
.Split(new[] { '\r', '\n' }, StringSplitOptions.RemoveEmptyEntries)
.Take(5);
foreach (var s in stack) Console.WriteLine($"[wb-error] {s.Trim()}");
}
}
private sealed class NullScope : IDisposable
{
public static readonly NullScope Instance = new();
public void Dispose() { }
}
}
private WbMeshAdapter()
{
// Uninitialized constructor — only for tests / flag-off cases where
// the caller wants a Dispose-safe no-op instance.
_isUninitialized = true;
}
/// <summary>Test/init helper — produces a Dispose-safe instance with no
/// underlying mesh manager. Public methods are all no-ops.</summary>
public static WbMeshAdapter CreateUninitialized() => new();
/// <summary>
/// The surface metadata side-table populated on each first
/// <see cref="IncrementRefCount"/>. Queried by the draw dispatcher
/// to determine translucency, luminosity, and fog behavior per batch.
/// </summary>
public AcSurfaceMetadataTable MetadataTable => _metadataTable;
/// <summary>
/// Returns the WB render data for <paramref name="id"/>, or null if not
/// yet uploaded or if this adapter is uninitialized. Increments WB's
/// internal usage counter — use <see cref="TryGetRenderData"/> for
/// render-loop lookups that should not affect lifecycle.
/// </summary>
public ObjectRenderData? GetRenderData(ulong id)
{
if (_isUninitialized || _meshManager is null) return null;
return _meshManager.GetRenderData(id);
}
/// <summary>
/// Returns the WB render data for <paramref name="id"/> without
/// modifying reference counts. Returns null if the mesh is not yet
/// uploaded. Safe for render-loop lookups.
/// </summary>
public ObjectRenderData? TryGetRenderData(ulong id)
{
if (_isUninitialized || _meshManager is null) return null;
return _meshManager.TryGetRenderData(id);
}
/// <inheritdoc/>
public void IncrementRefCount(ulong id)
{
if (_isUninitialized || _meshManager is null) return;
_meshManager.IncrementRefCount(id);
if (_metadataPopulated.Add(id))
{
PopulateMetadata(id);
// WB's IncrementRefCount alone only bumps a usage counter; it does
// NOT trigger mesh loading. We must explicitly call PrepareMeshDataAsync
// so the background workers actually decode the GfxObj. The result
// auto-enqueues into _stagedMeshData (ObjectMeshManager line 510),
// which Tick() drains onto the GPU. Until that completes,
// TryGetRenderData(id) returns null and the dispatcher silently
// skips the entity — standard streaming flicker.
//
// isSetup: false — acdream's MeshRefs already carry expanded
// per-part GfxObj ids (0x01XXXXXX). WB's Setup-expansion path is
// unused.
var prepTask = _meshManager.PrepareMeshDataAsync(id, isSetup: false);
// [indoor-upload] requested probe — only for EnvCell ids.
if (RenderingDiagnostics.IsEnvCellId(id) && RenderingDiagnostics.ProbeIndoorUploadEnabled)
{
bool hadRenderDataAtRequest = _meshManager.HasRenderData(id);
_pendingEnvCellRequests.Add(id);
Console.WriteLine($"[indoor-upload] requested cellId=0x{id:X8} hadRenderData={hadRenderDataAtRequest}");
// Phase 2 — surface what WB's catch block silently swallows.
// ObjectMeshManager.PrepareMeshData has a try/catch at line 589
// that calls _logger.LogError(ex, ...) — but we construct
// ObjectMeshManager with NullLogger.Instance so the log is
// dropped. This continuation captures the same data scoped to
// EnvCell ids only. Runs on ThreadPool; non-blocking. Zero cost
// when the probe is off.
ulong cellId = id;
_ = prepTask.ContinueWith(t =>
{
if (t.IsFaulted && t.Exception is not null)
{
var ex = t.Exception.InnerException ?? t.Exception;
var stack = (ex.StackTrace ?? "").Split('\n')
.Take(3).Select(s => s.Trim()).Where(s => s.Length > 0);
Console.WriteLine(
$"[indoor-upload] FAILED cellId=0x{cellId:X8} " +
$"exception={ex.GetType().Name}: {ex.Message} " +
$"stack=[{string.Join(" | ", stack)}]");
}
else if (t.IsCompletedSuccessfully && t.Result is null)
{
// Phase 2 cause-narrowing: WB's PrepareMeshData can return
// null for several reasons (ResolveId empty / TryGet<EnvCell>
// failed / type Unknown). Cross-check against acdream's own
// DatCollection — if WE find the cell but WB doesn't, the
// divergence is between dat readers, not a missing record.
bool ourCellFound = false;
try
{
ourCellFound = _dats?.Cell.TryGet<DatReaderWriter.DBObjs.EnvCell>(
(uint)cellId, out _) ?? false;
}
catch { /* swallow — this is best-effort diagnostic */ }
int wbResolveCount = -1;
string wbSelectedType = "none";
bool wbDbTryGetEnvCell = false;
bool wbDbIsPortal = false;
try
{
var wbResolutions = _wbDats?.ResolveId((uint)cellId).ToList();
wbResolveCount = wbResolutions?.Count ?? -1;
if (wbResolutions is not null && wbResolutions.Count > 0)
{
var selected = wbResolutions
.OrderByDescending(r => r.Database == _wbDats!.Portal)
.First();
wbSelectedType = selected.Type.ToString();
wbDbIsPortal = selected.Database == _wbDats!.Portal;
try { wbDbTryGetEnvCell = selected.Database.TryGet<DatReaderWriter.DBObjs.EnvCell>((uint)cellId, out _); } catch {}
}
}
catch { /* swallow — best-effort */ }
Console.WriteLine(
$"[indoor-upload] NULL_RESULT cellId=0x{cellId:X8} " +
$"ourCellDb.TryGet={ourCellFound} " +
$"wbResolveId.Count={wbResolveCount} " +
$"wbSelectedType={wbSelectedType} " +
$"wbDbIsPortal={wbDbIsPortal} " +
$"wbDbTryGet<EnvCell>={wbDbTryGetEnvCell}");
}
}, TaskScheduler.Default);
}
}
}
/// <inheritdoc/>
public void DecrementRefCount(ulong id)
{
if (_isUninitialized || _meshManager is null) return;
_meshManager.DecrementRefCount(id);
}
/// <summary>
/// Per-frame drain of the WB pipeline's main-thread work queues. MUST be
/// called once per frame from the render thread. Without this, the staged
/// mesh data queue grows unbounded (memory leak) and queued GL actions
/// never execute.
///
/// <para>
/// Order matters: <c>ProcessGLQueue</c> runs first to apply any pending GL
/// state changes (e.g., texture uploads queued by background workers
/// during mesh prep). Then we drain staged mesh data, calling
/// <c>UploadMeshData</c> on each item to materialize the actual GL VAO /
/// VBO / IBO resources. After Tick, <c>GetRenderData</c> for any id
/// previously passed to <c>IncrementRefCount</c> may return non-null.
/// </para>
///
/// <para>
/// No-op when the adapter is uninitialized (e.g., flag is off and the
/// adapter was constructed via <c>CreateUninitialized</c>).
/// </para>
/// </summary>
public void Tick()
{
if (_isUninitialized) return;
if (_disposed) return;
_graphicsDevice!.ProcessGLQueue();
while (_meshManager!.StagedMeshData.TryDequeue(out var meshData))
{
// [indoor-upload] completed probe — check BEFORE upload so we
// see what WB actually produced (vertex counts, parts) before
// any post-upload mutation.
bool isPendingEnvCell = RenderingDiagnostics.ProbeIndoorUploadEnabled
&& _pendingEnvCellRequests.Remove(meshData.ObjectId);
var renderData = _meshManager.UploadMeshData(meshData);
if (isPendingEnvCell)
{
int parts = meshData.SetupParts?.Count ?? 0;
bool hasGeom = meshData.EnvCellGeometry is not null;
int cellGeomVerts = meshData.EnvCellGeometry?.Vertices?.Length ?? 0;
bool uploadOk = renderData is not null;
Console.WriteLine(
$"[indoor-upload] completed cellId=0x{meshData.ObjectId:X8} " +
$"isSetup={meshData.IsSetup} parts={parts} " +
$"hasEnvCellGeom={hasGeom} cellGeomVerts={cellGeomVerts} " +
$"uploadOk={uploadOk}");
}
}
}
private void PopulateMetadata(ulong id)
{
if (_dats is null) return;
if (!_dats.Portal.TryGet<GfxObj>((uint)id, out var gfxObj)) return;
var subMeshes = GfxObjMesh.Build(gfxObj, _dats);
for (int i = 0; i < subMeshes.Count; i++)
{
var sm = subMeshes[i];
_metadataTable.Add(id, i, new AcSurfaceMetadata(
sm.Translucency, sm.Luminosity, sm.Diffuse,
sm.SurfOpacity, sm.NeedsUvRepeat, sm.DisableFog));
}
}
/// <inheritdoc/>
public void Dispose()
{
if (_disposed) return;
_disposed = true;
_meshManager?.Dispose();
_wbDats?.Dispose();
_graphicsDevice?.Dispose();
}
}