acdream/docs/research/2026-06-10-105-110-CLOSED-staged-texture-flush-drop.md
2026-06-10 12:17:09 +02:00

5.2 KiB
Raw Blame History

CLOSED — #105 white indoor walls × #110 near-plane correlation: one root cause

Date: 2026-06-10 (evening). Branch: claude/thirsty-goldberg-51bb9b. Commits: c787201 (#105 fix + apparatus) · d4b5c71 (#110 close + znear=0.1 re-land). Supersedes the plan in 2026-06-10-105-110-white-textures-nearplane-handoff.md (its §3 anatomy and §4 "only credible link" both turned out exactly right — the staged plan was short-circuited by a static find before any stochastic repro was needed).

The root cause (#105)

TextureAtlasManager.AddTexture only stages texture content: pixel bytes go into a per-array PBO (ManagedGLTextureArray.UpdateLayerInternal) plus a _pendingUpdates list. The actual TexSubImage3D copies into the texture array layers + mipmap regeneration happen in ProcessDirtyUpdates() — and WB drives that once per frame from its render loop:

references/WorldBuilder/Chorizite.OpenGLSDLBackend/GameScene.cs:975 _meshManager?.GenerateMipmaps(); (immediately before the first opaque pass)

GameScene.cs is the host-loop file the N.4/O-T4 extraction replaced with our GameWindow — so the per-frame driver was silently dropped (git log -S shows GenerateMipmaps arrived with d16d8cd O-T4 and never gained a caller). The only remaining flush was the incidental one inside UpdateLayerInternal when a PBO must grow (it flushes pending updates before orphaning the buffer). Consequence: every layer staged after an array's last PBO growth kept undefined TexStorage3D content behind a perfectly valid, resident bindless sampler handle.

That one sentence explains every #105 observation:

Observation Explanation
Dat tripwires silent on every bad run dat → decode → stage all delivered; the loss was the missing flush after staging
White/garbage surfaces, zh==0 handle valid + resident; content undefined
Intermittent, per-run lottery background decode-completion order shuffles which textures land in the post-last-growth tail
Persists the whole run at standstill nothing grows a PBO at standstill ⇒ nothing flushes
Indoor walls only only ObjectRenderBatch.BindlessTextureHandle consumers are affected = EnvCellRenderer cell shells; entities resolve per-frame via TextureCache (immediate TexImage2D), terrain via TerrainAtlas (immediate GenerateMipmap)
Struck on znear=1.0 builds too (2026-06-09 clean launch) the tail exists on every run; visibility of it is luck

The fix

WbMeshAdapter.Tick() now calls _meshManager.GenerateMipmaps() after the staged-upload drain. Tick() runs at the top of GameWindow OnRender, before all draw passes — the exact WB-equivalent position. One call; no retry loops, no back-patching machinery needed.

Evidence chain (apparatus: ACDREAM_PROBE_TEXFLUSH=1, kept env-gated)

  • Pre-fix (texflush-prefix.log): pending updates climb 0→48→…→142, dip only at PBO-growth crossings (86→76, 87→68 — the incidental flush, live), then park at 126 across 34/34 atlas arrays forever (19 heartbeats at standstill). Deterministic, first run — the broken contract did not need a stochastic white-wall repro.
  • Post-fix (texflush-postfix.log): after=0 on every line — staged updates drain the same frame they are staged.
  • 0.1-arm verification (nearplane-reland-1.log, nearplane-reland-2.log — the arm that struck 2-of-3 on 2026-06-10): after=0 on all 45/39 tex-flush lines; 68,291 + 56,097 [shell] lines with zero zh>0 batches; all four dat tripwires silent; zero [wb-error].

#110 resolution

The near plane was mechanism-innocent — precisely the handoff's only-credible-link: znear=0.1 makes close-up geometry newly visible → more prepare/upload pressure indoors → a larger never-flushed tail → higher #105 strike probability. With the flush restored, retail Render::znear = 0.1 (decomp :342173, initializer :1101867) is re-landed on all four cameras (d4b5c71), closing the §4 corner see-through (0.1 < the 0.3 m camera-collision sphere, so a pressed wall no longer near-clips away).

Pending user re-gate: (a) corner press — the wall must stay solid at the camera; (b) a distance scan for z-shimmer (none expected — retail ships 0.1 with D24); (c) general indoor texture watch over the next several launches.

Durable lesson

Memory: feedback_extraction_perframe_drivers.md — when extracting a library from a host app, the host loop's per-frame calls into the library are invisible contracts; grep the host's frame loop and re-wire every one. Staged/deferred APIs are the worst case: everything looks wired and works most of the time via incidental side-effect flushes.

Status of the old #105 exonerations (all stand)

Concurrent dat reads SAFE (hammer-verified); teardown AVs were dispose-during-read (fixed 8fadf77); probes don't cause white walls; membership/flood healthy. The four dat-side tripwires (7433b70) stay as permanent anomaly logging.

Next per the priority order: #107 indoor-login spawn wedge (ACDREAM_CAPTURE_RESOLVE apparatus ready) → #108 cellar grass-sweep + #109 far-door oscillation → #99/A6.P4 per-cell shadow architecture.