Brainstormed design for the first slice of Phase N.6 (perf polish).
Slice 1 ships two commits: (1) fix the GPU timing query double-buffering
in WbDrawDispatcher (cross-vendor ring of 3, read-before-overwrite),
(2) add an env-gated surface-format histogram dump + capture the
radius=12 perf baseline at Holtburg. Slice 2 (TextureCache cleanup +
shader migration + optional persistent-mapped buffers) is deferred
until after C.1.5 (PES emitter wiring), with the next-phase decision
to be made on the baseline numbers slice 1 produces.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>