feat(render): Phase U.3 — GPU clip-plane gate (gl_ClipDistance), no-clip default
Adds the GPU mechanism to clip drawing to a per-cell screen-space convex
region via gl_ClipDistance, consumed by the mesh + terrain vertex shaders.
This is the MECHANISM only — every instance defaults to slot 0 (no-clip /
pass-all) and terrain to count 0, so the running game renders IDENTICALLY to
pre-U.3 (verified: offline launch compiles both shaders and reaches steady
state; no GL errors). U.4 populates real clip data from portal visibility.
Binding contract (define once, both sides obey):
- mesh_modern.vert: SSBO binding=2 CellClip[] (shared per-frame regions, slot 0
reserved no-clip) + SSBO binding=3 uint[] per-instance slot, indexed by the
IDENTICAL gl_BaseInstanceARB+gl_InstanceID used for binding=0. binding=0/1
untouched.
- terrain_modern.vert: UBO binding=2 TerrainClip { int count; vec4 planes[8]; }
for the single OutsideView region (UBO namespace; SceneLighting is UBO
binding=1, so binding=2 is free and does not collide with the mesh SSBO
binding=2). count 0 = ungated.
- Both redeclare out gl_PerVertex { vec4 gl_Position; float gl_ClipDistance[8]; }
and set unused planes (i >= count) to +1.0 so they pass everything.
CellClip std430 layout (144 bytes/slot): count@0, 3 pad uints@4/8/12,
planes[8]@16 (vec4 stride 16). Terrain UBO std140: count@0 (padded to 16),
planes[8]@16 → 144 bytes. Verified by ClipFrameLayoutTests (8 new tests).
Pieces:
- ClipFrame: per-frame container + uploader for the SHARED clip data (binding=2
SSBO + terrain UBO). NoClip() = slot 0 + terrain count 0. AppendSlot /
SetTerrainClip pack std430/std140 bytes for U.4. UploadShared binds both.
- WbDrawDispatcher + EnvCellRenderer: each owns its binding=3 zero buffer
(all-zeros sized to its instance count → slot 0), re-binds binding=2 from the
shared ClipFrame id (or an internal no-clip fallback if unwired) before MDI.
gl_ClipDistance is per-vertex, so the single glMultiDrawElementsIndirect per
group is preserved — no draw splitting.
- TerrainModernRenderer: binds the terrain clip UBO (shared or no-clip fallback)
before its draw.
- GameWindow: glEnable(GL_CLIP_DISTANCE0..7) once at init (unused planes pass-all
so always-on avoids per-draw thrash); per frame builds ClipFrame.NoClip(),
UploadShared, and hands the buffer ids to the three renderers (tiny diff; U.4
swaps NoClip() for the real portal-visibility frame).
Gate: dotnet build green; App suite 134/134; offline launch confirms both
shaders compile + link with no GL errors.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
0b125830fe
commit
bf2e559369
8 changed files with 797 additions and 1 deletions
276
src/AcDream.App/Rendering/ClipFrame.cs
Normal file
276
src/AcDream.App/Rendering/ClipFrame.cs
Normal file
|
|
@ -0,0 +1,276 @@
|
|||
// ClipFrame.cs
|
||||
//
|
||||
// Phase U.3: the GPU-side container + uploader for the SHARED per-frame clip
|
||||
// data consumed by mesh_modern.vert (SSBO binding=2) and terrain_modern.vert
|
||||
// (UBO binding=2). This is the "shared" half of the U.3 clip mechanism; the
|
||||
// per-instance slot index buffer (SSBO binding=3) is PER-RENDERER and owned by
|
||||
// each renderer (WbDrawDispatcher / EnvCellRenderer), parallel to its instance
|
||||
// buffer — it is NOT here.
|
||||
//
|
||||
// === The contract (both shader sides obey) ===================================
|
||||
// binding=2 mesh SSBO holds an array of CellClip, one per "slot":
|
||||
// struct CellClip { uint count; uint _p0; uint _p1; uint _p2; vec4 planes[8]; };
|
||||
// std430 layout: count at byte 0, three pad uints at 4/8/12, planes[8] at 16
|
||||
// (vec4 stride 16) → 144 bytes per slot. Slot 0 is RESERVED = no-clip (count 0).
|
||||
// binding=2 terrain UBO holds the single OutsideView region:
|
||||
// layout(std140) { int uTerrainClipCount; vec4 uTerrainClipPlanes[8]; };
|
||||
// std140 layout: count at byte 0 (padded to 16), planes[8] at 16 → 144 bytes.
|
||||
//
|
||||
// In U.3 a ClipFrame is built via NoClip(): one slot (slot 0, count 0) and a
|
||||
// terrain count of 0. Everything renders exactly as before. U.4 populates real
|
||||
// slots from a PortalVisibilityFrame (one CellClip per visible cell) and sets the
|
||||
// terrain OutsideView planes, then points each renderer's per-instance slot
|
||||
// buffer at the right slots.
|
||||
//
|
||||
// Pure CPU byte-packing + a thin GL upload. NO GL types appear except in
|
||||
// UploadShared. The byte layout is asserted by ClipFrameLayoutTests so a silent
|
||||
// std430/std140 drift can't reach the GPU.
|
||||
using System;
|
||||
using System.Collections.Generic;
|
||||
using System.Numerics;
|
||||
using Silk.NET.OpenGL;
|
||||
|
||||
namespace AcDream.App.Rendering;
|
||||
|
||||
/// <summary>
|
||||
/// Per-frame container + uploader for the SHARED clip data: the binding=2 mesh
|
||||
/// SSBO (one <c>CellClip</c> per slot, slot 0 reserved no-clip) and the binding=2
|
||||
/// terrain UBO (the single OutsideView region). See the file header for the exact
|
||||
/// std430 / std140 byte layout. Per-instance slot buffers (binding=3) are owned by
|
||||
/// each renderer, not here.
|
||||
/// </summary>
|
||||
public sealed class ClipFrame : IDisposable
|
||||
{
|
||||
// ---- Layout constants (mirror mesh_modern.vert + terrain_modern.vert) ----
|
||||
|
||||
/// <summary>Max planes per clip region — matches the shader's <c>planes[8]</c>
|
||||
/// and GL's guaranteed <c>GL_MAX_CLIP_DISTANCES >= 8</c>.</summary>
|
||||
public const int MaxPlanes = 8;
|
||||
|
||||
/// <summary>std430 stride of one <c>CellClip</c>: 16 (count + 3 pad uints) +
|
||||
/// 8 × 16 (vec4 planes) = 144 bytes.</summary>
|
||||
public const int CellClipStrideBytes = 16 + MaxPlanes * 16; // 144
|
||||
|
||||
/// <summary>Byte offset of <c>planes[0]</c> within a <c>CellClip</c> (after the
|
||||
/// count + 3 pad uints).</summary>
|
||||
public const int CellClipPlanesOffset = 16;
|
||||
|
||||
/// <summary>std140 size of the terrain UBO block: int count padded to 16, then
|
||||
/// 8 × 16 (vec4 planes) = 144 bytes. Same number as the SSBO stride by
|
||||
/// coincidence of the 16-byte vec4 rule, but a DIFFERENT layout family.</summary>
|
||||
public const int TerrainUboBytes = 16 + MaxPlanes * 16; // 144
|
||||
|
||||
/// <summary>SSBO binding index for the shared per-cell clip regions
|
||||
/// (mesh_modern.vert binding=2).</summary>
|
||||
public const uint MeshClipSsboBinding = 2;
|
||||
|
||||
/// <summary>UBO binding index for the terrain OutsideView clip region
|
||||
/// (terrain_modern.vert binding=2). UBO namespace — distinct from the SSBO
|
||||
/// binding=2 above.</summary>
|
||||
public const uint TerrainClipUboBinding = 2;
|
||||
|
||||
// ---- CPU-side state ------------------------------------------------------
|
||||
|
||||
// Packed std430 bytes for clipRegions[]. Always holds at least slot 0.
|
||||
private byte[] _regionBytes;
|
||||
private int _slotCount;
|
||||
|
||||
// Packed std140 bytes for the terrain UBO (always TerrainUboBytes long).
|
||||
private readonly byte[] _terrainBytes = new byte[TerrainUboBytes];
|
||||
|
||||
// ---- GL-side state (lazily created on first UploadShared) ----------------
|
||||
|
||||
private uint _regionSsbo;
|
||||
private uint _terrainUbo;
|
||||
private bool _glInitialized;
|
||||
private bool _disposed;
|
||||
|
||||
private ClipFrame(byte[] regionBytes, int slotCount)
|
||||
{
|
||||
_regionBytes = regionBytes;
|
||||
_slotCount = slotCount;
|
||||
// Terrain defaults to count 0 (ungated). _terrainBytes is already all
|
||||
// zeros, which encodes count=0 + zeroed (unused) planes.
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// The U.3 default frame: exactly slot 0 (no-clip, count 0) and a terrain
|
||||
/// count of 0. The whole scene renders ungated — identical to pre-U.3. U.4
|
||||
/// replaces this with a frame built from real portal visibility.
|
||||
/// </summary>
|
||||
public static ClipFrame NoClip()
|
||||
{
|
||||
// One slot, all zeros: count=0 ⇒ shader passes every plane.
|
||||
var bytes = new byte[CellClipStrideBytes];
|
||||
return new ClipFrame(bytes, slotCount: 1);
|
||||
}
|
||||
|
||||
/// <summary>Number of clip slots currently packed (always >= 1 — slot 0 is
|
||||
/// the reserved no-clip slot).</summary>
|
||||
public int SlotCount => _slotCount;
|
||||
|
||||
/// <summary>The shared mesh-clip SSBO id, or 0 before the first
|
||||
/// <see cref="UploadShared"/>. Renderers may bind this directly if they don't
|
||||
/// receive it via a parameter; <see cref="UploadShared"/> already binds it to
|
||||
/// <see cref="MeshClipSsboBinding"/>.</summary>
|
||||
public uint RegionSsbo => _regionSsbo;
|
||||
|
||||
/// <summary>The terrain-clip UBO id, or 0 before the first
|
||||
/// <see cref="UploadShared"/>. Handed to <see cref="TerrainModernRenderer"/>
|
||||
/// so it can re-bind binding=2 (UBO namespace) before its draw.</summary>
|
||||
public uint TerrainUbo => _terrainUbo;
|
||||
|
||||
/// <summary>
|
||||
/// Append one clip region (becomes the next slot index) from a
|
||||
/// <see cref="ClipPlaneSet"/>. Only the convex-plane case is supported in
|
||||
/// U.3 — <c>Count > 0</c> packs that many planes; <c>Count == 0</c> packs a
|
||||
/// no-clip region (pass-all). The scissor / nothing-visible fallbacks that
|
||||
/// <see cref="ClipPlaneSet"/> can carry are deferred to U.4 (which will draw
|
||||
/// the AABB box or skip the cell on the CPU side, not via this slot). Returns
|
||||
/// the new slot's index.
|
||||
/// </summary>
|
||||
public int AppendSlot(ClipPlaneSet set)
|
||||
{
|
||||
int count = Math.Min(set.Count, MaxPlanes);
|
||||
if (count == 0)
|
||||
return AppendSlot(ReadOnlySpan<Vector4>.Empty);
|
||||
|
||||
Span<Vector4> planes = stackalloc Vector4[count];
|
||||
for (int i = 0; i < count; i++)
|
||||
planes[i] = set.Planes[i];
|
||||
return AppendSlot(planes);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Append one clip region from a raw plane list. <paramref name="planes"/>
|
||||
/// length 0 packs a no-clip (pass-all) region; otherwise up to
|
||||
/// <see cref="MaxPlanes"/> planes are packed (extras ignored). Each plane is
|
||||
/// <c>(nx, ny, 0, dw)</c> in clip space; a clip-space vertex is inside iff
|
||||
/// <c>dot(plane, gl_Position) >= 0</c> for every plane. Returns the new
|
||||
/// slot index.
|
||||
/// </summary>
|
||||
public int AppendSlot(ReadOnlySpan<Vector4> planes)
|
||||
{
|
||||
int count = Math.Min(planes.Length, MaxPlanes);
|
||||
|
||||
int slot = _slotCount;
|
||||
int byteOffset = slot * CellClipStrideBytes;
|
||||
EnsureRegionCapacity(byteOffset + CellClipStrideBytes);
|
||||
|
||||
// count (uint) at byteOffset; the 3 pad uints stay zero.
|
||||
WriteUInt(_regionBytes, byteOffset, (uint)count);
|
||||
|
||||
for (int i = 0; i < count; i++)
|
||||
{
|
||||
int po = byteOffset + CellClipPlanesOffset + i * 16;
|
||||
WriteVec4(_regionBytes, po, planes[i]);
|
||||
}
|
||||
|
||||
_slotCount++;
|
||||
return slot;
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Set the terrain OutsideView clip region (the single region the terrain
|
||||
/// shader gates against). <paramref name="planes"/> length 0 ungates terrain
|
||||
/// (count 0). U.3 callers never touch this — <see cref="NoClip"/> leaves it
|
||||
/// at count 0. U.4 calls it with the OutsideView planes.
|
||||
/// </summary>
|
||||
public void SetTerrainClip(ReadOnlySpan<Vector4> planes)
|
||||
{
|
||||
int count = Math.Min(planes.Length, MaxPlanes);
|
||||
Array.Clear(_terrainBytes);
|
||||
WriteInt(_terrainBytes, 0, count);
|
||||
for (int i = 0; i < count; i++)
|
||||
WriteVec4(_terrainBytes, CellClipPlanesOffset + i * 16, planes[i]);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Upload the shared mesh-clip SSBO (binding=2) and the terrain-clip UBO
|
||||
/// (binding=2, UBO namespace) and bind both to their binding points. Idempotent
|
||||
/// to call once per frame. Creates the GL buffers lazily on first call.
|
||||
/// </summary>
|
||||
public unsafe void UploadShared(GL gl)
|
||||
{
|
||||
ArgumentNullException.ThrowIfNull(gl);
|
||||
ObjectDisposedException.ThrowIf(_disposed, this);
|
||||
|
||||
if (!_glInitialized)
|
||||
{
|
||||
_regionSsbo = gl.GenBuffer();
|
||||
_terrainUbo = gl.GenBuffer();
|
||||
_glInitialized = true;
|
||||
}
|
||||
|
||||
int regionByteCount = _slotCount * CellClipStrideBytes;
|
||||
gl.BindBuffer(BufferTargetARB.ShaderStorageBuffer, _regionSsbo);
|
||||
fixed (byte* p = _regionBytes)
|
||||
{
|
||||
gl.BufferData(BufferTargetARB.ShaderStorageBuffer,
|
||||
(nuint)regionByteCount, p, BufferUsageARB.DynamicDraw);
|
||||
}
|
||||
gl.BindBufferBase(BufferTargetARB.ShaderStorageBuffer, MeshClipSsboBinding, _regionSsbo);
|
||||
|
||||
gl.BindBuffer(BufferTargetARB.UniformBuffer, _terrainUbo);
|
||||
fixed (byte* p = _terrainBytes)
|
||||
{
|
||||
gl.BufferData(BufferTargetARB.UniformBuffer,
|
||||
(nuint)TerrainUboBytes, p, BufferUsageARB.DynamicDraw);
|
||||
}
|
||||
gl.BindBufferBase(BufferTargetARB.UniformBuffer, TerrainClipUboBinding, _terrainUbo);
|
||||
}
|
||||
|
||||
public void Dispose()
|
||||
{
|
||||
if (_disposed) return;
|
||||
_disposed = true;
|
||||
// GL buffers are deleted by the owner's GL context teardown; ClipFrame
|
||||
// is a per-frame transient in U.3 (NoClip() each frame). We do not hold a
|
||||
// GL handle to delete here because UploadShared may not have run. If a
|
||||
// future phase makes ClipFrame long-lived, add buffer deletion guarded by
|
||||
// _glInitialized + a captured GL reference.
|
||||
}
|
||||
|
||||
// ---- byte helpers (little-endian; matches x86/x64 GPU upload) ------------
|
||||
|
||||
private void EnsureRegionCapacity(int requiredBytes)
|
||||
{
|
||||
if (_regionBytes.Length >= requiredBytes) return;
|
||||
int newLen = Math.Max(requiredBytes, _regionBytes.Length * 2);
|
||||
Array.Resize(ref _regionBytes, newLen);
|
||||
}
|
||||
|
||||
private static void WriteUInt(byte[] dst, int offset, uint value)
|
||||
{
|
||||
dst[offset + 0] = (byte)(value & 0xFF);
|
||||
dst[offset + 1] = (byte)((value >> 8) & 0xFF);
|
||||
dst[offset + 2] = (byte)((value >> 16) & 0xFF);
|
||||
dst[offset + 3] = (byte)((value >> 24) & 0xFF);
|
||||
}
|
||||
|
||||
private static void WriteInt(byte[] dst, int offset, int value)
|
||||
=> WriteUInt(dst, offset, unchecked((uint)value));
|
||||
|
||||
private static void WriteVec4(byte[] dst, int offset, Vector4 v)
|
||||
{
|
||||
WriteFloat(dst, offset + 0, v.X);
|
||||
WriteFloat(dst, offset + 4, v.Y);
|
||||
WriteFloat(dst, offset + 8, v.Z);
|
||||
WriteFloat(dst, offset + 12, v.W);
|
||||
}
|
||||
|
||||
private static void WriteFloat(byte[] dst, int offset, float value)
|
||||
{
|
||||
uint bits = BitConverter.SingleToUInt32Bits(value);
|
||||
WriteUInt(dst, offset, bits);
|
||||
}
|
||||
|
||||
// ---- Test seams ----------------------------------------------------------
|
||||
|
||||
/// <summary>Test seam: the packed std430 region bytes (slot 0..SlotCount-1).
|
||||
/// Read-only snapshot used by ClipFrameLayoutTests to assert the byte layout.</summary>
|
||||
internal ReadOnlySpan<byte> RegionBytesForTest => _regionBytes.AsSpan(0, _slotCount * CellClipStrideBytes);
|
||||
|
||||
/// <summary>Test seam: the packed std140 terrain UBO bytes.</summary>
|
||||
internal ReadOnlySpan<byte> TerrainBytesForTest => _terrainBytes;
|
||||
}
|
||||
Loading…
Add table
Add a link
Reference in a new issue