acdream/src/AcDream.App/Rendering/TerrainModernRenderer.cs
Erik bf2e559369 feat(render): Phase U.3 — GPU clip-plane gate (gl_ClipDistance), no-clip default
Adds the GPU mechanism to clip drawing to a per-cell screen-space convex
region via gl_ClipDistance, consumed by the mesh + terrain vertex shaders.
This is the MECHANISM only — every instance defaults to slot 0 (no-clip /
pass-all) and terrain to count 0, so the running game renders IDENTICALLY to
pre-U.3 (verified: offline launch compiles both shaders and reaches steady
state; no GL errors). U.4 populates real clip data from portal visibility.

Binding contract (define once, both sides obey):
- mesh_modern.vert: SSBO binding=2 CellClip[] (shared per-frame regions, slot 0
  reserved no-clip) + SSBO binding=3 uint[] per-instance slot, indexed by the
  IDENTICAL gl_BaseInstanceARB+gl_InstanceID used for binding=0. binding=0/1
  untouched.
- terrain_modern.vert: UBO binding=2 TerrainClip { int count; vec4 planes[8]; }
  for the single OutsideView region (UBO namespace; SceneLighting is UBO
  binding=1, so binding=2 is free and does not collide with the mesh SSBO
  binding=2). count 0 = ungated.
- Both redeclare out gl_PerVertex { vec4 gl_Position; float gl_ClipDistance[8]; }
  and set unused planes (i >= count) to +1.0 so they pass everything.

CellClip std430 layout (144 bytes/slot): count@0, 3 pad uints@4/8/12,
planes[8]@16 (vec4 stride 16). Terrain UBO std140: count@0 (padded to 16),
planes[8]@16 → 144 bytes. Verified by ClipFrameLayoutTests (8 new tests).

Pieces:
- ClipFrame: per-frame container + uploader for the SHARED clip data (binding=2
  SSBO + terrain UBO). NoClip() = slot 0 + terrain count 0. AppendSlot /
  SetTerrainClip pack std430/std140 bytes for U.4. UploadShared binds both.
- WbDrawDispatcher + EnvCellRenderer: each owns its binding=3 zero buffer
  (all-zeros sized to its instance count → slot 0), re-binds binding=2 from the
  shared ClipFrame id (or an internal no-clip fallback if unwired) before MDI.
  gl_ClipDistance is per-vertex, so the single glMultiDrawElementsIndirect per
  group is preserved — no draw splitting.
- TerrainModernRenderer: binds the terrain clip UBO (shared or no-clip fallback)
  before its draw.
- GameWindow: glEnable(GL_CLIP_DISTANCE0..7) once at init (unused planes pass-all
  so always-on avoids per-draw thrash); per frame builds ClipFrame.NoClip(),
  UploadShared, and hands the buffer ids to the three renderers (tiny diff; U.4
  swaps NoClip() for the real portal-visibility frame).

Gate: dotnet build green; App suite 134/134; offline launch confirms both
shaders compile + link with no GL errors.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 17:27:30 +02:00

425 lines
18 KiB
C#
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

using System.Numerics;
using AcDream.App.Rendering.Wb;
using AcDream.Core.Terrain;
using Silk.NET.OpenGL;
namespace AcDream.App.Rendering;
/// <summary>
/// Phase N.5b modern terrain dispatcher. Single global VBO/EBO with a slot
/// allocator (one slot per landblock, 384 verts × 40 bytes = 15,360 bytes
/// per slot). Per-frame: build a DrawElementsIndirectCommand array from
/// visible slots, upload, dispatch via glMultiDrawElementsIndirect. Atlas
/// textures bound via bindless handles set per-frame as sampler uniforms.
///
/// Total ~6-8 GL calls per frame for terrain regardless of visible
/// landblock count.
/// </summary>
public sealed unsafe class TerrainModernRenderer : IDisposable
{
// VertsPerLandblock MUST stay divisible by 6 — terrain_modern.vert uses
// `gl_VertexID % 6` to pick the cell-corner index (BL/BR/TR/TL), and
// because we bake `slot * VertsPerLandblock` into indices CPU-side and
// pass BaseVertex=0 to MultiDrawElementsIndirect, gl_VertexID becomes
// `slot * VertsPerLandblock + local_index`. The shader's modulo-6 only
// reduces to `local_index % 6` because 384 is a multiple of 6. Changing
// either constant without auditing the shader will silently mis-render.
private const int VertsPerLandblock = LandblockMesh.VerticesPerLandblock; // 384 (= 64 cells * 6 verts)
private const int IndicesPerLandblock = VertsPerLandblock;
private const int VertexSize = 40; // sizeof(TerrainVertex)
private const int IndexSize = sizeof(uint);
private const float LandblockSize = LandblockMesh.LandblockSize; // 192
private readonly GL _gl;
private readonly BindlessSupport _bindless;
private readonly Shader _shader;
private readonly TerrainAtlas _atlas;
/// <summary>A.5 T22.5: exposes the terrain atlas so callers can update
/// anisotropic level mid-session via <see cref="TerrainAtlas.SetAnisotropic"/>.</summary>
public TerrainAtlas Atlas => _atlas;
private readonly TerrainSlotAllocator _alloc;
// Per-slot live data (index by slot integer; null entries are unused slots).
private SlotData?[] _slots;
// Reverse map: landblockId -> slot, for RemoveLandblock and replacement.
private readonly Dictionary<uint, int> _idToSlot = new();
// GPU buffers.
private uint _globalVao;
private uint _globalVbo;
private uint _globalEbo;
private uint _indirectBuffer;
private int _indirectCapacity;
// Phase U.3: terrain clip UBO (binding=2, terrain_modern.vert TerrainClip).
// The shared one is created + uploaded by the GameWindow-level ClipFrame and
// handed in via SetClipUbo. When 0, we bind a lazily-created no-clip fallback
// (count 0 = ungated) so the shader never reads an unbound UBO at binding=2.
private uint _sharedClipUbo;
private uint _fallbackClipUbo;
// Cached uvec2-handle uniform locations (matrix uniforms are set by name via Shader.SetMatrix4).
private int _uTerrainHandleLoc;
private int _uAlphaHandleLoc;
// Reusable per-frame buffers.
private readonly List<int> _visibleSlots = new();
private DrawElementsIndirectCommand[] _deicScratch = Array.Empty<DrawElementsIndirectCommand>();
// Diag.
public int LoadedSlots => _alloc.LoadedCount;
public int VisibleSlots => _visibleSlots.Count;
public int CapacitySlots => _alloc.Capacity;
public TerrainModernRenderer(
GL gl,
BindlessSupport bindless,
Shader shader,
TerrainAtlas atlas,
int initialSlotCapacity = 64)
{
_gl = gl;
_bindless = bindless;
_shader = shader;
_atlas = atlas;
_alloc = new TerrainSlotAllocator(initialSlotCapacity);
_slots = new SlotData?[initialSlotCapacity];
_uTerrainHandleLoc = _gl.GetUniformLocation(_shader.Program, "uTerrainHandle");
_uAlphaHandleLoc = _gl.GetUniformLocation(_shader.Program, "uAlphaHandle");
_globalVao = _gl.GenVertexArray();
_globalVbo = _gl.GenBuffer();
_globalEbo = _gl.GenBuffer();
AllocateGpuBuffers(initialSlotCapacity);
ConfigureVao();
_indirectBuffer = _gl.GenBuffer();
}
/// <summary>
/// Phase U.3: hand the renderer the SHARED terrain-clip UBO (binding=2)
/// created by <see cref="ClipFrame.UploadShared"/>. The renderer binds it to
/// binding=2 before its draw. Pass 0 to fall back to the internal no-clip UBO
/// (count 0 = ungated terrain).
/// </summary>
public void SetClipUbo(uint sharedClipUbo) => _sharedClipUbo = sharedClipUbo;
/// <summary>
/// Two-tier streaming entry point. Accepts a prebuilt mesh from
/// <see cref="LandblockStreamResult.Loaded.MeshData"/> built on the worker
/// thread, together with the world-space origin computed by the caller
/// (render-thread GameWindow derives it from landblockId + liveCenterX/Y).
///
/// Delegates to <see cref="AddLandblock(uint,LandblockMeshData,Vector3)"/>
/// so both paths share one upload path. Per Phase A.5 spec T15.
/// </summary>
public void AddLandblockWithMesh(uint landblockId, LandblockMeshData meshData, Vector3 worldOrigin)
=> AddLandblock(landblockId, meshData, worldOrigin);
public void AddLandblock(uint landblockId, LandblockMeshData meshData, Vector3 worldOrigin)
{
ArgumentNullException.ThrowIfNull(meshData);
if (meshData.Vertices.Length != VertsPerLandblock)
throw new ArgumentException(
$"Expected {VertsPerLandblock} vertices, got {meshData.Vertices.Length}",
nameof(meshData));
if (meshData.Indices.Length != IndicesPerLandblock)
throw new ArgumentException(
$"Expected {IndicesPerLandblock} indices, got {meshData.Indices.Length}",
nameof(meshData));
if (_idToSlot.ContainsKey(landblockId))
RemoveLandblock(landblockId);
int slot = _alloc.Allocate(out var needsGrow);
if (needsGrow)
{
int newCap = Math.Max(_alloc.Capacity * 2, slot + 1);
EnsureCapacity(newCap);
}
// Bake worldOrigin into vertex positions; capture min/max Z for AABB.
var bakedVerts = new TerrainVertex[VertsPerLandblock];
float zMin = float.MaxValue, zMax = float.MinValue;
for (int i = 0; i < VertsPerLandblock; i++)
{
var v = meshData.Vertices[i];
var worldPos = v.Position + worldOrigin;
bakedVerts[i] = new TerrainVertex(worldPos, v.Normal, v.Data0, v.Data1, v.Data2, v.Data3);
if (worldPos.Z < zMin) zMin = worldPos.Z;
if (worldPos.Z > zMax) zMax = worldPos.Z;
}
if (zMin == float.MaxValue) { zMin = 0f; zMax = 0f; }
// Bake baseVertex into indices on the CPU side (driver-portable pattern).
uint baseVertex = (uint)(slot * VertsPerLandblock);
var bakedIndices = new uint[IndicesPerLandblock];
for (int i = 0; i < IndicesPerLandblock; i++)
bakedIndices[i] = meshData.Indices[i] + baseVertex;
// glBufferSubData into the slot's VBO + EBO regions.
nint vboByteOffset = (nint)(slot * VertsPerLandblock * VertexSize);
nint eboByteOffset = (nint)(slot * IndicesPerLandblock * IndexSize);
_gl.BindBuffer(BufferTargetARB.ArrayBuffer, _globalVbo);
fixed (TerrainVertex* p = bakedVerts)
{
_gl.BufferSubData(BufferTargetARB.ArrayBuffer, vboByteOffset,
(nuint)(VertsPerLandblock * VertexSize), p);
}
_gl.BindBuffer(BufferTargetARB.ArrayBuffer, 0);
_gl.BindBuffer(BufferTargetARB.ElementArrayBuffer, _globalEbo);
fixed (uint* p = bakedIndices)
{
_gl.BufferSubData(BufferTargetARB.ElementArrayBuffer, eboByteOffset,
(nuint)(IndicesPerLandblock * IndexSize), p);
}
_gl.BindBuffer(BufferTargetARB.ElementArrayBuffer, 0);
_slots[slot] = new SlotData
{
LandblockId = landblockId,
WorldOrigin = worldOrigin,
FirstIndex = (uint)(slot * IndicesPerLandblock),
IndexCount = IndicesPerLandblock,
AabbMin = new Vector3(worldOrigin.X, worldOrigin.Y, zMin),
AabbMax = new Vector3(worldOrigin.X + LandblockSize, worldOrigin.Y + LandblockSize, zMax),
};
_idToSlot[landblockId] = slot;
}
public void RemoveLandblock(uint landblockId)
{
if (!_idToSlot.TryGetValue(landblockId, out var slot))
return;
_idToSlot.Remove(landblockId);
_slots[slot] = null;
_alloc.Free(slot);
// No GPU clear: the per-frame DEIC array won't reference this slot.
}
public void Draw(ICamera camera, FrustumPlanes? frustum = null, uint? neverCullLandblockId = null)
{
if (_alloc.LoadedCount == 0) return;
// Build visible slot list with per-slot frustum cull.
_visibleSlots.Clear();
for (int slot = 0; slot < _slots.Length; slot++)
{
var data = _slots[slot];
if (data is null) continue;
if (frustum is not null && data.LandblockId != neverCullLandblockId)
{
if (!FrustumCuller.IsAabbVisible(frustum.Value, data.AabbMin, data.AabbMax))
continue;
}
_visibleSlots.Add(slot);
}
if (_visibleSlots.Count == 0) return;
// Build DEIC array.
if (_deicScratch.Length < _visibleSlots.Count)
_deicScratch = new DrawElementsIndirectCommand[Math.Max(_visibleSlots.Count, 64)];
for (int i = 0; i < _visibleSlots.Count; i++)
{
var data = _slots[_visibleSlots[i]]!;
_deicScratch[i] = new DrawElementsIndirectCommand
{
Count = (uint)data.IndexCount,
InstanceCount = 1u,
FirstIndex = data.FirstIndex,
BaseVertex = 0, // baked into indices on upload
BaseInstance = 0,
};
}
// Grow indirect buffer if needed.
if (_visibleSlots.Count > _indirectCapacity)
{
_indirectCapacity = Math.Max(64, _visibleSlots.Count * 2);
_gl.BindBuffer(GLEnum.DrawIndirectBuffer, _indirectBuffer);
_gl.BufferData(GLEnum.DrawIndirectBuffer,
(nuint)(_indirectCapacity * sizeof(DrawElementsIndirectCommand)),
null, GLEnum.DynamicDraw);
}
else
{
_gl.BindBuffer(GLEnum.DrawIndirectBuffer, _indirectBuffer);
}
// Upload DEIC array.
fixed (DrawElementsIndirectCommand* p = _deicScratch)
{
_gl.BufferSubData(GLEnum.DrawIndirectBuffer, 0,
(nuint)(_visibleSlots.Count * sizeof(DrawElementsIndirectCommand)), p);
}
// Bind shader + uniforms + atlas handles.
_shader.Use();
_shader.SetMatrix4("uView", camera.View);
_shader.SetMatrix4("uProjection", camera.Projection);
var (terrainHandle, alphaHandle) = _atlas.GetBindlessHandles();
// Pass each 64-bit handle as a uvec2 (low 32 bits, high 32 bits).
// GLSL constructs sampler2DArray(uTerrainHandle) at the use site —
// see terrain_modern.frag for why this is the safe pattern.
_gl.ProgramUniform2(_shader.Program, _uTerrainHandleLoc,
(uint)(terrainHandle & 0xFFFFFFFFu), (uint)(terrainHandle >> 32));
_gl.ProgramUniform2(_shader.Program, _uAlphaHandleLoc,
(uint)(alphaHandle & 0xFFFFFFFFu), (uint)(alphaHandle >> 32));
// Phase U.3: bind the terrain clip UBO (binding=2). Shared ClipFrame UBO
// when wired, else the no-clip fallback (count 0 = ungated terrain).
BindClipUboBinding2();
_gl.BindVertexArray(_globalVao);
_gl.MemoryBarrier(MemoryBarrierMask.CommandBarrierBit);
_gl.MultiDrawElementsIndirect(
PrimitiveType.Triangles, DrawElementsType.UnsignedInt,
(void*)0,
(uint)_visibleSlots.Count,
(uint)sizeof(DrawElementsIndirectCommand));
_gl.BindVertexArray(0);
_gl.BindBuffer(GLEnum.DrawIndirectBuffer, 0);
}
public void Dispose()
{
_gl.DeleteVertexArray(_globalVao);
_gl.DeleteBuffer(_globalVbo);
_gl.DeleteBuffer(_globalEbo);
_gl.DeleteBuffer(_indirectBuffer);
if (_fallbackClipUbo != 0) { _gl.DeleteBuffer(_fallbackClipUbo); _fallbackClipUbo = 0; } // Phase U.3
}
// ----------------------------------------------------------------
// Private helpers
// ----------------------------------------------------------------
/// <summary>
/// Phase U.3: bind the terrain clip UBO to binding=2. Prefers the shared
/// <see cref="ClipFrame"/> UBO (<see cref="SetClipUbo"/>); otherwise lazily
/// creates + binds a no-clip fallback (count 0 = ungated) so the shader never
/// reads an unbound UBO. The fallback is std140-sized to
/// <see cref="ClipFrame.TerrainUboBytes"/> and zero-filled (count 0).
/// </summary>
private void BindClipUboBinding2()
{
if (_sharedClipUbo != 0)
{
_gl.BindBufferBase(BufferTargetARB.UniformBuffer,
ClipFrame.TerrainClipUboBinding, _sharedClipUbo);
return;
}
if (_fallbackClipUbo == 0)
{
_fallbackClipUbo = _gl.GenBuffer();
var zero = stackalloc byte[ClipFrame.TerrainUboBytes];
for (int i = 0; i < ClipFrame.TerrainUboBytes; i++) zero[i] = 0;
_gl.BindBuffer(BufferTargetARB.UniformBuffer, _fallbackClipUbo);
_gl.BufferData(BufferTargetARB.UniformBuffer,
(nuint)ClipFrame.TerrainUboBytes, zero, BufferUsageARB.DynamicDraw);
}
_gl.BindBufferBase(BufferTargetARB.UniformBuffer,
ClipFrame.TerrainClipUboBinding, _fallbackClipUbo);
}
private void AllocateGpuBuffers(int capacitySlots)
{
nuint vboBytes = (nuint)(capacitySlots * VertsPerLandblock * VertexSize);
nuint eboBytes = (nuint)(capacitySlots * IndicesPerLandblock * IndexSize);
_gl.BindBuffer(BufferTargetARB.ArrayBuffer, _globalVbo);
_gl.BufferData(BufferTargetARB.ArrayBuffer, vboBytes, null, BufferUsageARB.DynamicDraw);
_gl.BindBuffer(BufferTargetARB.ArrayBuffer, 0);
_gl.BindBuffer(BufferTargetARB.ElementArrayBuffer, _globalEbo);
_gl.BufferData(BufferTargetARB.ElementArrayBuffer, eboBytes, null, BufferUsageARB.DynamicDraw);
_gl.BindBuffer(BufferTargetARB.ElementArrayBuffer, 0);
}
private void ConfigureVao()
{
_gl.BindVertexArray(_globalVao);
_gl.BindBuffer(BufferTargetARB.ArrayBuffer, _globalVbo);
_gl.BindBuffer(BufferTargetARB.ElementArrayBuffer, _globalEbo);
uint stride = (uint)VertexSize;
// location 0: Position
_gl.EnableVertexAttribArray(0);
_gl.VertexAttribPointer(0, 3, VertexAttribPointerType.Float, false, stride, (void*)0);
// location 1: Normal
_gl.EnableVertexAttribArray(1);
_gl.VertexAttribPointer(1, 3, VertexAttribPointerType.Float, false, stride, (void*)(3 * sizeof(float)));
// locations 2-5: Data0..Data3 (uvec4 byte attributes)
nint dataOffset = 6 * sizeof(float);
_gl.EnableVertexAttribArray(2);
_gl.VertexAttribIPointer(2, 4, VertexAttribIType.UnsignedByte, stride, (void*)dataOffset);
_gl.EnableVertexAttribArray(3);
_gl.VertexAttribIPointer(3, 4, VertexAttribIType.UnsignedByte, stride, (void*)(dataOffset + 4));
_gl.EnableVertexAttribArray(4);
_gl.VertexAttribIPointer(4, 4, VertexAttribIType.UnsignedByte, stride, (void*)(dataOffset + 8));
_gl.EnableVertexAttribArray(5);
_gl.VertexAttribIPointer(5, 4, VertexAttribIType.UnsignedByte, stride, (void*)(dataOffset + 12));
_gl.BindVertexArray(0);
}
private void EnsureCapacity(int newCapacity)
{
if (newCapacity <= _alloc.Capacity) return;
// Allocate new VBO + EBO at new size; copy old contents; swap; recreate VAO.
uint newVbo = _gl.GenBuffer();
uint newEbo = _gl.GenBuffer();
nuint newVboBytes = (nuint)(newCapacity * VertsPerLandblock * VertexSize);
nuint newEboBytes = (nuint)(newCapacity * IndicesPerLandblock * IndexSize);
nuint oldVboBytes = (nuint)(_alloc.Capacity * VertsPerLandblock * VertexSize);
nuint oldEboBytes = (nuint)(_alloc.Capacity * IndicesPerLandblock * IndexSize);
_gl.BindBuffer(BufferTargetARB.ArrayBuffer, newVbo);
_gl.BufferData(BufferTargetARB.ArrayBuffer, newVboBytes, null, BufferUsageARB.DynamicDraw);
_gl.BindBuffer(BufferTargetARB.CopyReadBuffer, _globalVbo);
_gl.BindBuffer(BufferTargetARB.CopyWriteBuffer, newVbo);
_gl.CopyBufferSubData(CopyBufferSubDataTarget.CopyReadBuffer, CopyBufferSubDataTarget.CopyWriteBuffer,
0, 0, oldVboBytes);
_gl.DeleteBuffer(_globalVbo);
_globalVbo = newVbo;
_gl.BindBuffer(BufferTargetARB.ElementArrayBuffer, newEbo);
_gl.BufferData(BufferTargetARB.ElementArrayBuffer, newEboBytes, null, BufferUsageARB.DynamicDraw);
_gl.BindBuffer(BufferTargetARB.CopyReadBuffer, _globalEbo);
_gl.BindBuffer(BufferTargetARB.CopyWriteBuffer, newEbo);
_gl.CopyBufferSubData(CopyBufferSubDataTarget.CopyReadBuffer, CopyBufferSubDataTarget.CopyWriteBuffer,
0, 0, oldEboBytes);
_gl.DeleteBuffer(_globalEbo);
_globalEbo = newEbo;
// Recreate VAO with new buffer bindings.
_gl.DeleteVertexArray(_globalVao);
_globalVao = _gl.GenVertexArray();
ConfigureVao();
// Grow slot tracking array.
Array.Resize(ref _slots, newCapacity);
_alloc.GrowTo(newCapacity);
}
private sealed class SlotData
{
public uint LandblockId;
public Vector3 WorldOrigin;
public uint FirstIndex;
public int IndexCount;
public Vector3 AabbMin;
public Vector3 AabbMax;
}
}