acdream/src/AcDream.App/Rendering/InstancedMeshRenderer.cs
Erik 9957070cab feat(render): Phase G.1/G.2 — SceneLighting UBO + sky renderer + shader integration
Wire the existing LightManager + WorldTimeService state into visible
rendering. Every draw call (terrain, static mesh, instanced mesh, sky)
now shares one SceneLighting UBO at binding=1 carrying:
  - 8 Light slots (Directional / Point / Spot, retail hard-cutoff)
  - Ambient RGB + active light count
  - Fog start/end/mode + color + lightning flash scalar
  - Camera world position + day fraction

The CPU side (SceneLightingUbo in Core.Lighting) is a POD struct that
gets BufferSubData'd once per frame from GameWindow.OnRender. Shaders
read the block via `layout(std140, binding = 1) uniform SceneLighting`
— no per-program uniform uploads.

Shader changes:
  - mesh.frag + mesh_instanced.frag accumulate 8 dynamic lights per
    fragment using the retail no-attenuation hard-cutoff model
    (r13 §10.2 / §13.1). Sun reads slot 0; spots use hard cos-cone test.
    Additive lightning flash + linear fog layered on top. Saturate
    clamps per-channel to 1.0.
  - terrain.vert bakes AdjustPlanes sun+ambient per vertex using the
    retail MIN_FACTOR = 0.08 ambient floor (r13 §7). terrain.frag adds
    fog + flash on top of the baked vertex color.
  - mesh.vert + mesh_instanced.vert emit vWorldPos so the fragment
    stage can do per-pixel lighting against world-space positions.
  - New sky.vert / sky.frag pair — unlit, scroll-UV, camera-centered,
    with its own 0.1..1e6 far plane. Ports WorldBuilder's skybox.

SkyRenderer (new file in App/Rendering/Sky/) ports WorldBuilder's
SkyboxRenderManager verbatim for the C# idiom: zeroed view translation,
dedicated projection, depth mask off, iterate each visible SkyObject
in the day group, apply arc transform (Z rot for heading + Y rot for
arc sweep), feed TexVelocityX/Y as a scrolling UV offset, apply
per-keyframe SkyObjectReplace overrides (mesh swap + transparency +
luminosity) for overcast / dusk cloud variants.

GameWindow integration:
  - OnLoad parses Region (0x13000000) into LoadedSkyDesc and hot-swaps
    WorldTime's provider to the dat-accurate keyframes. Seeds to noon
    for offline rendering. Creates the SceneLightingUboBinding and the
    SkyRenderer.
  - OnRender: set clear color from atmosphere fog, tick WeatherSystem,
    spawn/stop rain/snow camera-local emitters on kind change, feed
    sun to LightManager (zero intensity indoors — r13 §13.7), tick
    LightManager against viewer pos, build + upload the UBO, draw
    sky before terrain, draw terrain + static + instanced using the
    shared UBO.

5 new UBO packing tests (struct sizes, slot population, 8-light cap,
directional slot 0).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 10:39:48 +02:00

534 lines
23 KiB
C#
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

// src/AcDream.App/Rendering/InstancedMeshRenderer.cs
//
// True instanced rendering for static-object meshes.
// Groups entities by GfxObjId. All instance model matrices are written into
// a single shared instance VBO once per frame. Each sub-mesh is drawn with
// DrawElementsInstanced — one GL draw call per (GfxObj × sub-mesh) instead
// of one per entity. For a scene with N unique GfxObjs and M total entities
// this reduces draw calls from M*subMeshes to N*subMeshes.
//
// Matrix layout:
// System.Numerics.Matrix4x4 is row-major. Written to the float[] buffer in
// natural memory order (M11..M44). The GLSL shader reads 4 vec4 attributes
// (aInstanceRow0-3) and constructs mat4(row0, row1, row2, row3). Because
// GLSL mat4() takes column vectors, the rows of the C# matrix become the
// columns of the GLSL mat4 — which is the same transpose that UniformMatrix4
// with transpose=false produces. Visual result is identical to the old
// SetMatrix4("uModel", ...) path.
//
// Architecture note: public API matches StaticMeshRenderer so GameWindow only
// needs to update the shader and uniform setup at the call sites.
using System.Numerics;
using System.Runtime.InteropServices;
using AcDream.Core.Meshing;
using AcDream.Core.Terrain;
using AcDream.Core.World;
using Silk.NET.OpenGL;
namespace AcDream.App.Rendering;
public sealed unsafe class InstancedMeshRenderer : IDisposable
{
private readonly GL _gl;
private readonly Shader _shader;
private readonly TextureCache _textures;
// One GPU bundle per unique GfxObj id. Each GfxObj can have multiple sub-meshes.
private readonly Dictionary<uint, List<SubMeshGpu>> _gpuByGfxObj = new();
// Shared instance VBO — filled every frame with all instance model matrices.
private readonly uint _instanceVbo;
// Per-frame scratch: reused float buffer for instance matrix data.
// 16 floats per mat4. Grown on demand; never shrunk.
private float[] _instanceBuffer = new float[256 * 16]; // start at 256 instances
// ── Instance grouping scratch ─────────────────────────────────────────────
//
// Reused every frame to avoid per-frame allocation.
//
// **Group key = (GfxObjId, PaletteOverrideHash, SurfaceOverridesHash).**
//
// An earlier implementation grouped on <c>GfxObjId</c> alone and resolved
// the per-sub-mesh texture from the first instance in the group — which
// is fine for scenery where every tree shares the same palette, but
// utterly broken for NPCs: every humanoid uses the same base body
// GfxObjs and they all piled into one group, so the first NPC's palette
// was used for every NPC in the frame. Frustum culling + iteration
// order meant that "first NPC" changed as the camera turned — producing
// the "NPC clothing changes when I turn" symptom.
//
// Now we also key by the entity's PaletteOverride + per-MeshRef
// SurfaceOverrides signature so only entities that decode to the
// SAME texture for every sub-mesh can share a batch. Entities with
// unique appearance fall to single-instance groups (still correct,
// marginally slower than true instancing).
private readonly Dictionary<GroupKey, InstanceGroup> _groups = new();
private readonly record struct GroupKey(uint GfxObjId, ulong TextureSignature);
public InstancedMeshRenderer(GL gl, Shader shader, TextureCache textures)
{
_gl = gl;
_shader = shader;
_textures = textures;
_instanceVbo = _gl.GenBuffer();
}
// ── Upload ────────────────────────────────────────────────────────────────
public void EnsureUploaded(uint gfxObjId, IReadOnlyList<GfxObjSubMesh> subMeshes)
{
if (_gpuByGfxObj.ContainsKey(gfxObjId))
return;
var list = new List<SubMeshGpu>(subMeshes.Count);
foreach (var sm in subMeshes)
list.Add(UploadSubMesh(sm));
_gpuByGfxObj[gfxObjId] = list;
}
private SubMeshGpu UploadSubMesh(GfxObjSubMesh sm)
{
uint vao = _gl.GenVertexArray();
_gl.BindVertexArray(vao);
// ── Vertex buffer (positions, normals, UVs) ───────────────────────────
uint vbo = _gl.GenBuffer();
_gl.BindBuffer(BufferTargetARB.ArrayBuffer, vbo);
fixed (void* p = sm.Vertices)
_gl.BufferData(BufferTargetARB.ArrayBuffer,
(nuint)(sm.Vertices.Length * sizeof(Vertex)), p, BufferUsageARB.StaticDraw);
uint stride = (uint)sizeof(Vertex);
_gl.EnableVertexAttribArray(0);
_gl.VertexAttribPointer(0, 3, VertexAttribPointerType.Float, false, stride, (void*)0);
_gl.EnableVertexAttribArray(1);
_gl.VertexAttribPointer(1, 3, VertexAttribPointerType.Float, false, stride, (void*)(3 * sizeof(float)));
_gl.EnableVertexAttribArray(2);
_gl.VertexAttribPointer(2, 2, VertexAttribPointerType.Float, false, stride, (void*)(6 * sizeof(float)));
// Note: location 3 (uint TerrainLayer) is NOT used by mesh_instanced.vert;
// that slot is reserved for per-instance mat4 row 0 from the instance VBO.
// ── Index buffer ──────────────────────────────────────────────────────
uint ebo = _gl.GenBuffer();
_gl.BindBuffer(BufferTargetARB.ElementArrayBuffer, ebo);
fixed (void* p = sm.Indices)
_gl.BufferData(BufferTargetARB.ElementArrayBuffer,
(nuint)(sm.Indices.Length * sizeof(uint)), p, BufferUsageARB.StaticDraw);
// ── Per-instance model matrix (locations 3-6) ─────────────────────────
// Bind the shared instance VBO. The VAO captures this binding at each
// attribute location. At draw time we re-call VertexAttribPointer with
// the per-group byte offset (to address different groups in the VBO
// without DrawElementsInstancedBaseInstance).
_gl.BindBuffer(BufferTargetARB.ArrayBuffer, _instanceVbo);
// mat4 = 4 × vec4, stride = 64 bytes, divisor = 1 (advance once per instance)
for (uint row = 0; row < 4; row++)
{
uint loc = 3 + row;
_gl.EnableVertexAttribArray(loc);
_gl.VertexAttribPointer(loc, 4, VertexAttribPointerType.Float, false, 64, (void*)(row * 16));
_gl.VertexAttribDivisor(loc, 1);
}
_gl.BindVertexArray(0);
return new SubMeshGpu
{
Vao = vao,
Vbo = vbo,
Ebo = ebo,
IndexCount = sm.Indices.Length,
SurfaceId = sm.SurfaceId,
Translucency = sm.Translucency,
};
}
// ── Draw ──────────────────────────────────────────────────────────────────
public void Draw(ICamera camera,
IEnumerable<(uint LandblockId, Vector3 AabbMin, Vector3 AabbMax, IReadOnlyList<WorldEntity> Entities)> landblockEntries,
FrustumPlanes? frustum = null,
uint? neverCullLandblockId = null,
HashSet<uint>? visibleCellIds = null)
{
_shader.Use();
var vp = camera.View * camera.Projection;
_shader.SetMatrix4("uViewProjection", vp);
// Phase G: lighting + ambient + fog are owned by the
// SceneLighting UBO (binding=1) uploaded once per frame by
// GameWindow. The instanced mesh fragment shader reads it
// directly — no per-draw uniform uploads needed.
// ── Collect and group instances ───────────────────────────────────────
CollectGroups(landblockEntries, frustum, neverCullLandblockId, visibleCellIds);
// ── Build and upload the instance buffer ──────────────────────────────
// Count total instances.
int totalInstances = 0;
foreach (var grp in _groups.Values)
totalInstances += grp.Count;
// Grow the scratch buffer if needed.
int needed = totalInstances * 16;
if (_instanceBuffer.Length < needed)
_instanceBuffer = new float[needed + 256 * 16]; // extra headroom
// Write all groups contiguously. Record each group's starting offset
// (in units of instances, not bytes) so we can address them at draw time.
int instanceOffset = 0;
foreach (var grp in _groups.Values)
{
grp.BufferOffset = instanceOffset;
foreach (ref readonly var inst in CollectionsMarshal.AsSpan(grp.Entries))
WriteMatrix(_instanceBuffer, instanceOffset++ * 16, inst.Model);
}
// Upload all instance data in a single DynamicDraw call.
if (totalInstances > 0)
{
_gl.BindBuffer(BufferTargetARB.ArrayBuffer, _instanceVbo);
fixed (void* p = _instanceBuffer)
_gl.BufferData(BufferTargetARB.ArrayBuffer,
(nuint)(totalInstances * 16 * sizeof(float)), p, BufferUsageARB.DynamicDraw);
}
// ── Pass 1: Opaque + ClipMap ──────────────────────────────────────────
foreach (var (key, grp) in _groups)
{
if (!_gpuByGfxObj.TryGetValue(key.GfxObjId, out var subMeshes))
continue;
bool hasOpaqueSubMesh = false;
foreach (var sub in subMeshes)
{
if (sub.Translucency == TranslucencyKind.Opaque ||
sub.Translucency == TranslucencyKind.ClipMap)
{
hasOpaqueSubMesh = true;
break;
}
}
if (!hasOpaqueSubMesh) continue;
// For this group, instance data starts at grp.BufferOffset in the VBO.
// We need to tell the VAO to read from that offset.
uint byteOffset = (uint)(grp.BufferOffset * 64); // 64 bytes per mat4
foreach (var sub in subMeshes)
{
if (sub.Translucency != TranslucencyKind.Opaque &&
sub.Translucency != TranslucencyKind.ClipMap)
continue;
_shader.SetInt("uTranslucencyKind", (int)sub.Translucency);
// Bind VAO + re-point instance attributes to the group's slice
// in the shared VBO. This updates the VAO's stored offset for
// locations 3-6 without touching the vertex or index bindings.
_gl.BindVertexArray(sub.Vao);
_gl.BindBuffer(BufferTargetARB.ArrayBuffer, _instanceVbo);
for (uint row = 0; row < 4; row++)
{
_gl.VertexAttribPointer(3 + row, 4, VertexAttribPointerType.Float,
false, 64, (void*)(byteOffset + row * 16));
}
// Resolve texture from the first instance (all instances in this
// group share the same GfxObj so they have compatible overrides
// only in the degenerate case of mixed-palette entities using the
// same GfxObj — rare enough to accept the approximation here).
if (grp.Count == 0) continue;
var firstEntry = grp.Entries[0];
uint tex = ResolveTex(firstEntry.Entity, firstEntry.MeshRef, sub);
_gl.ActiveTexture(TextureUnit.Texture0);
_gl.BindTexture(TextureTarget.Texture2D, tex);
_gl.DrawElementsInstanced(PrimitiveType.Triangles,
(uint)sub.IndexCount,
DrawElementsType.UnsignedInt,
(void*)0,
(uint)grp.Count);
}
}
// ── Pass 2: Translucent (AlphaBlend, Additive, InvAlpha) ─────────────
_gl.Enable(EnableCap.Blend);
_gl.DepthMask(false);
_gl.Enable(EnableCap.CullFace);
_gl.CullFace(TriangleFace.Back);
_gl.FrontFace(FrontFaceDirection.Ccw);
foreach (var (key, grp) in _groups)
{
if (!_gpuByGfxObj.TryGetValue(key.GfxObjId, out var subMeshes))
continue;
bool hasTranslucentSubMesh = false;
foreach (var sub in subMeshes)
{
if (sub.Translucency != TranslucencyKind.Opaque &&
sub.Translucency != TranslucencyKind.ClipMap)
{
hasTranslucentSubMesh = true;
break;
}
}
if (!hasTranslucentSubMesh) continue;
uint byteOffset = (uint)(grp.BufferOffset * 64);
foreach (var sub in subMeshes)
{
if (sub.Translucency == TranslucencyKind.Opaque ||
sub.Translucency == TranslucencyKind.ClipMap)
continue;
switch (sub.Translucency)
{
case TranslucencyKind.Additive:
_gl.BlendFunc(BlendingFactor.SrcAlpha, BlendingFactor.One);
break;
case TranslucencyKind.InvAlpha:
_gl.BlendFunc(BlendingFactor.OneMinusSrcAlpha, BlendingFactor.SrcAlpha);
break;
default: // AlphaBlend
_gl.BlendFunc(BlendingFactor.SrcAlpha, BlendingFactor.OneMinusSrcAlpha);
break;
}
_shader.SetInt("uTranslucencyKind", (int)sub.Translucency);
_gl.BindVertexArray(sub.Vao);
_gl.BindBuffer(BufferTargetARB.ArrayBuffer, _instanceVbo);
for (uint row = 0; row < 4; row++)
{
_gl.VertexAttribPointer(3 + row, 4, VertexAttribPointerType.Float,
false, 64, (void*)(byteOffset + row * 16));
}
if (grp.Count == 0) continue;
var firstEntry = grp.Entries[0];
uint tex = ResolveTex(firstEntry.Entity, firstEntry.MeshRef, sub);
_gl.ActiveTexture(TextureUnit.Texture0);
_gl.BindTexture(TextureTarget.Texture2D, tex);
_gl.DrawElementsInstanced(PrimitiveType.Triangles,
(uint)sub.IndexCount,
DrawElementsType.UnsignedInt,
(void*)0,
(uint)grp.Count);
}
}
// Restore default GL state.
_gl.DepthMask(true);
_gl.Disable(EnableCap.Blend);
_gl.Disable(EnableCap.CullFace);
_gl.BindVertexArray(0);
}
// ── Grouping ──────────────────────────────────────────────────────────────
/// <summary>
/// Iterates all visible landblock entries and groups every (entity, meshRef)
/// pair by GfxObjId. Clears previous frame's groups before filling.
/// </summary>
private void CollectGroups(
IEnumerable<(uint LandblockId, Vector3 AabbMin, Vector3 AabbMax, IReadOnlyList<WorldEntity> Entities)> landblockEntries,
FrustumPlanes? frustum,
uint? neverCullLandblockId,
HashSet<uint>? visibleCellIds)
{
foreach (var grp in _groups.Values)
grp.Entries.Clear();
foreach (var entry in landblockEntries)
{
if (frustum is not null &&
entry.LandblockId != neverCullLandblockId &&
!FrustumCuller.IsAabbVisible(frustum.Value, entry.AabbMin, entry.AabbMax))
continue;
foreach (var entity in entry.Entities)
{
if (entity.MeshRefs.Count == 0)
continue;
// Step 4: portal visibility filter. If we have a visible cell set,
// skip interior entities whose parent cell isn't visible.
// visibleCellIds == null means camera is outdoors → show all interiors.
if (entity.ParentCellId.HasValue && visibleCellIds is not null
&& !visibleCellIds.Contains(entity.ParentCellId.Value))
continue;
var entityRoot =
Matrix4x4.CreateFromQuaternion(entity.Rotation) *
Matrix4x4.CreateTranslation(entity.Position);
// Hash the entity's PaletteOverride once — shared by every
// MeshRef on this entity, so we compute it outside the loop.
ulong palHash = HashPaletteOverride(entity.PaletteOverride);
foreach (var meshRef in entity.MeshRefs)
{
if (!_gpuByGfxObj.ContainsKey(meshRef.GfxObjId))
continue;
var model = meshRef.PartTransform * entityRoot;
// Texture signature = palette hash ^ surface-overrides hash.
// Two instances can share a batch only when their ResolveTex
// would return identical handles for every sub-mesh — that
// means identical palette AND identical surface overrides.
ulong surfHash = HashSurfaceOverrides(meshRef.SurfaceOverrides);
ulong texSig = palHash ^ surfHash;
var key = new GroupKey(meshRef.GfxObjId, texSig);
if (!_groups.TryGetValue(key, out var group))
{
group = new InstanceGroup();
_groups[key] = group;
}
group.Entries.Add(new InstanceEntry(model, entity, meshRef));
}
}
}
}
private static ulong HashPaletteOverride(AcDream.Core.World.PaletteOverride? p)
{
if (p is null) return 0UL;
ulong h = 0xCBF29CE484222325UL;
const ulong prime = 0x100000001B3UL;
h = (h ^ p.BasePaletteId) * prime;
foreach (var sp in p.SubPalettes)
{
h = (h ^ sp.SubPaletteId) * prime;
h = (h ^ sp.Offset) * prime;
h = (h ^ sp.Length) * prime;
}
return h;
}
/// <summary>
/// Order-independent hash of a SurfaceOverrides dictionary. XOR of each
/// (key, value) pair keeps the result stable regardless of Dictionary
/// iteration order, so two instances whose override maps contain the
/// same pairs will hash identically.
/// </summary>
private static ulong HashSurfaceOverrides(IReadOnlyDictionary<uint, uint>? overrides)
{
if (overrides is null || overrides.Count == 0) return 0UL;
ulong acc = 0UL;
foreach (var kvp in overrides)
{
ulong pair = ((ulong)kvp.Key << 32) | kvp.Value;
acc ^= pair;
}
// Fold with a prime so the zero case doesn't collide with "empty".
return (acc ^ 0xCBF29CE484222325UL) * 0x100000001B3UL;
}
// ── Matrix write ──────────────────────────────────────────────────────────
/// <summary>
/// Writes a System.Numerics Matrix4x4 into <paramref name="buf"/> starting
/// at <paramref name="offset"/> as 16 consecutive floats in row-major order
/// (the C# natural memory layout). The GLSL shader reads each 4-float row
/// as a column of the mat4 — identical to what UniformMatrix4(transpose=false)
/// produces for the uniform path.
/// </summary>
private static void WriteMatrix(float[] buf, int offset, in Matrix4x4 m)
{
buf[offset + 0] = m.M11; buf[offset + 1] = m.M12; buf[offset + 2] = m.M13; buf[offset + 3] = m.M14;
buf[offset + 4] = m.M21; buf[offset + 5] = m.M22; buf[offset + 6] = m.M23; buf[offset + 7] = m.M24;
buf[offset + 8] = m.M31; buf[offset + 9] = m.M32; buf[offset + 10] = m.M33; buf[offset + 11] = m.M34;
buf[offset + 12] = m.M41; buf[offset + 13] = m.M42; buf[offset + 14] = m.M43; buf[offset + 15] = m.M44;
}
// ── Texture resolution ────────────────────────────────────────────────────
private uint ResolveTex(WorldEntity entity, MeshRef meshRef, SubMeshGpu sub)
{
uint overrideOrigTex = 0;
bool hasOrigTexOverride = meshRef.SurfaceOverrides is not null
&& meshRef.SurfaceOverrides.TryGetValue(sub.SurfaceId, out overrideOrigTex);
uint? origTexOverride = hasOrigTexOverride ? overrideOrigTex : (uint?)null;
if (entity.PaletteOverride is not null)
{
return _textures.GetOrUploadWithPaletteOverride(
sub.SurfaceId, origTexOverride, entity.PaletteOverride);
}
else if (hasOrigTexOverride)
{
return _textures.GetOrUploadWithOrigTextureOverride(sub.SurfaceId, overrideOrigTex);
}
else
{
return _textures.GetOrUpload(sub.SurfaceId);
}
}
// ── Disposal ──────────────────────────────────────────────────────────────
public void Dispose()
{
foreach (var subs in _gpuByGfxObj.Values)
{
foreach (var sub in subs)
{
_gl.DeleteBuffer(sub.Vbo);
_gl.DeleteBuffer(sub.Ebo);
_gl.DeleteVertexArray(sub.Vao);
}
}
_gl.DeleteBuffer(_instanceVbo);
_gpuByGfxObj.Clear();
_groups.Clear();
}
// ── Private types ─────────────────────────────────────────────────────────
private sealed class SubMeshGpu
{
public uint Vao;
public uint Vbo;
public uint Ebo;
public int IndexCount;
public uint SurfaceId;
public TranslucencyKind Translucency;
}
/// <summary>
/// All instances of one GfxObj for this frame, plus their starting offset
/// in the shared instance VBO (in units of instances, not bytes).
/// </summary>
private sealed class InstanceGroup
{
public readonly List<InstanceEntry> Entries = new();
public int BufferOffset;
public int Count => Entries.Count;
}
private readonly struct InstanceEntry
{
public readonly Matrix4x4 Model;
public readonly WorldEntity Entity;
public readonly MeshRef MeshRef;
public InstanceEntry(Matrix4x4 model, WorldEntity entity, MeshRef meshRef)
{
Model = model;
Entity = entity;
MeshRef = meshRef;
}
}
}