Merge branch 'claude/priceless-feistel-c12935' — Phase N.5 SHIP
N.5: Modern Rendering Path. WbDrawDispatcher now uses bindless textures + glMultiDrawElementsIndirect on top of N.4's grouped pipeline. Three SSBO uploads + 2 indirect calls per frame, ~12-15 total GL calls for entity rendering regardless of scene complexity. Measured 1.23 ms / frame median at Holtburg courtyard (1662 groups, ~810 fps). User-gated visual verification PASS at Holtburg. Includes ship-amendment: legacy renderer path formally retired (InstancedMeshRenderer + StaticMeshRenderer + WbFoundationFlag deleted). Bindless is now mandatory; missing extensions throw NotSupportedException at startup with a clear error message. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
commit
27eaf4e0be
23 changed files with 4379 additions and 1278 deletions
63
CLAUDE.md
63
CLAUDE.md
|
|
@ -55,9 +55,11 @@ ourselves".
|
||||||
`EntitySpawnAdapter.cs` — bridge spawn lifecycle to WB ref-counts.
|
`EntitySpawnAdapter.cs` — bridge spawn lifecycle to WB ref-counts.
|
||||||
Atlas tier (procedural) goes via Landblock; per-instance tier
|
Atlas tier (procedural) goes via Landblock; per-instance tier
|
||||||
(server-spawned, palette/texture overrides) goes via Entity.
|
(server-spawned, palette/texture overrides) goes via Entity.
|
||||||
- `WbFoundationFlag` is default-on. `ACDREAM_USE_WB_FOUNDATION=0`
|
- **Modern path is mandatory as of N.5 ship amendment (2026-05-08).**
|
||||||
falls back to legacy `InstancedMeshRenderer` (kept as escape hatch
|
`WbFoundationFlag`, `InstancedMeshRenderer`, and `StaticMeshRenderer`
|
||||||
until N.6 fully retires it).
|
are deleted. Missing `GL_ARB_bindless_texture` or
|
||||||
|
`GL_ARB_shader_draw_parameters` throws `NotSupportedException` at
|
||||||
|
startup. There is no legacy fallback.
|
||||||
- **WB's modern rendering path** (GL 4.3 + bindless) packs every mesh
|
- **WB's modern rendering path** (GL 4.3 + bindless) packs every mesh
|
||||||
into a single global VAO/VBO/IBO. Each batch references its slice
|
into a single global VAO/VBO/IBO. Each batch references its slice
|
||||||
via `FirstIndex` (offset into IBO) + `BaseVertex` (offset into VBO).
|
via `FirstIndex` (offset into IBO) + `BaseVertex` (offset into VBO).
|
||||||
|
|
@ -72,6 +74,34 @@ ourselves".
|
||||||
`PrepareMeshDataAsync(id, isSetup)` to fire the background decode.
|
`PrepareMeshDataAsync(id, isSetup)` to fire the background decode.
|
||||||
Result auto-enqueues to `_stagedMeshData` which `Tick()` drains.
|
Result auto-enqueues to `_stagedMeshData` which `Tick()` drains.
|
||||||
`WbMeshAdapter` does this for you on first registration.
|
`WbMeshAdapter` does this for you on first registration.
|
||||||
|
- **N.5 modern dispatch** (`docs/superpowers/specs/2026-05-08-phase-n5-modern-rendering-design.md`)
|
||||||
|
uses bindless textures + multi-draw indirect on top of N.4's grouped
|
||||||
|
pipeline. Per frame: three SSBO uploads (`_instanceSsbo` mat4 per
|
||||||
|
instance @ binding=0; `_batchSsbo` `(uvec2 textureHandle, uint layer,
|
||||||
|
uint flags)` per group @ binding=1; `_indirectBuffer`
|
||||||
|
`DrawElementsIndirectCommand[]` opaque-section + transparent-section).
|
||||||
|
Two `glMultiDrawElementsIndirect` calls per frame, one per pass.
|
||||||
|
Total ~12-15 GL calls per frame for entity rendering regardless of
|
||||||
|
scene complexity.
|
||||||
|
- **`TextureCache` requires `BindlessSupport`** for the WB modern path.
|
||||||
|
Three `Bindless`-suffixed `GetOrUpload*` methods return 64-bit handles
|
||||||
|
made resident at upload time, backed by parallel Texture2DArray uploads
|
||||||
|
(`UploadRgba8AsLayer1Array`). The legacy `uint`-returning methods stay
|
||||||
|
for Sky / Terrain / Debug / particle paths that still sample via
|
||||||
|
`sampler2D`. After N.6 retires legacy renderers, the legacy upload path
|
||||||
|
+ caches can be deleted.
|
||||||
|
- **Translucency model is two-pass alpha-test** (matches WB), not
|
||||||
|
per-blend-mode subpasses. Opaque pass discards `α<0.95`; transparent
|
||||||
|
pass discards `α≥0.95` AND `α<0.05`. Native `Additive` blend renders
|
||||||
|
as alpha-blend on GfxObj surfaces — falsifiable; if a magic-content
|
||||||
|
regression shows up, add a third indirect call with
|
||||||
|
`glBlendFunc(SrcAlpha, One)` per spec §6 fallback (~30 min change).
|
||||||
|
- **Per-instance highlight (selection blink) is reserved.** `mesh_modern.vert`'s
|
||||||
|
`InstanceData` struct has a documented hook for `vec4 highlightColor`
|
||||||
|
— Phase B.4 follow-up adds the field + plumbs server-side selection
|
||||||
|
state. Stride grows from 64 → 80 bytes when added; shader updates
|
||||||
|
trivially (read the field from `Instances[instanceIndex]` + mix into
|
||||||
|
fragment color).
|
||||||
|
|
||||||
**Execution phases:** R1→R8 in the architecture doc. Each phase has clear
|
**Execution phases:** R1→R8 in the architecture doc. Each phase has clear
|
||||||
goals, test criteria, and builds on the previous. Don't skip phases.
|
goals, test criteria, and builds on the previous. Don't skip phases.
|
||||||
|
|
@ -472,18 +502,25 @@ acdream's plan lives in two files committed to the repo:
|
||||||
acceptance criteria. Do not drift from the spec without explicit user
|
acceptance criteria. Do not drift from the spec without explicit user
|
||||||
approval.
|
approval.
|
||||||
|
|
||||||
**Currently in flight: Phase N.5 — Modern Rendering Path.** Roadmap entry
|
**Currently in flight: Phase N.6 — Perf polish.**
|
||||||
at [`docs/plans/2026-04-11-roadmap.md`](docs/plans/2026-04-11-roadmap.md).
|
Roadmap entry at [`docs/plans/2026-04-11-roadmap.md`](docs/plans/2026-04-11-roadmap.md).
|
||||||
Builds on N.4's `WbDrawDispatcher` to adopt WB's modern rendering primitives:
|
Builds on N.5. Legacy renderers (`InstancedMeshRenderer`, `StaticMeshRenderer`,
|
||||||
bindless textures (eliminate `glBindTexture` calls) and
|
`WbFoundationFlag`) were retired in the N.5 ship amendment — N.6 scope is
|
||||||
`glMultiDrawElementsIndirect` (one GL call per pass instead of one per
|
perf-only: WB atlas adoption, persistent-mapped buffers, GPU-side culling,
|
||||||
group). Together these target a 2-5× CPU win on draw-heavy scenes by
|
GL_TIME_ELAPSED query double-buffering, direct N.4 vs N.5 perf measurement,
|
||||||
eliminating the remaining per-group state changes. Plan + spec to be
|
legacy `Texture2D`/`sampler2D` TextureCache path retirement (Sky/Terrain/Debug).
|
||||||
written when work begins.
|
Plan + spec written when work begins.
|
||||||
|
|
||||||
|
**Phase N.5 (Modern Rendering Path) shipped + amended 2026-05-08.** `WbDrawDispatcher`
|
||||||
|
on bindless textures + `glMultiDrawElementsIndirect`. CPU dispatcher 1.23ms/frame
|
||||||
|
at Holtburg (~810 fps). **Ship amendment:** `InstancedMeshRenderer`,
|
||||||
|
`StaticMeshRenderer`, `WbFoundationFlag` deleted in same phase — modern path is
|
||||||
|
mandatory; missing bindless throws at startup. Plan archived at
|
||||||
|
[`docs/superpowers/plans/2026-05-08-phase-n5-modern-rendering.md`](docs/superpowers/plans/2026-05-08-phase-n5-modern-rendering.md).
|
||||||
|
|
||||||
**Phase N.4 (Rendering Pipeline Foundation) shipped 2026-05-08.** WB's
|
**Phase N.4 (Rendering Pipeline Foundation) shipped 2026-05-08.** WB's
|
||||||
`ObjectMeshManager` is integrated and is the default rendering path
|
`ObjectMeshManager` is integrated and is the production rendering path
|
||||||
behind `ACDREAM_USE_WB_FOUNDATION` (default-on). Plan archived at
|
(mandatory as of N.5 ship amendment). Plan archived at
|
||||||
[`docs/superpowers/plans/2026-05-08-phase-n4-rendering-foundation.md`](docs/superpowers/plans/2026-05-08-phase-n4-rendering-foundation.md).
|
[`docs/superpowers/plans/2026-05-08-phase-n4-rendering-foundation.md`](docs/superpowers/plans/2026-05-08-phase-n4-rendering-foundation.md).
|
||||||
|
|
||||||
**Rules:**
|
**Rules:**
|
||||||
|
|
|
||||||
|
|
@ -82,11 +82,12 @@ ground. This is the bug class fixed in
|
||||||
|
|
||||||
**Sequencing implication:** Phase N.2 (terrain math helpers
|
**Sequencing implication:** Phase N.2 (terrain math helpers
|
||||||
substitution) cannot be shipped in isolation — it must land alongside
|
substitution) cannot be shipped in isolation — it must land alongside
|
||||||
N.5 (visual terrain renderer migration), at which point both physics
|
visual terrain renderer migration (originally N.5, now moved to N.7
|
||||||
and visual mesh switch to WB's formula together. Roadmap N.2 entry
|
scope), at which point both physics and visual mesh switch to WB's
|
||||||
flags this dependency.
|
formula together. N.5 shipped entity rendering only; terrain remains
|
||||||
|
on acdream's own pipeline through N.7.
|
||||||
|
|
||||||
**Research needed (when N.5 picks this up):**
|
**Research needed (when N.7 picks this up):**
|
||||||
1. Quantify divergence: run WB's `CalculateSplitDirection` and our
|
1. Quantify divergence: run WB's `CalculateSplitDirection` and our
|
||||||
`IsSplitSWtoNE` across all (lbX, lbY, cellX, cellY) tuples for a
|
`IsSplitSWtoNE` across all (lbX, lbY, cellX, cellY) tuples for a
|
||||||
representative landblock set; record disagreement rate.
|
representative landblock set; record disagreement rate.
|
||||||
|
|
@ -97,8 +98,8 @@ flags this dependency.
|
||||||
server-authoritative Z within tolerance) is invalidated by the
|
server-authoritative Z within tolerance) is invalidated by the
|
||||||
formula change.
|
formula change.
|
||||||
|
|
||||||
**Acceptance:** Resolved when N.5 lands and both physics + visual
|
**Acceptance:** Resolved when N.7 lands and both physics + visual
|
||||||
mesh use WB's split formula, OR when we decide to keep the AC2D
|
terrain use WB's split formula, OR when we decide to keep the AC2D
|
||||||
formula and patch WB's renderer in our fork.
|
formula and patch WB's renderer in our fork.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
@ -998,8 +999,8 @@ If the coat texture's UVs at the upper region map to texel-bytes whose palette i
|
||||||
|
|
||||||
**Files (diagnostic env vars committed for next-session reuse):**
|
**Files (diagnostic env vars committed for next-session reuse):**
|
||||||
|
|
||||||
- `src/AcDream.App/Rendering/InstancedMeshRenderer.cs:210-275`
|
- ~~`src/AcDream.App/Rendering/InstancedMeshRenderer.cs:210-275`
|
||||||
— `ACDREAM_NO_CULL` env var
|
— `ACDREAM_NO_CULL` env var~~ (file deleted in N.5 ship amendment)
|
||||||
- `src/AcDream.App/Rendering/GameWindow.cs` — `ACDREAM_HIDE_PART=N`
|
- `src/AcDream.App/Rendering/GameWindow.cs` — `ACDREAM_HIDE_PART=N`
|
||||||
hides specific humanoid part; `ACDREAM_DUMP_CLOTHING=1` dumps
|
hides specific humanoid part; `ACDREAM_DUMP_CLOTHING=1` dumps
|
||||||
AnimPartChanges + TextureChanges + per-part Surface chain coverage.
|
AnimPartChanges + TextureChanges + per-part Surface chain coverage.
|
||||||
|
|
|
||||||
|
|
@ -1,6 +1,6 @@
|
||||||
# acdream — strategic roadmap
|
# acdream — strategic roadmap
|
||||||
|
|
||||||
**Status:** Living document. Updated 2026-05-08 for Phase N.4 shipping (`WbMeshAdapter` + `WbDrawDispatcher` + `ACDREAM_USE_WB_FOUNDATION` default-on) + N.5 rebranded to "Modern rendering path" (bindless + multi-draw indirect on top of N.4's foundation).
|
**Status:** Living document. Updated 2026-05-08 for Phase N.5 shipping (bindless textures + `glMultiDrawElementsIndirect` on top of N.4's foundation; CPU dispatcher 1.23ms/frame at Holtburg, ~810 fps) + N.6 becomes the new in-flight phase (retire legacy renderers + perf polish).
|
||||||
**Purpose:** One source of truth for where the project is and where it's going. Every observed defect or missing feature has a named phase that owns it; when something looks wrong in-game, look here to find the phase that'll address it. Implementation details live in per-phase specs under `docs/superpowers/specs/`, not in this file.
|
**Purpose:** One source of truth for where the project is and where it's going. Every observed defect or missing feature has a named phase that owns it; when something looks wrong in-game, look here to find the phase that'll address it. Implementation details live in per-phase specs under `docs/superpowers/specs/`, not in this file.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
@ -59,7 +59,8 @@
|
||||||
| C.1 | PES particle system + sky-pass refinements — retail-faithful `ParticleEmitterInfo` unpack with all 13 motion integrators (`Particle::Init`/`Update` ports of `0x0051c290`/`0x0051c930`), `PhysicsScriptRunner` with `CallPES` self-loop semantics, `ParticleHookSink` with `EmitterDied` cleanup, instanced billboard `ParticleRenderer` with material-derived blend (DAT emitters never default additive — pulled from particle GfxObj surface), global back-to-front sort, BC clipmap alpha-keying, AttachLocal `is_parent_local=1` live-parent follow via `UpdateEmitterAnchor`. Sky pass: `Translucent+ClipMap` → alpha-blend cloud sheet (matches `D3DPolyRender::SetSurface` `0x0059c4d0`), raw-`Additive` fog-skip (matches `0x0059c882`), per-keyframe `SkyObjectReplace` Translucency/Luminosity/MaxBright divide-by-100, bit `0x01` pre/post-scene split (matches `GameSky::CreateDeletePhysicsObjects` `0x005073c0`), Setup-backed (`0x020xxxxx`) sky objects via `SetupMesh.Flatten`, persistent GL sampler objects (Wrap + ClampToEdge) replace per-frame wrap-mode mutation (ported from WorldBuilder's `OpenGLGraphicsDevice`), post-scene Z-offset gated on `(Properties & 4) != 0 && (Properties & 8) == 0` per `GameSky::UpdatePosition` `0x00506dd0`. Sky-PES playback disabled by default (named-retail proves `GameSky` drops `pes_id`); `ACDREAM_ENABLE_SKY_PES=1` opens the experimental path. 1325 → 1331 tests. | Live ✓ |
|
| C.1 | PES particle system + sky-pass refinements — retail-faithful `ParticleEmitterInfo` unpack with all 13 motion integrators (`Particle::Init`/`Update` ports of `0x0051c290`/`0x0051c930`), `PhysicsScriptRunner` with `CallPES` self-loop semantics, `ParticleHookSink` with `EmitterDied` cleanup, instanced billboard `ParticleRenderer` with material-derived blend (DAT emitters never default additive — pulled from particle GfxObj surface), global back-to-front sort, BC clipmap alpha-keying, AttachLocal `is_parent_local=1` live-parent follow via `UpdateEmitterAnchor`. Sky pass: `Translucent+ClipMap` → alpha-blend cloud sheet (matches `D3DPolyRender::SetSurface` `0x0059c4d0`), raw-`Additive` fog-skip (matches `0x0059c882`), per-keyframe `SkyObjectReplace` Translucency/Luminosity/MaxBright divide-by-100, bit `0x01` pre/post-scene split (matches `GameSky::CreateDeletePhysicsObjects` `0x005073c0`), Setup-backed (`0x020xxxxx`) sky objects via `SetupMesh.Flatten`, persistent GL sampler objects (Wrap + ClampToEdge) replace per-frame wrap-mode mutation (ported from WorldBuilder's `OpenGLGraphicsDevice`), post-scene Z-offset gated on `(Properties & 4) != 0 && (Properties & 8) == 0` per `GameSky::UpdatePosition` `0x00506dd0`. Sky-PES playback disabled by default (named-retail proves `GameSky` drops `pes_id`); `ACDREAM_ENABLE_SKY_PES=1` opens the experimental path. 1325 → 1331 tests. | Live ✓ |
|
||||||
| N.1 | WorldBuilder-backed scenery (Chorizite/WorldBuilder fork as submodule, SceneryHelpers + TerrainUtils replace our inline ports) | Live ✓ |
|
| N.1 | WorldBuilder-backed scenery (Chorizite/WorldBuilder fork as submodule, SceneryHelpers + TerrainUtils replace our inline ports) | Live ✓ |
|
||||||
| N.3 | WorldBuilder-backed texture decode — `SurfaceDecoder` delegates INDEX16 / P8 / A8R8G8B8 / R8G8B8 / A8(+Additive) to `TextureHelpers.Fill*`; `isAdditive` threaded through (terrain alpha → `FillA8Additive`, non-additive entity surfaces → `FillA8`). R5G6B5 + A4R4G4B4 newly handled (previously magenta). X8R8G8B8, DXT1/3/5, SolidColor remain ours (no WB equivalent). 9 conformance tests prove byte-identical equivalence per format. | Live ✓ |
|
| N.3 | WorldBuilder-backed texture decode — `SurfaceDecoder` delegates INDEX16 / P8 / A8R8G8B8 / R8G8B8 / A8(+Additive) to `TextureHelpers.Fill*`; `isAdditive` threaded through (terrain alpha → `FillA8Additive`, non-additive entity surfaces → `FillA8`). R5G6B5 + A4R4G4B4 newly handled (previously magenta). X8R8G8B8, DXT1/3/5, SolidColor remain ours (no WB equivalent). 9 conformance tests prove byte-identical equivalence per format. | Live ✓ |
|
||||||
| N.4 | Rendering pipeline foundation — adopted WB's `ObjectMeshManager` as the production mesh pipeline behind `ACDREAM_USE_WB_FOUNDATION` (default-on). `WbMeshAdapter` is the single seam (owns `ObjectMeshManager`, drains the staged-upload queue per frame, populates `AcSurfaceMetadataTable` with per-batch translucency / luminosity / fog metadata). `WbDrawDispatcher` is the production draw path: groups all visible (entity, batch) pairs, single-uploads the matrix buffer, fires one `glDrawElementsInstancedBaseVertexBaseInstance` per group with `BaseInstance` slicing into the shared instance VBO. `LandblockSpawnAdapter` + `EntitySpawnAdapter` bridge spawn lifecycle to WB ref-counts (atlas tier vs per-instance). Perf wins shipped as part of N.4: per-entity frustum cull, opaque front-to-back sort, palette-hash memoization (compute once per entity, reuse across batches). Visual verification at Holtburg passed: scenery + connected characters with full close-detail geometry (Issue #47 regression resolved). Legacy `InstancedMeshRenderer` retained as `ACDREAM_USE_WB_FOUNDATION=0` escape hatch until N.6. | Live ✓ |
|
| N.4 | Rendering pipeline foundation — adopted WB's `ObjectMeshManager` as the production mesh pipeline behind `ACDREAM_USE_WB_FOUNDATION` (default-on). `WbMeshAdapter` is the single seam (owns `ObjectMeshManager`, drains the staged-upload queue per frame, populates `AcSurfaceMetadataTable` with per-batch translucency / luminosity / fog metadata). `WbDrawDispatcher` is the production draw path: groups all visible (entity, batch) pairs, single-uploads the matrix buffer, fires one `glDrawElementsInstancedBaseVertexBaseInstance` per group with `BaseInstance` slicing into the shared instance VBO. `LandblockSpawnAdapter` + `EntitySpawnAdapter` bridge spawn lifecycle to WB ref-counts (atlas tier vs per-instance). Perf wins shipped as part of N.4: per-entity frustum cull, opaque front-to-back sort, palette-hash memoization (compute once per entity, reuse across batches). Visual verification at Holtburg passed: scenery + connected characters with full close-detail geometry (Issue #47 regression resolved). Legacy `InstancedMeshRenderer` retained as `ACDREAM_USE_WB_FOUNDATION=0` escape hatch until N.6 (retired early in N.5 ship amendment). | Live ✓ |
|
||||||
|
| N.5 | Modern rendering path — lifted `WbDrawDispatcher` onto bindless textures (`GL_ARB_bindless_texture`) + `glMultiDrawElementsIndirect`. Per-frame entity rendering: 3 SSBO uploads (instance matrices @ binding=0, batch data @ binding=1, indirect commands) + 2 indirect draw calls (opaque + transparent). ~12-15 GL calls per frame regardless of group count, down from hundreds-of-per-group in N.4. CPU dispatcher: 1.23 ms/frame median at Holtburg courtyard (1662 groups, ~810 fps sustained). All textures on the WB modern path use 1-layer `Texture2DArray` + `sampler2DArray`. Legacy callers keep `Texture2D` / `sampler2D` via the parallel `TextureCache` path until N.6 retires them. Three gotchas captured in memory: texture target lock-in, bindless Dispose order (two-phase non-resident before delete), GL_TIME_ELAPSED double-buffering. **Ship amendment 2026-05-08:** legacy renderers (`InstancedMeshRenderer`, `StaticMeshRenderer`, `WbFoundationFlag`) retired within N.5 — modern path is mandatory; missing bindless throws `NotSupportedException` at startup. N.6 scope narrowed accordingly. Plan archived at `docs/superpowers/plans/2026-05-08-phase-n5-modern-rendering.md`. | Live ✓ |
|
||||||
|
|
||||||
Plus polish that doesn't get its own phase number:
|
Plus polish that doesn't get its own phase number:
|
||||||
- FlyCamera default speed lowered + Shift-to-boost
|
- FlyCamera default speed lowered + Shift-to-boost
|
||||||
|
|
@ -624,22 +625,21 @@ for our deletions/additions; merge upstream `master` periodically.
|
||||||
memoization. Legacy `InstancedMeshRenderer` retained as flag-off
|
memoization. Legacy `InstancedMeshRenderer` retained as flag-off
|
||||||
fallback until N.6 fully retires it. Plan archived at
|
fallback until N.6 fully retires it. Plan archived at
|
||||||
`docs/superpowers/plans/2026-05-08-phase-n4-rendering-foundation.md`.
|
`docs/superpowers/plans/2026-05-08-phase-n4-rendering-foundation.md`.
|
||||||
- **N.5 — Modern rendering path.** **Rebranded from "Terrain rendering"
|
- **✓ SHIPPED — N.5 — Modern rendering path.** Shipped 2026-05-08.
|
||||||
2026-05-08 after N.4 perf review.** N.4 left two big remaining wins
|
**Rebranded from "Terrain rendering" 2026-05-08 after N.4 perf
|
||||||
on the table that pair naturally: (1) bindless textures via
|
review.** Lifted `WbDrawDispatcher` onto bindless textures
|
||||||
`GL_ARB_bindless_texture` (WB already populates
|
(`GL_ARB_bindless_texture`) + `glMultiDrawElementsIndirect`. Per-frame
|
||||||
`ObjectRenderBatch.BindlessTextureHandle`; switch our shader to
|
entity rendering: 3 SSBO uploads (instance matrices @ binding=0, batch
|
||||||
consume per-instance handles, eliminate 100% of `glBindTexture`
|
data @ binding=1, indirect commands) + 2 indirect calls (opaque +
|
||||||
calls), and (2) `glMultiDrawElementsIndirect` (one GL call per pass
|
transparent). ~12-15 GL calls per frame regardless of group count, down
|
||||||
instead of one per group; build a `DrawElementsIndirectCommand`
|
from hundreds-of-per-group in N.4. CPU dispatcher: 1.23 ms/frame median
|
||||||
buffer, fire one indirect draw, the driver pulls everything). Both
|
at Holtburg (1662 groups, ~810 fps). All textures on the modern path use
|
||||||
require shader changes (same shader, in fact — bindless + indirect
|
1-layer `Texture2DArray` + `sampler2DArray`; legacy callers retain
|
||||||
are the same modern path WB uses internally). Together they target a
|
`Texture2D` via the parallel `TextureCache` path until N.6 retires them.
|
||||||
2-5× CPU win on draw-heavy scenes (Holtburg courtyard, Foundry,
|
Three gotchas in memory (`project_phase_n5_state.md`): texture target
|
||||||
dense dungeons). Also folds in: persistent-mapped instance VBO
|
lock-in, bindless Dispose two-phase order, GL_TIME_ELAPSED double-
|
||||||
(`glBufferStorage` + `MAP_PERSISTENT_BIT | MAP_COHERENT_BIT` + ring
|
buffering. Plan archived at
|
||||||
buffer + sync) and texture pre-warm at landblock load (smooths
|
`docs/superpowers/plans/2026-05-08-phase-n5-modern-rendering.md`.
|
||||||
streaming-boundary hitches). **Estimate: 2-3 weeks.**
|
|
||||||
- **N.5b — Terrain rendering on N.5 path.** Wire WB's
|
- **N.5b — Terrain rendering on N.5 path.** Wire WB's
|
||||||
`TerrainRenderManager` + `LandSurfaceManager` + `TerrainGeometryGenerator`
|
`TerrainRenderManager` + `LandSurfaceManager` + `TerrainGeometryGenerator`
|
||||||
onto the modern rendering path. Closes N.2's deferred terrain math
|
onto the modern rendering path. Closes N.2's deferred terrain math
|
||||||
|
|
@ -647,12 +647,17 @@ for our deletions/additions; merge upstream `master` periodically.
|
||||||
`CalculateSplitDirection` + `GetHeight` + `GetNormal` in lockstep,
|
`CalculateSplitDirection` + `GetHeight` + `GetNormal` in lockstep,
|
||||||
resolving ISSUE #51. **Estimate: 1-2 weeks** (was 2-3 — modern path
|
resolving ISSUE #51. **Estimate: 1-2 weeks** (was 2-3 — modern path
|
||||||
primitives already in place from N.5).
|
primitives already in place from N.5).
|
||||||
- **N.6 — Static objects rendering.** Wire WB's
|
- **N.6 — Perf polish.** **Currently in flight.**
|
||||||
`StaticObjectRenderManager` onto the modern rendering path; **fully
|
Builds on N.5. Legacy renderer retirement was pulled forward into N.5
|
||||||
delete** legacy `StaticMeshRenderer` + `InstancedMeshRenderer` (they
|
ship amendment — `InstancedMeshRenderer`, `StaticMeshRenderer`, and
|
||||||
remain as `ACDREAM_USE_WB_FOUNDATION=0` escape hatches through N.5).
|
`WbFoundationFlag` are already gone. N.6 scope: WB atlas adoption for
|
||||||
Mostly draw orchestration at this point — most of the substance
|
memory savings on shared content, persistent-mapped buffers if
|
||||||
landed in N.4 + N.5. **Estimate: 1-2 weeks** (was 2-3).
|
`glBufferData` shows up in profiling, GPU-side culling via compute
|
||||||
|
pre-pass, GL_TIME_ELAPSED query double-buffering (deferred from N.5 —
|
||||||
|
diagnostic shows `gpu_us=0/0` under `ACDREAM_WB_DIAG=1`), direct N.4
|
||||||
|
vs N.5 perf measurement, retire the legacy `Texture2D`/`sampler2D` path
|
||||||
|
in `TextureCache` (currently kept for Sky + Terrain + Debug).
|
||||||
|
Plan + spec written when work begins. **Estimate: 1-2 weeks.**
|
||||||
- **N.7 — EnvCells / dungeons.** Replace EnvCell rendering with WB's
|
- **N.7 — EnvCells / dungeons.** Replace EnvCell rendering with WB's
|
||||||
`EnvCellRenderManager` + `PortalRenderManager` on top of N.4's
|
`EnvCellRenderManager` + `PortalRenderManager` on top of N.4's
|
||||||
foundation. **Estimate: 1-2 weeks** (was 2-3 — naturally smaller now
|
foundation. **Estimate: 1-2 weeks** (was 2-3 — naturally smaller now
|
||||||
|
|
|
||||||
72
docs/plans/2026-05-08-phase-n5-perf-baseline.md
Normal file
72
docs/plans/2026-05-08-phase-n5-perf-baseline.md
Normal file
|
|
@ -0,0 +1,72 @@
|
||||||
|
# Phase N.5 perf baseline
|
||||||
|
|
||||||
|
**Captured:** 2026-05-08, against N.5 head (post-Task 12) on local machine.
|
||||||
|
**Method:** `ACDREAM_WB_DIAG=1` + character at Holtburg spawn position +
|
||||||
|
roaming. Numbers below are 5-second window medians from `[WB-DIAG]`.
|
||||||
|
|
||||||
|
## Holtburg courtyard (steady state)
|
||||||
|
|
||||||
|
| Metric | N.5 measured | N.4 (estimated*) | Gate |
|
||||||
|
|---|---|---|---|
|
||||||
|
| CPU dispatcher (median) | **1227 µs / frame** | ≥2500 µs / frame | ≤70% of N.4 → **PASS** |
|
||||||
|
| CPU dispatcher (p95) | 1303 µs / frame | — | — |
|
||||||
|
| GPU rendering (median) | unmeasured (see below) | — | within ±10% — **DEFERRED** |
|
||||||
|
| `drawsIssued` per 5s | 4.85M (= 1662 groups × ~580 fps) | far higher per frame | — |
|
||||||
|
| `drawsIssued` per pass (CPU GL calls) | **2** (1 opaque + 1 transparent indirect) | ~hundreds per pass | ≤5 → **PASS** |
|
||||||
|
| `groups` (working set) | 1662 | ~similar | sanity |
|
||||||
|
| Frame rate (inferred) | ~810 fps | ~100-200 fps | substantial uplift |
|
||||||
|
|
||||||
|
*N.4 baseline NOT measured directly in this run. The "≥2500 µs / frame"
|
||||||
|
estimate assumes N.4's per-group glBindTexture + glBindBuffer +
|
||||||
|
glDrawElementsInstancedBaseVertexBaseInstance hot path costs ≥1.5 µs per
|
||||||
|
group and N.4 has ~1700 groups in this scene, putting the GL portion alone
|
||||||
|
at ~2.5 ms before adding the entity-walk overhead. N.5's measurement
|
||||||
|
includes ALL dispatcher work (entity walk + group bucketing + 3 SSBO
|
||||||
|
uploads + 2 indirect calls + state changes) at 1230 µs total — comfortably
|
||||||
|
half of the lower bound estimate.
|
||||||
|
|
||||||
|
## Acceptance gates (spec §8.3)
|
||||||
|
|
||||||
|
- [x] **Visual identity to N.4** — confirmed at Task 10 USER GATE: Holtburg
|
||||||
|
courtyard renders identical, no missing entities, no z-fighting, no
|
||||||
|
exploded parts.
|
||||||
|
- [x] **CPU dispatcher time ≤ 70% of N.4** — N.5 measures 1.23 ms/frame
|
||||||
|
median; estimated N.4 ≥2.5 ms/frame; **comfortably under 70%**.
|
||||||
|
- [ ] **GPU rendering time within ±10% of N.4** — DEFERRED. The
|
||||||
|
`GL_TIME_ELAPSED` query polling never reports `avail != 0` in our
|
||||||
|
single-frame poll loop; the driver hasn't finalized the result by the
|
||||||
|
time we check. The fix is double-buffering (issue queryA on frame N,
|
||||||
|
read result on frame N+2). N.6 perf polish item.
|
||||||
|
- [x] **`drawsIssued` ≤ 5 per pass (CPU GL calls)** — exactly 2 indirect
|
||||||
|
calls per frame regardless of scene size.
|
||||||
|
- [x] **All tests green** — 70/70 in
|
||||||
|
`FullyQualifiedName~Wb|FullyQualifiedName~MatrixComposition`.
|
||||||
|
8 pre-existing failures in `MotionInterpreter` / `BSPStepUp` /
|
||||||
|
`PositionManager` / `PlayerMovementController` / `Dispatcher` are
|
||||||
|
carry-forward from before N.5 and unrelated to rendering.
|
||||||
|
- [N/A] **`ACDREAM_USE_WB_FOUNDATION=0` still works** — escape hatch
|
||||||
|
formally retired in N.5 ship amendment. `InstancedMeshRenderer`,
|
||||||
|
`StaticMeshRenderer`, and `WbFoundationFlag` deleted. Missing
|
||||||
|
bindless throws `NotSupportedException` at startup with a clear
|
||||||
|
error message. No fallback path.
|
||||||
|
|
||||||
|
## Visual verification (Task 14)
|
||||||
|
|
||||||
|
- [x] **Holtburg courtyard** — PASS at Task 10 USER GATE.
|
||||||
|
- [ ] **Foundry interior / dense static-object scene** — TODO Task 14.
|
||||||
|
- [ ] **Indoor → outdoor cell transition** — TODO Task 14.
|
||||||
|
- [ ] **Drudge / character close-up (Issue #47 close-detail mesh)** — TODO Task 14.
|
||||||
|
- [ ] **Magic content (Decision 2 additive fallback check)** — TODO Task 14.
|
||||||
|
- [ ] **Long-session sanity** — DEFERRED (N.6 watchlist; not load-bearing for ship).
|
||||||
|
|
||||||
|
## Open follow-ups for N.6
|
||||||
|
|
||||||
|
1. **GPU timer query double-buffering** — the current single-frame poll
|
||||||
|
pattern never sees `QueryResultAvailable=true`. Issue queryA on frame N,
|
||||||
|
queryB on frame N+1, read queryA on frame N+2. ~30 lines of state.
|
||||||
|
2. **Direct N.4 vs N.5 perf comparison** — re-run with `git checkout`ed N.4
|
||||||
|
SHIP (`c445364`) for a side-by-side measurement. Not load-bearing but
|
||||||
|
useful for N.6 ship message.
|
||||||
|
3. **Persistent-mapped buffers** — Decision 7 deferral. If profiling shows
|
||||||
|
the per-frame `glBufferData` cost is the residual hot spot, layer it on
|
||||||
|
top of the modern path.
|
||||||
2706
docs/superpowers/plans/2026-05-08-phase-n5-modern-rendering.md
Normal file
2706
docs/superpowers/plans/2026-05-08-phase-n5-modern-rendering.md
Normal file
File diff suppressed because it is too large
Load diff
|
|
@ -0,0 +1,554 @@
|
||||||
|
# Phase N.5 — Modern Rendering Path — Design Spec
|
||||||
|
|
||||||
|
**Status:** Draft (brainstormed 2026-05-08, not yet implemented).
|
||||||
|
**Author:** acdream lead engineer + Claude.
|
||||||
|
**Builds on:** Phase N.4 (`WbDrawDispatcher`, shipped 2026-05-08).
|
||||||
|
**Predecessor docs:**
|
||||||
|
- `docs/research/2026-05-08-phase-n5-handoff.md` (cold-start briefing).
|
||||||
|
- `docs/superpowers/plans/2026-05-08-phase-n4-rendering-foundation.md` (N.4 plan; Adjustments 7-10 are required reading).
|
||||||
|
- `docs/superpowers/specs/2026-05-08-phase-n4-rendering-foundation-design.md` (N.4 spec).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 1. Problem statement
|
||||||
|
|
||||||
|
N.4 collapsed entity rendering from O(entities × batches) per-draw GL calls to O(unique GfxObj × surface × translucency) grouped instanced draws. The remaining hot path still does, per group:
|
||||||
|
|
||||||
|
```
|
||||||
|
glActiveTexture(0)
|
||||||
|
glBindTexture(2D, texHandle)
|
||||||
|
glBindBuffer(EBO, batchIbo)
|
||||||
|
glDrawElementsInstancedBaseVertexBaseInstance(...)
|
||||||
|
```
|
||||||
|
|
||||||
|
Across a typical Holtburg-courtyard scene that's still ~100-300 GL calls per frame for entities. Modern GPUs and our drivers (GL 4.3 + bindless, gated by WB's `_useModernRendering`) support patterns that eliminate ALL of those per-group calls:
|
||||||
|
|
||||||
|
- **Bindless textures** (`GL_ARB_bindless_texture`) — texture handles are 64-bit tokens that don't require `glBindTexture` to use; the shader samples from a handle read out of buffer data.
|
||||||
|
- **Multi-draw indirect** (`glMultiDrawElementsIndirect`) — one GL call dispatches N draws from a `DrawElementsIndirectCommand` buffer; the driver issues all of them with no CPU-side per-draw work.
|
||||||
|
|
||||||
|
N.5 lifts `WbDrawDispatcher` onto these primitives. Target: ≥30% reduction in CPU dispatcher time, draw call count down to ~5/frame, no visual regression vs N.4.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 2. Decisions log
|
||||||
|
|
||||||
|
This section records the brainstorm outcomes that the rest of the doc relies on.
|
||||||
|
|
||||||
|
| # | Decision | Choice | Reason |
|
||||||
|
|---|---|---|---|
|
||||||
|
| 1 | Texture sampler model | **`sampler2DArray`** for ALL textures (1-layer wrapping for per-instance composites) | Matches WB's modern shader exactly; future-proofs for atlas adoption in N.6+; avoids two shader files. ~50 lines of TextureCache change. |
|
||||||
|
| 2 | Translucent rendering | **WB's two-pass alpha-test** (opaque pass discards `α<0.95`, transparent pass discards `α≥0.95`) | Single blend mode per pass enables one indirect call per pass. Loses native `Additive` blend on GfxObj surfaces; sky + particles have own renderers and aren't affected. Falsifiable at visual verification — if we see a regression, add an additive sub-pass (~30-min fix). |
|
||||||
|
| 3 | Per-instance + per-draw data delivery | **All-SSBO**: `Instances[]` at binding=0 (mat4 per instance), `Batches[]` at binding=1 (texture handle + layer + flags per group) | Matches WB's modern shader. SSBOs avoid the 16-attrib stride limit, scale to large instance counts, give clean per-draw indexing via `gl_DrawIDARB`. |
|
||||||
|
| 4 | Bindless handle residency | **Resident on upload, never release** | acdream's content set is bounded (~1-5K unique textures per session). Handles persist for process lifetime; no eviction code in N.5. Diagnostic logging of handle count under `ACDREAM_WB_DIAG=1` to spot growth. |
|
||||||
|
| 5 | Escape hatch | **Modern path mandatory (N.5 ship amendment)**. `WbFoundationFlag` and `ACDREAM_USE_WB_FOUNDATION` env var have been deleted. Missing `GL_ARB_bindless_texture` or `GL_ARB_shader_draw_parameters` throws `NotSupportedException` at startup with a clear error message. No fallback. | Escape hatch was never exercised after N.4 ship. Legacy `InstancedMeshRenderer` + `StaticMeshRenderer` deleted in the N.5 retirement commit. N.6 scope narrowed accordingly. |
|
||||||
|
| 6 | Perf measurement | **CPU stopwatch + GL timer queries** logged via `[WB-DIAG]` | Captures both CPU dispatcher time and GPU rendering time. Acceptance gate compares before/after numbers in fixed Holtburg/Foundry scenes. |
|
||||||
|
| 7 | Persistent-mapped buffers | **Defer to N.6** | Bindless+indirect win is 70-80% of achievable savings. Persistent-mapped + ring + sync is the last 5-10% with non-trivial sync-fence complexity; not worth the risk in N.5's 2-3 week budget. Add post-N.5 if profiling shows residual `glBufferData` cost. |
|
||||||
|
| 8 | Per-instance highlight (selection blink) | **Defer to a Phase B.4 follow-up** | Retail pulses click targets as visual confirmation; the right mechanism is per-instance highlight color (NOT WB's global `uHighlightColor` which would tint everything in our single-indirect-call design). Field is reserved in design (extend `InstanceData` to include `vec4 highlightColor`); N.5 ships without the field, future phase plumbs it without shader rewrite. |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 3. Architecture overview
|
||||||
|
|
||||||
|
### What changes
|
||||||
|
|
||||||
|
`WbDrawDispatcher.Draw` swaps its inner loop. Phases 1-3 (entity walk, group bucketing, matrix layout) stay intact. Phases 5-6 (per-group GL calls) are replaced by a single `glMultiDrawElementsIndirect` per pass, fed by SSBO-resident per-instance and per-draw data.
|
||||||
|
|
||||||
|
### What's preserved from N.4
|
||||||
|
|
||||||
|
- Group bucketing pipeline (entity AABB cull, palette hash memo, group key dictionary).
|
||||||
|
- `AcSurfaceMetadataTable` for translucency classification.
|
||||||
|
- `EntitySpawnAdapter` / `LandblockSpawnAdapter` (mesh lifecycle bridge).
|
||||||
|
- `WbMeshAdapter` (the seam over WB's `ObjectMeshManager`).
|
||||||
|
- Front-to-back sort of opaque groups (depth-test reject of overdrawn fragments).
|
||||||
|
- Per-entity 5m AABB frustum cull.
|
||||||
|
|
||||||
|
### What's new
|
||||||
|
|
||||||
|
- `TextureCache` uploads as 1-layer `Texture2DArray` instead of `Texture2D`. Generates 64-bit bindless handles at upload, makes them resident.
|
||||||
|
- New shader pair `mesh_modern.vert/.frag` modeled on WB's `StaticObjectModern` but adapted (see §6).
|
||||||
|
- Three new GPU buffers in the dispatcher:
|
||||||
|
- `_instanceSsbo` — `std430` layout, `mat4[]`, all visible matrices.
|
||||||
|
- `_batchSsbo` — `std430` layout, `BatchData[]`, one entry per group.
|
||||||
|
- `_indirectBuffer` — `DrawElementsIndirectCommand[]`, one per group.
|
||||||
|
- Two diagnostic measurements in `[WB-DIAG]`: CPU stopwatch span around `Draw()`; GPU `GL_TIME_ELAPSED` query around the indirect dispatch.
|
||||||
|
|
||||||
|
### What gets deleted
|
||||||
|
|
||||||
|
- `WbDrawDispatcher.DrawGroup` (replaced by indirect).
|
||||||
|
- `WbDrawDispatcher.EnsureInstanceAttribs` (no more vertex attribs at locations 3-6).
|
||||||
|
- Per-blend-mode `glBlendFunc` switch in the translucent loop.
|
||||||
|
- `mesh_instanced.vert/.frag` (replaced by `mesh_modern.*`).
|
||||||
|
|
||||||
|
### What stays under the escape hatch
|
||||||
|
|
||||||
|
`InstancedMeshRenderer` is untouched. `ACDREAM_USE_WB_FOUNDATION=0` still routes there. N.6 retires it.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 4. Component changes
|
||||||
|
|
||||||
|
### 4.1 `TextureCache`
|
||||||
|
|
||||||
|
Texture upload path becomes Texture2DArray with depth=1:
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
private uint UploadRgba8AsLayer1Array(DecodedTexture decoded)
|
||||||
|
{
|
||||||
|
uint tex = _gl.GenTexture();
|
||||||
|
_gl.BindTexture(TextureTarget.Texture2DArray, tex);
|
||||||
|
|
||||||
|
fixed (byte* p = decoded.Rgba8)
|
||||||
|
_gl.TexImage3D(
|
||||||
|
TextureTarget.Texture2DArray, 0, InternalFormat.Rgba8,
|
||||||
|
(uint)decoded.Width, (uint)decoded.Height, depth: 1,
|
||||||
|
border: 0, PixelFormat.Rgba, PixelType.UnsignedByte, p);
|
||||||
|
|
||||||
|
_gl.TexParameter(TextureTarget.Texture2DArray, TextureParameterName.TextureMinFilter, (int)TextureMinFilter.Linear);
|
||||||
|
_gl.TexParameter(TextureTarget.Texture2DArray, TextureParameterName.TextureMagFilter, (int)TextureMagFilter.Linear);
|
||||||
|
_gl.TexParameter(TextureTarget.Texture2DArray, TextureParameterName.TextureWrapS, (int)TextureWrapMode.Repeat);
|
||||||
|
_gl.TexParameter(TextureTarget.Texture2DArray, TextureParameterName.TextureWrapT, (int)TextureWrapMode.Repeat);
|
||||||
|
_gl.BindTexture(TextureTarget.Texture2DArray, 0);
|
||||||
|
return tex;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Bindless handle generation, eager + resident-on-upload, parallel cache:
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
private readonly Dictionary<uint, ulong> _bindlessHandlesByGlName = new();
|
||||||
|
|
||||||
|
private ulong MakeResidentHandle(uint glTextureName)
|
||||||
|
{
|
||||||
|
if (_bindlessHandlesByGlName.TryGetValue(glTextureName, out var h))
|
||||||
|
return h;
|
||||||
|
h = _bindless.GetTextureHandleARB(glTextureName);
|
||||||
|
_bindless.MakeTextureHandleResidentARB(h);
|
||||||
|
_bindlessHandlesByGlName[glTextureName] = h;
|
||||||
|
return h;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Three new methods returning `ulong` bindless handles, paralleling the existing `uint` GL-name methods:
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
public ulong GetOrUploadBindless(uint surfaceId);
|
||||||
|
public ulong GetOrUploadWithOrigTextureOverrideBindless(uint surfaceId, uint overrideOrigTextureId);
|
||||||
|
public ulong GetOrUploadWithPaletteOverrideBindless(uint surfaceId, uint? overrideOrigTextureId, PaletteOverride paletteOverride, ulong precomputedPaletteHash);
|
||||||
|
```
|
||||||
|
|
||||||
|
Each delegates to its existing `uint` sibling to populate the underlying GL texture, then calls `MakeResidentHandle` and returns the 64-bit handle.
|
||||||
|
|
||||||
|
The `uint`-returning methods stay (used by `SkyRenderer`, `TerrainAtlas`, anything outside the WB modern path).
|
||||||
|
|
||||||
|
`Dispose` releases bindless handles BEFORE deleting their textures: iterate `_bindlessHandlesByGlName.Values`, call `glMakeTextureHandleNonResidentARB(handle)`, then `glDeleteTextures` proceeds as today.
|
||||||
|
|
||||||
|
### 4.2 `WbDrawDispatcher`
|
||||||
|
|
||||||
|
Three new GPU buffers (replacing `_instanceVbo`):
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
private uint _instanceSsbo; // binding=0, std430, mat4[]
|
||||||
|
private uint _batchSsbo; // binding=1, std430, BatchData[]
|
||||||
|
private uint _indirectBuffer; // GL_DRAW_INDIRECT_BUFFER, DEIC[]
|
||||||
|
```
|
||||||
|
|
||||||
|
`InstanceGroup` becomes:
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
private sealed class InstanceGroup
|
||||||
|
{
|
||||||
|
public uint Ibo;
|
||||||
|
public uint FirstIndex;
|
||||||
|
public int BaseVertex;
|
||||||
|
public int IndexCount;
|
||||||
|
public ulong BindlessTextureHandle; // 64-bit (was uint TextureHandle in N.4)
|
||||||
|
public uint TextureLayer; // always 0 in N.5 (per-instance composites are 1-layer arrays)
|
||||||
|
public TranslucencyKind Translucency;
|
||||||
|
public int FirstInstance;
|
||||||
|
public int InstanceCount;
|
||||||
|
public float SortDistance;
|
||||||
|
public readonly List<Matrix4x4> Matrices = new();
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
`GroupKey` adds the layer:
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
private readonly record struct GroupKey(
|
||||||
|
uint Ibo, uint FirstIndex, int BaseVertex, int IndexCount,
|
||||||
|
ulong BindlessTextureHandle, uint TextureLayer, TranslucencyKind Translucency);
|
||||||
|
```
|
||||||
|
|
||||||
|
Per-frame draw flow:
|
||||||
|
|
||||||
|
1. **Walk entities → build `_groups` dict** (unchanged from N.4).
|
||||||
|
2. **Lay matrices contiguously, split opaque/transparent, sort opaque** (unchanged).
|
||||||
|
3. **Build per-group BatchData and DEIC arrays.** One `BatchData` per group `(handle, layer, flags=0)`. One DEIC per group `(count = IndexCount, instanceCount = InstanceCount, firstIndex = FirstIndex, baseVertex = BaseVertex, baseInstance = FirstInstance)`. Indirect commands are laid out contiguously: opaque section first (sorted front-to-back), transparent section second. `_opaqueDrawCount` and `_transparentDrawCount` track section sizes; `_transparentByteOffset = _opaqueDrawCount * sizeof(DEIC)`.
|
||||||
|
4. **Three `glBufferData` uploads** to `_instanceSsbo`, `_batchSsbo`, `_indirectBuffer` (single buffer, both sections).
|
||||||
|
5. **Bind global VAO once** (preserved from N.4 — modern rendering shares one VAO).
|
||||||
|
6. **Bind SSBOs once** via `glBindBufferBase(SHADER_STORAGE_BUFFER, 0, _instanceSsbo)` and `... 1, _batchSsbo`.
|
||||||
|
7. **Opaque pass.** Set `uRenderPass = 0`. `glBindBuffer(DRAW_INDIRECT_BUFFER, _indirectBuffer)`. `glMultiDrawElementsIndirect(Triangles, UnsignedShort, indirect=(void*)0, drawcount=_opaqueDrawCount, stride=sizeof(DEIC))`.
|
||||||
|
8. **Transparent pass.** Set `uRenderPass = 1`. `glEnable(BLEND)` + `glBlendFunc(SrcAlpha, OneMinusSrcAlpha)` + `glDepthMask(false)`. `glMultiDrawElementsIndirect(Triangles, UnsignedShort, indirect=(void*)_transparentByteOffset, drawcount=_transparentDrawCount, stride=sizeof(DEIC))`.
|
||||||
|
9. **Restore state.** `glDepthMask(true)` + `glDisable(BLEND)` + `glBindVertexArray(0)`.
|
||||||
|
|
||||||
|
Diagnostic timing (under `ACDREAM_WB_DIAG=1`):
|
||||||
|
|
||||||
|
- CPU: `Stopwatch` started at the top of `Draw()`, stopped at the bottom. Median + 95th-percentile flushed in the 5-second `[WB-DIAG]` rollup.
|
||||||
|
- GPU: `glGenQueries` two query objects (one for opaque, one for transparent). `glBeginQuery(TIME_ELAPSED) / glEndQuery` around each `glMultiDrawElementsIndirect`. Result polled with `GL_QUERY_RESULT_NO_WAIT` on the next frame's start; if not ready, drop the sample and try again.
|
||||||
|
|
||||||
|
### 4.3 New shader files
|
||||||
|
|
||||||
|
`src/AcDream.App/Shaders/mesh_modern.vert`:
|
||||||
|
|
||||||
|
```glsl
|
||||||
|
#version 430 core
|
||||||
|
#extension GL_ARB_bindless_texture : require
|
||||||
|
#extension GL_ARB_shader_draw_parameters : require
|
||||||
|
|
||||||
|
layout(location = 0) in vec3 aPosition;
|
||||||
|
layout(location = 1) in vec3 aNormal;
|
||||||
|
layout(location = 2) in vec2 aTexCoord;
|
||||||
|
|
||||||
|
struct InstanceData {
|
||||||
|
mat4 transform;
|
||||||
|
// Reserved for Phase B.4 follow-up (selection-blink retail-faithful highlight):
|
||||||
|
// vec4 highlightColor; // RGBA — when non-zero alpha, fragment shader mixes into output.
|
||||||
|
// Add field here, increase stride to 80 bytes, and read at fragment via flat varying.
|
||||||
|
};
|
||||||
|
|
||||||
|
struct BatchData {
|
||||||
|
uvec2 textureHandle; // bindless handle for sampler2DArray
|
||||||
|
uint textureLayer; // layer index (always 0 for per-instance composites)
|
||||||
|
uint flags; // reserved for future use
|
||||||
|
};
|
||||||
|
|
||||||
|
layout(std430, binding = 0) readonly buffer InstanceBuffer {
|
||||||
|
InstanceData Instances[];
|
||||||
|
};
|
||||||
|
|
||||||
|
layout(std430, binding = 1) readonly buffer BatchBuffer {
|
||||||
|
BatchData Batches[];
|
||||||
|
};
|
||||||
|
|
||||||
|
layout(std140, binding = 1) uniform LightingUbo {
|
||||||
|
vec4 uAmbient;
|
||||||
|
vec4 uSunDir;
|
||||||
|
vec4 uSunColor;
|
||||||
|
// matches existing acdream lighting UBO; do not change layout
|
||||||
|
};
|
||||||
|
|
||||||
|
uniform mat4 uViewProjection;
|
||||||
|
uniform int uRenderPass; // 0=opaque, 1=transparent (consumed in fragment shader)
|
||||||
|
|
||||||
|
out vec3 vNormal;
|
||||||
|
out vec2 vTexCoord;
|
||||||
|
out flat uvec2 vTextureHandle;
|
||||||
|
out flat uint vTextureLayer;
|
||||||
|
|
||||||
|
void main() {
|
||||||
|
int instanceIndex = gl_BaseInstanceARB + gl_InstanceID;
|
||||||
|
mat4 model = Instances[instanceIndex].transform;
|
||||||
|
|
||||||
|
vec4 worldPos = model * vec4(aPosition, 1.0);
|
||||||
|
gl_Position = uViewProjection * worldPos;
|
||||||
|
|
||||||
|
vNormal = normalize(mat3(model) * aNormal);
|
||||||
|
vTexCoord = aTexCoord;
|
||||||
|
|
||||||
|
BatchData b = Batches[gl_DrawIDARB];
|
||||||
|
vTextureHandle = b.textureHandle;
|
||||||
|
vTextureLayer = b.textureLayer;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
`src/AcDream.App/Shaders/mesh_modern.frag`:
|
||||||
|
|
||||||
|
```glsl
|
||||||
|
#version 430 core
|
||||||
|
#extension GL_ARB_bindless_texture : require
|
||||||
|
|
||||||
|
in vec3 vNormal;
|
||||||
|
in vec2 vTexCoord;
|
||||||
|
in flat uvec2 vTextureHandle;
|
||||||
|
in flat uint vTextureLayer;
|
||||||
|
|
||||||
|
layout(std140, binding = 1) uniform LightingUbo {
|
||||||
|
vec4 uAmbient;
|
||||||
|
vec4 uSunDir;
|
||||||
|
vec4 uSunColor;
|
||||||
|
};
|
||||||
|
|
||||||
|
uniform int uRenderPass;
|
||||||
|
|
||||||
|
out vec4 FragColor;
|
||||||
|
|
||||||
|
void main() {
|
||||||
|
sampler2DArray tex = sampler2DArray(vTextureHandle);
|
||||||
|
vec4 color = texture(tex, vec3(vTexCoord, float(vTextureLayer)));
|
||||||
|
|
||||||
|
if (uRenderPass == 0) {
|
||||||
|
// Opaque pass: discard soft pixels (alpha cutout), write to depth
|
||||||
|
if (color.a < 0.95) discard;
|
||||||
|
} else {
|
||||||
|
// Transparent pass: discard hard pixels (already drawn opaque), no depth write
|
||||||
|
if (color.a >= 0.95) discard;
|
||||||
|
if (color.a < 0.05) discard; // skip totally-empty fragments — perf for large transparent overdraw
|
||||||
|
}
|
||||||
|
|
||||||
|
// Diffuse lighting (preserved from acdream's existing lighting model)
|
||||||
|
vec3 N = normalize(vNormal);
|
||||||
|
vec3 L = normalize(uSunDir.xyz);
|
||||||
|
float diff = max(dot(N, L), 0.0);
|
||||||
|
vec3 lit = uAmbient.rgb + uSunColor.rgb * diff;
|
||||||
|
color.rgb *= clamp(lit, 0.0, 1.0);
|
||||||
|
|
||||||
|
FragColor = color;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Differences from WB's `StaticObjectModern.*`:
|
||||||
|
|
||||||
|
- Drops `uActiveCells[]` cell-filtering (acdream culls cells on CPU).
|
||||||
|
- Drops `uDrawIDOffset` (acdream issues full passes, no pagination).
|
||||||
|
- Drops `uHighlightColor` (deferred to Phase B.4 follow-up; reserved as per-instance `highlightColor` field, not a global uniform).
|
||||||
|
- Adapts the lighting model to acdream's existing UBO at binding=1 instead of WB's `SceneData` UBO.
|
||||||
|
- Uses 1-layer `sampler2DArray` for ALL textures (WB uses multi-layer atlases — same shader works for both shapes).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 5. Per-frame data flow walk-through
|
||||||
|
|
||||||
|
A concrete trace. Visible work for frame N:
|
||||||
|
|
||||||
|
| Group | GfxObj | Surface | Translucency | Instances |
|
||||||
|
|---|---|---|---|---|
|
||||||
|
| 0 | oak tree | bark | Opaque | 12 |
|
||||||
|
| 1 | oak tree | leaves | AlphaBlend | 12 |
|
||||||
|
| 2 | drudge | skin (palette override) | Opaque | 1 |
|
||||||
|
| 3 | drudge | eyes | Opaque | 1 |
|
||||||
|
|
||||||
|
**Instance SSBO** (binding=0), 26 entries (each batch contributes its own copy of the entity matrix):
|
||||||
|
```
|
||||||
|
[0..11] = oak instance matrices (group 0 — bark)
|
||||||
|
[12..23] = oak instance matrices (group 1 — leaves)
|
||||||
|
[24] = drudge instance matrix (group 2 — skin)
|
||||||
|
[25] = drudge instance matrix (group 3 — eyes)
|
||||||
|
```
|
||||||
|
|
||||||
|
**Batch SSBO** (binding=1), 4 entries indexed by `gl_DrawIDARB`:
|
||||||
|
```
|
||||||
|
Batches[0] = (oak_bark_handle, layer=0, flags=0)
|
||||||
|
Batches[1] = (oak_leaves_handle, layer=0, flags=0)
|
||||||
|
Batches[2] = (drudge_skin_handle_with_palette, layer=0, flags=0)
|
||||||
|
Batches[3] = (drudge_eyes_handle, layer=0, flags=0)
|
||||||
|
```
|
||||||
|
|
||||||
|
**Indirect buffer** (single buffer, two sections):
|
||||||
|
```
|
||||||
|
_indirectBuffer[0..2] = opaque section (3 entries, sorted front-to-back)
|
||||||
|
[0] = (count=oakBarkIdx, instanceCount=12, firstIndex=oakBarkFI, baseVertex=oakBV, baseInstance=0)
|
||||||
|
[1] = (count=drudgeSkinIdx, instanceCount=1, firstIndex=drudgeSkinFI, baseVertex=drudgeBV, baseInstance=24)
|
||||||
|
[2] = (count=drudgeEyesIdx, instanceCount=1, firstIndex=drudgeEyesFI, baseVertex=drudgeBV, baseInstance=25)
|
||||||
|
|
||||||
|
_indirectBuffer[3] = transparent section (1 entry)
|
||||||
|
[3] = (count=oakLeavesIdx, instanceCount=12, firstIndex=oakLeavesFI, baseVertex=oakBV, baseInstance=12)
|
||||||
|
|
||||||
|
_opaqueDrawCount = 3; _transparentDrawCount = 1; _transparentByteOffset = 3 * sizeof(DEIC) = 60.
|
||||||
|
```
|
||||||
|
|
||||||
|
**Shader access pattern** (per vertex):
|
||||||
|
```glsl
|
||||||
|
int instanceIndex = gl_BaseInstanceARB + gl_InstanceID; // unique per (group, instance) pair
|
||||||
|
mat4 model = Instances[instanceIndex].transform;
|
||||||
|
BatchData b = Batches[gl_DrawIDARB]; // shared across all verts in this draw
|
||||||
|
sampler2DArray tex = sampler2DArray(b.textureHandle);
|
||||||
|
vec4 color = texture(tex, vec3(aTexCoord, float(b.textureLayer)));
|
||||||
|
```
|
||||||
|
|
||||||
|
**Per-frame CPU GL calls** (entity rendering, total):
|
||||||
|
- 3× `glBufferData` (instance SSBO, batch SSBO, indirect buffer).
|
||||||
|
- 1× `glBindVertexArray(globalVAO)`.
|
||||||
|
- 2× `glBindBufferBase` (SSBOs at bindings 0 + 1).
|
||||||
|
- 1× `glBindBuffer(DRAW_INDIRECT_BUFFER, _indirectBuffer)`.
|
||||||
|
- 2× `glMultiDrawElementsIndirect` (one opaque, one transparent).
|
||||||
|
- ~5 state changes (blend, depth mask, render pass uniform).
|
||||||
|
|
||||||
|
Total: ~15-20 GL calls per frame for entity rendering, regardless of group count. N.4 baseline is "few hundred."
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 6. Translucent rendering detail
|
||||||
|
|
||||||
|
Per Decision 2: WB's two-pass alpha-test pattern.
|
||||||
|
|
||||||
|
**Group classification.** `ClassifyBatches` puts groups into one of two arrays:
|
||||||
|
|
||||||
|
- **Opaque indirect:** `TranslucencyKind.Opaque` and `TranslucencyKind.ClipMap`.
|
||||||
|
- **Transparent indirect:** `TranslucencyKind.AlphaBlend`, `Additive`, `InvAlpha` all merged. Per Decision 2, additive renders as alpha-blend; falsifiable at visual verification.
|
||||||
|
|
||||||
|
Opaque groups stay sorted front-to-back by `SortDistance` (preserved from N.4 — depth-test reject of overdrawn fragments is a meaningful win on dense scenes).
|
||||||
|
|
||||||
|
**Pass GL state:**
|
||||||
|
|
||||||
|
```csharp
|
||||||
|
// Opaque pass
|
||||||
|
_gl.Disable(EnableCap.Blend);
|
||||||
|
_gl.DepthMask(true);
|
||||||
|
_gl.Enable(EnableCap.CullFace); _gl.CullFace(TriangleFace.Back); _gl.FrontFace(FrontFaceDirection.Ccw);
|
||||||
|
_shader.SetInt("uRenderPass", 0);
|
||||||
|
_gl.BindBuffer(BufferTargetARB.DrawIndirectBuffer, _indirectBuffer);
|
||||||
|
_gl.MultiDrawElementsIndirect(PrimitiveType.Triangles, DrawElementsType.UnsignedShort,
|
||||||
|
indirect: (void*)0, drawcount: _opaqueDrawCount, stride: (uint)sizeof(DEIC));
|
||||||
|
|
||||||
|
// Transparent pass
|
||||||
|
_gl.Enable(EnableCap.Blend);
|
||||||
|
_gl.BlendFunc(BlendingFactor.SrcAlpha, BlendingFactor.OneMinusSrcAlpha);
|
||||||
|
_gl.DepthMask(false);
|
||||||
|
_shader.SetInt("uRenderPass", 1);
|
||||||
|
_gl.MultiDrawElementsIndirect(PrimitiveType.Triangles, DrawElementsType.UnsignedShort,
|
||||||
|
indirect: (void*)_transparentByteOffset, drawcount: _transparentDrawCount, stride: (uint)sizeof(DEIC));
|
||||||
|
|
||||||
|
// Cleanup
|
||||||
|
_gl.DepthMask(true); _gl.Disable(EnableCap.Blend); _gl.BindVertexArray(0);
|
||||||
|
```
|
||||||
|
|
||||||
|
**Visual verification gate (additive fallback plan).** During Week 2-3 visual verification, look at:
|
||||||
|
- Holtburg courtyard, dungeon entrance — confirm scenery + characters identical.
|
||||||
|
- Foundry interior — magic-themed content with potentially additive-flagged surfaces.
|
||||||
|
- Any glowing weapon decals, magical aura effects, or self-luminous textures observed.
|
||||||
|
|
||||||
|
If a visible regression appears (faded glow, missing additive bloom): amend spec to add a third indirect call within the transparent pass with `glBlendFunc(SrcAlpha, One)`. Group classification splits Additive into its own bucket. ~30-min change.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 7. Error handling and fallback
|
||||||
|
|
||||||
|
### 7.1 GPU capability detection
|
||||||
|
|
||||||
|
WB's `OpenGLGraphicsDevice` already detects:
|
||||||
|
- `HasOpenGL43` (required for SSBOs, multi-draw indirect, `gl_BaseInstanceARB`).
|
||||||
|
- `HasBindless` (required for bindless texture handles).
|
||||||
|
|
||||||
|
`WbDrawDispatcher` is only constructed when `WbFoundationFlag.Enabled` is true, which gates on `_useModernRendering = HasOpenGL43 && HasBindless`. We inherit WB's gating.
|
||||||
|
|
||||||
|
**Additional check:** `GL_ARB_shader_draw_parameters` (for `gl_BaseInstanceARB`, `gl_DrawIDARB`). Standard on GL 4.6, available as extension on 4.3+. Add to N.5's capability check; if missing, `WbDrawDispatcher` constructor logs a one-time warning and the foundation flag flips off (falls back to `InstancedMeshRenderer`).
|
||||||
|
|
||||||
|
### 7.2 Shader compile failure
|
||||||
|
|
||||||
|
If `mesh_modern.vert/.frag` fails to compile (driver bug, GLSL version mismatch, extension issue): catch the compile exception in `WbDrawDispatcher` constructor, log the GLSL info log + GPU vendor/renderer string ONCE, flip `WbFoundationFlag.Enabled = false` for the session, fall back to `InstancedMeshRenderer`. Do not crash.
|
||||||
|
|
||||||
|
### 7.3 Non-resident handle (the bindless foot-gun)
|
||||||
|
|
||||||
|
Sampling a non-resident handle causes undefined behavior (driver-dependent: black texture, GPU fault, device-lost).
|
||||||
|
|
||||||
|
Mitigation in code: `TextureCache.MakeResidentHandle` is the only API that produces a handle, and it makes the handle resident in the same call. There is no API surface that produces a non-resident handle. Defense-in-depth: dispatcher asserts `BindlessTextureHandle != 0` before queuing a draw (zero handles get filtered out, same as zero `surfaceId` does today).
|
||||||
|
|
||||||
|
### 7.4 Indirect command corruption
|
||||||
|
|
||||||
|
`count`, `firstIndex`, `baseVertex` come from WB's `ObjectRenderBatch` (never user input; WB-internal correctness). `instanceCount` is `grp.Matrices.Count` (we control). `baseInstance` is `grp.FirstInstance` (we control, computed cumulatively). Bug-class is "WB-internal corruption + our cumulative-offset bug" — same surface area as N.4's `BaseInstance` already trusts. Add a debug-build assertion: cumulative `baseInstance` values must be strictly increasing.
|
||||||
|
|
||||||
|
### 7.5 Disposal order
|
||||||
|
|
||||||
|
`WbDrawDispatcher.Dispose` releases bindless handles before deleting underlying textures (driver UB otherwise). `TextureCache.Dispose` does this:
|
||||||
|
1. Iterate `_bindlessHandlesByGlName.Values`, call `glMakeTextureHandleNonResidentARB(handle)`.
|
||||||
|
2. Call `_glExtensions.MakeAllNonResidentARB` if available (some drivers prefer batch).
|
||||||
|
3. Then `glDeleteTextures` proceeds as today.
|
||||||
|
|
||||||
|
Dispatcher's own buffer cleanup (`_instanceSsbo`, `_batchSsbo`, `_indirectBuffer`) via `glDeleteBuffers`.
|
||||||
|
|
||||||
|
### 7.6 Persistent first-failure diagnostic
|
||||||
|
|
||||||
|
If shader compile fails OR an extension check fails OR `glMultiDrawElementsIndirect` returns `GL_INVALID_OPERATION` on first frame: log ONCE with GPU vendor/renderer string + GLSL info log. Don't spam. User pastes the line into a bug report; we know exactly where to look.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 8. Testing and acceptance
|
||||||
|
|
||||||
|
### 8.1 Unit / conformance tests
|
||||||
|
|
||||||
|
- **`TextureCacheBindlessTests`** — for each `Bindless`-suffixed `GetOrUpload*`: returns non-zero `ulong`, returns same handle for same key (cache hit), distinct keys yield distinct handles, returned handle is resident per GL state query.
|
||||||
|
- **`WbDrawDispatcherIndirectBuilderTests`** — pure CPU test: given a fixture of `(entity, mesh, batch)` tuples, verify the indirect buffer layout: `count` / `firstIndex` / `baseVertex` / `baseInstance` per group, opaque section sorted front-to-back, transparent section in classification order (no sort — back-to-front sort can be added in a follow-up if measured useful).
|
||||||
|
- **`WbDrawDispatcherTranslucencyTests`** — verify groups land in correct indirect buffer (opaque vs transparent) per `TranslucencyKind`. `Additive`/`InvAlpha` go to transparent. `ClipMap` goes to opaque. Empty groups skipped.
|
||||||
|
- **Existing N.4 tests stay green.** All 60 tests captured by `FullyQualifiedName~Wb|MatrixComposition` filter remain at 60/0.
|
||||||
|
|
||||||
|
### 8.2 Visual verification
|
||||||
|
|
||||||
|
Same gate as N.4 used. Live ACE + retail dat, in-world testing.
|
||||||
|
|
||||||
|
- **Holtburg courtyard** — characters + scenery + buildings render identically to N.4. No missing entities, no z-fighting, no exploded parts.
|
||||||
|
- **Foundry interior** — dense static-object scene, stress-tests indirect call count and translucency classification.
|
||||||
|
- **Indoor → outdoor cell transition** — confirms cell visibility filtering still works (we cull on CPU; dispatcher should never see invisible-cell entities).
|
||||||
|
- **Drudge / character close-up** — confirms Issue #47 close-detail mesh preservation.
|
||||||
|
- **Magic content (additive fallback check)** — Foundry runes, glowing weapons if observable, boss models with luminous decals. Trigger spec amendment if regression spotted.
|
||||||
|
|
||||||
|
User-confirms each. These are visual identity checks against the running N.4 behavior (use `git stash` of N.5 changes + relaunch as the comparison baseline).
|
||||||
|
|
||||||
|
### 8.3 Perf measurement (the win gate)
|
||||||
|
|
||||||
|
`[WB-DIAG]` augmented:
|
||||||
|
|
||||||
|
```
|
||||||
|
[WB-DIAG] entSeen=N entDrawn=M ... drawsIssued=K groups=G (existing)
|
||||||
|
[WB-DIAG] cpu_us=Xmedian/Y95p gpu_us=Zmedian/W95p (new)
|
||||||
|
```
|
||||||
|
|
||||||
|
Capture before/after numbers in fixed scenes/cameras:
|
||||||
|
|
||||||
|
| Scene | Camera position | Metric |
|
||||||
|
|---|---|---|
|
||||||
|
| Holtburg courtyard | 30m elevated, looking SW | `cpu`, `gpu`, `drawsIssued` |
|
||||||
|
| Foundry interior | character spawn, default heading | `cpu`, `gpu`, `drawsIssued` |
|
||||||
|
| Open landscape | terrain wander, no entities | `cpu`, `gpu`, `drawsIssued` (sanity) |
|
||||||
|
|
||||||
|
**Acceptance gates** (paste into SHIP commit message):
|
||||||
|
|
||||||
|
- Visual identity to N.4 — confirmed via §8.2.
|
||||||
|
- CPU dispatcher time ≤ 70% of N.4 in Holtburg courtyard (target: ≥30% reduction).
|
||||||
|
- GPU rendering time within ±10% of N.4 (sanity: no regression).
|
||||||
|
- `drawsIssued ≤ 5 per pass` (down from "few hundred per pass").
|
||||||
|
- All tests green — 60+ Wb tests + new bindless/indirect tests.
|
||||||
|
- `ACDREAM_USE_WB_FOUNDATION=0` still works — `InstancedMeshRenderer` fallback runs and renders correctly.
|
||||||
|
|
||||||
|
### 8.4 Long-session sanity check
|
||||||
|
|
||||||
|
Hour-long session with `ACDREAM_WB_DIAG=1`. Watch resident-handle count grow. Expected: bounded plateau under 5K once content set is fully traversed. If unbounded growth, residency policy revisit required in N.6.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 9. Risks
|
||||||
|
|
||||||
|
| Risk | Likelihood | Impact | Mitigation |
|
||||||
|
|---|---|---|---|
|
||||||
|
| Driver bug in bindless residency | Low (mature in 2025+ drivers) | Crash / black textures | One-time logging on first failure; legacy fallback under flag-off |
|
||||||
|
| Driver bug in `glMultiDrawElementsIndirect` | Low | GL_INVALID_OPERATION | Capability check + first-failure logging + fallback |
|
||||||
|
| Resident handle count exceeds driver limit in long session | Low (acdream content is bounded) | Cumulative GPU memory pressure → eventual eviction surprises | `[WB-DIAG]` resident-count log; revisit eviction in N.6 if it grows unbounded |
|
||||||
|
| Shader compile fails on weird GPU | Medium-low | First-launch failure | Compile-error catch + fallback to `InstancedMeshRenderer` |
|
||||||
|
| Additive fidelity regression on rare GfxObj surfaces | Medium | Subtle visual difference | Visual verification at magic-themed content; spec amendment for additive sub-pass if found |
|
||||||
|
| `gl_BaseInstanceARB` fields not advancing per-instance attribs we still use | Low (we drop attribs entirely) | Wrong matrices | All instance data via SSBO; no vertex attrib at locations 3-6 to misalign |
|
||||||
|
| SSBO indexing GPU cost worse than uniform-array | Low (well-optimized in modern drivers) | Possible GPU time regression | GL timer queries detect; if observed, fall back to uniform array of bounded size |
|
||||||
|
| Persistent-mapped buffer foot-guns (chosen NOT to use in N.5) | n/a | n/a | Decision 7 defers to N.6 |
|
||||||
|
| Per-instance highlight (selection blink) feature creep | Low | Scope grows | Decision 8 defers; field reserved in design doc |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 10. Out of scope (explicitly)
|
||||||
|
|
||||||
|
The following are NOT N.5 work. They become possible follow-ons.
|
||||||
|
|
||||||
|
- **WB's `TextureAtlasManager` adoption for atlas tier.** N.5 keeps acdream's `TextureCache` as the texture owner for everything. Atlas adoption is N.6+ if memory pressure shows up.
|
||||||
|
- **Persistent-mapped buffer ring with sync fences.** Decision 7. N.6 candidate if profiling shows residual `glBufferData` cost.
|
||||||
|
- **GPU-side culling (compute pre-pass).** Future phase.
|
||||||
|
- **Texture array repacking for multi-layer per-instance composites.** Future, if many palette-overrides actually share dimensions and could be packed.
|
||||||
|
- **Selection-blink highlight color.** Decision 8. Phase B.4 follow-up. Field reserved in `InstanceData` design (extend stride to 80 bytes when implementing).
|
||||||
|
- ~~**Deletion of legacy `InstancedMeshRenderer`.** N.6.~~ **Done in N.5 ship amendment** — `InstancedMeshRenderer`, `StaticMeshRenderer`, and `WbFoundationFlag` were deleted in the retirement commit.
|
||||||
|
- **Terrain wiring through WB.** Future.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 11. Open questions
|
||||||
|
|
||||||
|
None outstanding. All 8 brainstorm questions resolved + 1 clarification on highlight semantics. Ready for plan.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
*End of design.*
|
||||||
|
|
@ -14,6 +14,7 @@
|
||||||
</ItemGroup>
|
</ItemGroup>
|
||||||
<ItemGroup>
|
<ItemGroup>
|
||||||
<PackageReference Include="Silk.NET.OpenGL" Version="2.23.0" />
|
<PackageReference Include="Silk.NET.OpenGL" Version="2.23.0" />
|
||||||
|
<PackageReference Include="Silk.NET.OpenGL.Extensions.ARB" Version="2.23.0" />
|
||||||
<PackageReference Include="Silk.NET.Windowing" Version="2.23.0" />
|
<PackageReference Include="Silk.NET.Windowing" Version="2.23.0" />
|
||||||
<PackageReference Include="Silk.NET.Input" Version="2.23.0" />
|
<PackageReference Include="Silk.NET.Input" Version="2.23.0" />
|
||||||
<PackageReference Include="Silk.NET.OpenAL" Version="2.23.0" />
|
<PackageReference Include="Silk.NET.OpenAL" Version="2.23.0" />
|
||||||
|
|
|
||||||
|
|
@ -25,14 +25,17 @@ public sealed class GameWindow : IDisposable
|
||||||
private DatCollection? _dats;
|
private DatCollection? _dats;
|
||||||
private float _lastMouseX;
|
private float _lastMouseX;
|
||||||
private float _lastMouseY;
|
private float _lastMouseY;
|
||||||
private InstancedMeshRenderer? _staticMesh;
|
|
||||||
private Shader? _meshShader;
|
private Shader? _meshShader;
|
||||||
private TextureCache? _textureCache;
|
private TextureCache? _textureCache;
|
||||||
/// <summary>Phase N.4: WB-backed rendering pipeline adapter. Non-null only
|
/// <summary>Phase N.4+: WB-backed rendering pipeline adapter. Always non-null
|
||||||
/// when <c>ACDREAM_USE_WB_FOUNDATION=1</c> is set; null otherwise.</summary>
|
/// after <c>OnLoad</c> completes (modern path is mandatory as of N.5).</summary>
|
||||||
private AcDream.App.Rendering.Wb.WbMeshAdapter? _wbMeshAdapter;
|
private AcDream.App.Rendering.Wb.WbMeshAdapter? _wbMeshAdapter;
|
||||||
private AcDream.App.Rendering.Wb.EntitySpawnAdapter? _wbEntitySpawnAdapter;
|
private AcDream.App.Rendering.Wb.EntitySpawnAdapter? _wbEntitySpawnAdapter;
|
||||||
private AcDream.App.Rendering.Wb.WbDrawDispatcher? _wbDrawDispatcher;
|
private AcDream.App.Rendering.Wb.WbDrawDispatcher? _wbDrawDispatcher;
|
||||||
|
/// <summary>Phase N.5: ARB_bindless_texture + ARB_shader_draw_parameters
|
||||||
|
/// support. Required at startup — missing bindless throws
|
||||||
|
/// <see cref="NotSupportedException"/> in <c>OnLoad</c>.</summary>
|
||||||
|
private AcDream.App.Rendering.Wb.BindlessSupport? _bindlessSupport;
|
||||||
private SamplerCache? _samplerCache;
|
private SamplerCache? _samplerCache;
|
||||||
private DebugLineRenderer? _debugLines;
|
private DebugLineRenderer? _debugLines;
|
||||||
// K-fix4 (2026-04-26): default OFF. The orange BSP / green cylinder
|
// K-fix4 (2026-04-26): default OFF. The orange BSP / green cylinder
|
||||||
|
|
@ -966,10 +969,6 @@ public sealed class GameWindow : IDisposable
|
||||||
Path.Combine(shadersDir, "terrain.vert"),
|
Path.Combine(shadersDir, "terrain.vert"),
|
||||||
Path.Combine(shadersDir, "terrain.frag"));
|
Path.Combine(shadersDir, "terrain.frag"));
|
||||||
|
|
||||||
_meshShader = new Shader(_gl,
|
|
||||||
Path.Combine(shadersDir, "mesh_instanced.vert"),
|
|
||||||
Path.Combine(shadersDir, "mesh_instanced.frag"));
|
|
||||||
|
|
||||||
// Phase G.1/G.2: shared scene-lighting UBO. Stays bound at
|
// Phase G.1/G.2: shared scene-lighting UBO. Stays bound at
|
||||||
// binding=1 for the lifetime of the process — every shader that
|
// binding=1 for the lifetime of the process — every shader that
|
||||||
// declares `layout(std140, binding = 1) uniform SceneLighting`
|
// declares `layout(std140, binding = 1) uniform SceneLighting`
|
||||||
|
|
@ -1419,7 +1418,43 @@ public sealed class GameWindow : IDisposable
|
||||||
_heightTable = heightTable;
|
_heightTable = heightTable;
|
||||||
_surfaceCache = new Dictionary<uint, AcDream.Core.Terrain.SurfaceInfo>();
|
_surfaceCache = new Dictionary<uint, AcDream.Core.Terrain.SurfaceInfo>();
|
||||||
|
|
||||||
_textureCache = new TextureCache(_gl, _dats);
|
// N.5: detect ARB_bindless_texture + ARB_shader_draw_parameters.
|
||||||
|
// The modern path (SSBO + glMultiDrawElementsIndirect + bindless textures)
|
||||||
|
// is mandatory as of Phase N.5 — missing extensions throw at startup with
|
||||||
|
// a clear error so users can file a real bug report rather than silently
|
||||||
|
// falling back to a half-working renderer.
|
||||||
|
if (AcDream.App.Rendering.Wb.BindlessSupport.TryCreate(_gl, out var bindless))
|
||||||
|
{
|
||||||
|
if (bindless!.HasShaderDrawParameters(_gl))
|
||||||
|
{
|
||||||
|
_bindlessSupport = bindless;
|
||||||
|
Console.WriteLine("[N.5] modern path capabilities present (bindless + ARB_shader_draw_parameters)");
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
Console.WriteLine("[N.5] GL_ARB_shader_draw_parameters not present — modern path not available");
|
||||||
|
}
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
Console.WriteLine("[N.5] GL_ARB_bindless_texture not present — modern path not available");
|
||||||
|
}
|
||||||
|
|
||||||
|
if (_bindlessSupport is null)
|
||||||
|
{
|
||||||
|
throw new NotSupportedException(
|
||||||
|
"acdream requires GL_ARB_bindless_texture + GL_ARB_shader_draw_parameters " +
|
||||||
|
"(GL 4.3+ with bindless support). Your GPU/driver does not expose these extensions. " +
|
||||||
|
"If this is unexpected, please file a bug report with your GPU vendor + driver version.");
|
||||||
|
}
|
||||||
|
|
||||||
|
// Mesh shader always loads (modern path is the only path).
|
||||||
|
_meshShader = new Shader(_gl,
|
||||||
|
Path.Combine(shadersDir, "mesh_modern.vert"),
|
||||||
|
Path.Combine(shadersDir, "mesh_modern.frag"));
|
||||||
|
Console.WriteLine("[N.5] mesh_modern shader loaded");
|
||||||
|
|
||||||
|
_textureCache = new TextureCache(_gl, _dats, _bindlessSupport);
|
||||||
// Two persistent GL sampler objects (Repeat + ClampToEdge) so
|
// Two persistent GL sampler objects (Repeat + ClampToEdge) so
|
||||||
// the sky pass can pick wrap mode per submesh without mutating
|
// the sky pass can pick wrap mode per submesh without mutating
|
||||||
// shared per-texture wrap state. See SamplerCache + the
|
// shared per-texture wrap state. See SamplerCache + the
|
||||||
|
|
@ -1427,17 +1462,14 @@ public sealed class GameWindow : IDisposable
|
||||||
// references/WorldBuilder/Chorizite.OpenGLSDLBackend/OpenGLGraphicsDevice.cs:115-132.
|
// references/WorldBuilder/Chorizite.OpenGLSDLBackend/OpenGLGraphicsDevice.cs:115-132.
|
||||||
_samplerCache = new SamplerCache(_gl);
|
_samplerCache = new SamplerCache(_gl);
|
||||||
|
|
||||||
// Phase N.4 — WB rendering pipeline foundation. Constructed only when
|
// Phase N.4+N.5 — WB rendering pipeline foundation. The modern path is
|
||||||
// ACDREAM_USE_WB_FOUNDATION=1 is set; otherwise the legacy renderer
|
// mandatory as of N.5 ship amendment: WbMeshAdapter + WbDrawDispatcher
|
||||||
// path stays in charge. The full ObjectMeshManager bring-up lives in
|
// always construct. WbMeshAdapter owns ObjectMeshManager and opens its
|
||||||
// WbMeshAdapter (Task 9): OpenGLGraphicsDevice + DefaultDatReaderWriter
|
// own file handles for the dat files (independent of our DatCollection).
|
||||||
// + ObjectMeshManager. WbMeshAdapter opens its own file handles for
|
|
||||||
// the dat files (independent of our DatCollection).
|
|
||||||
if (AcDream.App.Rendering.Wb.WbFoundationFlag.IsEnabled)
|
|
||||||
{
|
{
|
||||||
var wbLogger = Microsoft.Extensions.Logging.Abstractions.NullLogger<AcDream.App.Rendering.Wb.WbMeshAdapter>.Instance;
|
var wbLogger = Microsoft.Extensions.Logging.Abstractions.NullLogger<AcDream.App.Rendering.Wb.WbMeshAdapter>.Instance;
|
||||||
_wbMeshAdapter = new AcDream.App.Rendering.Wb.WbMeshAdapter(_gl, _datDir, _dats, wbLogger);
|
_wbMeshAdapter = new AcDream.App.Rendering.Wb.WbMeshAdapter(_gl, _datDir, _dats, wbLogger);
|
||||||
Console.WriteLine("[N.4] WbFoundation flag is ENABLED — routing static content through ObjectMeshManager.");
|
Console.WriteLine("[N.4+N.5] WB foundation + modern path active — routing all content through ObjectMeshManager.");
|
||||||
}
|
}
|
||||||
|
|
||||||
// Phase N.4 Task 12: construct LandblockSpawnAdapter under the feature flag
|
// Phase N.4 Task 12: construct LandblockSpawnAdapter under the feature flag
|
||||||
|
|
@ -1446,60 +1478,51 @@ public sealed class GameWindow : IDisposable
|
||||||
// one that carries the adapter so AddLandblock/RemoveLandblock notify WB.
|
// one that carries the adapter so AddLandblock/RemoveLandblock notify WB.
|
||||||
// Phase N.4 Task 17: also construct EntitySpawnAdapter for server-spawned
|
// Phase N.4 Task 17: also construct EntitySpawnAdapter for server-spawned
|
||||||
// per-instance content under the same flag.
|
// per-instance content under the same flag.
|
||||||
|
// N.5 mandatory path: spawn adapters + dispatcher always construct.
|
||||||
|
// _wbMeshAdapter, _meshShader, _textureCache, and _bindlessSupport are
|
||||||
|
// all guaranteed non-null here (startup throws above if any are missing).
|
||||||
{
|
{
|
||||||
AcDream.App.Rendering.Wb.LandblockSpawnAdapter? wbSpawnAdapter = null;
|
var wbSpawnAdapter = new AcDream.App.Rendering.Wb.LandblockSpawnAdapter(_wbMeshAdapter!);
|
||||||
AcDream.App.Rendering.Wb.EntitySpawnAdapter? wbEntitySpawnAdapter = null;
|
// Sequencer factory: look up Setup + MotionTable from dats and build
|
||||||
if (AcDream.App.Rendering.Wb.WbFoundationFlag.IsEnabled && _wbMeshAdapter is not null)
|
// an AnimationSequencer. Falls back to a no-op sequencer when the
|
||||||
|
// entity has no motion table (static props, etc.). Uses _animLoader
|
||||||
|
// which is initialised earlier in OnLoad; it is non-null here.
|
||||||
|
var capturedDats = _dats;
|
||||||
|
var capturedAnimLoader = _animLoader;
|
||||||
|
AcDream.Core.Physics.AnimationSequencer SequencerFactory(AcDream.Core.World.WorldEntity e)
|
||||||
{
|
{
|
||||||
wbSpawnAdapter = new AcDream.App.Rendering.Wb.LandblockSpawnAdapter(_wbMeshAdapter);
|
if (capturedDats is not null && capturedAnimLoader is not null)
|
||||||
// Sequencer factory: look up Setup + MotionTable from dats and build
|
|
||||||
// an AnimationSequencer. Falls back to a no-op sequencer when the
|
|
||||||
// entity has no motion table (static props, etc.). Uses _animLoader
|
|
||||||
// which is initialised at line 1004; it is non-null here because
|
|
||||||
// OnLoad wires _dats + _animLoader before this block runs.
|
|
||||||
var capturedDats = _dats;
|
|
||||||
var capturedAnimLoader = _animLoader;
|
|
||||||
AcDream.Core.Physics.AnimationSequencer SequencerFactory(AcDream.Core.World.WorldEntity e)
|
|
||||||
{
|
{
|
||||||
if (capturedDats is not null && capturedAnimLoader is not null)
|
var setup = capturedDats.Get<DatReaderWriter.DBObjs.Setup>(e.SourceGfxObjOrSetupId);
|
||||||
|
if (setup is not null)
|
||||||
{
|
{
|
||||||
var setup = capturedDats.Get<DatReaderWriter.DBObjs.Setup>(e.SourceGfxObjOrSetupId);
|
uint mtableId = (uint)setup.DefaultMotionTable;
|
||||||
if (setup is not null)
|
if (mtableId != 0)
|
||||||
{
|
{
|
||||||
uint mtableId = (uint)setup.DefaultMotionTable;
|
var mtable = capturedDats.Get<DatReaderWriter.DBObjs.MotionTable>(mtableId);
|
||||||
if (mtableId != 0)
|
if (mtable is not null)
|
||||||
{
|
return new AcDream.Core.Physics.AnimationSequencer(setup, mtable, capturedAnimLoader);
|
||||||
var mtable = capturedDats.Get<DatReaderWriter.DBObjs.MotionTable>(mtableId);
|
|
||||||
if (mtable is not null)
|
|
||||||
return new AcDream.Core.Physics.AnimationSequencer(setup, mtable, capturedAnimLoader);
|
|
||||||
}
|
|
||||||
// Setup exists but no motion table — no-op sequencer.
|
|
||||||
return new AcDream.Core.Physics.AnimationSequencer(
|
|
||||||
setup,
|
|
||||||
new DatReaderWriter.DBObjs.MotionTable(),
|
|
||||||
capturedAnimLoader);
|
|
||||||
}
|
}
|
||||||
|
// Setup exists but no motion table — no-op sequencer.
|
||||||
|
return new AcDream.Core.Physics.AnimationSequencer(
|
||||||
|
setup,
|
||||||
|
new DatReaderWriter.DBObjs.MotionTable(),
|
||||||
|
capturedAnimLoader);
|
||||||
}
|
}
|
||||||
// Complete fallback: empty setup + empty motion table + null loader.
|
|
||||||
return new AcDream.Core.Physics.AnimationSequencer(
|
|
||||||
new DatReaderWriter.DBObjs.Setup(),
|
|
||||||
new DatReaderWriter.DBObjs.MotionTable(),
|
|
||||||
new NullAnimLoader());
|
|
||||||
}
|
}
|
||||||
wbEntitySpawnAdapter = new AcDream.App.Rendering.Wb.EntitySpawnAdapter(
|
// Complete fallback: empty setup + empty motion table + null loader.
|
||||||
_textureCache, SequencerFactory, _wbMeshAdapter);
|
return new AcDream.Core.Physics.AnimationSequencer(
|
||||||
_wbEntitySpawnAdapter = wbEntitySpawnAdapter;
|
new DatReaderWriter.DBObjs.Setup(),
|
||||||
|
new DatReaderWriter.DBObjs.MotionTable(),
|
||||||
|
new NullAnimLoader());
|
||||||
}
|
}
|
||||||
|
var wbEntitySpawnAdapter = new AcDream.App.Rendering.Wb.EntitySpawnAdapter(
|
||||||
|
_textureCache!, SequencerFactory, _wbMeshAdapter!);
|
||||||
|
_wbEntitySpawnAdapter = wbEntitySpawnAdapter;
|
||||||
_worldState = new AcDream.App.Streaming.GpuWorldState(wbSpawnAdapter, wbEntitySpawnAdapter);
|
_worldState = new AcDream.App.Streaming.GpuWorldState(wbSpawnAdapter, wbEntitySpawnAdapter);
|
||||||
}
|
|
||||||
|
|
||||||
_staticMesh = new InstancedMeshRenderer(_gl, _meshShader, _textureCache, _wbMeshAdapter);
|
|
||||||
|
|
||||||
if (AcDream.App.Rendering.Wb.WbFoundationFlag.IsEnabled
|
|
||||||
&& _wbMeshAdapter is not null && _wbEntitySpawnAdapter is not null)
|
|
||||||
{
|
|
||||||
_wbDrawDispatcher = new AcDream.App.Rendering.Wb.WbDrawDispatcher(
|
_wbDrawDispatcher = new AcDream.App.Rendering.Wb.WbDrawDispatcher(
|
||||||
_gl, _meshShader, _textureCache, _wbMeshAdapter, _wbEntitySpawnAdapter);
|
_gl, _meshShader!, _textureCache!, _wbMeshAdapter!, _wbEntitySpawnAdapter, _bindlessSupport!);
|
||||||
}
|
}
|
||||||
|
|
||||||
// Phase G.1 sky renderer — its own shader (sky.vert / sky.frag)
|
// Phase G.1 sky renderer — its own shader (sky.vert / sky.frag)
|
||||||
|
|
@ -1509,7 +1532,7 @@ public sealed class GameWindow : IDisposable
|
||||||
Path.Combine(shadersDir, "sky.vert"),
|
Path.Combine(shadersDir, "sky.vert"),
|
||||||
Path.Combine(shadersDir, "sky.frag"));
|
Path.Combine(shadersDir, "sky.frag"));
|
||||||
_skyRenderer = new AcDream.App.Rendering.Sky.SkyRenderer(
|
_skyRenderer = new AcDream.App.Rendering.Sky.SkyRenderer(
|
||||||
_gl, _dats, skyShader, _textureCache, _samplerCache);
|
_gl, _dats, skyShader, _textureCache!, _samplerCache);
|
||||||
|
|
||||||
// Phase G.1 particle renderer — renders rain / snow / spell auras
|
// Phase G.1 particle renderer — renders rain / snow / spell auras
|
||||||
// spawned into the shared ParticleSystem as billboard quads.
|
// spawned into the shared ParticleSystem as billboard quads.
|
||||||
|
|
@ -2025,7 +2048,7 @@ public sealed class GameWindow : IDisposable
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
if (_dats is null || _staticMesh is null) return;
|
if (_dats is null) return;
|
||||||
if (spawn.Position is null || spawn.SetupTableId is null)
|
if (spawn.Position is null || spawn.SetupTableId is null)
|
||||||
{
|
{
|
||||||
// Can't place a mesh without both. Most of these are inventory
|
// Can't place a mesh without both. Most of these are inventory
|
||||||
|
|
@ -2360,10 +2383,9 @@ public sealed class GameWindow : IDisposable
|
||||||
continue;
|
continue;
|
||||||
}
|
}
|
||||||
_physicsDataCache.CacheGfxObj(mr.GfxObjId, gfx);
|
_physicsDataCache.CacheGfxObj(mr.GfxObjId, gfx);
|
||||||
var subMeshes = AcDream.Core.Meshing.GfxObjMesh.Build(gfx, _dats);
|
|
||||||
_staticMesh.EnsureUploaded(mr.GfxObjId, subMeshes);
|
|
||||||
if (dumpClothing)
|
if (dumpClothing)
|
||||||
{
|
{
|
||||||
|
var subMeshes = AcDream.Core.Meshing.GfxObjMesh.Build(gfx, _dats);
|
||||||
int tris = 0; int subs = 0;
|
int tris = 0; int subs = 0;
|
||||||
foreach (var sm in subMeshes) { tris += sm.Indices.Length / 3; subs++; }
|
foreach (var sm in subMeshes) { tris += sm.Indices.Length / 3; subs++; }
|
||||||
dumpClothingTotalTris += tris;
|
dumpClothingTotalTris += tris;
|
||||||
|
|
@ -5194,44 +5216,25 @@ public sealed class GameWindow : IDisposable
|
||||||
portalPlanes, origin.X, origin.Y);
|
portalPlanes, origin.X, origin.Y);
|
||||||
}
|
}
|
||||||
|
|
||||||
// Upload every GfxObj referenced by this landblock's entities.
|
// N.5: WbMeshAdapter.Tick() handles GPU upload for all GfxObj meshes via
|
||||||
// EnsureUploaded is idempotent so duplicates across landblocks are free.
|
// ObjectMeshManager.PrepareMeshDataAsync. The legacy EnsureUploaded loop
|
||||||
if (_staticMesh is not null)
|
// (and _pendingCellMeshes drain) are retired with InstancedMeshRenderer.
|
||||||
|
// Cache GfxObj physics data (BSP trees) for the physics engine — this
|
||||||
|
// loop is physics-only, not renderer-side.
|
||||||
|
foreach (var entity in lb.Entities)
|
||||||
{
|
{
|
||||||
// Task 8: drain any pending EnvCell room-mesh sub-meshes first.
|
foreach (var meshRef in entity.MeshRefs)
|
||||||
// The worker thread pre-built these CPU-side and stored them in
|
|
||||||
// _pendingCellMeshes. We must upload them here (render thread) before
|
|
||||||
// the per-MeshRef loop below tries to look them up via GfxObjMesh.Build,
|
|
||||||
// which would fail because EnvCell ids (0xAAAA01xx) aren't real GfxObj
|
|
||||||
// dat ids. EnsureUploaded is idempotent so calling it here then seeing
|
|
||||||
// the same id again in the loop below is safe.
|
|
||||||
foreach (var entity in lb.Entities)
|
|
||||||
{
|
{
|
||||||
foreach (var meshRef in entity.MeshRefs)
|
if ((meshRef.GfxObjId & 0xFF000000u) != 0x01000000u) continue;
|
||||||
{
|
var gfx = _dats.Get<DatReaderWriter.DBObjs.GfxObj>(meshRef.GfxObjId);
|
||||||
if (_pendingCellMeshes.TryRemove(meshRef.GfxObjId, out var cellSubMeshes))
|
if (gfx is null) continue;
|
||||||
_staticMesh.EnsureUploaded(meshRef.GfxObjId, cellSubMeshes);
|
_physicsDataCache.CacheGfxObj(meshRef.GfxObjId, gfx);
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// Now upload regular GfxObj sub-meshes (stabs, scenery, interior stabs).
|
|
||||||
// Skip any ids already uploaded (includes the cell meshes just drained).
|
|
||||||
foreach (var entity in lb.Entities)
|
|
||||||
{
|
|
||||||
foreach (var meshRef in entity.MeshRefs)
|
|
||||||
{
|
|
||||||
// Skip EnvCell synthetic ids — already handled above (or already
|
|
||||||
// uploaded on a prior tick). GfxObj ids are 0x01xxxxxx; Setup ids
|
|
||||||
// are 0x02xxxxxx; anything else is not a GfxObj dat record.
|
|
||||||
if ((meshRef.GfxObjId & 0xFF000000u) != 0x01000000u) continue;
|
|
||||||
var gfx = _dats.Get<DatReaderWriter.DBObjs.GfxObj>(meshRef.GfxObjId);
|
|
||||||
if (gfx is null) continue;
|
|
||||||
_physicsDataCache.CacheGfxObj(meshRef.GfxObjId, gfx);
|
|
||||||
var subMeshes = AcDream.Core.Meshing.GfxObjMesh.Build(gfx, _dats);
|
|
||||||
_staticMesh.EnsureUploaded(meshRef.GfxObjId, subMeshes);
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
// Drain _pendingCellMeshes to prevent unbounded accumulation.
|
||||||
|
// The data is no longer consumed (WB handles EnvCell geometry through
|
||||||
|
// its own pipeline), but the worker thread still populates this dict.
|
||||||
|
_pendingCellMeshes.Clear();
|
||||||
|
|
||||||
// Task 7: register static entities into the ShadowObjectRegistry so the
|
// Task 7: register static entities into the ShadowObjectRegistry so the
|
||||||
// Transition system can find and collide against them during movement.
|
// Transition system can find and collide against them during movement.
|
||||||
|
|
@ -6336,20 +6339,11 @@ public sealed class GameWindow : IDisposable
|
||||||
animatedIds.Add(k);
|
animatedIds.Add(k);
|
||||||
}
|
}
|
||||||
|
|
||||||
if (_wbDrawDispatcher is not null)
|
// N.5: WbDrawDispatcher is always non-null (modern path mandatory).
|
||||||
{
|
_wbDrawDispatcher!.Draw(camera, _worldState.LandblockEntries, frustum,
|
||||||
_wbDrawDispatcher.Draw(camera, _worldState.LandblockEntries, frustum,
|
neverCullLandblockId: playerLb,
|
||||||
neverCullLandblockId: playerLb,
|
visibleCellIds: visibility?.VisibleCellIds,
|
||||||
visibleCellIds: visibility?.VisibleCellIds,
|
animatedEntityIds: animatedIds);
|
||||||
animatedEntityIds: animatedIds);
|
|
||||||
}
|
|
||||||
else
|
|
||||||
{
|
|
||||||
_staticMesh?.Draw(camera, _worldState.LandblockEntries, frustum,
|
|
||||||
neverCullLandblockId: playerLb,
|
|
||||||
visibleCellIds: visibility?.VisibleCellIds,
|
|
||||||
animatedEntityIds: animatedIds);
|
|
||||||
}
|
|
||||||
|
|
||||||
// Phase G.1 / E.3: draw all live particles after opaque
|
// Phase G.1 / E.3: draw all live particles after opaque
|
||||||
// scene geometry so alpha blending composites correctly.
|
// scene geometry so alpha blending composites correctly.
|
||||||
|
|
@ -8731,11 +8725,10 @@ public sealed class GameWindow : IDisposable
|
||||||
_liveSession?.Dispose();
|
_liveSession?.Dispose();
|
||||||
_audioEngine?.Dispose(); // Phase E.2: stop all voices, close AL context
|
_audioEngine?.Dispose(); // Phase E.2: stop all voices, close AL context
|
||||||
_wbDrawDispatcher?.Dispose();
|
_wbDrawDispatcher?.Dispose();
|
||||||
_staticMesh?.Dispose();
|
|
||||||
_skyRenderer?.Dispose(); // depends on sampler cache; dispose first
|
_skyRenderer?.Dispose(); // depends on sampler cache; dispose first
|
||||||
_samplerCache?.Dispose();
|
_samplerCache?.Dispose();
|
||||||
_textureCache?.Dispose();
|
_textureCache?.Dispose();
|
||||||
_wbMeshAdapter?.Dispose(); // Phase N.4 WB foundation — null when flag off
|
_wbMeshAdapter?.Dispose(); // Phase N.4+N.5 WB foundation (mandatory modern path)
|
||||||
|
|
||||||
_meshShader?.Dispose();
|
_meshShader?.Dispose();
|
||||||
_terrain?.Dispose();
|
_terrain?.Dispose();
|
||||||
|
|
|
||||||
|
|
@ -1,596 +0,0 @@
|
||||||
// src/AcDream.App/Rendering/InstancedMeshRenderer.cs
|
|
||||||
//
|
|
||||||
// True instanced rendering for static-object meshes.
|
|
||||||
// Groups entities by GfxObjId. All instance model matrices are written into
|
|
||||||
// a single shared instance VBO once per frame. Each sub-mesh is drawn with
|
|
||||||
// DrawElementsInstanced — one GL draw call per (GfxObj × sub-mesh) instead
|
|
||||||
// of one per entity. For a scene with N unique GfxObjs and M total entities
|
|
||||||
// this reduces draw calls from M*subMeshes to N*subMeshes.
|
|
||||||
//
|
|
||||||
// Matrix layout:
|
|
||||||
// System.Numerics.Matrix4x4 is row-major. Written to the float[] buffer in
|
|
||||||
// natural memory order (M11..M44). The GLSL shader reads 4 vec4 attributes
|
|
||||||
// (aInstanceRow0-3) and constructs mat4(row0, row1, row2, row3). Because
|
|
||||||
// GLSL mat4() takes column vectors, the rows of the C# matrix become the
|
|
||||||
// columns of the GLSL mat4 — which is the same transpose that UniformMatrix4
|
|
||||||
// with transpose=false produces. Visual result is identical to the old
|
|
||||||
// SetMatrix4("uModel", ...) path.
|
|
||||||
//
|
|
||||||
// Architecture note: public API matches StaticMeshRenderer so GameWindow only
|
|
||||||
// needs to update the shader and uniform setup at the call sites.
|
|
||||||
using System.Numerics;
|
|
||||||
using System.Runtime.InteropServices;
|
|
||||||
using AcDream.App.Rendering.Wb;
|
|
||||||
using AcDream.Core.Meshing;
|
|
||||||
using AcDream.Core.Terrain;
|
|
||||||
using AcDream.Core.World;
|
|
||||||
using Silk.NET.OpenGL;
|
|
||||||
|
|
||||||
namespace AcDream.App.Rendering;
|
|
||||||
|
|
||||||
public sealed unsafe class InstancedMeshRenderer : IDisposable
|
|
||||||
{
|
|
||||||
private readonly GL _gl;
|
|
||||||
private readonly Shader _shader;
|
|
||||||
private readonly TextureCache _textures;
|
|
||||||
|
|
||||||
/// <summary>
|
|
||||||
/// Optional WB adapter. Held but currently unused — Phase N.4 Adjustment 2
|
|
||||||
/// (2026-05-08) reverted Task 9's renderer-level routing. Tier-routing decisions
|
|
||||||
/// (atlas vs per-instance) belong at the spawn-callback layer (Task 11
|
|
||||||
/// LandblockSpawnAdapter for atlas-tier; Task 17 EntitySpawnAdapter for
|
|
||||||
/// per-instance), not in the renderer which is intentionally tier-blind. The
|
|
||||||
/// constructor parameter is preserved so GameWindow's wire-up doesn't shift
|
|
||||||
/// when later tasks need adapter access.
|
|
||||||
/// </summary>
|
|
||||||
private readonly WbMeshAdapter? _wbMeshAdapter;
|
|
||||||
|
|
||||||
// One GPU bundle per unique GfxObj id. Each GfxObj can have multiple sub-meshes.
|
|
||||||
private readonly Dictionary<uint, List<SubMeshGpu>> _gpuByGfxObj = new();
|
|
||||||
|
|
||||||
// Shared instance VBO — filled every frame with all instance model matrices.
|
|
||||||
private readonly uint _instanceVbo;
|
|
||||||
|
|
||||||
// Per-frame scratch: reused float buffer for instance matrix data.
|
|
||||||
// 16 floats per mat4. Grown on demand; never shrunk.
|
|
||||||
private float[] _instanceBuffer = new float[256 * 16]; // start at 256 instances
|
|
||||||
|
|
||||||
// ── Instance grouping scratch ─────────────────────────────────────────────
|
|
||||||
//
|
|
||||||
// Reused every frame to avoid per-frame allocation.
|
|
||||||
//
|
|
||||||
// **Group key = (GfxObjId, PaletteOverrideHash, SurfaceOverridesHash).**
|
|
||||||
//
|
|
||||||
// An earlier implementation grouped on <c>GfxObjId</c> alone and resolved
|
|
||||||
// the per-sub-mesh texture from the first instance in the group — which
|
|
||||||
// is fine for scenery where every tree shares the same palette, but
|
|
||||||
// utterly broken for NPCs: every humanoid uses the same base body
|
|
||||||
// GfxObjs and they all piled into one group, so the first NPC's palette
|
|
||||||
// was used for every NPC in the frame. Frustum culling + iteration
|
|
||||||
// order meant that "first NPC" changed as the camera turned — producing
|
|
||||||
// the "NPC clothing changes when I turn" symptom.
|
|
||||||
//
|
|
||||||
// Now we also key by the entity's PaletteOverride + per-MeshRef
|
|
||||||
// SurfaceOverrides signature so only entities that decode to the
|
|
||||||
// SAME texture for every sub-mesh can share a batch. Entities with
|
|
||||||
// unique appearance fall to single-instance groups (still correct,
|
|
||||||
// marginally slower than true instancing).
|
|
||||||
private readonly Dictionary<GroupKey, InstanceGroup> _groups = new();
|
|
||||||
|
|
||||||
private readonly record struct GroupKey(uint GfxObjId, ulong TextureSignature);
|
|
||||||
|
|
||||||
public InstancedMeshRenderer(GL gl, Shader shader, TextureCache textures,
|
|
||||||
WbMeshAdapter? wbMeshAdapter = null)
|
|
||||||
{
|
|
||||||
_gl = gl;
|
|
||||||
_shader = shader;
|
|
||||||
_textures = textures;
|
|
||||||
_wbMeshAdapter = wbMeshAdapter;
|
|
||||||
|
|
||||||
_instanceVbo = _gl.GenBuffer();
|
|
||||||
}
|
|
||||||
|
|
||||||
// ── Upload ────────────────────────────────────────────────────────────────
|
|
||||||
|
|
||||||
public void EnsureUploaded(uint gfxObjId, IReadOnlyList<GfxObjSubMesh> subMeshes)
|
|
||||||
{
|
|
||||||
if (_gpuByGfxObj.ContainsKey(gfxObjId))
|
|
||||||
return;
|
|
||||||
|
|
||||||
// Phase N.4 Adjustment 2 (2026-05-08): renderer is tier-blind. Tier-routing
|
|
||||||
// (atlas vs per-instance) lives at the spawn-callback layer (Tasks 11 + 17),
|
|
||||||
// not here. Smoke-test of the original Task 9 routing showed it caught
|
|
||||||
// characters / NPCs (server-spawned, per-instance tier) along with static
|
|
||||||
// scenery, because EnsureUploaded is called from both spawn paths.
|
|
||||||
var list = new List<SubMeshGpu>(subMeshes.Count);
|
|
||||||
foreach (var sm in subMeshes)
|
|
||||||
list.Add(UploadSubMesh(sm));
|
|
||||||
_gpuByGfxObj[gfxObjId] = list;
|
|
||||||
}
|
|
||||||
|
|
||||||
private SubMeshGpu UploadSubMesh(GfxObjSubMesh sm)
|
|
||||||
{
|
|
||||||
uint vao = _gl.GenVertexArray();
|
|
||||||
_gl.BindVertexArray(vao);
|
|
||||||
|
|
||||||
// ── Vertex buffer (positions, normals, UVs) ───────────────────────────
|
|
||||||
uint vbo = _gl.GenBuffer();
|
|
||||||
_gl.BindBuffer(BufferTargetARB.ArrayBuffer, vbo);
|
|
||||||
fixed (void* p = sm.Vertices)
|
|
||||||
_gl.BufferData(BufferTargetARB.ArrayBuffer,
|
|
||||||
(nuint)(sm.Vertices.Length * sizeof(Vertex)), p, BufferUsageARB.StaticDraw);
|
|
||||||
|
|
||||||
uint stride = (uint)sizeof(Vertex);
|
|
||||||
_gl.EnableVertexAttribArray(0);
|
|
||||||
_gl.VertexAttribPointer(0, 3, VertexAttribPointerType.Float, false, stride, (void*)0);
|
|
||||||
_gl.EnableVertexAttribArray(1);
|
|
||||||
_gl.VertexAttribPointer(1, 3, VertexAttribPointerType.Float, false, stride, (void*)(3 * sizeof(float)));
|
|
||||||
_gl.EnableVertexAttribArray(2);
|
|
||||||
_gl.VertexAttribPointer(2, 2, VertexAttribPointerType.Float, false, stride, (void*)(6 * sizeof(float)));
|
|
||||||
// Note: location 3 (uint TerrainLayer) is NOT used by mesh_instanced.vert;
|
|
||||||
// that slot is reserved for per-instance mat4 row 0 from the instance VBO.
|
|
||||||
|
|
||||||
// ── Index buffer ──────────────────────────────────────────────────────
|
|
||||||
uint ebo = _gl.GenBuffer();
|
|
||||||
_gl.BindBuffer(BufferTargetARB.ElementArrayBuffer, ebo);
|
|
||||||
fixed (void* p = sm.Indices)
|
|
||||||
_gl.BufferData(BufferTargetARB.ElementArrayBuffer,
|
|
||||||
(nuint)(sm.Indices.Length * sizeof(uint)), p, BufferUsageARB.StaticDraw);
|
|
||||||
|
|
||||||
// ── Per-instance model matrix (locations 3-6) ─────────────────────────
|
|
||||||
// Bind the shared instance VBO. The VAO captures this binding at each
|
|
||||||
// attribute location. At draw time we re-call VertexAttribPointer with
|
|
||||||
// the per-group byte offset (to address different groups in the VBO
|
|
||||||
// without DrawElementsInstancedBaseInstance).
|
|
||||||
_gl.BindBuffer(BufferTargetARB.ArrayBuffer, _instanceVbo);
|
|
||||||
// mat4 = 4 × vec4, stride = 64 bytes, divisor = 1 (advance once per instance)
|
|
||||||
for (uint row = 0; row < 4; row++)
|
|
||||||
{
|
|
||||||
uint loc = 3 + row;
|
|
||||||
_gl.EnableVertexAttribArray(loc);
|
|
||||||
_gl.VertexAttribPointer(loc, 4, VertexAttribPointerType.Float, false, 64, (void*)(row * 16));
|
|
||||||
_gl.VertexAttribDivisor(loc, 1);
|
|
||||||
}
|
|
||||||
|
|
||||||
_gl.BindVertexArray(0);
|
|
||||||
|
|
||||||
return new SubMeshGpu
|
|
||||||
{
|
|
||||||
Vao = vao,
|
|
||||||
Vbo = vbo,
|
|
||||||
Ebo = ebo,
|
|
||||||
IndexCount = sm.Indices.Length,
|
|
||||||
SurfaceId = sm.SurfaceId,
|
|
||||||
Translucency = sm.Translucency,
|
|
||||||
};
|
|
||||||
}
|
|
||||||
|
|
||||||
// ── Draw ──────────────────────────────────────────────────────────────────
|
|
||||||
|
|
||||||
public void Draw(ICamera camera,
|
|
||||||
IEnumerable<(uint LandblockId, Vector3 AabbMin, Vector3 AabbMax, IReadOnlyList<WorldEntity> Entities)> landblockEntries,
|
|
||||||
FrustumPlanes? frustum = null,
|
|
||||||
uint? neverCullLandblockId = null,
|
|
||||||
HashSet<uint>? visibleCellIds = null,
|
|
||||||
// L-fix1 (2026-04-28): set of entity ids that should bypass the
|
|
||||||
// landblock-level frustum cull. Animated entities (other
|
|
||||||
// players, NPCs, monsters) are always rendered if their
|
|
||||||
// landblock is loaded — without this they vanish whenever the
|
|
||||||
// camera rotates away from their landblock, even though
|
|
||||||
// they're within visible distance of the player. Pass null /
|
|
||||||
// empty to keep the previous "cull everything by landblock"
|
|
||||||
// behavior.
|
|
||||||
HashSet<uint>? animatedEntityIds = null)
|
|
||||||
{
|
|
||||||
_shader.Use();
|
|
||||||
|
|
||||||
var vp = camera.View * camera.Projection;
|
|
||||||
_shader.SetMatrix4("uViewProjection", vp);
|
|
||||||
|
|
||||||
// Phase G: lighting + ambient + fog are owned by the
|
|
||||||
// SceneLighting UBO (binding=1) uploaded once per frame by
|
|
||||||
// GameWindow. The instanced mesh fragment shader reads it
|
|
||||||
// directly — no per-draw uniform uploads needed.
|
|
||||||
|
|
||||||
// ── Collect and group instances ───────────────────────────────────────
|
|
||||||
CollectGroups(landblockEntries, frustum, neverCullLandblockId, visibleCellIds, animatedEntityIds);
|
|
||||||
|
|
||||||
// ── Build and upload the instance buffer ──────────────────────────────
|
|
||||||
// Count total instances.
|
|
||||||
int totalInstances = 0;
|
|
||||||
foreach (var grp in _groups.Values)
|
|
||||||
totalInstances += grp.Count;
|
|
||||||
|
|
||||||
// Grow the scratch buffer if needed.
|
|
||||||
int needed = totalInstances * 16;
|
|
||||||
if (_instanceBuffer.Length < needed)
|
|
||||||
_instanceBuffer = new float[needed + 256 * 16]; // extra headroom
|
|
||||||
|
|
||||||
// Write all groups contiguously. Record each group's starting offset
|
|
||||||
// (in units of instances, not bytes) so we can address them at draw time.
|
|
||||||
int instanceOffset = 0;
|
|
||||||
foreach (var grp in _groups.Values)
|
|
||||||
{
|
|
||||||
grp.BufferOffset = instanceOffset;
|
|
||||||
foreach (ref readonly var inst in CollectionsMarshal.AsSpan(grp.Entries))
|
|
||||||
WriteMatrix(_instanceBuffer, instanceOffset++ * 16, inst.Model);
|
|
||||||
}
|
|
||||||
|
|
||||||
// Upload all instance data in a single DynamicDraw call.
|
|
||||||
if (totalInstances > 0)
|
|
||||||
{
|
|
||||||
_gl.BindBuffer(BufferTargetARB.ArrayBuffer, _instanceVbo);
|
|
||||||
fixed (void* p = _instanceBuffer)
|
|
||||||
_gl.BufferData(BufferTargetARB.ArrayBuffer,
|
|
||||||
(nuint)(totalInstances * 16 * sizeof(float)), p, BufferUsageARB.DynamicDraw);
|
|
||||||
}
|
|
||||||
|
|
||||||
// ── Pass 1: Opaque + ClipMap ──────────────────────────────────────────
|
|
||||||
// Diagnostic: ACDREAM_NO_CULL=1 disables backface culling entirely.
|
|
||||||
if (string.Equals(Environment.GetEnvironmentVariable("ACDREAM_NO_CULL"), "1", StringComparison.Ordinal))
|
|
||||||
{
|
|
||||||
_gl.Disable(EnableCap.CullFace);
|
|
||||||
}
|
|
||||||
foreach (var (key, grp) in _groups)
|
|
||||||
{
|
|
||||||
if (!_gpuByGfxObj.TryGetValue(key.GfxObjId, out var subMeshes))
|
|
||||||
continue;
|
|
||||||
|
|
||||||
bool hasOpaqueSubMesh = false;
|
|
||||||
foreach (var sub in subMeshes)
|
|
||||||
{
|
|
||||||
if (sub.Translucency == TranslucencyKind.Opaque ||
|
|
||||||
sub.Translucency == TranslucencyKind.ClipMap)
|
|
||||||
{
|
|
||||||
hasOpaqueSubMesh = true;
|
|
||||||
break;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
if (!hasOpaqueSubMesh) continue;
|
|
||||||
|
|
||||||
// For this group, instance data starts at grp.BufferOffset in the VBO.
|
|
||||||
// We need to tell the VAO to read from that offset.
|
|
||||||
uint byteOffset = (uint)(grp.BufferOffset * 64); // 64 bytes per mat4
|
|
||||||
|
|
||||||
foreach (var sub in subMeshes)
|
|
||||||
{
|
|
||||||
if (sub.Translucency != TranslucencyKind.Opaque &&
|
|
||||||
sub.Translucency != TranslucencyKind.ClipMap)
|
|
||||||
continue;
|
|
||||||
|
|
||||||
_shader.SetInt("uTranslucencyKind", (int)sub.Translucency);
|
|
||||||
|
|
||||||
// Bind VAO + re-point instance attributes to the group's slice
|
|
||||||
// in the shared VBO. This updates the VAO's stored offset for
|
|
||||||
// locations 3-6 without touching the vertex or index bindings.
|
|
||||||
_gl.BindVertexArray(sub.Vao);
|
|
||||||
_gl.BindBuffer(BufferTargetARB.ArrayBuffer, _instanceVbo);
|
|
||||||
for (uint row = 0; row < 4; row++)
|
|
||||||
{
|
|
||||||
_gl.VertexAttribPointer(3 + row, 4, VertexAttribPointerType.Float,
|
|
||||||
false, 64, (void*)(byteOffset + row * 16));
|
|
||||||
}
|
|
||||||
|
|
||||||
// Resolve texture from the first instance (all instances in this
|
|
||||||
// group share the same GfxObj so they have compatible overrides
|
|
||||||
// only in the degenerate case of mixed-palette entities using the
|
|
||||||
// same GfxObj — rare enough to accept the approximation here).
|
|
||||||
if (grp.Count == 0) continue;
|
|
||||||
var firstEntry = grp.Entries[0];
|
|
||||||
uint tex = ResolveTex(firstEntry.Entity, firstEntry.MeshRef, sub);
|
|
||||||
_gl.ActiveTexture(TextureUnit.Texture0);
|
|
||||||
_gl.BindTexture(TextureTarget.Texture2D, tex);
|
|
||||||
|
|
||||||
_gl.DrawElementsInstanced(PrimitiveType.Triangles,
|
|
||||||
(uint)sub.IndexCount,
|
|
||||||
DrawElementsType.UnsignedInt,
|
|
||||||
(void*)0,
|
|
||||||
(uint)grp.Count);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// ── Pass 2: Translucent (AlphaBlend, Additive, InvAlpha) ─────────────
|
|
||||||
_gl.Enable(EnableCap.Blend);
|
|
||||||
_gl.DepthMask(false);
|
|
||||||
// Diagnostic: ACDREAM_NO_CULL=1 disables backface culling (used 2026-05-01
|
|
||||||
// to test if our mesh winding (0,i,i+1) vs ACME's (i+1,i,0) is causing
|
|
||||||
// visible polygons to be culled, especially around the neck/coat seam).
|
|
||||||
if (string.Equals(Environment.GetEnvironmentVariable("ACDREAM_NO_CULL"), "1", StringComparison.Ordinal))
|
|
||||||
{
|
|
||||||
_gl.Disable(EnableCap.CullFace);
|
|
||||||
}
|
|
||||||
else
|
|
||||||
{
|
|
||||||
_gl.Enable(EnableCap.CullFace);
|
|
||||||
_gl.CullFace(TriangleFace.Back);
|
|
||||||
_gl.FrontFace(FrontFaceDirection.Ccw);
|
|
||||||
}
|
|
||||||
|
|
||||||
foreach (var (key, grp) in _groups)
|
|
||||||
{
|
|
||||||
if (!_gpuByGfxObj.TryGetValue(key.GfxObjId, out var subMeshes))
|
|
||||||
continue;
|
|
||||||
|
|
||||||
bool hasTranslucentSubMesh = false;
|
|
||||||
foreach (var sub in subMeshes)
|
|
||||||
{
|
|
||||||
if (sub.Translucency != TranslucencyKind.Opaque &&
|
|
||||||
sub.Translucency != TranslucencyKind.ClipMap)
|
|
||||||
{
|
|
||||||
hasTranslucentSubMesh = true;
|
|
||||||
break;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
if (!hasTranslucentSubMesh) continue;
|
|
||||||
|
|
||||||
uint byteOffset = (uint)(grp.BufferOffset * 64);
|
|
||||||
|
|
||||||
foreach (var sub in subMeshes)
|
|
||||||
{
|
|
||||||
if (sub.Translucency == TranslucencyKind.Opaque ||
|
|
||||||
sub.Translucency == TranslucencyKind.ClipMap)
|
|
||||||
continue;
|
|
||||||
|
|
||||||
switch (sub.Translucency)
|
|
||||||
{
|
|
||||||
case TranslucencyKind.Additive:
|
|
||||||
_gl.BlendFunc(BlendingFactor.SrcAlpha, BlendingFactor.One);
|
|
||||||
break;
|
|
||||||
case TranslucencyKind.InvAlpha:
|
|
||||||
_gl.BlendFunc(BlendingFactor.OneMinusSrcAlpha, BlendingFactor.SrcAlpha);
|
|
||||||
break;
|
|
||||||
default: // AlphaBlend
|
|
||||||
_gl.BlendFunc(BlendingFactor.SrcAlpha, BlendingFactor.OneMinusSrcAlpha);
|
|
||||||
break;
|
|
||||||
}
|
|
||||||
|
|
||||||
_shader.SetInt("uTranslucencyKind", (int)sub.Translucency);
|
|
||||||
|
|
||||||
_gl.BindVertexArray(sub.Vao);
|
|
||||||
_gl.BindBuffer(BufferTargetARB.ArrayBuffer, _instanceVbo);
|
|
||||||
for (uint row = 0; row < 4; row++)
|
|
||||||
{
|
|
||||||
_gl.VertexAttribPointer(3 + row, 4, VertexAttribPointerType.Float,
|
|
||||||
false, 64, (void*)(byteOffset + row * 16));
|
|
||||||
}
|
|
||||||
|
|
||||||
if (grp.Count == 0) continue;
|
|
||||||
var firstEntry = grp.Entries[0];
|
|
||||||
uint tex = ResolveTex(firstEntry.Entity, firstEntry.MeshRef, sub);
|
|
||||||
_gl.ActiveTexture(TextureUnit.Texture0);
|
|
||||||
_gl.BindTexture(TextureTarget.Texture2D, tex);
|
|
||||||
|
|
||||||
_gl.DrawElementsInstanced(PrimitiveType.Triangles,
|
|
||||||
(uint)sub.IndexCount,
|
|
||||||
DrawElementsType.UnsignedInt,
|
|
||||||
(void*)0,
|
|
||||||
(uint)grp.Count);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// Restore default GL state.
|
|
||||||
_gl.DepthMask(true);
|
|
||||||
_gl.Disable(EnableCap.Blend);
|
|
||||||
_gl.Disable(EnableCap.CullFace);
|
|
||||||
_gl.BindVertexArray(0);
|
|
||||||
}
|
|
||||||
|
|
||||||
// ── Grouping ──────────────────────────────────────────────────────────────
|
|
||||||
|
|
||||||
/// <summary>
|
|
||||||
/// Iterates all visible landblock entries and groups every (entity, meshRef)
|
|
||||||
/// pair by GfxObjId. Clears previous frame's groups before filling.
|
|
||||||
/// </summary>
|
|
||||||
private void CollectGroups(
|
|
||||||
IEnumerable<(uint LandblockId, Vector3 AabbMin, Vector3 AabbMax, IReadOnlyList<WorldEntity> Entities)> landblockEntries,
|
|
||||||
FrustumPlanes? frustum,
|
|
||||||
uint? neverCullLandblockId,
|
|
||||||
HashSet<uint>? visibleCellIds,
|
|
||||||
HashSet<uint>? animatedEntityIds)
|
|
||||||
{
|
|
||||||
foreach (var grp in _groups.Values)
|
|
||||||
grp.Entries.Clear();
|
|
||||||
|
|
||||||
foreach (var entry in landblockEntries)
|
|
||||||
{
|
|
||||||
// L-fix1 (2026-04-28): the landblock cull decision is now
|
|
||||||
// PER-LANDBLOCK boolean, not a continue. We still need to
|
|
||||||
// walk the entity list because animated entities (in
|
|
||||||
// animatedEntityIds) bypass the cull and render anyway.
|
|
||||||
bool landblockVisible = frustum is null
|
|
||||||
|| entry.LandblockId == neverCullLandblockId
|
|
||||||
|| FrustumCuller.IsAabbVisible(frustum.Value, entry.AabbMin, entry.AabbMax);
|
|
||||||
|
|
||||||
// Fast path: no animated entities globally → if landblock is
|
|
||||||
// culled, skip the whole entity list (preserves the original
|
|
||||||
// O(visible-landblocks) cost when the caller doesn't care
|
|
||||||
// about animated bypass).
|
|
||||||
if (!landblockVisible && (animatedEntityIds is null || animatedEntityIds.Count == 0))
|
|
||||||
continue;
|
|
||||||
|
|
||||||
foreach (var entity in entry.Entities)
|
|
||||||
{
|
|
||||||
if (entity.MeshRefs.Count == 0)
|
|
||||||
continue;
|
|
||||||
|
|
||||||
// L-fix1: when the landblock is frustum-culled, only
|
|
||||||
// render entities flagged as animated. This keeps
|
|
||||||
// remote players / NPCs / monsters visible even when
|
|
||||||
// their landblock rotates out of the view frustum.
|
|
||||||
bool isAnimated = animatedEntityIds?.Contains(entity.Id) == true;
|
|
||||||
if (!landblockVisible && !isAnimated)
|
|
||||||
continue;
|
|
||||||
|
|
||||||
// Step 4: portal visibility filter. If we have a visible cell set,
|
|
||||||
// skip interior entities whose parent cell isn't visible.
|
|
||||||
// visibleCellIds == null means camera is outdoors → show all interiors.
|
|
||||||
if (entity.ParentCellId.HasValue && visibleCellIds is not null
|
|
||||||
&& !visibleCellIds.Contains(entity.ParentCellId.Value))
|
|
||||||
continue;
|
|
||||||
|
|
||||||
var entityRoot =
|
|
||||||
Matrix4x4.CreateFromQuaternion(entity.Rotation) *
|
|
||||||
Matrix4x4.CreateTranslation(entity.Position);
|
|
||||||
|
|
||||||
// Hash the entity's PaletteOverride once — shared by every
|
|
||||||
// MeshRef on this entity, so we compute it outside the loop.
|
|
||||||
ulong palHash = HashPaletteOverride(entity.PaletteOverride);
|
|
||||||
|
|
||||||
foreach (var meshRef in entity.MeshRefs)
|
|
||||||
{
|
|
||||||
if (!_gpuByGfxObj.TryGetValue(meshRef.GfxObjId, out var cachedMeshes))
|
|
||||||
continue;
|
|
||||||
|
|
||||||
var model = meshRef.PartTransform * entityRoot;
|
|
||||||
|
|
||||||
// Texture signature = palette hash ^ surface-overrides hash.
|
|
||||||
// Two instances can share a batch only when their ResolveTex
|
|
||||||
// would return identical handles for every sub-mesh — that
|
|
||||||
// means identical palette AND identical surface overrides.
|
|
||||||
ulong surfHash = HashSurfaceOverrides(meshRef.SurfaceOverrides);
|
|
||||||
ulong texSig = palHash ^ surfHash;
|
|
||||||
var key = new GroupKey(meshRef.GfxObjId, texSig);
|
|
||||||
|
|
||||||
if (!_groups.TryGetValue(key, out var group))
|
|
||||||
{
|
|
||||||
group = new InstanceGroup();
|
|
||||||
_groups[key] = group;
|
|
||||||
}
|
|
||||||
|
|
||||||
group.Entries.Add(new InstanceEntry(model, entity, meshRef));
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
private static ulong HashPaletteOverride(AcDream.Core.World.PaletteOverride? p)
|
|
||||||
{
|
|
||||||
if (p is null) return 0UL;
|
|
||||||
ulong h = 0xCBF29CE484222325UL;
|
|
||||||
const ulong prime = 0x100000001B3UL;
|
|
||||||
h = (h ^ p.BasePaletteId) * prime;
|
|
||||||
foreach (var sp in p.SubPalettes)
|
|
||||||
{
|
|
||||||
h = (h ^ sp.SubPaletteId) * prime;
|
|
||||||
h = (h ^ sp.Offset) * prime;
|
|
||||||
h = (h ^ sp.Length) * prime;
|
|
||||||
}
|
|
||||||
return h;
|
|
||||||
}
|
|
||||||
|
|
||||||
/// <summary>
|
|
||||||
/// Order-independent hash of a SurfaceOverrides dictionary. XOR of each
|
|
||||||
/// (key, value) pair keeps the result stable regardless of Dictionary
|
|
||||||
/// iteration order, so two instances whose override maps contain the
|
|
||||||
/// same pairs will hash identically.
|
|
||||||
/// </summary>
|
|
||||||
private static ulong HashSurfaceOverrides(IReadOnlyDictionary<uint, uint>? overrides)
|
|
||||||
{
|
|
||||||
if (overrides is null || overrides.Count == 0) return 0UL;
|
|
||||||
ulong acc = 0UL;
|
|
||||||
foreach (var kvp in overrides)
|
|
||||||
{
|
|
||||||
ulong pair = ((ulong)kvp.Key << 32) | kvp.Value;
|
|
||||||
acc ^= pair;
|
|
||||||
}
|
|
||||||
// Fold with a prime so the zero case doesn't collide with "empty".
|
|
||||||
return (acc ^ 0xCBF29CE484222325UL) * 0x100000001B3UL;
|
|
||||||
}
|
|
||||||
|
|
||||||
// ── Matrix write ──────────────────────────────────────────────────────────
|
|
||||||
|
|
||||||
/// <summary>
|
|
||||||
/// Writes a System.Numerics Matrix4x4 into <paramref name="buf"/> starting
|
|
||||||
/// at <paramref name="offset"/> as 16 consecutive floats in row-major order
|
|
||||||
/// (the C# natural memory layout). The GLSL shader reads each 4-float row
|
|
||||||
/// as a column of the mat4 — identical to what UniformMatrix4(transpose=false)
|
|
||||||
/// produces for the uniform path.
|
|
||||||
/// </summary>
|
|
||||||
private static void WriteMatrix(float[] buf, int offset, in Matrix4x4 m)
|
|
||||||
{
|
|
||||||
buf[offset + 0] = m.M11; buf[offset + 1] = m.M12; buf[offset + 2] = m.M13; buf[offset + 3] = m.M14;
|
|
||||||
buf[offset + 4] = m.M21; buf[offset + 5] = m.M22; buf[offset + 6] = m.M23; buf[offset + 7] = m.M24;
|
|
||||||
buf[offset + 8] = m.M31; buf[offset + 9] = m.M32; buf[offset + 10] = m.M33; buf[offset + 11] = m.M34;
|
|
||||||
buf[offset + 12] = m.M41; buf[offset + 13] = m.M42; buf[offset + 14] = m.M43; buf[offset + 15] = m.M44;
|
|
||||||
}
|
|
||||||
|
|
||||||
// ── Texture resolution ────────────────────────────────────────────────────
|
|
||||||
|
|
||||||
private uint ResolveTex(WorldEntity entity, MeshRef meshRef, SubMeshGpu sub)
|
|
||||||
{
|
|
||||||
uint overrideOrigTex = 0;
|
|
||||||
bool hasOrigTexOverride = meshRef.SurfaceOverrides is not null
|
|
||||||
&& meshRef.SurfaceOverrides.TryGetValue(sub.SurfaceId, out overrideOrigTex);
|
|
||||||
uint? origTexOverride = hasOrigTexOverride ? overrideOrigTex : (uint?)null;
|
|
||||||
|
|
||||||
if (entity.PaletteOverride is not null)
|
|
||||||
{
|
|
||||||
return _textures.GetOrUploadWithPaletteOverride(
|
|
||||||
sub.SurfaceId, origTexOverride, entity.PaletteOverride);
|
|
||||||
}
|
|
||||||
else if (hasOrigTexOverride)
|
|
||||||
{
|
|
||||||
return _textures.GetOrUploadWithOrigTextureOverride(sub.SurfaceId, overrideOrigTex);
|
|
||||||
}
|
|
||||||
else
|
|
||||||
{
|
|
||||||
return _textures.GetOrUpload(sub.SurfaceId);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// ── Disposal ──────────────────────────────────────────────────────────────
|
|
||||||
|
|
||||||
public void Dispose()
|
|
||||||
{
|
|
||||||
foreach (var subs in _gpuByGfxObj.Values)
|
|
||||||
{
|
|
||||||
foreach (var sub in subs)
|
|
||||||
{
|
|
||||||
_gl.DeleteBuffer(sub.Vbo);
|
|
||||||
_gl.DeleteBuffer(sub.Ebo);
|
|
||||||
_gl.DeleteVertexArray(sub.Vao);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
_gl.DeleteBuffer(_instanceVbo);
|
|
||||||
_gpuByGfxObj.Clear();
|
|
||||||
_groups.Clear();
|
|
||||||
}
|
|
||||||
|
|
||||||
// ── Private types ─────────────────────────────────────────────────────────
|
|
||||||
|
|
||||||
private sealed class SubMeshGpu
|
|
||||||
{
|
|
||||||
public uint Vao;
|
|
||||||
public uint Vbo;
|
|
||||||
public uint Ebo;
|
|
||||||
public int IndexCount;
|
|
||||||
public uint SurfaceId;
|
|
||||||
public TranslucencyKind Translucency;
|
|
||||||
}
|
|
||||||
|
|
||||||
/// <summary>
|
|
||||||
/// All instances of one GfxObj for this frame, plus their starting offset
|
|
||||||
/// in the shared instance VBO (in units of instances, not bytes).
|
|
||||||
/// </summary>
|
|
||||||
private sealed class InstanceGroup
|
|
||||||
{
|
|
||||||
public readonly List<InstanceEntry> Entries = new();
|
|
||||||
public int BufferOffset;
|
|
||||||
|
|
||||||
public int Count => Entries.Count;
|
|
||||||
}
|
|
||||||
|
|
||||||
private readonly struct InstanceEntry
|
|
||||||
{
|
|
||||||
public readonly Matrix4x4 Model;
|
|
||||||
public readonly WorldEntity Entity;
|
|
||||||
public readonly MeshRef MeshRef;
|
|
||||||
|
|
||||||
public InstanceEntry(Matrix4x4 model, WorldEntity entity, MeshRef meshRef)
|
|
||||||
{
|
|
||||||
Model = model;
|
|
||||||
Entity = entity;
|
|
||||||
MeshRef = meshRef;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
@ -1,35 +0,0 @@
|
||||||
#version 430 core
|
|
||||||
|
|
||||||
// Per-vertex attributes
|
|
||||||
layout(location = 0) in vec3 aPosition;
|
|
||||||
layout(location = 1) in vec3 aNormal;
|
|
||||||
layout(location = 2) in vec2 aTexCoord;
|
|
||||||
|
|
||||||
// Per-instance model matrix, split across four vec4 attribute slots.
|
|
||||||
// A mat4 consumes 4 consecutive attribute locations, so locations 3-6 are
|
|
||||||
// all occupied by this single logical matrix. The C# side must call
|
|
||||||
// VertexAttribPointer four times (one per row) and VertexAttribDivisor(loc, 1)
|
|
||||||
// on each of the four slots.
|
|
||||||
layout(location = 3) in vec4 aInstanceRow0;
|
|
||||||
layout(location = 4) in vec4 aInstanceRow1;
|
|
||||||
layout(location = 5) in vec4 aInstanceRow2;
|
|
||||||
layout(location = 6) in vec4 aInstanceRow3;
|
|
||||||
|
|
||||||
uniform mat4 uViewProjection;
|
|
||||||
|
|
||||||
out vec2 vTex;
|
|
||||||
out vec3 vWorldNormal;
|
|
||||||
out vec3 vWorldPos;
|
|
||||||
|
|
||||||
void main() {
|
|
||||||
// Reconstruct the per-instance model matrix from its four row vectors.
|
|
||||||
mat4 model = mat4(aInstanceRow0, aInstanceRow1, aInstanceRow2, aInstanceRow3);
|
|
||||||
|
|
||||||
vec4 worldPos = model * vec4(aPosition, 1.0);
|
|
||||||
gl_Position = uViewProjection * worldPos;
|
|
||||||
|
|
||||||
vWorldPos = worldPos.xyz;
|
|
||||||
// Transform normal into world space.
|
|
||||||
vWorldNormal = normalize(mat3(model) * aNormal);
|
|
||||||
vTex = aTexCoord;
|
|
||||||
}
|
|
||||||
|
|
@ -1,24 +1,22 @@
|
||||||
#version 430 core
|
#version 430 core
|
||||||
|
#extension GL_ARB_bindless_texture : require
|
||||||
|
|
||||||
in vec2 vTex;
|
in vec3 vNormal;
|
||||||
in vec3 vWorldNormal;
|
in vec2 vTexCoord;
|
||||||
in vec3 vWorldPos;
|
in vec3 vWorldPos;
|
||||||
|
in flat uvec2 vTextureHandle;
|
||||||
|
in flat uint vTextureLayer;
|
||||||
|
|
||||||
out vec4 fragColor;
|
// uRenderPass values (Phase N.5 Decision 2 — two-pass alpha-test):
|
||||||
|
// 0 = opaque pass — discard fragments with alpha < 0.95
|
||||||
|
// (lets the depth write succeed for solid pixels)
|
||||||
|
// 1 = translucent pass — covers AlphaBlend / Additive / InvAlpha;
|
||||||
|
// discard alpha >= 0.95 (already drawn opaque) and
|
||||||
|
// alpha < 0.05 (skip empty fragments — large
|
||||||
|
// transparent overdraw cost otherwise)
|
||||||
|
uniform int uRenderPass;
|
||||||
|
|
||||||
// One 2D texture per draw call — same binding point as mesh.frag so the
|
// SceneLighting UBO — IDENTICAL layout to mesh_instanced.frag binding=1.
|
||||||
// C# side can use the same TextureCache without a texture-array pipeline.
|
|
||||||
uniform sampler2D uDiffuse;
|
|
||||||
|
|
||||||
// Translucency kind — matches TranslucencyKind C# enum (same as mesh.frag):
|
|
||||||
// 0 = Opaque — depth write+test, no blend; shader never discards
|
|
||||||
// 1 = ClipMap — alpha-key discard at 0.5 (doors, windows, vegetation)
|
|
||||||
// 2 = AlphaBlend — GL blending handles compositing; do NOT discard
|
|
||||||
// 3 = Additive — GL additive blending; do NOT discard
|
|
||||||
// 4 = InvAlpha — GL inverted-alpha blending; do NOT discard
|
|
||||||
uniform int uTranslucencyKind;
|
|
||||||
|
|
||||||
// Phase G.1+G.2: shared scene-lighting UBO (see mesh.frag for layout docs).
|
|
||||||
struct Light {
|
struct Light {
|
||||||
vec4 posAndKind;
|
vec4 posAndKind;
|
||||||
vec4 dirAndRange;
|
vec4 dirAndRange;
|
||||||
|
|
@ -38,10 +36,8 @@ vec3 accumulateLights(vec3 N, vec3 worldPos) {
|
||||||
int activeLights = int(uCellAmbient.w);
|
int activeLights = int(uCellAmbient.w);
|
||||||
for (int i = 0; i < 8; ++i) {
|
for (int i = 0; i < 8; ++i) {
|
||||||
if (i >= activeLights) break;
|
if (i >= activeLights) break;
|
||||||
|
|
||||||
int kind = int(uLights[i].posAndKind.w);
|
int kind = int(uLights[i].posAndKind.w);
|
||||||
vec3 Lcol = uLights[i].colorAndIntensity.xyz * uLights[i].colorAndIntensity.w;
|
vec3 Lcol = uLights[i].colorAndIntensity.xyz * uLights[i].colorAndIntensity.w;
|
||||||
|
|
||||||
if (kind == 0) {
|
if (kind == 0) {
|
||||||
vec3 Ldir = -uLights[i].dirAndRange.xyz;
|
vec3 Ldir = -uLights[i].dirAndRange.xyz;
|
||||||
float ndl = max(0.0, dot(N, Ldir));
|
float ndl = max(0.0, dot(N, Ldir));
|
||||||
|
|
@ -77,16 +73,24 @@ vec3 applyFog(vec3 lit, vec3 worldPos) {
|
||||||
return mix(lit, uFogColor.xyz, fog);
|
return mix(lit, uFogColor.xyz, fog);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
out vec4 FragColor;
|
||||||
|
|
||||||
void main() {
|
void main() {
|
||||||
vec4 color = texture(uDiffuse, vTex);
|
sampler2DArray tex = sampler2DArray(vTextureHandle);
|
||||||
|
vec4 color = texture(tex, vec3(vTexCoord, float(vTextureLayer)));
|
||||||
|
|
||||||
// Alpha cutout only for clip-map surfaces (doors, windows, vegetation).
|
// Two-pass alpha-test (N.5 Decision 2).
|
||||||
if (uTranslucencyKind == 1 && color.a < 0.5) discard;
|
if (uRenderPass == 0) {
|
||||||
|
if (color.a < 0.95) discard; // opaque pass
|
||||||
|
} else {
|
||||||
|
if (color.a >= 0.95) discard; // transparent pass
|
||||||
|
if (color.a < 0.05) discard; // skip totally-empty
|
||||||
|
}
|
||||||
|
|
||||||
vec3 N = normalize(vWorldNormal);
|
vec3 N = normalize(vNormal);
|
||||||
vec3 lit = accumulateLights(N, vWorldPos);
|
vec3 lit = accumulateLights(N, vWorldPos);
|
||||||
|
|
||||||
// Lightning flash — additive scene bump.
|
// Lightning flash — additive scene bump (matches mesh_instanced.frag).
|
||||||
lit += uFogParams.z * vec3(0.6, 0.6, 0.75);
|
lit += uFogParams.z * vec3(0.6, 0.6, 0.75);
|
||||||
|
|
||||||
// Retail clamp per-channel to 1.0 (r13 §13.1).
|
// Retail clamp per-channel to 1.0 (r13 §13.1).
|
||||||
|
|
@ -94,5 +98,5 @@ void main() {
|
||||||
|
|
||||||
vec3 rgb = color.rgb * lit;
|
vec3 rgb = color.rgb * lit;
|
||||||
rgb = applyFog(rgb, vWorldPos);
|
rgb = applyFog(rgb, vWorldPos);
|
||||||
fragColor = vec4(rgb, color.a);
|
FragColor = vec4(rgb, color.a);
|
||||||
}
|
}
|
||||||
62
src/AcDream.App/Rendering/Shaders/mesh_modern.vert
Normal file
62
src/AcDream.App/Rendering/Shaders/mesh_modern.vert
Normal file
|
|
@ -0,0 +1,62 @@
|
||||||
|
#version 430 core
|
||||||
|
#extension GL_ARB_shader_draw_parameters : require
|
||||||
|
|
||||||
|
layout(location = 0) in vec3 aPosition;
|
||||||
|
layout(location = 1) in vec3 aNormal;
|
||||||
|
layout(location = 2) in vec2 aTexCoord;
|
||||||
|
|
||||||
|
struct InstanceData {
|
||||||
|
mat4 transform;
|
||||||
|
// Reserved for Phase B.4 follow-up (selection-blink retail-faithful
|
||||||
|
// highlight): vec4 highlightColor; — extend stride here, increase the
|
||||||
|
// _instanceSsbo upload size in WbDrawDispatcher, add a flat varying out,
|
||||||
|
// and consume in mesh_modern.frag.
|
||||||
|
};
|
||||||
|
|
||||||
|
struct BatchData {
|
||||||
|
uvec2 textureHandle; // bindless handle for sampler2DArray
|
||||||
|
uint textureLayer; // layer index (always 0 for per-instance composites)
|
||||||
|
uint flags; // reserved — N.5 dispatcher owns all blend state
|
||||||
|
// (glBlendFunc per pass). If a future phase wants
|
||||||
|
// shader-side per-batch additive flag (Decision 2
|
||||||
|
// fallback), encode it here as bit 0.
|
||||||
|
};
|
||||||
|
|
||||||
|
layout(std430, binding = 0) readonly buffer InstanceBuffer {
|
||||||
|
InstanceData Instances[];
|
||||||
|
};
|
||||||
|
|
||||||
|
// binding=1 here is the SSBO namespace — distinct from the UBO namespace.
|
||||||
|
// SceneLighting UBO also uses binding=1 in the fragment shader; GL keeps
|
||||||
|
// GL_SHADER_STORAGE_BUFFER and GL_UNIFORM_BUFFER binding tables separate.
|
||||||
|
// Task 10 dispatcher binds:
|
||||||
|
// glBindBufferBase(GL_SHADER_STORAGE_BUFFER, 0, instanceSsbo)
|
||||||
|
// glBindBufferBase(GL_SHADER_STORAGE_BUFFER, 1, batchSsbo)
|
||||||
|
// Existing SceneLightingUboBinding handles the UBO side.
|
||||||
|
layout(std430, binding = 1) readonly buffer BatchBuffer {
|
||||||
|
BatchData Batches[];
|
||||||
|
};
|
||||||
|
|
||||||
|
uniform mat4 uViewProjection;
|
||||||
|
|
||||||
|
out vec3 vNormal;
|
||||||
|
out vec2 vTexCoord;
|
||||||
|
out vec3 vWorldPos;
|
||||||
|
out flat uvec2 vTextureHandle;
|
||||||
|
out flat uint vTextureLayer;
|
||||||
|
|
||||||
|
void main() {
|
||||||
|
int instanceIndex = gl_BaseInstanceARB + gl_InstanceID;
|
||||||
|
mat4 model = Instances[instanceIndex].transform;
|
||||||
|
|
||||||
|
vec4 worldPos = model * vec4(aPosition, 1.0);
|
||||||
|
gl_Position = uViewProjection * worldPos;
|
||||||
|
|
||||||
|
vWorldPos = worldPos.xyz;
|
||||||
|
vNormal = normalize(mat3(model) * aNormal);
|
||||||
|
vTexCoord = aTexCoord;
|
||||||
|
|
||||||
|
BatchData b = Batches[gl_DrawIDARB];
|
||||||
|
vTextureHandle = b.textureHandle;
|
||||||
|
vTextureLayer = b.textureLayer;
|
||||||
|
}
|
||||||
|
|
@ -1,293 +0,0 @@
|
||||||
// src/AcDream.App/Rendering/StaticMeshRenderer.cs
|
|
||||||
using System.Numerics;
|
|
||||||
using AcDream.Core.Meshing;
|
|
||||||
using AcDream.Core.Terrain;
|
|
||||||
using AcDream.Core.World;
|
|
||||||
using Silk.NET.OpenGL;
|
|
||||||
|
|
||||||
namespace AcDream.App.Rendering;
|
|
||||||
|
|
||||||
public sealed unsafe class StaticMeshRenderer : IDisposable
|
|
||||||
{
|
|
||||||
private readonly GL _gl;
|
|
||||||
private readonly Shader _shader;
|
|
||||||
private readonly TextureCache _textures;
|
|
||||||
|
|
||||||
// One GPU bundle per unique GfxObj id. Each GfxObj can have multiple sub-meshes.
|
|
||||||
private readonly Dictionary<uint, List<SubMeshGpu>> _gpuByGfxObj = new();
|
|
||||||
|
|
||||||
public StaticMeshRenderer(GL gl, Shader shader, TextureCache textures)
|
|
||||||
{
|
|
||||||
_gl = gl;
|
|
||||||
_shader = shader;
|
|
||||||
_textures = textures;
|
|
||||||
}
|
|
||||||
|
|
||||||
public void EnsureUploaded(uint gfxObjId, IReadOnlyList<GfxObjSubMesh> subMeshes)
|
|
||||||
{
|
|
||||||
if (_gpuByGfxObj.ContainsKey(gfxObjId))
|
|
||||||
return;
|
|
||||||
|
|
||||||
var list = new List<SubMeshGpu>(subMeshes.Count);
|
|
||||||
foreach (var sm in subMeshes)
|
|
||||||
list.Add(UploadSubMesh(sm));
|
|
||||||
_gpuByGfxObj[gfxObjId] = list;
|
|
||||||
}
|
|
||||||
|
|
||||||
private SubMeshGpu UploadSubMesh(GfxObjSubMesh sm)
|
|
||||||
{
|
|
||||||
uint vao = _gl.GenVertexArray();
|
|
||||||
_gl.BindVertexArray(vao);
|
|
||||||
|
|
||||||
uint vbo = _gl.GenBuffer();
|
|
||||||
_gl.BindBuffer(BufferTargetARB.ArrayBuffer, vbo);
|
|
||||||
fixed (void* p = sm.Vertices)
|
|
||||||
_gl.BufferData(BufferTargetARB.ArrayBuffer,
|
|
||||||
(nuint)(sm.Vertices.Length * sizeof(Vertex)), p, BufferUsageARB.StaticDraw);
|
|
||||||
|
|
||||||
uint ebo = _gl.GenBuffer();
|
|
||||||
_gl.BindBuffer(BufferTargetARB.ElementArrayBuffer, ebo);
|
|
||||||
fixed (void* p = sm.Indices)
|
|
||||||
_gl.BufferData(BufferTargetARB.ElementArrayBuffer,
|
|
||||||
(nuint)(sm.Indices.Length * sizeof(uint)), p, BufferUsageARB.StaticDraw);
|
|
||||||
|
|
||||||
uint stride = (uint)sizeof(Vertex);
|
|
||||||
_gl.EnableVertexAttribArray(0);
|
|
||||||
_gl.VertexAttribPointer(0, 3, VertexAttribPointerType.Float, false, stride, (void*)0);
|
|
||||||
_gl.EnableVertexAttribArray(1);
|
|
||||||
_gl.VertexAttribPointer(1, 3, VertexAttribPointerType.Float, false, stride, (void*)(3 * sizeof(float)));
|
|
||||||
_gl.EnableVertexAttribArray(2);
|
|
||||||
_gl.VertexAttribPointer(2, 2, VertexAttribPointerType.Float, false, stride, (void*)(6 * sizeof(float)));
|
|
||||||
_gl.EnableVertexAttribArray(3);
|
|
||||||
_gl.VertexAttribIPointer(3, 1, VertexAttribIType.UnsignedInt, stride, (void*)(8 * sizeof(float)));
|
|
||||||
|
|
||||||
_gl.BindVertexArray(0);
|
|
||||||
|
|
||||||
return new SubMeshGpu
|
|
||||||
{
|
|
||||||
Vao = vao,
|
|
||||||
Vbo = vbo,
|
|
||||||
Ebo = ebo,
|
|
||||||
IndexCount = sm.Indices.Length,
|
|
||||||
SurfaceId = sm.SurfaceId,
|
|
||||||
// Capture translucency at upload time so the draw loop never
|
|
||||||
// has to look it up from external state.
|
|
||||||
Translucency = sm.Translucency,
|
|
||||||
};
|
|
||||||
}
|
|
||||||
|
|
||||||
public void Draw(ICamera camera,
|
|
||||||
IEnumerable<(uint LandblockId, Vector3 AabbMin, Vector3 AabbMax, IReadOnlyList<WorldEntity> Entities)> landblockEntries,
|
|
||||||
FrustumPlanes? frustum = null,
|
|
||||||
uint? neverCullLandblockId = null)
|
|
||||||
{
|
|
||||||
_shader.Use();
|
|
||||||
_shader.SetMatrix4("uView", camera.View);
|
|
||||||
_shader.SetMatrix4("uProjection", camera.Projection);
|
|
||||||
|
|
||||||
// ── Pass 1: Opaque + ClipMap ──────────────────────────────────────────
|
|
||||||
// Depth write on (default). No blending. ClipMap surfaces use the
|
|
||||||
// alpha-discard path in the fragment shader (uTranslucencyKind == 1).
|
|
||||||
foreach (var entry in landblockEntries)
|
|
||||||
{
|
|
||||||
// Per-landblock frustum cull. Never cull the player's landblock.
|
|
||||||
if (frustum is not null &&
|
|
||||||
entry.LandblockId != neverCullLandblockId &&
|
|
||||||
!FrustumCuller.IsAabbVisible(frustum.Value, entry.AabbMin, entry.AabbMax))
|
|
||||||
continue;
|
|
||||||
|
|
||||||
foreach (var entity in entry.Entities)
|
|
||||||
{
|
|
||||||
if (entity.MeshRefs.Count == 0)
|
|
||||||
continue;
|
|
||||||
|
|
||||||
foreach (var meshRef in entity.MeshRefs)
|
|
||||||
{
|
|
||||||
if (!_gpuByGfxObj.TryGetValue(meshRef.GfxObjId, out var subMeshes))
|
|
||||||
continue;
|
|
||||||
|
|
||||||
var entityRoot =
|
|
||||||
Matrix4x4.CreateFromQuaternion(entity.Rotation) *
|
|
||||||
Matrix4x4.CreateTranslation(entity.Position);
|
|
||||||
var model = meshRef.PartTransform * entityRoot;
|
|
||||||
_shader.SetMatrix4("uModel", model);
|
|
||||||
|
|
||||||
foreach (var sub in subMeshes)
|
|
||||||
{
|
|
||||||
// Skip translucent sub-meshes in the first pass.
|
|
||||||
if (sub.Translucency != TranslucencyKind.Opaque &&
|
|
||||||
sub.Translucency != TranslucencyKind.ClipMap)
|
|
||||||
continue;
|
|
||||||
|
|
||||||
_shader.SetInt("uTranslucencyKind", (int)sub.Translucency);
|
|
||||||
|
|
||||||
uint tex = ResolveTex(entity, meshRef, sub);
|
|
||||||
_gl.ActiveTexture(TextureUnit.Texture0);
|
|
||||||
_gl.BindTexture(TextureTarget.Texture2D, tex);
|
|
||||||
|
|
||||||
_gl.BindVertexArray(sub.Vao);
|
|
||||||
_gl.DrawElements(PrimitiveType.Triangles, (uint)sub.IndexCount, DrawElementsType.UnsignedInt, (void*)0);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// ── Pass 2: Translucent (AlphaBlend, Additive, InvAlpha) ─────────────
|
|
||||||
// Depth test on so translucents composite correctly behind opaque geometry.
|
|
||||||
// Depth write OFF so translucents don't occlude each other or downstream
|
|
||||||
// opaque draws. Blend function is set per-draw based on TranslucencyKind.
|
|
||||||
//
|
|
||||||
// NOTE: translucent draws are NOT sorted by depth — overlapping translucent
|
|
||||||
// surfaces can composite in the wrong order. Portal-sized billboards don't
|
|
||||||
// overlap in practice so this is acceptable and avoids a larger refactor.
|
|
||||||
_gl.Enable(EnableCap.Blend);
|
|
||||||
_gl.DepthMask(false);
|
|
||||||
|
|
||||||
// Phase 9.2: enable back-face culling for the translucent pass so
|
|
||||||
// closed-shell translucents (lifestone crystal, glow gems, any
|
|
||||||
// convex blended mesh) don't draw their back faces over their
|
|
||||||
// front faces in arbitrary iteration order. Without this, the
|
|
||||||
// 58 triangles of the lifestone crystal composited with an
|
|
||||||
// "inside-out" look where the user saw through one face into
|
|
||||||
// the hollow interior. With back-face culling on, back faces are
|
|
||||||
// dropped at rasterization time, front faces composite as-is,
|
|
||||||
// and depth ordering within the front-facing subset is a
|
|
||||||
// non-issue for closed convex-ish shells. Matches WorldBuilder's
|
|
||||||
// per-batch CullMode handling in
|
|
||||||
// references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/
|
|
||||||
// BaseObjectRenderManager.cs:361-365.
|
|
||||||
//
|
|
||||||
// Our fan triangulation emits pos-side polygons as
|
|
||||||
// (0, i, i+1) which is CCW in standard OpenGL conventions, so
|
|
||||||
// GL_BACK + CCW front is the correct state. Neg-side polygons
|
|
||||||
// (if any) use reversed winding and get culled here — that's a
|
|
||||||
// known limitation and matches the opaque-pass behavior since
|
|
||||||
// neg-side polys are virtually never translucent in AC content.
|
|
||||||
_gl.Enable(EnableCap.CullFace);
|
|
||||||
_gl.CullFace(TriangleFace.Back);
|
|
||||||
_gl.FrontFace(FrontFaceDirection.Ccw);
|
|
||||||
|
|
||||||
foreach (var entry in landblockEntries)
|
|
||||||
{
|
|
||||||
// Same per-landblock frustum cull for pass 2.
|
|
||||||
if (frustum is not null &&
|
|
||||||
entry.LandblockId != neverCullLandblockId &&
|
|
||||||
!FrustumCuller.IsAabbVisible(frustum.Value, entry.AabbMin, entry.AabbMax))
|
|
||||||
continue;
|
|
||||||
|
|
||||||
foreach (var entity in entry.Entities)
|
|
||||||
{
|
|
||||||
if (entity.MeshRefs.Count == 0)
|
|
||||||
continue;
|
|
||||||
|
|
||||||
foreach (var meshRef in entity.MeshRefs)
|
|
||||||
{
|
|
||||||
if (!_gpuByGfxObj.TryGetValue(meshRef.GfxObjId, out var subMeshes))
|
|
||||||
continue;
|
|
||||||
|
|
||||||
var entityRoot =
|
|
||||||
Matrix4x4.CreateFromQuaternion(entity.Rotation) *
|
|
||||||
Matrix4x4.CreateTranslation(entity.Position);
|
|
||||||
var model = meshRef.PartTransform * entityRoot;
|
|
||||||
_shader.SetMatrix4("uModel", model);
|
|
||||||
|
|
||||||
foreach (var sub in subMeshes)
|
|
||||||
{
|
|
||||||
if (sub.Translucency == TranslucencyKind.Opaque ||
|
|
||||||
sub.Translucency == TranslucencyKind.ClipMap)
|
|
||||||
continue;
|
|
||||||
|
|
||||||
// Set per-draw blend function.
|
|
||||||
switch (sub.Translucency)
|
|
||||||
{
|
|
||||||
case TranslucencyKind.Additive:
|
|
||||||
// src*a + dst — portal swirls, glows
|
|
||||||
_gl.BlendFunc(BlendingFactor.SrcAlpha, BlendingFactor.One);
|
|
||||||
break;
|
|
||||||
|
|
||||||
case TranslucencyKind.InvAlpha:
|
|
||||||
// src*(1-a) + dst*a
|
|
||||||
_gl.BlendFunc(BlendingFactor.OneMinusSrcAlpha, BlendingFactor.SrcAlpha);
|
|
||||||
break;
|
|
||||||
|
|
||||||
default: // AlphaBlend
|
|
||||||
// src*a + dst*(1-a)
|
|
||||||
_gl.BlendFunc(BlendingFactor.SrcAlpha, BlendingFactor.OneMinusSrcAlpha);
|
|
||||||
break;
|
|
||||||
}
|
|
||||||
|
|
||||||
_shader.SetInt("uTranslucencyKind", (int)sub.Translucency);
|
|
||||||
|
|
||||||
uint tex = ResolveTex(entity, meshRef, sub);
|
|
||||||
_gl.ActiveTexture(TextureUnit.Texture0);
|
|
||||||
_gl.BindTexture(TextureTarget.Texture2D, tex);
|
|
||||||
|
|
||||||
_gl.BindVertexArray(sub.Vao);
|
|
||||||
_gl.DrawElements(PrimitiveType.Triangles, (uint)sub.IndexCount, DrawElementsType.UnsignedInt, (void*)0);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// Restore default GL state for subsequent renderers (terrain etc.).
|
|
||||||
_gl.DepthMask(true);
|
|
||||||
_gl.Disable(EnableCap.Blend);
|
|
||||||
_gl.Disable(EnableCap.CullFace);
|
|
||||||
|
|
||||||
_gl.BindVertexArray(0);
|
|
||||||
}
|
|
||||||
|
|
||||||
/// <summary>
|
|
||||||
/// Resolves the GL texture id for a sub-mesh, honouring palette and
|
|
||||||
/// texture overrides carried on the entity and the mesh-ref.
|
|
||||||
/// </summary>
|
|
||||||
private uint ResolveTex(WorldEntity entity, MeshRef meshRef, SubMeshGpu sub)
|
|
||||||
{
|
|
||||||
uint overrideOrigTex = 0;
|
|
||||||
bool hasOrigTexOverride = meshRef.SurfaceOverrides is not null
|
|
||||||
&& meshRef.SurfaceOverrides.TryGetValue(sub.SurfaceId, out overrideOrigTex);
|
|
||||||
uint? origTexOverride = hasOrigTexOverride ? overrideOrigTex : (uint?)null;
|
|
||||||
|
|
||||||
if (entity.PaletteOverride is not null)
|
|
||||||
{
|
|
||||||
return _textures.GetOrUploadWithPaletteOverride(
|
|
||||||
sub.SurfaceId, origTexOverride, entity.PaletteOverride);
|
|
||||||
}
|
|
||||||
else if (hasOrigTexOverride)
|
|
||||||
{
|
|
||||||
return _textures.GetOrUploadWithOrigTextureOverride(sub.SurfaceId, overrideOrigTex);
|
|
||||||
}
|
|
||||||
else
|
|
||||||
{
|
|
||||||
return _textures.GetOrUpload(sub.SurfaceId);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
public void Dispose()
|
|
||||||
{
|
|
||||||
foreach (var subs in _gpuByGfxObj.Values)
|
|
||||||
{
|
|
||||||
foreach (var sub in subs)
|
|
||||||
{
|
|
||||||
_gl.DeleteBuffer(sub.Vbo);
|
|
||||||
_gl.DeleteBuffer(sub.Ebo);
|
|
||||||
_gl.DeleteVertexArray(sub.Vao);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
_gpuByGfxObj.Clear();
|
|
||||||
}
|
|
||||||
|
|
||||||
private sealed class SubMeshGpu
|
|
||||||
{
|
|
||||||
public uint Vao;
|
|
||||||
public uint Vbo;
|
|
||||||
public uint Ebo;
|
|
||||||
public int IndexCount;
|
|
||||||
public uint SurfaceId;
|
|
||||||
/// <summary>
|
|
||||||
/// Cached from GfxObjSubMesh.Translucency at upload time.
|
|
||||||
/// Avoids any per-draw lookup into external state.
|
|
||||||
/// </summary>
|
|
||||||
public TranslucencyKind Translucency;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
@ -29,10 +29,22 @@ public sealed unsafe class TextureCache : Wb.ITextureCachePerInstance, IDisposab
|
||||||
private readonly Dictionary<(uint surfaceId, uint origTexOverride, ulong paletteHash), uint> _handlesByPalette = new();
|
private readonly Dictionary<(uint surfaceId, uint origTexOverride, ulong paletteHash), uint> _handlesByPalette = new();
|
||||||
private uint _magentaHandle;
|
private uint _magentaHandle;
|
||||||
|
|
||||||
public TextureCache(GL gl, DatCollection dats)
|
private readonly Wb.BindlessSupport? _bindless;
|
||||||
|
|
||||||
|
// Bindless / Texture2DArray parallel caches. Keys mirror the legacy three
|
||||||
|
// caches so a surface used by both the legacy (Texture2D, sampler2D) and
|
||||||
|
// modern (Texture2DArray, sampler2DArray) paths is uploaded twice — once
|
||||||
|
// per target. Each entry stores both the GL texture name (for Dispose
|
||||||
|
// cleanup) and the resident bindless handle (returned to callers).
|
||||||
|
private readonly Dictionary<uint, (uint Name, ulong Handle)> _bindlessBySurfaceId = new();
|
||||||
|
private readonly Dictionary<(uint surfaceId, uint origTexOverride), (uint Name, ulong Handle)> _bindlessByOverridden = new();
|
||||||
|
private readonly Dictionary<(uint surfaceId, uint origTexOverride, ulong paletteHash), (uint Name, ulong Handle)> _bindlessByPalette = new();
|
||||||
|
|
||||||
|
public TextureCache(GL gl, DatCollection dats, Wb.BindlessSupport? bindless = null)
|
||||||
{
|
{
|
||||||
_gl = gl;
|
_gl = gl;
|
||||||
_dats = dats;
|
_dats = dats;
|
||||||
|
_bindless = bindless;
|
||||||
}
|
}
|
||||||
|
|
||||||
/// <summary>
|
/// <summary>
|
||||||
|
|
@ -149,6 +161,82 @@ public sealed unsafe class TextureCache : Wb.ITextureCachePerInstance, IDisposab
|
||||||
return h;
|
return h;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/// <summary>
|
||||||
|
/// 64-bit bindless handle variant of <see cref="GetOrUpload"/> for the WB
|
||||||
|
/// modern rendering path. Uploads the texture as a 1-layer Texture2DArray
|
||||||
|
/// (so the shader's <c>sampler2DArray</c> can sample at layer 0) and returns
|
||||||
|
/// a resident bindless handle. Caches by surfaceId in a separate dictionary
|
||||||
|
/// from the legacy Texture2D path; the same surface may be uploaded twice
|
||||||
|
/// if used by both paths (acceptable transition cost — N.6 deletes the legacy
|
||||||
|
/// path).
|
||||||
|
/// Throws if BindlessSupport wasn't provided to the constructor.
|
||||||
|
/// </summary>
|
||||||
|
public ulong GetOrUploadBindless(uint surfaceId)
|
||||||
|
{
|
||||||
|
EnsureBindlessAvailable();
|
||||||
|
if (_bindlessBySurfaceId.TryGetValue(surfaceId, out var entry))
|
||||||
|
return entry.Handle;
|
||||||
|
var decoded = DecodeFromDats(surfaceId, origTextureOverride: null, paletteOverride: null);
|
||||||
|
uint name = UploadRgba8AsLayer1Array(decoded);
|
||||||
|
ulong handle = _bindless!.GetResidentHandle(name);
|
||||||
|
_bindlessBySurfaceId[surfaceId] = (name, handle);
|
||||||
|
return handle;
|
||||||
|
}
|
||||||
|
|
||||||
|
/// <summary>
|
||||||
|
/// 64-bit bindless handle variant of <see cref="GetOrUploadWithOrigTextureOverride"/>
|
||||||
|
/// for the WB modern rendering path. Uploads the texture as a 1-layer
|
||||||
|
/// Texture2DArray with the override SurfaceTexture id and returns a resident
|
||||||
|
/// bindless handle. Caches under a separate composite key from the legacy
|
||||||
|
/// path. Throws if BindlessSupport wasn't provided to the constructor.
|
||||||
|
/// </summary>
|
||||||
|
public ulong GetOrUploadWithOrigTextureOverrideBindless(uint surfaceId, uint overrideOrigTextureId)
|
||||||
|
{
|
||||||
|
EnsureBindlessAvailable();
|
||||||
|
var key = (surfaceId, overrideOrigTextureId);
|
||||||
|
if (_bindlessByOverridden.TryGetValue(key, out var entry))
|
||||||
|
return entry.Handle;
|
||||||
|
var decoded = DecodeFromDats(surfaceId, origTextureOverride: overrideOrigTextureId, paletteOverride: null);
|
||||||
|
uint name = UploadRgba8AsLayer1Array(decoded);
|
||||||
|
ulong handle = _bindless!.GetResidentHandle(name);
|
||||||
|
_bindlessByOverridden[key] = (name, handle);
|
||||||
|
return handle;
|
||||||
|
}
|
||||||
|
|
||||||
|
/// <summary>
|
||||||
|
/// 64-bit bindless handle variant of <see cref="GetOrUploadWithPaletteOverride"/>
|
||||||
|
/// for the WB modern rendering path. Applies the palette override on top of
|
||||||
|
/// the texture's default palette before decoding, uploads as a 1-layer
|
||||||
|
/// Texture2DArray, and returns a resident bindless handle. Takes a
|
||||||
|
/// precomputed palette hash so the WB dispatcher can compute it once per
|
||||||
|
/// entity. Throws if BindlessSupport wasn't provided to the constructor.
|
||||||
|
/// </summary>
|
||||||
|
public ulong GetOrUploadWithPaletteOverrideBindless(
|
||||||
|
uint surfaceId,
|
||||||
|
uint? overrideOrigTextureId,
|
||||||
|
PaletteOverride paletteOverride,
|
||||||
|
ulong precomputedPaletteHash)
|
||||||
|
{
|
||||||
|
EnsureBindlessAvailable();
|
||||||
|
uint origTexKey = overrideOrigTextureId ?? 0;
|
||||||
|
var key = (surfaceId, origTexKey, precomputedPaletteHash);
|
||||||
|
if (_bindlessByPalette.TryGetValue(key, out var entry))
|
||||||
|
return entry.Handle;
|
||||||
|
var decoded = DecodeFromDats(surfaceId, origTextureOverride: overrideOrigTextureId, paletteOverride: paletteOverride);
|
||||||
|
uint name = UploadRgba8AsLayer1Array(decoded);
|
||||||
|
ulong handle = _bindless!.GetResidentHandle(name);
|
||||||
|
_bindlessByPalette[key] = (name, handle);
|
||||||
|
return handle;
|
||||||
|
}
|
||||||
|
|
||||||
|
private void EnsureBindlessAvailable()
|
||||||
|
{
|
||||||
|
if (_bindless is null)
|
||||||
|
throw new InvalidOperationException(
|
||||||
|
"TextureCache constructed without BindlessSupport — cannot generate bindless handles. " +
|
||||||
|
"WbDrawDispatcher requires the bindless-aware ctor overload (pass non-null BindlessSupport).");
|
||||||
|
}
|
||||||
|
|
||||||
/// <summary>
|
/// <summary>
|
||||||
/// Cheap 64-bit hash over a palette override's identity so two
|
/// Cheap 64-bit hash over a palette override's identity so two
|
||||||
/// entities with the same palette setup share a decode. Internal so
|
/// entities with the same palette setup share a decode. Internal so
|
||||||
|
|
@ -279,17 +367,79 @@ public sealed unsafe class TextureCache : Wb.ITextureCachePerInstance, IDisposab
|
||||||
return tex;
|
return tex;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/// <summary>
|
||||||
|
/// Variant of <see cref="UploadRgba8"/> that uploads pixel data as a 1-layer
|
||||||
|
/// Texture2DArray. Required by the WB modern rendering path which samples via
|
||||||
|
/// sampler2DArray in its bindless shader. Pixel data is identical.
|
||||||
|
/// </summary>
|
||||||
|
private uint UploadRgba8AsLayer1Array(DecodedTexture decoded)
|
||||||
|
{
|
||||||
|
uint tex = _gl.GenTexture();
|
||||||
|
_gl.BindTexture(TextureTarget.Texture2DArray, tex);
|
||||||
|
|
||||||
|
fixed (byte* p = decoded.Rgba8)
|
||||||
|
_gl.TexImage3D(
|
||||||
|
TextureTarget.Texture2DArray,
|
||||||
|
0,
|
||||||
|
InternalFormat.Rgba8,
|
||||||
|
(uint)decoded.Width,
|
||||||
|
(uint)decoded.Height,
|
||||||
|
depth: 1,
|
||||||
|
border: 0,
|
||||||
|
PixelFormat.Rgba,
|
||||||
|
PixelType.UnsignedByte,
|
||||||
|
p);
|
||||||
|
|
||||||
|
_gl.TexParameter(TextureTarget.Texture2DArray, TextureParameterName.TextureMinFilter, (int)TextureMinFilter.Linear);
|
||||||
|
_gl.TexParameter(TextureTarget.Texture2DArray, TextureParameterName.TextureMagFilter, (int)TextureMagFilter.Linear);
|
||||||
|
_gl.TexParameter(TextureTarget.Texture2DArray, TextureParameterName.TextureWrapS, (int)TextureWrapMode.Repeat);
|
||||||
|
_gl.TexParameter(TextureTarget.Texture2DArray, TextureParameterName.TextureWrapT, (int)TextureWrapMode.Repeat);
|
||||||
|
|
||||||
|
_gl.BindTexture(TextureTarget.Texture2DArray, 0);
|
||||||
|
return tex;
|
||||||
|
}
|
||||||
|
|
||||||
public void Dispose()
|
public void Dispose()
|
||||||
{
|
{
|
||||||
|
// Phase 1: make all bindless handles non-resident BEFORE any
|
||||||
|
// DeleteTexture call. ARB_bindless_texture requires that resident
|
||||||
|
// handles be released before their backing texture is deleted —
|
||||||
|
// interleaving per-entry is UB. Single null-guard around the whole
|
||||||
|
// block (cleaner than per-call null-conditionals).
|
||||||
|
if (_bindless is not null)
|
||||||
|
{
|
||||||
|
foreach (var (_, handle) in _bindlessBySurfaceId.Values)
|
||||||
|
_bindless.MakeNonResident(handle);
|
||||||
|
foreach (var (_, handle) in _bindlessByOverridden.Values)
|
||||||
|
_bindless.MakeNonResident(handle);
|
||||||
|
foreach (var (_, handle) in _bindlessByPalette.Values)
|
||||||
|
_bindless.MakeNonResident(handle);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Phase 2: delete the Texture2DArray textures backing those handles.
|
||||||
|
foreach (var (name, _) in _bindlessBySurfaceId.Values)
|
||||||
|
_gl.DeleteTexture(name);
|
||||||
|
_bindlessBySurfaceId.Clear();
|
||||||
|
foreach (var (name, _) in _bindlessByOverridden.Values)
|
||||||
|
_gl.DeleteTexture(name);
|
||||||
|
_bindlessByOverridden.Clear();
|
||||||
|
foreach (var (name, _) in _bindlessByPalette.Values)
|
||||||
|
_gl.DeleteTexture(name);
|
||||||
|
_bindlessByPalette.Clear();
|
||||||
|
|
||||||
|
// Phase 3: legacy Texture2D textures.
|
||||||
foreach (var h in _handlesBySurfaceId.Values)
|
foreach (var h in _handlesBySurfaceId.Values)
|
||||||
_gl.DeleteTexture(h);
|
_gl.DeleteTexture(h);
|
||||||
_handlesBySurfaceId.Clear();
|
_handlesBySurfaceId.Clear();
|
||||||
|
|
||||||
foreach (var h in _handlesByOverridden.Values)
|
foreach (var h in _handlesByOverridden.Values)
|
||||||
_gl.DeleteTexture(h);
|
_gl.DeleteTexture(h);
|
||||||
_handlesByOverridden.Clear();
|
_handlesByOverridden.Clear();
|
||||||
|
|
||||||
foreach (var h in _handlesByPalette.Values)
|
foreach (var h in _handlesByPalette.Values)
|
||||||
_gl.DeleteTexture(h);
|
_gl.DeleteTexture(h);
|
||||||
_handlesByPalette.Clear();
|
_handlesByPalette.Clear();
|
||||||
|
|
||||||
if (_magentaHandle != 0)
|
if (_magentaHandle != 0)
|
||||||
{
|
{
|
||||||
_gl.DeleteTexture(_magentaHandle);
|
_gl.DeleteTexture(_magentaHandle);
|
||||||
|
|
|
||||||
55
src/AcDream.App/Rendering/Wb/BindlessSupport.cs
Normal file
55
src/AcDream.App/Rendering/Wb/BindlessSupport.cs
Normal file
|
|
@ -0,0 +1,55 @@
|
||||||
|
using Silk.NET.OpenGL;
|
||||||
|
using Silk.NET.OpenGL.Extensions.ARB;
|
||||||
|
|
||||||
|
namespace AcDream.App.Rendering.Wb;
|
||||||
|
|
||||||
|
/// <summary>
|
||||||
|
/// Thin wrapper around <see cref="ArbBindlessTexture"/> + capability detection
|
||||||
|
/// for the modern rendering path. Constructed once at startup via
|
||||||
|
/// <see cref="TryCreate"/>, which returns false if the extension isn't present.
|
||||||
|
/// </summary>
|
||||||
|
public sealed class BindlessSupport
|
||||||
|
{
|
||||||
|
private readonly ArbBindlessTexture _ext;
|
||||||
|
|
||||||
|
private BindlessSupport(ArbBindlessTexture extension)
|
||||||
|
{
|
||||||
|
_ext = extension;
|
||||||
|
}
|
||||||
|
|
||||||
|
public static bool TryCreate(GL gl, out BindlessSupport? support)
|
||||||
|
{
|
||||||
|
if (gl.TryGetExtension<ArbBindlessTexture>(out var ext))
|
||||||
|
{
|
||||||
|
support = new BindlessSupport(ext);
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
support = null;
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
|
||||||
|
/// <summary>Get a 64-bit bindless handle for the texture and make it resident.
|
||||||
|
/// Idempotent: handle is the same for a given texture name.</summary>
|
||||||
|
public ulong GetResidentHandle(uint textureName)
|
||||||
|
{
|
||||||
|
ulong h = _ext.GetTextureHandle(textureName);
|
||||||
|
if (!_ext.IsTextureHandleResident(h))
|
||||||
|
_ext.MakeTextureHandleResident(h);
|
||||||
|
return h;
|
||||||
|
}
|
||||||
|
|
||||||
|
/// <summary>Release residency for a handle. Call before deleting the underlying texture.</summary>
|
||||||
|
public void MakeNonResident(ulong handle)
|
||||||
|
{
|
||||||
|
if (_ext.IsTextureHandleResident(handle))
|
||||||
|
_ext.MakeTextureHandleNonResident(handle);
|
||||||
|
}
|
||||||
|
|
||||||
|
/// <summary>Detect <c>GL_ARB_shader_draw_parameters</c> in addition to bindless.
|
||||||
|
/// N.5's vertex shader uses <c>gl_BaseInstanceARB</c> and <c>gl_DrawIDARB</c>
|
||||||
|
/// from this extension.</summary>
|
||||||
|
public bool HasShaderDrawParameters(GL gl)
|
||||||
|
{
|
||||||
|
return gl.IsExtensionPresent("GL_ARB_shader_draw_parameters");
|
||||||
|
}
|
||||||
|
}
|
||||||
17
src/AcDream.App/Rendering/Wb/DrawElementsIndirectCommand.cs
Normal file
17
src/AcDream.App/Rendering/Wb/DrawElementsIndirectCommand.cs
Normal file
|
|
@ -0,0 +1,17 @@
|
||||||
|
using System.Runtime.InteropServices;
|
||||||
|
|
||||||
|
namespace AcDream.App.Rendering.Wb;
|
||||||
|
|
||||||
|
/// <summary>
|
||||||
|
/// Layout matches what <c>glMultiDrawElementsIndirect</c> expects.
|
||||||
|
/// Total size 20 bytes; arrays are typically uploaded with stride = sizeof(this).
|
||||||
|
/// </summary>
|
||||||
|
[StructLayout(LayoutKind.Sequential, Pack = 4)]
|
||||||
|
public struct DrawElementsIndirectCommand
|
||||||
|
{
|
||||||
|
public uint Count; // index count for this draw
|
||||||
|
public uint InstanceCount; // number of instances
|
||||||
|
public uint FirstIndex; // offset into IBO, in indices
|
||||||
|
public int BaseVertex; // vertex offset into VBO
|
||||||
|
public uint BaseInstance; // first instance ID (offsets per-instance attribs / SSBO read)
|
||||||
|
}
|
||||||
|
|
@ -1,6 +1,7 @@
|
||||||
using System;
|
using System;
|
||||||
using System.Collections.Generic;
|
using System.Collections.Generic;
|
||||||
using System.Numerics;
|
using System.Numerics;
|
||||||
|
using System.Runtime.InteropServices;
|
||||||
using AcDream.Core.Meshing;
|
using AcDream.Core.Meshing;
|
||||||
using AcDream.Core.Terrain;
|
using AcDream.Core.Terrain;
|
||||||
using AcDream.Core.World;
|
using AcDream.Core.World;
|
||||||
|
|
@ -12,45 +13,49 @@ namespace AcDream.App.Rendering.Wb;
|
||||||
/// <summary>
|
/// <summary>
|
||||||
/// Draws entities using WB's <see cref="ObjectRenderData"/> (a single global
|
/// Draws entities using WB's <see cref="ObjectRenderData"/> (a single global
|
||||||
/// VAO/VBO/IBO under modern rendering) with acdream's <see cref="TextureCache"/>
|
/// VAO/VBO/IBO under modern rendering) with acdream's <see cref="TextureCache"/>
|
||||||
/// for texture resolution and <see cref="AcSurfaceMetadataTable"/> for
|
/// for bindless texture resolution and <see cref="AcSurfaceMetadataTable"/> for
|
||||||
/// translucency classification.
|
/// translucency classification.
|
||||||
///
|
///
|
||||||
/// <para>
|
/// <para>
|
||||||
/// <b>Atlas-tier</b> entities (<c>ServerGuid == 0</c>): mesh data comes from WB's
|
/// <b>Atlas-tier</b> entities (<c>ServerGuid == 0</c>): mesh data comes from WB's
|
||||||
/// <see cref="ObjectMeshManager"/> via <see cref="WbMeshAdapter.TryGetRenderData"/>.
|
/// <see cref="ObjectMeshManager"/> via <see cref="WbMeshAdapter.TryGetRenderData"/>.
|
||||||
/// Textures resolve through <see cref="TextureCache.GetOrUpload"/> using the batch's
|
/// Textures resolve through the bindless-suffixed
|
||||||
/// <c>SurfaceId</c>.
|
/// <see cref="TextureCache.GetOrUploadBindless"/> variants, returning 64-bit
|
||||||
|
/// resident handles stored in the per-group SSBO.
|
||||||
/// </para>
|
/// </para>
|
||||||
///
|
///
|
||||||
/// <para>
|
/// <para>
|
||||||
/// <b>Per-instance-tier</b> entities (<c>ServerGuid != 0</c>): mesh data also from
|
/// <b>Per-instance-tier</b> entities (<c>ServerGuid != 0</c>): mesh data also from
|
||||||
/// WB, but textures resolve through <see cref="TextureCache"/> with palette and
|
/// WB, but textures resolve through
|
||||||
/// surface overrides applied. <see cref="AnimatedEntityState"/> is currently
|
/// <see cref="TextureCache.GetOrUploadWithPaletteOverrideBindless"/> with palette
|
||||||
|
/// and surface overrides applied. <see cref="AnimatedEntityState"/> is currently
|
||||||
/// unused at draw time — GameWindow's spawn path already bakes AnimPartChanges +
|
/// unused at draw time — GameWindow's spawn path already bakes AnimPartChanges +
|
||||||
/// GfxObjDegradeResolver (Issue #47 close-detail mesh) into <c>MeshRefs</c>.
|
/// GfxObjDegradeResolver (Issue #47 close-detail mesh) into <c>MeshRefs</c>.
|
||||||
/// </para>
|
/// </para>
|
||||||
///
|
///
|
||||||
/// <para>
|
/// <para>
|
||||||
/// <b>GL strategy:</b> GROUPED instanced drawing. All visible (entity, batch)
|
/// <b>GL strategy (N.5 — mandatory):</b> <c>glMultiDrawElementsIndirect</c> with SSBOs
|
||||||
/// pairs are bucketed by <see cref="GroupKey"/>; within a group a single
|
/// and <c>GL_ARB_bindless_texture</c> + <c>GL_ARB_shader_draw_parameters</c>.
|
||||||
/// <c>glDrawElementsInstancedBaseVertexBaseInstance</c> renders all instances.
|
/// All visible (entity, batch) pairs are bucketed by <see cref="GroupKey"/>;
|
||||||
/// All matrices for the frame land in one shared instance VBO via a single
|
/// each group becomes one <c>DrawElementsIndirectCommand</c>. Three GPU buffers
|
||||||
/// <c>BufferData</c> upload. This drops draw calls from O(entities×batches)
|
/// are uploaded per frame: instance matrices (SSBO binding 0), per-group batch
|
||||||
/// to O(unique GfxObj×batch×texture) — typically two orders of magnitude fewer.
|
/// metadata/texture handles (SSBO binding 1), and the indirect draw commands.
|
||||||
|
/// Two <c>glMultiDrawElementsIndirect</c> calls cover the opaque and transparent
|
||||||
|
/// passes respectively — one GL call per pass regardless of group count.
|
||||||
/// </para>
|
/// </para>
|
||||||
///
|
///
|
||||||
/// <para>
|
/// <para>
|
||||||
/// <b>Shader:</b> reuses <c>mesh_instanced</c> (vert locations 0-2 = Position/
|
/// <b>Shader:</b> <c>mesh_modern</c> (bindless + <c>gl_DrawIDARB</c> /
|
||||||
/// Normal/UV from WB's <c>VertexPositionNormalTexture</c>; locations 3-6 = instance
|
/// <c>gl_BaseInstanceARB</c>). Missing bindless/draw-parameters throws
|
||||||
/// matrix from our VBO). WB's 32-byte vertex stride is compatible.
|
/// <see cref="NotSupportedException"/> at startup — there is no legacy fallback.
|
||||||
/// </para>
|
/// </para>
|
||||||
///
|
///
|
||||||
/// <para>
|
/// <para>
|
||||||
/// <b>Modern rendering assumption:</b> WB's <c>_useModernRendering</c> path (GL
|
/// <b>Modern rendering assumption:</b> WB's <c>_useModernRendering</c> path (GL
|
||||||
/// 4.3 + bindless) puts every mesh in a single shared VAO/VBO/IBO and uses
|
/// 4.3 + bindless) puts every mesh in a single shared VAO/VBO/IBO and uses
|
||||||
/// <c>FirstIndex</c> + <c>BaseVertex</c> per batch. The dispatcher honors those
|
/// <c>FirstIndex</c> + <c>BaseVertex</c> per batch. The dispatcher honors those
|
||||||
/// offsets via <c>DrawElementsInstancedBaseVertex(BaseInstance)</c>. The legacy
|
/// offsets inside each <c>DrawElementsIndirectCommand</c> via
|
||||||
/// per-mesh-VAO path also works since FirstIndex/BaseVertex are zero there.
|
/// <c>glMultiDrawElementsIndirect</c>.
|
||||||
/// </para>
|
/// </para>
|
||||||
/// </summary>
|
/// </summary>
|
||||||
public sealed unsafe class WbDrawDispatcher : IDisposable
|
public sealed unsafe class WbDrawDispatcher : IDisposable
|
||||||
|
|
@ -61,14 +66,40 @@ public sealed unsafe class WbDrawDispatcher : IDisposable
|
||||||
private readonly WbMeshAdapter _meshAdapter;
|
private readonly WbMeshAdapter _meshAdapter;
|
||||||
private readonly EntitySpawnAdapter _entitySpawnAdapter;
|
private readonly EntitySpawnAdapter _entitySpawnAdapter;
|
||||||
|
|
||||||
private readonly uint _instanceVbo;
|
private readonly BindlessSupport _bindless;
|
||||||
private readonly HashSet<uint> _patchedVaos = new();
|
|
||||||
|
// SSBO buffer ids
|
||||||
|
private uint _instanceSsbo;
|
||||||
|
private uint _batchSsbo;
|
||||||
|
private uint _indirectBuffer;
|
||||||
|
|
||||||
|
// Per-frame scratch arrays — Tasks 9-10 fully wire these.
|
||||||
|
private float[] _instanceData = new float[256 * 16]; // mat4 floats per instance
|
||||||
|
private BatchData[] _batchData = new BatchData[256];
|
||||||
|
private DrawElementsIndirectCommand[] _indirectCommands = new DrawElementsIndirectCommand[256];
|
||||||
|
|
||||||
|
private int _opaqueDrawCount;
|
||||||
|
private int _transparentDrawCount;
|
||||||
|
private int _transparentByteOffset;
|
||||||
|
|
||||||
|
// std430 layout: ulong TextureHandle (uvec2) at offset 0, uint TextureLayer
|
||||||
|
// at offset 8, uint Flags at offset 12. Total 16 bytes.
|
||||||
|
// Pack=8 (not 4) because std430's uvec2 requires 8-byte alignment — Pack=4
|
||||||
|
// works today by accident (TextureHandle is the first field, so offset 0 is
|
||||||
|
// always 8-byte aligned), but adding a 4-byte field before TextureHandle
|
||||||
|
// without bumping Pack would silently misalign the GPU struct.
|
||||||
|
[StructLayout(LayoutKind.Sequential, Pack = 8)]
|
||||||
|
private struct BatchData
|
||||||
|
{
|
||||||
|
public ulong TextureHandle; // bindless handle (uvec2 in GLSL)
|
||||||
|
public uint TextureLayer;
|
||||||
|
public uint Flags;
|
||||||
|
}
|
||||||
|
|
||||||
// Per-frame scratch — reused across frames to avoid per-frame allocation.
|
// Per-frame scratch — reused across frames to avoid per-frame allocation.
|
||||||
private readonly Dictionary<GroupKey, InstanceGroup> _groups = new();
|
private readonly Dictionary<GroupKey, InstanceGroup> _groups = new();
|
||||||
private readonly List<InstanceGroup> _opaqueDraws = new();
|
private readonly List<InstanceGroup> _opaqueDraws = new();
|
||||||
private readonly List<InstanceGroup> _translucentDraws = new();
|
private readonly List<InstanceGroup> _translucentDraws = new();
|
||||||
private float[] _instanceBuffer = new float[256 * 16]; // grow on demand, never shrink
|
|
||||||
|
|
||||||
// Per-entity-cull AABB radius. Conservative — covers most entities; large
|
// Per-entity-cull AABB radius. Conservative — covers most entities; large
|
||||||
// outliers (long banners, tall columns) are still landblock-culled.
|
// outliers (long banners, tall columns) are still landblock-culled.
|
||||||
|
|
@ -84,12 +115,23 @@ public sealed unsafe class WbDrawDispatcher : IDisposable
|
||||||
private int _instancesIssued;
|
private int _instancesIssued;
|
||||||
private long _lastLogTick;
|
private long _lastLogTick;
|
||||||
|
|
||||||
|
// CPU + GPU timing for [WB-DIAG] under ACDREAM_WB_DIAG=1.
|
||||||
|
private readonly System.Diagnostics.Stopwatch _cpuStopwatch = new();
|
||||||
|
private readonly long[] _cpuSamples = new long[256]; // microseconds
|
||||||
|
private int _cpuSampleCursor;
|
||||||
|
private uint _gpuQueryOpaque;
|
||||||
|
private uint _gpuQueryTransparent;
|
||||||
|
private readonly long[] _gpuSamples = new long[256]; // microseconds
|
||||||
|
private int _gpuSampleCursor;
|
||||||
|
private bool _gpuQueriesInitialized;
|
||||||
|
|
||||||
public WbDrawDispatcher(
|
public WbDrawDispatcher(
|
||||||
GL gl,
|
GL gl,
|
||||||
Shader shader,
|
Shader shader,
|
||||||
TextureCache textures,
|
TextureCache textures,
|
||||||
WbMeshAdapter meshAdapter,
|
WbMeshAdapter meshAdapter,
|
||||||
EntitySpawnAdapter entitySpawnAdapter)
|
EntitySpawnAdapter entitySpawnAdapter,
|
||||||
|
BindlessSupport bindless)
|
||||||
{
|
{
|
||||||
ArgumentNullException.ThrowIfNull(gl);
|
ArgumentNullException.ThrowIfNull(gl);
|
||||||
ArgumentNullException.ThrowIfNull(shader);
|
ArgumentNullException.ThrowIfNull(shader);
|
||||||
|
|
@ -103,7 +145,10 @@ public sealed unsafe class WbDrawDispatcher : IDisposable
|
||||||
_meshAdapter = meshAdapter;
|
_meshAdapter = meshAdapter;
|
||||||
_entitySpawnAdapter = entitySpawnAdapter;
|
_entitySpawnAdapter = entitySpawnAdapter;
|
||||||
|
|
||||||
_instanceVbo = _gl.GenBuffer();
|
_bindless = bindless ?? throw new ArgumentNullException(nameof(bindless));
|
||||||
|
_instanceSsbo = _gl.GenBuffer();
|
||||||
|
_batchSsbo = _gl.GenBuffer();
|
||||||
|
_indirectBuffer = _gl.GenBuffer();
|
||||||
}
|
}
|
||||||
|
|
||||||
public static Matrix4x4 ComposePartWorldMatrix(
|
public static Matrix4x4 ComposePartWorldMatrix(
|
||||||
|
|
@ -126,6 +171,16 @@ public sealed unsafe class WbDrawDispatcher : IDisposable
|
||||||
|
|
||||||
bool diag = string.Equals(Environment.GetEnvironmentVariable("ACDREAM_WB_DIAG"), "1", StringComparison.Ordinal);
|
bool diag = string.Equals(Environment.GetEnvironmentVariable("ACDREAM_WB_DIAG"), "1", StringComparison.Ordinal);
|
||||||
|
|
||||||
|
if (diag && !_gpuQueriesInitialized)
|
||||||
|
{
|
||||||
|
_gpuQueryOpaque = _gl.GenQuery();
|
||||||
|
_gpuQueryTransparent = _gl.GenQuery();
|
||||||
|
_gpuQueriesInitialized = true;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Always run the CPU stopwatch — cheap; only logged under diag.
|
||||||
|
_cpuStopwatch.Restart();
|
||||||
|
|
||||||
// Camera world-space position for front-to-back sort (perf #2). The view
|
// Camera world-space position for front-to-back sort (perf #2). The view
|
||||||
// matrix is the inverse of the camera's world transform, so the world
|
// matrix is the inverse of the camera's world transform, so the world
|
||||||
// translation lives in the inverse's translation row.
|
// translation lives in the inverse's translation row.
|
||||||
|
|
@ -235,23 +290,24 @@ public sealed unsafe class WbDrawDispatcher : IDisposable
|
||||||
// Nothing visible — skip the GL pass entirely.
|
// Nothing visible — skip the GL pass entirely.
|
||||||
if (anyVao == 0)
|
if (anyVao == 0)
|
||||||
{
|
{
|
||||||
|
_cpuStopwatch.Stop();
|
||||||
if (diag) MaybeFlushDiag();
|
if (diag) MaybeFlushDiag();
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
|
||||||
// ── Phase 2: lay matrices out contiguously, assign per-group offsets,
|
// ── Phase 3: assign FirstInstance per group, lay matrices contiguously, sort opaque ──
|
||||||
// split into opaque/translucent + compute sort keys ─────────
|
|
||||||
int totalInstances = 0;
|
int totalInstances = 0;
|
||||||
foreach (var grp in _groups.Values) totalInstances += grp.Matrices.Count;
|
foreach (var grp in _groups.Values) totalInstances += grp.Matrices.Count;
|
||||||
if (totalInstances == 0)
|
if (totalInstances == 0)
|
||||||
{
|
{
|
||||||
|
_cpuStopwatch.Stop();
|
||||||
if (diag) MaybeFlushDiag();
|
if (diag) MaybeFlushDiag();
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
|
||||||
int needed = totalInstances * 16;
|
int needed = totalInstances * 16;
|
||||||
if (_instanceBuffer.Length < needed)
|
if (_instanceData.Length < needed)
|
||||||
_instanceBuffer = new float[needed + 256 * 16]; // headroom
|
_instanceData = new float[needed + 256 * 16];
|
||||||
|
|
||||||
_opaqueDraws.Clear();
|
_opaqueDraws.Clear();
|
||||||
_translucentDraws.Clear();
|
_translucentDraws.Clear();
|
||||||
|
|
@ -268,17 +324,17 @@ public sealed unsafe class WbDrawDispatcher : IDisposable
|
||||||
// position for front-to-back sort (perf #2). Cheap heuristic; works
|
// position for front-to-back sort (perf #2). Cheap heuristic; works
|
||||||
// well when instances of one group are spatially coherent
|
// well when instances of one group are spatially coherent
|
||||||
// (typical for trees in one landblock area, NPCs at one spawn).
|
// (typical for trees in one landblock area, NPCs at one spawn).
|
||||||
var firstM = grp.Matrices[0];
|
var first = grp.Matrices[0];
|
||||||
var grpPos = new Vector3(firstM.M41, firstM.M42, firstM.M43);
|
var grpPos = new Vector3(first.M41, first.M42, first.M43);
|
||||||
grp.SortDistance = Vector3.DistanceSquared(camPos, grpPos);
|
grp.SortDistance = Vector3.DistanceSquared(camPos, grpPos);
|
||||||
|
|
||||||
for (int i = 0; i < grp.Matrices.Count; i++)
|
for (int i = 0; i < grp.Matrices.Count; i++)
|
||||||
{
|
{
|
||||||
WriteMatrix(_instanceBuffer, cursor * 16, grp.Matrices[i]);
|
WriteMatrix(_instanceData, cursor * 16, grp.Matrices[i]);
|
||||||
cursor++;
|
cursor++;
|
||||||
}
|
}
|
||||||
|
|
||||||
if (grp.Translucency == TranslucencyKind.Opaque || grp.Translucency == TranslucencyKind.ClipMap)
|
if (IsOpaque(grp.Translucency))
|
||||||
_opaqueDraws.Add(grp);
|
_opaqueDraws.Add(grp);
|
||||||
else
|
else
|
||||||
_translucentDraws.Add(grp);
|
_translucentDraws.Add(grp);
|
||||||
|
|
@ -290,90 +346,141 @@ public sealed unsafe class WbDrawDispatcher : IDisposable
|
||||||
// Foundry interior).
|
// Foundry interior).
|
||||||
_opaqueDraws.Sort(static (a, b) => a.SortDistance.CompareTo(b.SortDistance));
|
_opaqueDraws.Sort(static (a, b) => a.SortDistance.CompareTo(b.SortDistance));
|
||||||
|
|
||||||
// ── Phase 3: one upload of all matrices ─────────────────────────────
|
// ── Phase 4: build IndirectGroupInput list (opaque sorted, then translucent),
|
||||||
_gl.BindBuffer(BufferTargetARB.ArrayBuffer, _instanceVbo);
|
// fill via BuildIndirectArrays ──────────────────────────────────
|
||||||
fixed (float* p = _instanceBuffer)
|
int totalDraws = _opaqueDraws.Count + _translucentDraws.Count;
|
||||||
_gl.BufferData(BufferTargetARB.ArrayBuffer,
|
if (_batchData.Length < totalDraws)
|
||||||
(nuint)(totalInstances * 16 * sizeof(float)), p, BufferUsageARB.DynamicDraw);
|
_batchData = new BatchData[totalDraws + 64];
|
||||||
|
if (_indirectCommands.Length < totalDraws)
|
||||||
|
_indirectCommands = new DrawElementsIndirectCommand[totalDraws + 64];
|
||||||
|
|
||||||
// ── Phase 4: bind VAO once (modern rendering shares one global VAO) ──
|
var groupInputs = new List<IndirectGroupInput>(totalDraws);
|
||||||
EnsureInstanceAttribs(anyVao);
|
foreach (var g in _opaqueDraws) groupInputs.Add(ToInput(g));
|
||||||
|
foreach (var g in _translucentDraws) groupInputs.Add(ToInput(g));
|
||||||
|
|
||||||
|
// Cast _batchData (private BatchData) to public-mirror BatchDataPublic for BuildIndirectArrays.
|
||||||
|
// Layout is asserted at test time (BatchDataPublic_LayoutMatchesPrivateBatchData test).
|
||||||
|
var batchPublic = new BatchDataPublic[totalDraws];
|
||||||
|
var layout = BuildIndirectArrays(groupInputs, _indirectCommands, batchPublic);
|
||||||
|
|
||||||
|
// Copy back into _batchData
|
||||||
|
for (int i = 0; i < totalDraws; i++)
|
||||||
|
{
|
||||||
|
_batchData[i] = new BatchData
|
||||||
|
{
|
||||||
|
TextureHandle = batchPublic[i].TextureHandle,
|
||||||
|
TextureLayer = batchPublic[i].TextureLayer,
|
||||||
|
Flags = batchPublic[i].Flags,
|
||||||
|
};
|
||||||
|
}
|
||||||
|
_opaqueDrawCount = layout.OpaqueCount;
|
||||||
|
_transparentDrawCount = layout.TransparentCount;
|
||||||
|
_transparentByteOffset = layout.TransparentByteOffset;
|
||||||
|
|
||||||
|
// ── Phase 5: upload three buffers ───────────────────────────────────
|
||||||
|
fixed (float* ip = _instanceData)
|
||||||
|
UploadSsbo(_instanceSsbo, 0, ip, totalInstances * 16 * sizeof(float));
|
||||||
|
|
||||||
|
fixed (BatchData* bp = _batchData)
|
||||||
|
UploadSsbo(_batchSsbo, 1, bp, totalDraws * sizeof(BatchData));
|
||||||
|
|
||||||
|
fixed (DrawElementsIndirectCommand* cp = _indirectCommands)
|
||||||
|
{
|
||||||
|
_gl.BindBuffer(BufferTargetARB.DrawIndirectBuffer, _indirectBuffer);
|
||||||
|
_gl.BufferData(BufferTargetARB.DrawIndirectBuffer,
|
||||||
|
(nuint)(totalDraws * sizeof(DrawElementsIndirectCommand)), cp, BufferUsageARB.DynamicDraw);
|
||||||
|
}
|
||||||
|
|
||||||
|
// ── Phase 6: bind global VAO once ───────────────────────────────────
|
||||||
_gl.BindVertexArray(anyVao);
|
_gl.BindVertexArray(anyVao);
|
||||||
|
|
||||||
// ── Phase 5: opaque + ClipMap pass (front-to-back sorted) ───────────
|
|
||||||
if (string.Equals(Environment.GetEnvironmentVariable("ACDREAM_NO_CULL"), "1", StringComparison.Ordinal))
|
if (string.Equals(Environment.GetEnvironmentVariable("ACDREAM_NO_CULL"), "1", StringComparison.Ordinal))
|
||||||
_gl.Disable(EnableCap.CullFace);
|
_gl.Disable(EnableCap.CullFace);
|
||||||
|
|
||||||
foreach (var grp in _opaqueDraws)
|
// ── Phase 7: opaque pass ─────────────────────────────────────────────
|
||||||
|
if (_opaqueDrawCount > 0)
|
||||||
{
|
{
|
||||||
_shader.SetInt("uTranslucencyKind", (int)grp.Translucency);
|
_gl.Disable(EnableCap.Blend);
|
||||||
DrawGroup(grp);
|
_gl.DepthMask(true);
|
||||||
|
_shader.SetInt("uRenderPass", 0);
|
||||||
|
_gl.BindBuffer(BufferTargetARB.DrawIndirectBuffer, _indirectBuffer);
|
||||||
|
if (diag && _gpuQueriesInitialized) _gl.BeginQuery(QueryTarget.TimeElapsed, _gpuQueryOpaque);
|
||||||
|
_gl.MultiDrawElementsIndirect(
|
||||||
|
PrimitiveType.Triangles,
|
||||||
|
DrawElementsType.UnsignedShort,
|
||||||
|
(void*)0,
|
||||||
|
(uint)_opaqueDrawCount,
|
||||||
|
(uint)DrawCommandStride);
|
||||||
|
if (diag && _gpuQueriesInitialized) _gl.EndQuery(QueryTarget.TimeElapsed);
|
||||||
}
|
}
|
||||||
|
|
||||||
// ── Phase 6: translucent pass ───────────────────────────────────────
|
// ── Phase 8: transparent pass ────────────────────────────────────────
|
||||||
_gl.Enable(EnableCap.Blend);
|
if (_transparentDrawCount > 0)
|
||||||
_gl.DepthMask(false);
|
|
||||||
|
|
||||||
if (string.Equals(Environment.GetEnvironmentVariable("ACDREAM_NO_CULL"), "1", StringComparison.Ordinal))
|
|
||||||
{
|
{
|
||||||
_gl.Disable(EnableCap.CullFace);
|
_gl.Enable(EnableCap.Blend);
|
||||||
}
|
_gl.BlendFunc(BlendingFactor.SrcAlpha, BlendingFactor.OneMinusSrcAlpha);
|
||||||
else
|
_gl.DepthMask(false);
|
||||||
{
|
_shader.SetInt("uRenderPass", 1);
|
||||||
_gl.Enable(EnableCap.CullFace);
|
if (diag && _gpuQueriesInitialized) _gl.BeginQuery(QueryTarget.TimeElapsed, _gpuQueryTransparent);
|
||||||
_gl.CullFace(TriangleFace.Back);
|
_gl.MultiDrawElementsIndirect(
|
||||||
_gl.FrontFace(FrontFaceDirection.Ccw);
|
PrimitiveType.Triangles,
|
||||||
|
DrawElementsType.UnsignedShort,
|
||||||
|
(void*)_transparentByteOffset,
|
||||||
|
(uint)_transparentDrawCount,
|
||||||
|
(uint)DrawCommandStride);
|
||||||
|
if (diag && _gpuQueriesInitialized) _gl.EndQuery(QueryTarget.TimeElapsed);
|
||||||
|
_gl.DepthMask(true);
|
||||||
|
_gl.Disable(EnableCap.Blend);
|
||||||
}
|
}
|
||||||
|
|
||||||
foreach (var grp in _translucentDraws)
|
|
||||||
{
|
|
||||||
switch (grp.Translucency)
|
|
||||||
{
|
|
||||||
case TranslucencyKind.Additive:
|
|
||||||
_gl.BlendFunc(BlendingFactor.SrcAlpha, BlendingFactor.One);
|
|
||||||
break;
|
|
||||||
case TranslucencyKind.InvAlpha:
|
|
||||||
_gl.BlendFunc(BlendingFactor.OneMinusSrcAlpha, BlendingFactor.SrcAlpha);
|
|
||||||
break;
|
|
||||||
default:
|
|
||||||
_gl.BlendFunc(BlendingFactor.SrcAlpha, BlendingFactor.OneMinusSrcAlpha);
|
|
||||||
break;
|
|
||||||
}
|
|
||||||
|
|
||||||
_shader.SetInt("uTranslucencyKind", (int)grp.Translucency);
|
|
||||||
DrawGroup(grp);
|
|
||||||
}
|
|
||||||
|
|
||||||
_gl.DepthMask(true);
|
|
||||||
_gl.Disable(EnableCap.Blend);
|
|
||||||
_gl.Disable(EnableCap.CullFace);
|
_gl.Disable(EnableCap.CullFace);
|
||||||
_gl.BindVertexArray(0);
|
_gl.BindVertexArray(0);
|
||||||
|
|
||||||
|
_cpuStopwatch.Stop();
|
||||||
|
|
||||||
if (diag)
|
if (diag)
|
||||||
{
|
{
|
||||||
_drawsIssued += _opaqueDraws.Count + _translucentDraws.Count;
|
long cpuUs = _cpuStopwatch.ElapsedTicks * 1_000_000L / System.Diagnostics.Stopwatch.Frequency;
|
||||||
|
_cpuSamples[_cpuSampleCursor] = cpuUs;
|
||||||
|
_cpuSampleCursor = (_cpuSampleCursor + 1) % _cpuSamples.Length;
|
||||||
|
|
||||||
|
// Read GPU samples non-blocking; the result for the previous frame's
|
||||||
|
// queries should be ready by now. If not, drop the sample (don't stall
|
||||||
|
// the CPU waiting for the GPU).
|
||||||
|
if (_gpuQueriesInitialized)
|
||||||
|
{
|
||||||
|
_gl.GetQueryObject(_gpuQueryOpaque, QueryObjectParameterName.ResultAvailable, out int avail);
|
||||||
|
if (avail != 0)
|
||||||
|
{
|
||||||
|
_gl.GetQueryObject(_gpuQueryOpaque, QueryObjectParameterName.Result, out ulong opaqueNs);
|
||||||
|
_gl.GetQueryObject(_gpuQueryTransparent, QueryObjectParameterName.Result, out ulong transNs);
|
||||||
|
long gpuUs = (long)((opaqueNs + transNs) / 1000UL);
|
||||||
|
_gpuSamples[_gpuSampleCursor] = gpuUs;
|
||||||
|
_gpuSampleCursor = (_gpuSampleCursor + 1) % _gpuSamples.Length;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
_drawsIssued += _opaqueDrawCount + _transparentDrawCount;
|
||||||
_instancesIssued += totalInstances;
|
_instancesIssued += totalInstances;
|
||||||
MaybeFlushDiag();
|
MaybeFlushDiag();
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
private void DrawGroup(InstanceGroup grp)
|
private static IndirectGroupInput ToInput(InstanceGroup g) => new(
|
||||||
{
|
IndexCount: g.IndexCount,
|
||||||
_gl.ActiveTexture(TextureUnit.Texture0);
|
FirstIndex: g.FirstIndex,
|
||||||
_gl.BindTexture(TextureTarget.Texture2D, grp.TextureHandle);
|
BaseVertex: g.BaseVertex,
|
||||||
_gl.BindBuffer(BufferTargetARB.ElementArrayBuffer, grp.Ibo);
|
InstanceCount: g.InstanceCount,
|
||||||
|
FirstInstance: g.FirstInstance,
|
||||||
|
TextureHandle: g.BindlessTextureHandle,
|
||||||
|
TextureLayer: g.TextureLayer,
|
||||||
|
Translucency: g.Translucency);
|
||||||
|
|
||||||
// BaseInstance offsets the per-instance attribute fetches into our
|
private unsafe void UploadSsbo(uint ssbo, uint binding, void* data, int byteCount)
|
||||||
// shared instance VBO so each group reads its own slice. Requires
|
{
|
||||||
// GL_ARB_base_instance (GL 4.2+); WB requires 4.3 so this is available.
|
_gl.BindBuffer(BufferTargetARB.ShaderStorageBuffer, ssbo);
|
||||||
_gl.DrawElementsInstancedBaseVertexBaseInstance(
|
_gl.BufferData(BufferTargetARB.ShaderStorageBuffer, (nuint)byteCount, data, BufferUsageARB.DynamicDraw);
|
||||||
PrimitiveType.Triangles,
|
_gl.BindBufferBase(BufferTargetARB.ShaderStorageBuffer, binding, ssbo);
|
||||||
(uint)grp.IndexCount,
|
|
||||||
DrawElementsType.UnsignedShort,
|
|
||||||
(void*)(grp.FirstIndex * sizeof(ushort)),
|
|
||||||
(uint)grp.InstanceCount,
|
|
||||||
grp.BaseVertex,
|
|
||||||
(uint)grp.FirstInstance);
|
|
||||||
}
|
}
|
||||||
|
|
||||||
private void MaybeFlushDiag()
|
private void MaybeFlushDiag()
|
||||||
|
|
@ -381,13 +488,41 @@ public sealed unsafe class WbDrawDispatcher : IDisposable
|
||||||
long now = Environment.TickCount64;
|
long now = Environment.TickCount64;
|
||||||
if (now - _lastLogTick > 5000)
|
if (now - _lastLogTick > 5000)
|
||||||
{
|
{
|
||||||
|
long cpuMed = MedianMicros(_cpuSamples);
|
||||||
|
long cpuP95 = Percentile95Micros(_cpuSamples);
|
||||||
|
long gpuMed = MedianMicros(_gpuSamples);
|
||||||
|
long gpuP95 = Percentile95Micros(_gpuSamples);
|
||||||
Console.WriteLine(
|
Console.WriteLine(
|
||||||
$"[WB-DIAG] entSeen={_entitiesSeen} entDrawn={_entitiesDrawn} meshMissing={_meshesMissing} drawsIssued={_drawsIssued} instances={_instancesIssued} groups={_groups.Count}");
|
$"[WB-DIAG] entSeen={_entitiesSeen} entDrawn={_entitiesDrawn} meshMissing={_meshesMissing} drawsIssued={_drawsIssued} instances={_instancesIssued} groups={_groups.Count} " +
|
||||||
|
$"cpu_us={cpuMed}m/{cpuP95}p95 gpu_us={gpuMed}m/{gpuP95}p95");
|
||||||
_entitiesSeen = _entitiesDrawn = _meshesMissing = _drawsIssued = _instancesIssued = 0;
|
_entitiesSeen = _entitiesDrawn = _meshesMissing = _drawsIssued = _instancesIssued = 0;
|
||||||
_lastLogTick = now;
|
_lastLogTick = now;
|
||||||
|
// Don't reset the sample buffers — they're a moving window of the
|
||||||
|
// last 256 frames; clearing per 5s flush would lose recent history.
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
private static long MedianMicros(long[] samples)
|
||||||
|
{
|
||||||
|
var copy = (long[])samples.Clone();
|
||||||
|
Array.Sort(copy);
|
||||||
|
int nz = 0;
|
||||||
|
foreach (var v in copy) if (v > 0) nz++;
|
||||||
|
if (nz == 0) return 0;
|
||||||
|
return copy[copy.Length - nz / 2];
|
||||||
|
}
|
||||||
|
|
||||||
|
private static long Percentile95Micros(long[] samples)
|
||||||
|
{
|
||||||
|
var copy = (long[])samples.Clone();
|
||||||
|
Array.Sort(copy);
|
||||||
|
int nz = 0;
|
||||||
|
foreach (var v in copy) if (v > 0) nz++;
|
||||||
|
if (nz == 0) return 0;
|
||||||
|
int idx = copy.Length - 1 - (int)(nz * 0.05);
|
||||||
|
return copy[idx];
|
||||||
|
}
|
||||||
|
|
||||||
private void ClassifyBatches(
|
private void ClassifyBatches(
|
||||||
ObjectRenderData renderData,
|
ObjectRenderData renderData,
|
||||||
ulong gfxObjId,
|
ulong gfxObjId,
|
||||||
|
|
@ -413,12 +548,16 @@ public sealed unsafe class WbDrawDispatcher : IDisposable
|
||||||
: TranslucencyKind.Opaque;
|
: TranslucencyKind.Opaque;
|
||||||
}
|
}
|
||||||
|
|
||||||
uint texHandle = ResolveTexture(entity, meshRef, batch, palHash);
|
ulong texHandle = ResolveTexture(entity, meshRef, batch, palHash);
|
||||||
if (texHandle == 0) continue;
|
if (texHandle == 0) continue;
|
||||||
|
|
||||||
|
// TextureLayer is always 0 for per-instance composites; non-zero when
|
||||||
|
// WB atlas is adopted in N.6+ and batches reference a shared atlas layer.
|
||||||
|
uint texLayer = 0;
|
||||||
|
|
||||||
var key = new GroupKey(
|
var key = new GroupKey(
|
||||||
batch.IBO, batch.FirstIndex, (int)batch.BaseVertex,
|
batch.IBO, batch.FirstIndex, (int)batch.BaseVertex,
|
||||||
batch.IndexCount, texHandle, translucency);
|
batch.IndexCount, texHandle, texLayer, translucency);
|
||||||
|
|
||||||
if (!_groups.TryGetValue(key, out var grp))
|
if (!_groups.TryGetValue(key, out var grp))
|
||||||
{
|
{
|
||||||
|
|
@ -428,7 +567,8 @@ public sealed unsafe class WbDrawDispatcher : IDisposable
|
||||||
FirstIndex = batch.FirstIndex,
|
FirstIndex = batch.FirstIndex,
|
||||||
BaseVertex = (int)batch.BaseVertex,
|
BaseVertex = (int)batch.BaseVertex,
|
||||||
IndexCount = batch.IndexCount,
|
IndexCount = batch.IndexCount,
|
||||||
TextureHandle = texHandle,
|
BindlessTextureHandle = texHandle,
|
||||||
|
TextureLayer = texLayer,
|
||||||
Translucency = translucency,
|
Translucency = translucency,
|
||||||
};
|
};
|
||||||
_groups[key] = grp;
|
_groups[key] = grp;
|
||||||
|
|
@ -437,10 +577,8 @@ public sealed unsafe class WbDrawDispatcher : IDisposable
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
private uint ResolveTexture(WorldEntity entity, MeshRef meshRef, ObjectRenderBatch batch, ulong palHash)
|
private ulong ResolveTexture(WorldEntity entity, MeshRef meshRef, ObjectRenderBatch batch, ulong palHash)
|
||||||
{
|
{
|
||||||
// WB stores the surface id on batch.Key.SurfaceId (TextureKey struct);
|
|
||||||
// batch.SurfaceId is unset (zero) for batches built by ObjectMeshManager.
|
|
||||||
uint surfaceId = batch.Key.SurfaceId;
|
uint surfaceId = batch.Key.SurfaceId;
|
||||||
if (surfaceId == 0 || surfaceId == 0xFFFFFFFF) return 0;
|
if (surfaceId == 0 || surfaceId == 0xFFFFFFFF) return 0;
|
||||||
|
|
||||||
|
|
@ -451,34 +589,16 @@ public sealed unsafe class WbDrawDispatcher : IDisposable
|
||||||
|
|
||||||
if (entity.PaletteOverride is not null)
|
if (entity.PaletteOverride is not null)
|
||||||
{
|
{
|
||||||
// perf #4: pass the entity-precomputed palette hash so TextureCache
|
return _textures.GetOrUploadWithPaletteOverrideBindless(
|
||||||
// can skip its internal HashPaletteOverride for repeat lookups
|
|
||||||
// within the same character.
|
|
||||||
return _textures.GetOrUploadWithPaletteOverride(
|
|
||||||
surfaceId, origTexOverride, entity.PaletteOverride, palHash);
|
surfaceId, origTexOverride, entity.PaletteOverride, palHash);
|
||||||
}
|
}
|
||||||
else if (hasOrigTexOverride)
|
else if (hasOrigTexOverride)
|
||||||
{
|
{
|
||||||
return _textures.GetOrUploadWithOrigTextureOverride(surfaceId, overrideOrigTex);
|
return _textures.GetOrUploadWithOrigTextureOverrideBindless(surfaceId, overrideOrigTex);
|
||||||
}
|
}
|
||||||
else
|
else
|
||||||
{
|
{
|
||||||
return _textures.GetOrUpload(surfaceId);
|
return _textures.GetOrUploadBindless(surfaceId);
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
private void EnsureInstanceAttribs(uint vao)
|
|
||||||
{
|
|
||||||
if (!_patchedVaos.Add(vao)) return;
|
|
||||||
|
|
||||||
_gl.BindVertexArray(vao);
|
|
||||||
_gl.BindBuffer(BufferTargetARB.ArrayBuffer, _instanceVbo);
|
|
||||||
for (uint row = 0; row < 4; row++)
|
|
||||||
{
|
|
||||||
uint loc = 3 + row;
|
|
||||||
_gl.EnableVertexAttribArray(loc);
|
|
||||||
_gl.VertexAttribPointer(loc, 4, VertexAttribPointerType.Float, false, 64, (void*)(row * 16));
|
|
||||||
_gl.VertexAttribDivisor(loc, 1);
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
@ -494,15 +614,138 @@ public sealed unsafe class WbDrawDispatcher : IDisposable
|
||||||
{
|
{
|
||||||
if (_disposed) return;
|
if (_disposed) return;
|
||||||
_disposed = true;
|
_disposed = true;
|
||||||
_gl.DeleteBuffer(_instanceVbo);
|
_gl.DeleteBuffer(_instanceSsbo);
|
||||||
|
_gl.DeleteBuffer(_batchSsbo);
|
||||||
|
_gl.DeleteBuffer(_indirectBuffer);
|
||||||
|
if (_gpuQueriesInitialized)
|
||||||
|
{
|
||||||
|
_gl.DeleteQuery(_gpuQueryOpaque);
|
||||||
|
_gl.DeleteQuery(_gpuQueryTransparent);
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// ── Public types + helpers for BuildIndirectArrays (Task 9) ─────────────
|
||||||
|
//
|
||||||
|
// These are public so the pure-CPU unit tests in AcDream.Core.Tests can
|
||||||
|
// exercise BuildIndirectArrays without needing a GL context.
|
||||||
|
|
||||||
|
/// <summary>
|
||||||
|
/// Stride in bytes of <c>DrawElementsIndirectCommand</c> in the indirect buffer.
|
||||||
|
/// 5 × <c>uint</c> = 20 bytes. Tests and callers reference this symbolically
|
||||||
|
/// rather than hard-coding <c>20</c> so a layout change produces a compile error.
|
||||||
|
/// </summary>
|
||||||
|
public const int DrawCommandStride = 20; // sizeof(DrawElementsIndirectCommand): 5 × uint
|
||||||
|
|
||||||
|
/// <summary>
|
||||||
|
/// Public view of the per-group inputs to <see cref="BuildIndirectArrays"/> — used in tests.
|
||||||
|
/// </summary>
|
||||||
|
public readonly record struct IndirectGroupInput(
|
||||||
|
int IndexCount,
|
||||||
|
uint FirstIndex,
|
||||||
|
int BaseVertex,
|
||||||
|
int InstanceCount,
|
||||||
|
int FirstInstance,
|
||||||
|
ulong TextureHandle,
|
||||||
|
uint TextureLayer,
|
||||||
|
TranslucencyKind Translucency);
|
||||||
|
|
||||||
|
/// <summary>
|
||||||
|
/// Public mirror of the per-group <see cref="BatchData"/> uploaded to the SSBO.
|
||||||
|
/// Tests verify the layout. Same field shape as the private BatchData.
|
||||||
|
/// </summary>
|
||||||
|
[StructLayout(LayoutKind.Sequential, Pack = 8)]
|
||||||
|
public struct BatchDataPublic
|
||||||
|
{
|
||||||
|
public ulong TextureHandle;
|
||||||
|
public uint TextureLayer;
|
||||||
|
public uint Flags;
|
||||||
|
}
|
||||||
|
|
||||||
|
/// <summary>Result of <see cref="BuildIndirectArrays"/>.</summary>
|
||||||
|
public readonly record struct IndirectLayoutResult(
|
||||||
|
int OpaqueCount,
|
||||||
|
int TransparentCount,
|
||||||
|
int TransparentByteOffset);
|
||||||
|
|
||||||
|
/// <summary>
|
||||||
|
/// Lays out the indirect commands + parallel BatchData array contiguously:
|
||||||
|
/// opaque section first (caller sorts before calling), transparent section second.
|
||||||
|
/// Pure CPU, no GL state. Caller passes pre-sized scratch arrays.
|
||||||
|
/// </summary>
|
||||||
|
/// <remarks>
|
||||||
|
/// Classification: Opaque + ClipMap → opaque pass (ClipMap uses discard, not
|
||||||
|
/// blending). Everything else (AlphaBlend, Additive, InvAlpha) → transparent pass.
|
||||||
|
/// </remarks>
|
||||||
|
public static IndirectLayoutResult BuildIndirectArrays(
|
||||||
|
IReadOnlyList<IndirectGroupInput> groups,
|
||||||
|
DrawElementsIndirectCommand[] indirectScratch,
|
||||||
|
BatchDataPublic[] batchScratch)
|
||||||
|
{
|
||||||
|
int opaqueCount = 0;
|
||||||
|
int transparentCount = 0;
|
||||||
|
|
||||||
|
foreach (var g in groups)
|
||||||
|
{
|
||||||
|
if (IsOpaque(g.Translucency)) opaqueCount++;
|
||||||
|
else transparentCount++;
|
||||||
|
}
|
||||||
|
|
||||||
|
int oi = 0; // opaque write cursor (fills [0..opaqueCount))
|
||||||
|
int ti = opaqueCount; // transparent write cursor (fills [opaqueCount..end))
|
||||||
|
|
||||||
|
foreach (var g in groups)
|
||||||
|
{
|
||||||
|
var dec = new DrawElementsIndirectCommand
|
||||||
|
{
|
||||||
|
Count = (uint)g.IndexCount,
|
||||||
|
InstanceCount = (uint)g.InstanceCount,
|
||||||
|
FirstIndex = g.FirstIndex,
|
||||||
|
BaseVertex = g.BaseVertex,
|
||||||
|
BaseInstance = (uint)g.FirstInstance,
|
||||||
|
};
|
||||||
|
var bd = new BatchDataPublic
|
||||||
|
{
|
||||||
|
TextureHandle = g.TextureHandle,
|
||||||
|
TextureLayer = g.TextureLayer,
|
||||||
|
Flags = 0,
|
||||||
|
};
|
||||||
|
|
||||||
|
if (IsOpaque(g.Translucency))
|
||||||
|
{
|
||||||
|
indirectScratch[oi] = dec;
|
||||||
|
batchScratch[oi] = bd;
|
||||||
|
oi++;
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
indirectScratch[ti] = dec;
|
||||||
|
batchScratch[ti] = bd;
|
||||||
|
ti++;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
return new IndirectLayoutResult(opaqueCount, transparentCount, opaqueCount * DrawCommandStride);
|
||||||
|
}
|
||||||
|
|
||||||
|
/// <summary>
|
||||||
|
/// Public test shim for <see cref="IsOpaque"/>. Locks in the N.5 Decision 2
|
||||||
|
/// translucency partition: Opaque + ClipMap → opaque indirect; AlphaBlend +
|
||||||
|
/// Additive + InvAlpha → transparent indirect.
|
||||||
|
/// </summary>
|
||||||
|
public static bool IsOpaquePublic(TranslucencyKind t) => IsOpaque(t);
|
||||||
|
|
||||||
|
private static bool IsOpaque(TranslucencyKind t)
|
||||||
|
=> t == TranslucencyKind.Opaque || t == TranslucencyKind.ClipMap;
|
||||||
|
|
||||||
|
// ────────────────────────────────────────────────────────────────────────
|
||||||
|
|
||||||
private readonly record struct GroupKey(
|
private readonly record struct GroupKey(
|
||||||
uint Ibo,
|
uint Ibo,
|
||||||
uint FirstIndex,
|
uint FirstIndex,
|
||||||
int BaseVertex,
|
int BaseVertex,
|
||||||
int IndexCount,
|
int IndexCount,
|
||||||
uint TextureHandle,
|
ulong BindlessTextureHandle,
|
||||||
|
uint TextureLayer,
|
||||||
TranslucencyKind Translucency);
|
TranslucencyKind Translucency);
|
||||||
|
|
||||||
private sealed class InstanceGroup
|
private sealed class InstanceGroup
|
||||||
|
|
@ -511,7 +754,8 @@ public sealed unsafe class WbDrawDispatcher : IDisposable
|
||||||
public uint FirstIndex;
|
public uint FirstIndex;
|
||||||
public int BaseVertex;
|
public int BaseVertex;
|
||||||
public int IndexCount;
|
public int IndexCount;
|
||||||
public uint TextureHandle;
|
public ulong BindlessTextureHandle; // 64-bit (was uint TextureHandle in N.4)
|
||||||
|
public uint TextureLayer; // 0 for per-instance composites; non-zero when WB atlas is adopted in N.6+
|
||||||
public TranslucencyKind Translucency;
|
public TranslucencyKind Translucency;
|
||||||
public int FirstInstance; // offset into the shared instance VBO (in instances, not bytes)
|
public int FirstInstance; // offset into the shared instance VBO (in instances, not bytes)
|
||||||
public int InstanceCount;
|
public int InstanceCount;
|
||||||
|
|
|
||||||
|
|
@ -1,39 +0,0 @@
|
||||||
namespace AcDream.App.Rendering.Wb;
|
|
||||||
|
|
||||||
/// <summary>
|
|
||||||
/// Process-lifetime cache of <c>ACDREAM_USE_WB_FOUNDATION</c> env var.
|
|
||||||
/// Read once at static-init time; all consumers import this rather than
|
|
||||||
/// re-reading the env var per call (env-var lookups on Windows are not
|
|
||||||
/// free at hot-path cadence).
|
|
||||||
///
|
|
||||||
/// <para>
|
|
||||||
/// <b>Default-on as of Phase N.4 ship (2026-05-08).</b> The WB foundation
|
|
||||||
/// (<c>WbMeshAdapter</c> + <c>WbDrawDispatcher</c>) is the production
|
|
||||||
/// rendering path. Set <c>ACDREAM_USE_WB_FOUNDATION=0</c> to fall back
|
|
||||||
/// to the legacy <c>InstancedMeshRenderer</c> path — kept as an escape
|
|
||||||
/// hatch until N.6 fully replaces it.
|
|
||||||
/// </para>
|
|
||||||
///
|
|
||||||
/// <para>
|
|
||||||
/// Per-instance customized content (server <c>CreateObject</c> entities
|
|
||||||
/// with palette / texture overrides) routes through
|
|
||||||
/// <see cref="TextureCache.GetOrUploadWithPaletteOverride"/> regardless
|
|
||||||
/// of the flag — the flag controls which DRAW path consumes those
|
|
||||||
/// textures.
|
|
||||||
/// </para>
|
|
||||||
/// </summary>
|
|
||||||
public static class WbFoundationFlag
|
|
||||||
{
|
|
||||||
private static bool _isEnabled =
|
|
||||||
System.Environment.GetEnvironmentVariable("ACDREAM_USE_WB_FOUNDATION") != "0";
|
|
||||||
|
|
||||||
public static bool IsEnabled => _isEnabled;
|
|
||||||
|
|
||||||
/// <summary>
|
|
||||||
/// FOR TESTS ONLY. Forces <see cref="IsEnabled"/> to <c>true</c> so
|
|
||||||
/// integration tests can exercise the WB adapter path without having to
|
|
||||||
/// set the env var before static initialisation. Never call from
|
|
||||||
/// production code.
|
|
||||||
/// </summary>
|
|
||||||
internal static void ForTestsOnly_ForceEnable() => _isEnabled = true;
|
|
||||||
}
|
|
||||||
|
|
@ -144,7 +144,7 @@ public sealed class GpuWorldState
|
||||||
}
|
}
|
||||||
|
|
||||||
_loaded[landblock.LandblockId] = landblock;
|
_loaded[landblock.LandblockId] = landblock;
|
||||||
if (WbFoundationFlag.IsEnabled && _wbSpawnAdapter is not null)
|
if (_wbSpawnAdapter is not null)
|
||||||
_wbSpawnAdapter.OnLandblockLoaded(_loaded[landblock.LandblockId]);
|
_wbSpawnAdapter.OnLandblockLoaded(_loaded[landblock.LandblockId]);
|
||||||
RebuildFlatView();
|
RebuildFlatView();
|
||||||
}
|
}
|
||||||
|
|
@ -195,7 +195,7 @@ public sealed class GpuWorldState
|
||||||
|
|
||||||
public void RemoveLandblock(uint landblockId)
|
public void RemoveLandblock(uint landblockId)
|
||||||
{
|
{
|
||||||
if (WbFoundationFlag.IsEnabled && _wbSpawnAdapter is not null)
|
if (_wbSpawnAdapter is not null)
|
||||||
_wbSpawnAdapter.OnLandblockUnloaded(landblockId);
|
_wbSpawnAdapter.OnLandblockUnloaded(landblockId);
|
||||||
|
|
||||||
// Rescue persistent entities before removal. These get appended
|
// Rescue persistent entities before removal. These get appended
|
||||||
|
|
|
||||||
|
|
@ -0,0 +1,32 @@
|
||||||
|
using AcDream.App.Rendering;
|
||||||
|
using AcDream.App.Rendering.Wb;
|
||||||
|
using DatReaderWriter;
|
||||||
|
using Xunit;
|
||||||
|
|
||||||
|
namespace AcDream.Core.Tests.Rendering;
|
||||||
|
|
||||||
|
/// <summary>
|
||||||
|
/// Lightweight unit tests for <see cref="TextureCache"/>'s bindless path.
|
||||||
|
/// We can't construct a real TextureCache in a headless test (it requires a
|
||||||
|
/// live GL context), so this file documents contracts that future engineers
|
||||||
|
/// should preserve. Real bindless integration is verified at Task 14's
|
||||||
|
/// visual gate.
|
||||||
|
/// </summary>
|
||||||
|
public sealed class TextureCacheBindlessTests
|
||||||
|
{
|
||||||
|
[Fact]
|
||||||
|
public void Contract_BindlessMethodsThrowWithoutBindlessSupport()
|
||||||
|
{
|
||||||
|
// The actual throw lives in TextureCache.EnsureBindlessAvailable
|
||||||
|
// and is reached only via GL-bound Bindless* method calls. The
|
||||||
|
// contract is: if the dispatcher (which requires bindless) ever
|
||||||
|
// gets a TextureCache constructed without BindlessSupport, it
|
||||||
|
// should fail-fast with InvalidOperationException — NOT silently
|
||||||
|
// route a draw to handle 0 (which would produce a non-resident
|
||||||
|
// GPU fault).
|
||||||
|
//
|
||||||
|
// This test is a marker. Future engineers: do not weaken
|
||||||
|
// EnsureBindlessAvailable to swallow the missing dependency.
|
||||||
|
Assert.True(true, "Contract documented in TextureCache.EnsureBindlessAvailable");
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
@ -19,16 +19,9 @@ namespace AcDream.Core.Tests.Rendering.Wb;
|
||||||
/// </summary>
|
/// </summary>
|
||||||
public sealed class PendingSpawnIntegrationTests
|
public sealed class PendingSpawnIntegrationTests
|
||||||
{
|
{
|
||||||
/// <summary>
|
// N.5 ship amendment: WbFoundationFlag was deleted — GpuWorldState
|
||||||
/// Force-enable WbFoundationFlag for this test class.
|
// no longer gates adapter calls on the flag; they are unconditional
|
||||||
/// GpuWorldState gates its adapter calls on this static-cached flag;
|
// when the adapter is non-null. No static ctor hook needed.
|
||||||
/// calling the internal test hook lets us exercise the full integration
|
|
||||||
/// path without needing the env var set before process startup.
|
|
||||||
/// </summary>
|
|
||||||
static PendingSpawnIntegrationTests()
|
|
||||||
{
|
|
||||||
WbFoundationFlag.ForTestsOnly_ForceEnable();
|
|
||||||
}
|
|
||||||
|
|
||||||
[Fact]
|
[Fact]
|
||||||
public void LiveEntity_ParkedBeforeLandblock_DrainsButIsNotRegisteredWithAdapter()
|
public void LiveEntity_ParkedBeforeLandblock_DrainsButIsNotRegisteredWithAdapter()
|
||||||
|
|
|
||||||
|
|
@ -0,0 +1,113 @@
|
||||||
|
using System.Numerics;
|
||||||
|
using AcDream.App.Rendering.Wb;
|
||||||
|
using AcDream.Core.Meshing;
|
||||||
|
using Xunit;
|
||||||
|
|
||||||
|
namespace AcDream.Core.Tests.Rendering.Wb;
|
||||||
|
|
||||||
|
/// <summary>
|
||||||
|
/// Pure CPU test of <see cref="WbDrawDispatcher.BuildIndirectArrays"/>.
|
||||||
|
/// Verifies that a synthetic group set lays out into the indirect buffer
|
||||||
|
/// + parallel batch data with opaque section first, transparent second,
|
||||||
|
/// per-group fields propagated correctly.
|
||||||
|
/// </summary>
|
||||||
|
public sealed class WbDrawDispatcherIndirectBuilderTests
|
||||||
|
{
|
||||||
|
[Fact]
|
||||||
|
public void TwoOpaqueGroupsAndOneTransparent_LaysOutContiguouslyOpaqueFirst()
|
||||||
|
{
|
||||||
|
// Arrange — three groups: 2 opaque (12+1 instances) + 1 transparent (12 instances)
|
||||||
|
var groups = new List<WbDrawDispatcher.IndirectGroupInput>
|
||||||
|
{
|
||||||
|
new(IndexCount: 100, FirstIndex: 0, BaseVertex: 0, InstanceCount: 12, FirstInstance: 0, TextureHandle: 0xAA, TextureLayer: 0, Translucency: TranslucencyKind.Opaque),
|
||||||
|
new(IndexCount: 200, FirstIndex: 100, BaseVertex: 0, InstanceCount: 12, FirstInstance: 12, TextureHandle: 0xBB, TextureLayer: 0, Translucency: TranslucencyKind.AlphaBlend),
|
||||||
|
new(IndexCount: 50, FirstIndex: 300, BaseVertex: 100, InstanceCount: 1, FirstInstance: 24, TextureHandle: 0xCC, TextureLayer: 0, Translucency: TranslucencyKind.Opaque),
|
||||||
|
};
|
||||||
|
|
||||||
|
var indirect = new DrawElementsIndirectCommand[16];
|
||||||
|
var batch = new WbDrawDispatcher.BatchDataPublic[16];
|
||||||
|
|
||||||
|
// Act
|
||||||
|
var result = WbDrawDispatcher.BuildIndirectArrays(groups, indirect, batch);
|
||||||
|
|
||||||
|
// Assert layout
|
||||||
|
Assert.Equal(2, result.OpaqueCount);
|
||||||
|
Assert.Equal(1, result.TransparentCount);
|
||||||
|
Assert.Equal(2 * 20, result.TransparentByteOffset); // sizeof(DEIC) = 20
|
||||||
|
|
||||||
|
// Opaque section, in input order (Task 10 callers sort)
|
||||||
|
Assert.Equal(100u, indirect[0].Count);
|
||||||
|
Assert.Equal(0u, indirect[0].FirstIndex);
|
||||||
|
Assert.Equal(0, indirect[0].BaseVertex);
|
||||||
|
Assert.Equal(12u, indirect[0].InstanceCount);
|
||||||
|
Assert.Equal(0u, indirect[0].BaseInstance);
|
||||||
|
|
||||||
|
Assert.Equal(50u, indirect[1].Count);
|
||||||
|
Assert.Equal(300u, indirect[1].FirstIndex);
|
||||||
|
Assert.Equal(100, indirect[1].BaseVertex);
|
||||||
|
Assert.Equal(1u, indirect[1].InstanceCount);
|
||||||
|
Assert.Equal(24u, indirect[1].BaseInstance);
|
||||||
|
|
||||||
|
// Transparent section
|
||||||
|
Assert.Equal(200u, indirect[2].Count);
|
||||||
|
Assert.Equal(100u, indirect[2].FirstIndex);
|
||||||
|
Assert.Equal(12u, indirect[2].InstanceCount);
|
||||||
|
Assert.Equal(12u, indirect[2].BaseInstance);
|
||||||
|
|
||||||
|
// BatchData parallel — same indices as indirect
|
||||||
|
Assert.Equal(0xAAul, batch[0].TextureHandle);
|
||||||
|
Assert.Equal(0xCCul, batch[1].TextureHandle);
|
||||||
|
Assert.Equal(0xBBul, batch[2].TextureHandle);
|
||||||
|
}
|
||||||
|
|
||||||
|
[Fact]
|
||||||
|
public void EmptyGroupList_ProducesZeroCounts()
|
||||||
|
{
|
||||||
|
var groups = new List<WbDrawDispatcher.IndirectGroupInput>();
|
||||||
|
var indirect = new DrawElementsIndirectCommand[0];
|
||||||
|
var batch = new WbDrawDispatcher.BatchDataPublic[0];
|
||||||
|
|
||||||
|
var result = WbDrawDispatcher.BuildIndirectArrays(groups, indirect, batch);
|
||||||
|
|
||||||
|
Assert.Equal(0, result.OpaqueCount);
|
||||||
|
Assert.Equal(0, result.TransparentCount);
|
||||||
|
Assert.Equal(0, result.TransparentByteOffset);
|
||||||
|
}
|
||||||
|
|
||||||
|
[Fact]
|
||||||
|
public void ClipMapTreatedAsOpaque()
|
||||||
|
{
|
||||||
|
// ClipMap surfaces (alpha-cutout) belong with the opaque pass
|
||||||
|
// because the discard handles transparency, not blending.
|
||||||
|
var groups = new List<WbDrawDispatcher.IndirectGroupInput>
|
||||||
|
{
|
||||||
|
new(IndexCount: 10, FirstIndex: 0, BaseVertex: 0, InstanceCount: 1, FirstInstance: 0, TextureHandle: 0x1, TextureLayer: 0, Translucency: TranslucencyKind.ClipMap),
|
||||||
|
};
|
||||||
|
var indirect = new DrawElementsIndirectCommand[4];
|
||||||
|
var batch = new WbDrawDispatcher.BatchDataPublic[4];
|
||||||
|
|
||||||
|
var result = WbDrawDispatcher.BuildIndirectArrays(groups, indirect, batch);
|
||||||
|
|
||||||
|
Assert.Equal(1, result.OpaqueCount);
|
||||||
|
Assert.Equal(0, result.TransparentCount);
|
||||||
|
}
|
||||||
|
|
||||||
|
[Fact]
|
||||||
|
public void BatchDataPublic_LayoutMatchesPrivateBatchData()
|
||||||
|
{
|
||||||
|
// Task 10 will use MemoryMarshal.Cast<BatchData, BatchDataPublic> to
|
||||||
|
// expose the dispatcher's per-frame BatchData[] scratch to BuildIndirectArrays
|
||||||
|
// without copying. The cast is only safe if the structs have identical
|
||||||
|
// layout (size, field offsets). Both use [StructLayout(Sequential, Pack=8)].
|
||||||
|
Assert.Equal(16, System.Runtime.CompilerServices.Unsafe.SizeOf<WbDrawDispatcher.BatchDataPublic>());
|
||||||
|
Assert.Equal(0, (int)System.Runtime.InteropServices.Marshal.OffsetOf<WbDrawDispatcher.BatchDataPublic>(nameof(WbDrawDispatcher.BatchDataPublic.TextureHandle)));
|
||||||
|
Assert.Equal(8, (int)System.Runtime.InteropServices.Marshal.OffsetOf<WbDrawDispatcher.BatchDataPublic>(nameof(WbDrawDispatcher.BatchDataPublic.TextureLayer)));
|
||||||
|
Assert.Equal(12, (int)System.Runtime.InteropServices.Marshal.OffsetOf<WbDrawDispatcher.BatchDataPublic>(nameof(WbDrawDispatcher.BatchDataPublic.Flags)));
|
||||||
|
}
|
||||||
|
|
||||||
|
[Fact]
|
||||||
|
public void DrawCommandStride_MatchesStructSize()
|
||||||
|
{
|
||||||
|
Assert.Equal(WbDrawDispatcher.DrawCommandStride, System.Runtime.CompilerServices.Unsafe.SizeOf<DrawElementsIndirectCommand>());
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
@ -0,0 +1,25 @@
|
||||||
|
using AcDream.App.Rendering.Wb;
|
||||||
|
using AcDream.Core.Meshing;
|
||||||
|
using Xunit;
|
||||||
|
|
||||||
|
namespace AcDream.Core.Tests.Rendering.Wb;
|
||||||
|
|
||||||
|
/// <summary>
|
||||||
|
/// Locks in the N.5 translucency partition contract (spec Decision 2).
|
||||||
|
/// If the partition drifts, the dispatcher's opaque + transparent indirect
|
||||||
|
/// passes will silently render the wrong groups in the wrong pass — visible
|
||||||
|
/// regression that's hard to spot in code review.
|
||||||
|
/// </summary>
|
||||||
|
public sealed class WbDrawDispatcherTranslucencyTests
|
||||||
|
{
|
||||||
|
[Theory]
|
||||||
|
[InlineData(TranslucencyKind.Opaque, true)]
|
||||||
|
[InlineData(TranslucencyKind.ClipMap, true)]
|
||||||
|
[InlineData(TranslucencyKind.AlphaBlend, false)]
|
||||||
|
[InlineData(TranslucencyKind.Additive, false)]
|
||||||
|
[InlineData(TranslucencyKind.InvAlpha, false)]
|
||||||
|
public void IsOpaque_PartitionsByKind(TranslucencyKind kind, bool expected)
|
||||||
|
{
|
||||||
|
Assert.Equal(expected, WbDrawDispatcher.IsOpaquePublic(kind));
|
||||||
|
}
|
||||||
|
}
|
||||||
Loading…
Add table
Add a link
Reference in a new issue