Merge branch 'claude/priceless-feistel-c12935' — Phase N.5 SHIP
N.5: Modern Rendering Path. WbDrawDispatcher now uses bindless textures + glMultiDrawElementsIndirect on top of N.4's grouped pipeline. Three SSBO uploads + 2 indirect calls per frame, ~12-15 total GL calls for entity rendering regardless of scene complexity. Measured 1.23 ms / frame median at Holtburg courtyard (1662 groups, ~810 fps). User-gated visual verification PASS at Holtburg. Includes ship-amendment: legacy renderer path formally retired (InstancedMeshRenderer + StaticMeshRenderer + WbFoundationFlag deleted). Bindless is now mandatory; missing extensions throw NotSupportedException at startup with a clear error message. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
commit
27eaf4e0be
23 changed files with 4379 additions and 1278 deletions
63
CLAUDE.md
63
CLAUDE.md
|
|
@ -55,9 +55,11 @@ ourselves".
|
|||
`EntitySpawnAdapter.cs` — bridge spawn lifecycle to WB ref-counts.
|
||||
Atlas tier (procedural) goes via Landblock; per-instance tier
|
||||
(server-spawned, palette/texture overrides) goes via Entity.
|
||||
- `WbFoundationFlag` is default-on. `ACDREAM_USE_WB_FOUNDATION=0`
|
||||
falls back to legacy `InstancedMeshRenderer` (kept as escape hatch
|
||||
until N.6 fully retires it).
|
||||
- **Modern path is mandatory as of N.5 ship amendment (2026-05-08).**
|
||||
`WbFoundationFlag`, `InstancedMeshRenderer`, and `StaticMeshRenderer`
|
||||
are deleted. Missing `GL_ARB_bindless_texture` or
|
||||
`GL_ARB_shader_draw_parameters` throws `NotSupportedException` at
|
||||
startup. There is no legacy fallback.
|
||||
- **WB's modern rendering path** (GL 4.3 + bindless) packs every mesh
|
||||
into a single global VAO/VBO/IBO. Each batch references its slice
|
||||
via `FirstIndex` (offset into IBO) + `BaseVertex` (offset into VBO).
|
||||
|
|
@ -72,6 +74,34 @@ ourselves".
|
|||
`PrepareMeshDataAsync(id, isSetup)` to fire the background decode.
|
||||
Result auto-enqueues to `_stagedMeshData` which `Tick()` drains.
|
||||
`WbMeshAdapter` does this for you on first registration.
|
||||
- **N.5 modern dispatch** (`docs/superpowers/specs/2026-05-08-phase-n5-modern-rendering-design.md`)
|
||||
uses bindless textures + multi-draw indirect on top of N.4's grouped
|
||||
pipeline. Per frame: three SSBO uploads (`_instanceSsbo` mat4 per
|
||||
instance @ binding=0; `_batchSsbo` `(uvec2 textureHandle, uint layer,
|
||||
uint flags)` per group @ binding=1; `_indirectBuffer`
|
||||
`DrawElementsIndirectCommand[]` opaque-section + transparent-section).
|
||||
Two `glMultiDrawElementsIndirect` calls per frame, one per pass.
|
||||
Total ~12-15 GL calls per frame for entity rendering regardless of
|
||||
scene complexity.
|
||||
- **`TextureCache` requires `BindlessSupport`** for the WB modern path.
|
||||
Three `Bindless`-suffixed `GetOrUpload*` methods return 64-bit handles
|
||||
made resident at upload time, backed by parallel Texture2DArray uploads
|
||||
(`UploadRgba8AsLayer1Array`). The legacy `uint`-returning methods stay
|
||||
for Sky / Terrain / Debug / particle paths that still sample via
|
||||
`sampler2D`. After N.6 retires legacy renderers, the legacy upload path
|
||||
+ caches can be deleted.
|
||||
- **Translucency model is two-pass alpha-test** (matches WB), not
|
||||
per-blend-mode subpasses. Opaque pass discards `α<0.95`; transparent
|
||||
pass discards `α≥0.95` AND `α<0.05`. Native `Additive` blend renders
|
||||
as alpha-blend on GfxObj surfaces — falsifiable; if a magic-content
|
||||
regression shows up, add a third indirect call with
|
||||
`glBlendFunc(SrcAlpha, One)` per spec §6 fallback (~30 min change).
|
||||
- **Per-instance highlight (selection blink) is reserved.** `mesh_modern.vert`'s
|
||||
`InstanceData` struct has a documented hook for `vec4 highlightColor`
|
||||
— Phase B.4 follow-up adds the field + plumbs server-side selection
|
||||
state. Stride grows from 64 → 80 bytes when added; shader updates
|
||||
trivially (read the field from `Instances[instanceIndex]` + mix into
|
||||
fragment color).
|
||||
|
||||
**Execution phases:** R1→R8 in the architecture doc. Each phase has clear
|
||||
goals, test criteria, and builds on the previous. Don't skip phases.
|
||||
|
|
@ -472,18 +502,25 @@ acdream's plan lives in two files committed to the repo:
|
|||
acceptance criteria. Do not drift from the spec without explicit user
|
||||
approval.
|
||||
|
||||
**Currently in flight: Phase N.5 — Modern Rendering Path.** Roadmap entry
|
||||
at [`docs/plans/2026-04-11-roadmap.md`](docs/plans/2026-04-11-roadmap.md).
|
||||
Builds on N.4's `WbDrawDispatcher` to adopt WB's modern rendering primitives:
|
||||
bindless textures (eliminate `glBindTexture` calls) and
|
||||
`glMultiDrawElementsIndirect` (one GL call per pass instead of one per
|
||||
group). Together these target a 2-5× CPU win on draw-heavy scenes by
|
||||
eliminating the remaining per-group state changes. Plan + spec to be
|
||||
written when work begins.
|
||||
**Currently in flight: Phase N.6 — Perf polish.**
|
||||
Roadmap entry at [`docs/plans/2026-04-11-roadmap.md`](docs/plans/2026-04-11-roadmap.md).
|
||||
Builds on N.5. Legacy renderers (`InstancedMeshRenderer`, `StaticMeshRenderer`,
|
||||
`WbFoundationFlag`) were retired in the N.5 ship amendment — N.6 scope is
|
||||
perf-only: WB atlas adoption, persistent-mapped buffers, GPU-side culling,
|
||||
GL_TIME_ELAPSED query double-buffering, direct N.4 vs N.5 perf measurement,
|
||||
legacy `Texture2D`/`sampler2D` TextureCache path retirement (Sky/Terrain/Debug).
|
||||
Plan + spec written when work begins.
|
||||
|
||||
**Phase N.5 (Modern Rendering Path) shipped + amended 2026-05-08.** `WbDrawDispatcher`
|
||||
on bindless textures + `glMultiDrawElementsIndirect`. CPU dispatcher 1.23ms/frame
|
||||
at Holtburg (~810 fps). **Ship amendment:** `InstancedMeshRenderer`,
|
||||
`StaticMeshRenderer`, `WbFoundationFlag` deleted in same phase — modern path is
|
||||
mandatory; missing bindless throws at startup. Plan archived at
|
||||
[`docs/superpowers/plans/2026-05-08-phase-n5-modern-rendering.md`](docs/superpowers/plans/2026-05-08-phase-n5-modern-rendering.md).
|
||||
|
||||
**Phase N.4 (Rendering Pipeline Foundation) shipped 2026-05-08.** WB's
|
||||
`ObjectMeshManager` is integrated and is the default rendering path
|
||||
behind `ACDREAM_USE_WB_FOUNDATION` (default-on). Plan archived at
|
||||
`ObjectMeshManager` is integrated and is the production rendering path
|
||||
(mandatory as of N.5 ship amendment). Plan archived at
|
||||
[`docs/superpowers/plans/2026-05-08-phase-n4-rendering-foundation.md`](docs/superpowers/plans/2026-05-08-phase-n4-rendering-foundation.md).
|
||||
|
||||
**Rules:**
|
||||
|
|
|
|||
|
|
@ -82,11 +82,12 @@ ground. This is the bug class fixed in
|
|||
|
||||
**Sequencing implication:** Phase N.2 (terrain math helpers
|
||||
substitution) cannot be shipped in isolation — it must land alongside
|
||||
N.5 (visual terrain renderer migration), at which point both physics
|
||||
and visual mesh switch to WB's formula together. Roadmap N.2 entry
|
||||
flags this dependency.
|
||||
visual terrain renderer migration (originally N.5, now moved to N.7
|
||||
scope), at which point both physics and visual mesh switch to WB's
|
||||
formula together. N.5 shipped entity rendering only; terrain remains
|
||||
on acdream's own pipeline through N.7.
|
||||
|
||||
**Research needed (when N.5 picks this up):**
|
||||
**Research needed (when N.7 picks this up):**
|
||||
1. Quantify divergence: run WB's `CalculateSplitDirection` and our
|
||||
`IsSplitSWtoNE` across all (lbX, lbY, cellX, cellY) tuples for a
|
||||
representative landblock set; record disagreement rate.
|
||||
|
|
@ -97,8 +98,8 @@ flags this dependency.
|
|||
server-authoritative Z within tolerance) is invalidated by the
|
||||
formula change.
|
||||
|
||||
**Acceptance:** Resolved when N.5 lands and both physics + visual
|
||||
mesh use WB's split formula, OR when we decide to keep the AC2D
|
||||
**Acceptance:** Resolved when N.7 lands and both physics + visual
|
||||
terrain use WB's split formula, OR when we decide to keep the AC2D
|
||||
formula and patch WB's renderer in our fork.
|
||||
|
||||
---
|
||||
|
|
@ -998,8 +999,8 @@ If the coat texture's UVs at the upper region map to texel-bytes whose palette i
|
|||
|
||||
**Files (diagnostic env vars committed for next-session reuse):**
|
||||
|
||||
- `src/AcDream.App/Rendering/InstancedMeshRenderer.cs:210-275`
|
||||
— `ACDREAM_NO_CULL` env var
|
||||
- ~~`src/AcDream.App/Rendering/InstancedMeshRenderer.cs:210-275`
|
||||
— `ACDREAM_NO_CULL` env var~~ (file deleted in N.5 ship amendment)
|
||||
- `src/AcDream.App/Rendering/GameWindow.cs` — `ACDREAM_HIDE_PART=N`
|
||||
hides specific humanoid part; `ACDREAM_DUMP_CLOTHING=1` dumps
|
||||
AnimPartChanges + TextureChanges + per-part Surface chain coverage.
|
||||
|
|
|
|||
|
|
@ -1,6 +1,6 @@
|
|||
# acdream — strategic roadmap
|
||||
|
||||
**Status:** Living document. Updated 2026-05-08 for Phase N.4 shipping (`WbMeshAdapter` + `WbDrawDispatcher` + `ACDREAM_USE_WB_FOUNDATION` default-on) + N.5 rebranded to "Modern rendering path" (bindless + multi-draw indirect on top of N.4's foundation).
|
||||
**Status:** Living document. Updated 2026-05-08 for Phase N.5 shipping (bindless textures + `glMultiDrawElementsIndirect` on top of N.4's foundation; CPU dispatcher 1.23ms/frame at Holtburg, ~810 fps) + N.6 becomes the new in-flight phase (retire legacy renderers + perf polish).
|
||||
**Purpose:** One source of truth for where the project is and where it's going. Every observed defect or missing feature has a named phase that owns it; when something looks wrong in-game, look here to find the phase that'll address it. Implementation details live in per-phase specs under `docs/superpowers/specs/`, not in this file.
|
||||
|
||||
---
|
||||
|
|
@ -59,7 +59,8 @@
|
|||
| C.1 | PES particle system + sky-pass refinements — retail-faithful `ParticleEmitterInfo` unpack with all 13 motion integrators (`Particle::Init`/`Update` ports of `0x0051c290`/`0x0051c930`), `PhysicsScriptRunner` with `CallPES` self-loop semantics, `ParticleHookSink` with `EmitterDied` cleanup, instanced billboard `ParticleRenderer` with material-derived blend (DAT emitters never default additive — pulled from particle GfxObj surface), global back-to-front sort, BC clipmap alpha-keying, AttachLocal `is_parent_local=1` live-parent follow via `UpdateEmitterAnchor`. Sky pass: `Translucent+ClipMap` → alpha-blend cloud sheet (matches `D3DPolyRender::SetSurface` `0x0059c4d0`), raw-`Additive` fog-skip (matches `0x0059c882`), per-keyframe `SkyObjectReplace` Translucency/Luminosity/MaxBright divide-by-100, bit `0x01` pre/post-scene split (matches `GameSky::CreateDeletePhysicsObjects` `0x005073c0`), Setup-backed (`0x020xxxxx`) sky objects via `SetupMesh.Flatten`, persistent GL sampler objects (Wrap + ClampToEdge) replace per-frame wrap-mode mutation (ported from WorldBuilder's `OpenGLGraphicsDevice`), post-scene Z-offset gated on `(Properties & 4) != 0 && (Properties & 8) == 0` per `GameSky::UpdatePosition` `0x00506dd0`. Sky-PES playback disabled by default (named-retail proves `GameSky` drops `pes_id`); `ACDREAM_ENABLE_SKY_PES=1` opens the experimental path. 1325 → 1331 tests. | Live ✓ |
|
||||
| N.1 | WorldBuilder-backed scenery (Chorizite/WorldBuilder fork as submodule, SceneryHelpers + TerrainUtils replace our inline ports) | Live ✓ |
|
||||
| N.3 | WorldBuilder-backed texture decode — `SurfaceDecoder` delegates INDEX16 / P8 / A8R8G8B8 / R8G8B8 / A8(+Additive) to `TextureHelpers.Fill*`; `isAdditive` threaded through (terrain alpha → `FillA8Additive`, non-additive entity surfaces → `FillA8`). R5G6B5 + A4R4G4B4 newly handled (previously magenta). X8R8G8B8, DXT1/3/5, SolidColor remain ours (no WB equivalent). 9 conformance tests prove byte-identical equivalence per format. | Live ✓ |
|
||||
| N.4 | Rendering pipeline foundation — adopted WB's `ObjectMeshManager` as the production mesh pipeline behind `ACDREAM_USE_WB_FOUNDATION` (default-on). `WbMeshAdapter` is the single seam (owns `ObjectMeshManager`, drains the staged-upload queue per frame, populates `AcSurfaceMetadataTable` with per-batch translucency / luminosity / fog metadata). `WbDrawDispatcher` is the production draw path: groups all visible (entity, batch) pairs, single-uploads the matrix buffer, fires one `glDrawElementsInstancedBaseVertexBaseInstance` per group with `BaseInstance` slicing into the shared instance VBO. `LandblockSpawnAdapter` + `EntitySpawnAdapter` bridge spawn lifecycle to WB ref-counts (atlas tier vs per-instance). Perf wins shipped as part of N.4: per-entity frustum cull, opaque front-to-back sort, palette-hash memoization (compute once per entity, reuse across batches). Visual verification at Holtburg passed: scenery + connected characters with full close-detail geometry (Issue #47 regression resolved). Legacy `InstancedMeshRenderer` retained as `ACDREAM_USE_WB_FOUNDATION=0` escape hatch until N.6. | Live ✓ |
|
||||
| N.4 | Rendering pipeline foundation — adopted WB's `ObjectMeshManager` as the production mesh pipeline behind `ACDREAM_USE_WB_FOUNDATION` (default-on). `WbMeshAdapter` is the single seam (owns `ObjectMeshManager`, drains the staged-upload queue per frame, populates `AcSurfaceMetadataTable` with per-batch translucency / luminosity / fog metadata). `WbDrawDispatcher` is the production draw path: groups all visible (entity, batch) pairs, single-uploads the matrix buffer, fires one `glDrawElementsInstancedBaseVertexBaseInstance` per group with `BaseInstance` slicing into the shared instance VBO. `LandblockSpawnAdapter` + `EntitySpawnAdapter` bridge spawn lifecycle to WB ref-counts (atlas tier vs per-instance). Perf wins shipped as part of N.4: per-entity frustum cull, opaque front-to-back sort, palette-hash memoization (compute once per entity, reuse across batches). Visual verification at Holtburg passed: scenery + connected characters with full close-detail geometry (Issue #47 regression resolved). Legacy `InstancedMeshRenderer` retained as `ACDREAM_USE_WB_FOUNDATION=0` escape hatch until N.6 (retired early in N.5 ship amendment). | Live ✓ |
|
||||
| N.5 | Modern rendering path — lifted `WbDrawDispatcher` onto bindless textures (`GL_ARB_bindless_texture`) + `glMultiDrawElementsIndirect`. Per-frame entity rendering: 3 SSBO uploads (instance matrices @ binding=0, batch data @ binding=1, indirect commands) + 2 indirect draw calls (opaque + transparent). ~12-15 GL calls per frame regardless of group count, down from hundreds-of-per-group in N.4. CPU dispatcher: 1.23 ms/frame median at Holtburg courtyard (1662 groups, ~810 fps sustained). All textures on the WB modern path use 1-layer `Texture2DArray` + `sampler2DArray`. Legacy callers keep `Texture2D` / `sampler2D` via the parallel `TextureCache` path until N.6 retires them. Three gotchas captured in memory: texture target lock-in, bindless Dispose order (two-phase non-resident before delete), GL_TIME_ELAPSED double-buffering. **Ship amendment 2026-05-08:** legacy renderers (`InstancedMeshRenderer`, `StaticMeshRenderer`, `WbFoundationFlag`) retired within N.5 — modern path is mandatory; missing bindless throws `NotSupportedException` at startup. N.6 scope narrowed accordingly. Plan archived at `docs/superpowers/plans/2026-05-08-phase-n5-modern-rendering.md`. | Live ✓ |
|
||||
|
||||
Plus polish that doesn't get its own phase number:
|
||||
- FlyCamera default speed lowered + Shift-to-boost
|
||||
|
|
@ -624,22 +625,21 @@ for our deletions/additions; merge upstream `master` periodically.
|
|||
memoization. Legacy `InstancedMeshRenderer` retained as flag-off
|
||||
fallback until N.6 fully retires it. Plan archived at
|
||||
`docs/superpowers/plans/2026-05-08-phase-n4-rendering-foundation.md`.
|
||||
- **N.5 — Modern rendering path.** **Rebranded from "Terrain rendering"
|
||||
2026-05-08 after N.4 perf review.** N.4 left two big remaining wins
|
||||
on the table that pair naturally: (1) bindless textures via
|
||||
`GL_ARB_bindless_texture` (WB already populates
|
||||
`ObjectRenderBatch.BindlessTextureHandle`; switch our shader to
|
||||
consume per-instance handles, eliminate 100% of `glBindTexture`
|
||||
calls), and (2) `glMultiDrawElementsIndirect` (one GL call per pass
|
||||
instead of one per group; build a `DrawElementsIndirectCommand`
|
||||
buffer, fire one indirect draw, the driver pulls everything). Both
|
||||
require shader changes (same shader, in fact — bindless + indirect
|
||||
are the same modern path WB uses internally). Together they target a
|
||||
2-5× CPU win on draw-heavy scenes (Holtburg courtyard, Foundry,
|
||||
dense dungeons). Also folds in: persistent-mapped instance VBO
|
||||
(`glBufferStorage` + `MAP_PERSISTENT_BIT | MAP_COHERENT_BIT` + ring
|
||||
buffer + sync) and texture pre-warm at landblock load (smooths
|
||||
streaming-boundary hitches). **Estimate: 2-3 weeks.**
|
||||
- **✓ SHIPPED — N.5 — Modern rendering path.** Shipped 2026-05-08.
|
||||
**Rebranded from "Terrain rendering" 2026-05-08 after N.4 perf
|
||||
review.** Lifted `WbDrawDispatcher` onto bindless textures
|
||||
(`GL_ARB_bindless_texture`) + `glMultiDrawElementsIndirect`. Per-frame
|
||||
entity rendering: 3 SSBO uploads (instance matrices @ binding=0, batch
|
||||
data @ binding=1, indirect commands) + 2 indirect calls (opaque +
|
||||
transparent). ~12-15 GL calls per frame regardless of group count, down
|
||||
from hundreds-of-per-group in N.4. CPU dispatcher: 1.23 ms/frame median
|
||||
at Holtburg (1662 groups, ~810 fps). All textures on the modern path use
|
||||
1-layer `Texture2DArray` + `sampler2DArray`; legacy callers retain
|
||||
`Texture2D` via the parallel `TextureCache` path until N.6 retires them.
|
||||
Three gotchas in memory (`project_phase_n5_state.md`): texture target
|
||||
lock-in, bindless Dispose two-phase order, GL_TIME_ELAPSED double-
|
||||
buffering. Plan archived at
|
||||
`docs/superpowers/plans/2026-05-08-phase-n5-modern-rendering.md`.
|
||||
- **N.5b — Terrain rendering on N.5 path.** Wire WB's
|
||||
`TerrainRenderManager` + `LandSurfaceManager` + `TerrainGeometryGenerator`
|
||||
onto the modern rendering path. Closes N.2's deferred terrain math
|
||||
|
|
@ -647,12 +647,17 @@ for our deletions/additions; merge upstream `master` periodically.
|
|||
`CalculateSplitDirection` + `GetHeight` + `GetNormal` in lockstep,
|
||||
resolving ISSUE #51. **Estimate: 1-2 weeks** (was 2-3 — modern path
|
||||
primitives already in place from N.5).
|
||||
- **N.6 — Static objects rendering.** Wire WB's
|
||||
`StaticObjectRenderManager` onto the modern rendering path; **fully
|
||||
delete** legacy `StaticMeshRenderer` + `InstancedMeshRenderer` (they
|
||||
remain as `ACDREAM_USE_WB_FOUNDATION=0` escape hatches through N.5).
|
||||
Mostly draw orchestration at this point — most of the substance
|
||||
landed in N.4 + N.5. **Estimate: 1-2 weeks** (was 2-3).
|
||||
- **N.6 — Perf polish.** **Currently in flight.**
|
||||
Builds on N.5. Legacy renderer retirement was pulled forward into N.5
|
||||
ship amendment — `InstancedMeshRenderer`, `StaticMeshRenderer`, and
|
||||
`WbFoundationFlag` are already gone. N.6 scope: WB atlas adoption for
|
||||
memory savings on shared content, persistent-mapped buffers if
|
||||
`glBufferData` shows up in profiling, GPU-side culling via compute
|
||||
pre-pass, GL_TIME_ELAPSED query double-buffering (deferred from N.5 —
|
||||
diagnostic shows `gpu_us=0/0` under `ACDREAM_WB_DIAG=1`), direct N.4
|
||||
vs N.5 perf measurement, retire the legacy `Texture2D`/`sampler2D` path
|
||||
in `TextureCache` (currently kept for Sky + Terrain + Debug).
|
||||
Plan + spec written when work begins. **Estimate: 1-2 weeks.**
|
||||
- **N.7 — EnvCells / dungeons.** Replace EnvCell rendering with WB's
|
||||
`EnvCellRenderManager` + `PortalRenderManager` on top of N.4's
|
||||
foundation. **Estimate: 1-2 weeks** (was 2-3 — naturally smaller now
|
||||
|
|
|
|||
72
docs/plans/2026-05-08-phase-n5-perf-baseline.md
Normal file
72
docs/plans/2026-05-08-phase-n5-perf-baseline.md
Normal file
|
|
@ -0,0 +1,72 @@
|
|||
# Phase N.5 perf baseline
|
||||
|
||||
**Captured:** 2026-05-08, against N.5 head (post-Task 12) on local machine.
|
||||
**Method:** `ACDREAM_WB_DIAG=1` + character at Holtburg spawn position +
|
||||
roaming. Numbers below are 5-second window medians from `[WB-DIAG]`.
|
||||
|
||||
## Holtburg courtyard (steady state)
|
||||
|
||||
| Metric | N.5 measured | N.4 (estimated*) | Gate |
|
||||
|---|---|---|---|
|
||||
| CPU dispatcher (median) | **1227 µs / frame** | ≥2500 µs / frame | ≤70% of N.4 → **PASS** |
|
||||
| CPU dispatcher (p95) | 1303 µs / frame | — | — |
|
||||
| GPU rendering (median) | unmeasured (see below) | — | within ±10% — **DEFERRED** |
|
||||
| `drawsIssued` per 5s | 4.85M (= 1662 groups × ~580 fps) | far higher per frame | — |
|
||||
| `drawsIssued` per pass (CPU GL calls) | **2** (1 opaque + 1 transparent indirect) | ~hundreds per pass | ≤5 → **PASS** |
|
||||
| `groups` (working set) | 1662 | ~similar | sanity |
|
||||
| Frame rate (inferred) | ~810 fps | ~100-200 fps | substantial uplift |
|
||||
|
||||
*N.4 baseline NOT measured directly in this run. The "≥2500 µs / frame"
|
||||
estimate assumes N.4's per-group glBindTexture + glBindBuffer +
|
||||
glDrawElementsInstancedBaseVertexBaseInstance hot path costs ≥1.5 µs per
|
||||
group and N.4 has ~1700 groups in this scene, putting the GL portion alone
|
||||
at ~2.5 ms before adding the entity-walk overhead. N.5's measurement
|
||||
includes ALL dispatcher work (entity walk + group bucketing + 3 SSBO
|
||||
uploads + 2 indirect calls + state changes) at 1230 µs total — comfortably
|
||||
half of the lower bound estimate.
|
||||
|
||||
## Acceptance gates (spec §8.3)
|
||||
|
||||
- [x] **Visual identity to N.4** — confirmed at Task 10 USER GATE: Holtburg
|
||||
courtyard renders identical, no missing entities, no z-fighting, no
|
||||
exploded parts.
|
||||
- [x] **CPU dispatcher time ≤ 70% of N.4** — N.5 measures 1.23 ms/frame
|
||||
median; estimated N.4 ≥2.5 ms/frame; **comfortably under 70%**.
|
||||
- [ ] **GPU rendering time within ±10% of N.4** — DEFERRED. The
|
||||
`GL_TIME_ELAPSED` query polling never reports `avail != 0` in our
|
||||
single-frame poll loop; the driver hasn't finalized the result by the
|
||||
time we check. The fix is double-buffering (issue queryA on frame N,
|
||||
read result on frame N+2). N.6 perf polish item.
|
||||
- [x] **`drawsIssued` ≤ 5 per pass (CPU GL calls)** — exactly 2 indirect
|
||||
calls per frame regardless of scene size.
|
||||
- [x] **All tests green** — 70/70 in
|
||||
`FullyQualifiedName~Wb|FullyQualifiedName~MatrixComposition`.
|
||||
8 pre-existing failures in `MotionInterpreter` / `BSPStepUp` /
|
||||
`PositionManager` / `PlayerMovementController` / `Dispatcher` are
|
||||
carry-forward from before N.5 and unrelated to rendering.
|
||||
- [N/A] **`ACDREAM_USE_WB_FOUNDATION=0` still works** — escape hatch
|
||||
formally retired in N.5 ship amendment. `InstancedMeshRenderer`,
|
||||
`StaticMeshRenderer`, and `WbFoundationFlag` deleted. Missing
|
||||
bindless throws `NotSupportedException` at startup with a clear
|
||||
error message. No fallback path.
|
||||
|
||||
## Visual verification (Task 14)
|
||||
|
||||
- [x] **Holtburg courtyard** — PASS at Task 10 USER GATE.
|
||||
- [ ] **Foundry interior / dense static-object scene** — TODO Task 14.
|
||||
- [ ] **Indoor → outdoor cell transition** — TODO Task 14.
|
||||
- [ ] **Drudge / character close-up (Issue #47 close-detail mesh)** — TODO Task 14.
|
||||
- [ ] **Magic content (Decision 2 additive fallback check)** — TODO Task 14.
|
||||
- [ ] **Long-session sanity** — DEFERRED (N.6 watchlist; not load-bearing for ship).
|
||||
|
||||
## Open follow-ups for N.6
|
||||
|
||||
1. **GPU timer query double-buffering** — the current single-frame poll
|
||||
pattern never sees `QueryResultAvailable=true`. Issue queryA on frame N,
|
||||
queryB on frame N+1, read queryA on frame N+2. ~30 lines of state.
|
||||
2. **Direct N.4 vs N.5 perf comparison** — re-run with `git checkout`ed N.4
|
||||
SHIP (`c445364`) for a side-by-side measurement. Not load-bearing but
|
||||
useful for N.6 ship message.
|
||||
3. **Persistent-mapped buffers** — Decision 7 deferral. If profiling shows
|
||||
the per-frame `glBufferData` cost is the residual hot spot, layer it on
|
||||
top of the modern path.
|
||||
2706
docs/superpowers/plans/2026-05-08-phase-n5-modern-rendering.md
Normal file
2706
docs/superpowers/plans/2026-05-08-phase-n5-modern-rendering.md
Normal file
File diff suppressed because it is too large
Load diff
|
|
@ -0,0 +1,554 @@
|
|||
# Phase N.5 — Modern Rendering Path — Design Spec
|
||||
|
||||
**Status:** Draft (brainstormed 2026-05-08, not yet implemented).
|
||||
**Author:** acdream lead engineer + Claude.
|
||||
**Builds on:** Phase N.4 (`WbDrawDispatcher`, shipped 2026-05-08).
|
||||
**Predecessor docs:**
|
||||
- `docs/research/2026-05-08-phase-n5-handoff.md` (cold-start briefing).
|
||||
- `docs/superpowers/plans/2026-05-08-phase-n4-rendering-foundation.md` (N.4 plan; Adjustments 7-10 are required reading).
|
||||
- `docs/superpowers/specs/2026-05-08-phase-n4-rendering-foundation-design.md` (N.4 spec).
|
||||
|
||||
---
|
||||
|
||||
## 1. Problem statement
|
||||
|
||||
N.4 collapsed entity rendering from O(entities × batches) per-draw GL calls to O(unique GfxObj × surface × translucency) grouped instanced draws. The remaining hot path still does, per group:
|
||||
|
||||
```
|
||||
glActiveTexture(0)
|
||||
glBindTexture(2D, texHandle)
|
||||
glBindBuffer(EBO, batchIbo)
|
||||
glDrawElementsInstancedBaseVertexBaseInstance(...)
|
||||
```
|
||||
|
||||
Across a typical Holtburg-courtyard scene that's still ~100-300 GL calls per frame for entities. Modern GPUs and our drivers (GL 4.3 + bindless, gated by WB's `_useModernRendering`) support patterns that eliminate ALL of those per-group calls:
|
||||
|
||||
- **Bindless textures** (`GL_ARB_bindless_texture`) — texture handles are 64-bit tokens that don't require `glBindTexture` to use; the shader samples from a handle read out of buffer data.
|
||||
- **Multi-draw indirect** (`glMultiDrawElementsIndirect`) — one GL call dispatches N draws from a `DrawElementsIndirectCommand` buffer; the driver issues all of them with no CPU-side per-draw work.
|
||||
|
||||
N.5 lifts `WbDrawDispatcher` onto these primitives. Target: ≥30% reduction in CPU dispatcher time, draw call count down to ~5/frame, no visual regression vs N.4.
|
||||
|
||||
---
|
||||
|
||||
## 2. Decisions log
|
||||
|
||||
This section records the brainstorm outcomes that the rest of the doc relies on.
|
||||
|
||||
| # | Decision | Choice | Reason |
|
||||
|---|---|---|---|
|
||||
| 1 | Texture sampler model | **`sampler2DArray`** for ALL textures (1-layer wrapping for per-instance composites) | Matches WB's modern shader exactly; future-proofs for atlas adoption in N.6+; avoids two shader files. ~50 lines of TextureCache change. |
|
||||
| 2 | Translucent rendering | **WB's two-pass alpha-test** (opaque pass discards `α<0.95`, transparent pass discards `α≥0.95`) | Single blend mode per pass enables one indirect call per pass. Loses native `Additive` blend on GfxObj surfaces; sky + particles have own renderers and aren't affected. Falsifiable at visual verification — if we see a regression, add an additive sub-pass (~30-min fix). |
|
||||
| 3 | Per-instance + per-draw data delivery | **All-SSBO**: `Instances[]` at binding=0 (mat4 per instance), `Batches[]` at binding=1 (texture handle + layer + flags per group) | Matches WB's modern shader. SSBOs avoid the 16-attrib stride limit, scale to large instance counts, give clean per-draw indexing via `gl_DrawIDARB`. |
|
||||
| 4 | Bindless handle residency | **Resident on upload, never release** | acdream's content set is bounded (~1-5K unique textures per session). Handles persist for process lifetime; no eviction code in N.5. Diagnostic logging of handle count under `ACDREAM_WB_DIAG=1` to spot growth. |
|
||||
| 5 | Escape hatch | **Modern path mandatory (N.5 ship amendment)**. `WbFoundationFlag` and `ACDREAM_USE_WB_FOUNDATION` env var have been deleted. Missing `GL_ARB_bindless_texture` or `GL_ARB_shader_draw_parameters` throws `NotSupportedException` at startup with a clear error message. No fallback. | Escape hatch was never exercised after N.4 ship. Legacy `InstancedMeshRenderer` + `StaticMeshRenderer` deleted in the N.5 retirement commit. N.6 scope narrowed accordingly. |
|
||||
| 6 | Perf measurement | **CPU stopwatch + GL timer queries** logged via `[WB-DIAG]` | Captures both CPU dispatcher time and GPU rendering time. Acceptance gate compares before/after numbers in fixed Holtburg/Foundry scenes. |
|
||||
| 7 | Persistent-mapped buffers | **Defer to N.6** | Bindless+indirect win is 70-80% of achievable savings. Persistent-mapped + ring + sync is the last 5-10% with non-trivial sync-fence complexity; not worth the risk in N.5's 2-3 week budget. Add post-N.5 if profiling shows residual `glBufferData` cost. |
|
||||
| 8 | Per-instance highlight (selection blink) | **Defer to a Phase B.4 follow-up** | Retail pulses click targets as visual confirmation; the right mechanism is per-instance highlight color (NOT WB's global `uHighlightColor` which would tint everything in our single-indirect-call design). Field is reserved in design (extend `InstanceData` to include `vec4 highlightColor`); N.5 ships without the field, future phase plumbs it without shader rewrite. |
|
||||
|
||||
---
|
||||
|
||||
## 3. Architecture overview
|
||||
|
||||
### What changes
|
||||
|
||||
`WbDrawDispatcher.Draw` swaps its inner loop. Phases 1-3 (entity walk, group bucketing, matrix layout) stay intact. Phases 5-6 (per-group GL calls) are replaced by a single `glMultiDrawElementsIndirect` per pass, fed by SSBO-resident per-instance and per-draw data.
|
||||
|
||||
### What's preserved from N.4
|
||||
|
||||
- Group bucketing pipeline (entity AABB cull, palette hash memo, group key dictionary).
|
||||
- `AcSurfaceMetadataTable` for translucency classification.
|
||||
- `EntitySpawnAdapter` / `LandblockSpawnAdapter` (mesh lifecycle bridge).
|
||||
- `WbMeshAdapter` (the seam over WB's `ObjectMeshManager`).
|
||||
- Front-to-back sort of opaque groups (depth-test reject of overdrawn fragments).
|
||||
- Per-entity 5m AABB frustum cull.
|
||||
|
||||
### What's new
|
||||
|
||||
- `TextureCache` uploads as 1-layer `Texture2DArray` instead of `Texture2D`. Generates 64-bit bindless handles at upload, makes them resident.
|
||||
- New shader pair `mesh_modern.vert/.frag` modeled on WB's `StaticObjectModern` but adapted (see §6).
|
||||
- Three new GPU buffers in the dispatcher:
|
||||
- `_instanceSsbo` — `std430` layout, `mat4[]`, all visible matrices.
|
||||
- `_batchSsbo` — `std430` layout, `BatchData[]`, one entry per group.
|
||||
- `_indirectBuffer` — `DrawElementsIndirectCommand[]`, one per group.
|
||||
- Two diagnostic measurements in `[WB-DIAG]`: CPU stopwatch span around `Draw()`; GPU `GL_TIME_ELAPSED` query around the indirect dispatch.
|
||||
|
||||
### What gets deleted
|
||||
|
||||
- `WbDrawDispatcher.DrawGroup` (replaced by indirect).
|
||||
- `WbDrawDispatcher.EnsureInstanceAttribs` (no more vertex attribs at locations 3-6).
|
||||
- Per-blend-mode `glBlendFunc` switch in the translucent loop.
|
||||
- `mesh_instanced.vert/.frag` (replaced by `mesh_modern.*`).
|
||||
|
||||
### What stays under the escape hatch
|
||||
|
||||
`InstancedMeshRenderer` is untouched. `ACDREAM_USE_WB_FOUNDATION=0` still routes there. N.6 retires it.
|
||||
|
||||
---
|
||||
|
||||
## 4. Component changes
|
||||
|
||||
### 4.1 `TextureCache`
|
||||
|
||||
Texture upload path becomes Texture2DArray with depth=1:
|
||||
|
||||
```csharp
|
||||
private uint UploadRgba8AsLayer1Array(DecodedTexture decoded)
|
||||
{
|
||||
uint tex = _gl.GenTexture();
|
||||
_gl.BindTexture(TextureTarget.Texture2DArray, tex);
|
||||
|
||||
fixed (byte* p = decoded.Rgba8)
|
||||
_gl.TexImage3D(
|
||||
TextureTarget.Texture2DArray, 0, InternalFormat.Rgba8,
|
||||
(uint)decoded.Width, (uint)decoded.Height, depth: 1,
|
||||
border: 0, PixelFormat.Rgba, PixelType.UnsignedByte, p);
|
||||
|
||||
_gl.TexParameter(TextureTarget.Texture2DArray, TextureParameterName.TextureMinFilter, (int)TextureMinFilter.Linear);
|
||||
_gl.TexParameter(TextureTarget.Texture2DArray, TextureParameterName.TextureMagFilter, (int)TextureMagFilter.Linear);
|
||||
_gl.TexParameter(TextureTarget.Texture2DArray, TextureParameterName.TextureWrapS, (int)TextureWrapMode.Repeat);
|
||||
_gl.TexParameter(TextureTarget.Texture2DArray, TextureParameterName.TextureWrapT, (int)TextureWrapMode.Repeat);
|
||||
_gl.BindTexture(TextureTarget.Texture2DArray, 0);
|
||||
return tex;
|
||||
}
|
||||
```
|
||||
|
||||
Bindless handle generation, eager + resident-on-upload, parallel cache:
|
||||
|
||||
```csharp
|
||||
private readonly Dictionary<uint, ulong> _bindlessHandlesByGlName = new();
|
||||
|
||||
private ulong MakeResidentHandle(uint glTextureName)
|
||||
{
|
||||
if (_bindlessHandlesByGlName.TryGetValue(glTextureName, out var h))
|
||||
return h;
|
||||
h = _bindless.GetTextureHandleARB(glTextureName);
|
||||
_bindless.MakeTextureHandleResidentARB(h);
|
||||
_bindlessHandlesByGlName[glTextureName] = h;
|
||||
return h;
|
||||
}
|
||||
```
|
||||
|
||||
Three new methods returning `ulong` bindless handles, paralleling the existing `uint` GL-name methods:
|
||||
|
||||
```csharp
|
||||
public ulong GetOrUploadBindless(uint surfaceId);
|
||||
public ulong GetOrUploadWithOrigTextureOverrideBindless(uint surfaceId, uint overrideOrigTextureId);
|
||||
public ulong GetOrUploadWithPaletteOverrideBindless(uint surfaceId, uint? overrideOrigTextureId, PaletteOverride paletteOverride, ulong precomputedPaletteHash);
|
||||
```
|
||||
|
||||
Each delegates to its existing `uint` sibling to populate the underlying GL texture, then calls `MakeResidentHandle` and returns the 64-bit handle.
|
||||
|
||||
The `uint`-returning methods stay (used by `SkyRenderer`, `TerrainAtlas`, anything outside the WB modern path).
|
||||
|
||||
`Dispose` releases bindless handles BEFORE deleting their textures: iterate `_bindlessHandlesByGlName.Values`, call `glMakeTextureHandleNonResidentARB(handle)`, then `glDeleteTextures` proceeds as today.
|
||||
|
||||
### 4.2 `WbDrawDispatcher`
|
||||
|
||||
Three new GPU buffers (replacing `_instanceVbo`):
|
||||
|
||||
```csharp
|
||||
private uint _instanceSsbo; // binding=0, std430, mat4[]
|
||||
private uint _batchSsbo; // binding=1, std430, BatchData[]
|
||||
private uint _indirectBuffer; // GL_DRAW_INDIRECT_BUFFER, DEIC[]
|
||||
```
|
||||
|
||||
`InstanceGroup` becomes:
|
||||
|
||||
```csharp
|
||||
private sealed class InstanceGroup
|
||||
{
|
||||
public uint Ibo;
|
||||
public uint FirstIndex;
|
||||
public int BaseVertex;
|
||||
public int IndexCount;
|
||||
public ulong BindlessTextureHandle; // 64-bit (was uint TextureHandle in N.4)
|
||||
public uint TextureLayer; // always 0 in N.5 (per-instance composites are 1-layer arrays)
|
||||
public TranslucencyKind Translucency;
|
||||
public int FirstInstance;
|
||||
public int InstanceCount;
|
||||
public float SortDistance;
|
||||
public readonly List<Matrix4x4> Matrices = new();
|
||||
}
|
||||
```
|
||||
|
||||
`GroupKey` adds the layer:
|
||||
|
||||
```csharp
|
||||
private readonly record struct GroupKey(
|
||||
uint Ibo, uint FirstIndex, int BaseVertex, int IndexCount,
|
||||
ulong BindlessTextureHandle, uint TextureLayer, TranslucencyKind Translucency);
|
||||
```
|
||||
|
||||
Per-frame draw flow:
|
||||
|
||||
1. **Walk entities → build `_groups` dict** (unchanged from N.4).
|
||||
2. **Lay matrices contiguously, split opaque/transparent, sort opaque** (unchanged).
|
||||
3. **Build per-group BatchData and DEIC arrays.** One `BatchData` per group `(handle, layer, flags=0)`. One DEIC per group `(count = IndexCount, instanceCount = InstanceCount, firstIndex = FirstIndex, baseVertex = BaseVertex, baseInstance = FirstInstance)`. Indirect commands are laid out contiguously: opaque section first (sorted front-to-back), transparent section second. `_opaqueDrawCount` and `_transparentDrawCount` track section sizes; `_transparentByteOffset = _opaqueDrawCount * sizeof(DEIC)`.
|
||||
4. **Three `glBufferData` uploads** to `_instanceSsbo`, `_batchSsbo`, `_indirectBuffer` (single buffer, both sections).
|
||||
5. **Bind global VAO once** (preserved from N.4 — modern rendering shares one VAO).
|
||||
6. **Bind SSBOs once** via `glBindBufferBase(SHADER_STORAGE_BUFFER, 0, _instanceSsbo)` and `... 1, _batchSsbo`.
|
||||
7. **Opaque pass.** Set `uRenderPass = 0`. `glBindBuffer(DRAW_INDIRECT_BUFFER, _indirectBuffer)`. `glMultiDrawElementsIndirect(Triangles, UnsignedShort, indirect=(void*)0, drawcount=_opaqueDrawCount, stride=sizeof(DEIC))`.
|
||||
8. **Transparent pass.** Set `uRenderPass = 1`. `glEnable(BLEND)` + `glBlendFunc(SrcAlpha, OneMinusSrcAlpha)` + `glDepthMask(false)`. `glMultiDrawElementsIndirect(Triangles, UnsignedShort, indirect=(void*)_transparentByteOffset, drawcount=_transparentDrawCount, stride=sizeof(DEIC))`.
|
||||
9. **Restore state.** `glDepthMask(true)` + `glDisable(BLEND)` + `glBindVertexArray(0)`.
|
||||
|
||||
Diagnostic timing (under `ACDREAM_WB_DIAG=1`):
|
||||
|
||||
- CPU: `Stopwatch` started at the top of `Draw()`, stopped at the bottom. Median + 95th-percentile flushed in the 5-second `[WB-DIAG]` rollup.
|
||||
- GPU: `glGenQueries` two query objects (one for opaque, one for transparent). `glBeginQuery(TIME_ELAPSED) / glEndQuery` around each `glMultiDrawElementsIndirect`. Result polled with `GL_QUERY_RESULT_NO_WAIT` on the next frame's start; if not ready, drop the sample and try again.
|
||||
|
||||
### 4.3 New shader files
|
||||
|
||||
`src/AcDream.App/Shaders/mesh_modern.vert`:
|
||||
|
||||
```glsl
|
||||
#version 430 core
|
||||
#extension GL_ARB_bindless_texture : require
|
||||
#extension GL_ARB_shader_draw_parameters : require
|
||||
|
||||
layout(location = 0) in vec3 aPosition;
|
||||
layout(location = 1) in vec3 aNormal;
|
||||
layout(location = 2) in vec2 aTexCoord;
|
||||
|
||||
struct InstanceData {
|
||||
mat4 transform;
|
||||
// Reserved for Phase B.4 follow-up (selection-blink retail-faithful highlight):
|
||||
// vec4 highlightColor; // RGBA — when non-zero alpha, fragment shader mixes into output.
|
||||
// Add field here, increase stride to 80 bytes, and read at fragment via flat varying.
|
||||
};
|
||||
|
||||
struct BatchData {
|
||||
uvec2 textureHandle; // bindless handle for sampler2DArray
|
||||
uint textureLayer; // layer index (always 0 for per-instance composites)
|
||||
uint flags; // reserved for future use
|
||||
};
|
||||
|
||||
layout(std430, binding = 0) readonly buffer InstanceBuffer {
|
||||
InstanceData Instances[];
|
||||
};
|
||||
|
||||
layout(std430, binding = 1) readonly buffer BatchBuffer {
|
||||
BatchData Batches[];
|
||||
};
|
||||
|
||||
layout(std140, binding = 1) uniform LightingUbo {
|
||||
vec4 uAmbient;
|
||||
vec4 uSunDir;
|
||||
vec4 uSunColor;
|
||||
// matches existing acdream lighting UBO; do not change layout
|
||||
};
|
||||
|
||||
uniform mat4 uViewProjection;
|
||||
uniform int uRenderPass; // 0=opaque, 1=transparent (consumed in fragment shader)
|
||||
|
||||
out vec3 vNormal;
|
||||
out vec2 vTexCoord;
|
||||
out flat uvec2 vTextureHandle;
|
||||
out flat uint vTextureLayer;
|
||||
|
||||
void main() {
|
||||
int instanceIndex = gl_BaseInstanceARB + gl_InstanceID;
|
||||
mat4 model = Instances[instanceIndex].transform;
|
||||
|
||||
vec4 worldPos = model * vec4(aPosition, 1.0);
|
||||
gl_Position = uViewProjection * worldPos;
|
||||
|
||||
vNormal = normalize(mat3(model) * aNormal);
|
||||
vTexCoord = aTexCoord;
|
||||
|
||||
BatchData b = Batches[gl_DrawIDARB];
|
||||
vTextureHandle = b.textureHandle;
|
||||
vTextureLayer = b.textureLayer;
|
||||
}
|
||||
```
|
||||
|
||||
`src/AcDream.App/Shaders/mesh_modern.frag`:
|
||||
|
||||
```glsl
|
||||
#version 430 core
|
||||
#extension GL_ARB_bindless_texture : require
|
||||
|
||||
in vec3 vNormal;
|
||||
in vec2 vTexCoord;
|
||||
in flat uvec2 vTextureHandle;
|
||||
in flat uint vTextureLayer;
|
||||
|
||||
layout(std140, binding = 1) uniform LightingUbo {
|
||||
vec4 uAmbient;
|
||||
vec4 uSunDir;
|
||||
vec4 uSunColor;
|
||||
};
|
||||
|
||||
uniform int uRenderPass;
|
||||
|
||||
out vec4 FragColor;
|
||||
|
||||
void main() {
|
||||
sampler2DArray tex = sampler2DArray(vTextureHandle);
|
||||
vec4 color = texture(tex, vec3(vTexCoord, float(vTextureLayer)));
|
||||
|
||||
if (uRenderPass == 0) {
|
||||
// Opaque pass: discard soft pixels (alpha cutout), write to depth
|
||||
if (color.a < 0.95) discard;
|
||||
} else {
|
||||
// Transparent pass: discard hard pixels (already drawn opaque), no depth write
|
||||
if (color.a >= 0.95) discard;
|
||||
if (color.a < 0.05) discard; // skip totally-empty fragments — perf for large transparent overdraw
|
||||
}
|
||||
|
||||
// Diffuse lighting (preserved from acdream's existing lighting model)
|
||||
vec3 N = normalize(vNormal);
|
||||
vec3 L = normalize(uSunDir.xyz);
|
||||
float diff = max(dot(N, L), 0.0);
|
||||
vec3 lit = uAmbient.rgb + uSunColor.rgb * diff;
|
||||
color.rgb *= clamp(lit, 0.0, 1.0);
|
||||
|
||||
FragColor = color;
|
||||
}
|
||||
```
|
||||
|
||||
Differences from WB's `StaticObjectModern.*`:
|
||||
|
||||
- Drops `uActiveCells[]` cell-filtering (acdream culls cells on CPU).
|
||||
- Drops `uDrawIDOffset` (acdream issues full passes, no pagination).
|
||||
- Drops `uHighlightColor` (deferred to Phase B.4 follow-up; reserved as per-instance `highlightColor` field, not a global uniform).
|
||||
- Adapts the lighting model to acdream's existing UBO at binding=1 instead of WB's `SceneData` UBO.
|
||||
- Uses 1-layer `sampler2DArray` for ALL textures (WB uses multi-layer atlases — same shader works for both shapes).
|
||||
|
||||
---
|
||||
|
||||
## 5. Per-frame data flow walk-through
|
||||
|
||||
A concrete trace. Visible work for frame N:
|
||||
|
||||
| Group | GfxObj | Surface | Translucency | Instances |
|
||||
|---|---|---|---|---|
|
||||
| 0 | oak tree | bark | Opaque | 12 |
|
||||
| 1 | oak tree | leaves | AlphaBlend | 12 |
|
||||
| 2 | drudge | skin (palette override) | Opaque | 1 |
|
||||
| 3 | drudge | eyes | Opaque | 1 |
|
||||
|
||||
**Instance SSBO** (binding=0), 26 entries (each batch contributes its own copy of the entity matrix):
|
||||
```
|
||||
[0..11] = oak instance matrices (group 0 — bark)
|
||||
[12..23] = oak instance matrices (group 1 — leaves)
|
||||
[24] = drudge instance matrix (group 2 — skin)
|
||||
[25] = drudge instance matrix (group 3 — eyes)
|
||||
```
|
||||
|
||||
**Batch SSBO** (binding=1), 4 entries indexed by `gl_DrawIDARB`:
|
||||
```
|
||||
Batches[0] = (oak_bark_handle, layer=0, flags=0)
|
||||
Batches[1] = (oak_leaves_handle, layer=0, flags=0)
|
||||
Batches[2] = (drudge_skin_handle_with_palette, layer=0, flags=0)
|
||||
Batches[3] = (drudge_eyes_handle, layer=0, flags=0)
|
||||
```
|
||||
|
||||
**Indirect buffer** (single buffer, two sections):
|
||||
```
|
||||
_indirectBuffer[0..2] = opaque section (3 entries, sorted front-to-back)
|
||||
[0] = (count=oakBarkIdx, instanceCount=12, firstIndex=oakBarkFI, baseVertex=oakBV, baseInstance=0)
|
||||
[1] = (count=drudgeSkinIdx, instanceCount=1, firstIndex=drudgeSkinFI, baseVertex=drudgeBV, baseInstance=24)
|
||||
[2] = (count=drudgeEyesIdx, instanceCount=1, firstIndex=drudgeEyesFI, baseVertex=drudgeBV, baseInstance=25)
|
||||
|
||||
_indirectBuffer[3] = transparent section (1 entry)
|
||||
[3] = (count=oakLeavesIdx, instanceCount=12, firstIndex=oakLeavesFI, baseVertex=oakBV, baseInstance=12)
|
||||
|
||||
_opaqueDrawCount = 3; _transparentDrawCount = 1; _transparentByteOffset = 3 * sizeof(DEIC) = 60.
|
||||
```
|
||||
|
||||
**Shader access pattern** (per vertex):
|
||||
```glsl
|
||||
int instanceIndex = gl_BaseInstanceARB + gl_InstanceID; // unique per (group, instance) pair
|
||||
mat4 model = Instances[instanceIndex].transform;
|
||||
BatchData b = Batches[gl_DrawIDARB]; // shared across all verts in this draw
|
||||
sampler2DArray tex = sampler2DArray(b.textureHandle);
|
||||
vec4 color = texture(tex, vec3(aTexCoord, float(b.textureLayer)));
|
||||
```
|
||||
|
||||
**Per-frame CPU GL calls** (entity rendering, total):
|
||||
- 3× `glBufferData` (instance SSBO, batch SSBO, indirect buffer).
|
||||
- 1× `glBindVertexArray(globalVAO)`.
|
||||
- 2× `glBindBufferBase` (SSBOs at bindings 0 + 1).
|
||||
- 1× `glBindBuffer(DRAW_INDIRECT_BUFFER, _indirectBuffer)`.
|
||||
- 2× `glMultiDrawElementsIndirect` (one opaque, one transparent).
|
||||
- ~5 state changes (blend, depth mask, render pass uniform).
|
||||
|
||||
Total: ~15-20 GL calls per frame for entity rendering, regardless of group count. N.4 baseline is "few hundred."
|
||||
|
||||
---
|
||||
|
||||
## 6. Translucent rendering detail
|
||||
|
||||
Per Decision 2: WB's two-pass alpha-test pattern.
|
||||
|
||||
**Group classification.** `ClassifyBatches` puts groups into one of two arrays:
|
||||
|
||||
- **Opaque indirect:** `TranslucencyKind.Opaque` and `TranslucencyKind.ClipMap`.
|
||||
- **Transparent indirect:** `TranslucencyKind.AlphaBlend`, `Additive`, `InvAlpha` all merged. Per Decision 2, additive renders as alpha-blend; falsifiable at visual verification.
|
||||
|
||||
Opaque groups stay sorted front-to-back by `SortDistance` (preserved from N.4 — depth-test reject of overdrawn fragments is a meaningful win on dense scenes).
|
||||
|
||||
**Pass GL state:**
|
||||
|
||||
```csharp
|
||||
// Opaque pass
|
||||
_gl.Disable(EnableCap.Blend);
|
||||
_gl.DepthMask(true);
|
||||
_gl.Enable(EnableCap.CullFace); _gl.CullFace(TriangleFace.Back); _gl.FrontFace(FrontFaceDirection.Ccw);
|
||||
_shader.SetInt("uRenderPass", 0);
|
||||
_gl.BindBuffer(BufferTargetARB.DrawIndirectBuffer, _indirectBuffer);
|
||||
_gl.MultiDrawElementsIndirect(PrimitiveType.Triangles, DrawElementsType.UnsignedShort,
|
||||
indirect: (void*)0, drawcount: _opaqueDrawCount, stride: (uint)sizeof(DEIC));
|
||||
|
||||
// Transparent pass
|
||||
_gl.Enable(EnableCap.Blend);
|
||||
_gl.BlendFunc(BlendingFactor.SrcAlpha, BlendingFactor.OneMinusSrcAlpha);
|
||||
_gl.DepthMask(false);
|
||||
_shader.SetInt("uRenderPass", 1);
|
||||
_gl.MultiDrawElementsIndirect(PrimitiveType.Triangles, DrawElementsType.UnsignedShort,
|
||||
indirect: (void*)_transparentByteOffset, drawcount: _transparentDrawCount, stride: (uint)sizeof(DEIC));
|
||||
|
||||
// Cleanup
|
||||
_gl.DepthMask(true); _gl.Disable(EnableCap.Blend); _gl.BindVertexArray(0);
|
||||
```
|
||||
|
||||
**Visual verification gate (additive fallback plan).** During Week 2-3 visual verification, look at:
|
||||
- Holtburg courtyard, dungeon entrance — confirm scenery + characters identical.
|
||||
- Foundry interior — magic-themed content with potentially additive-flagged surfaces.
|
||||
- Any glowing weapon decals, magical aura effects, or self-luminous textures observed.
|
||||
|
||||
If a visible regression appears (faded glow, missing additive bloom): amend spec to add a third indirect call within the transparent pass with `glBlendFunc(SrcAlpha, One)`. Group classification splits Additive into its own bucket. ~30-min change.
|
||||
|
||||
---
|
||||
|
||||
## 7. Error handling and fallback
|
||||
|
||||
### 7.1 GPU capability detection
|
||||
|
||||
WB's `OpenGLGraphicsDevice` already detects:
|
||||
- `HasOpenGL43` (required for SSBOs, multi-draw indirect, `gl_BaseInstanceARB`).
|
||||
- `HasBindless` (required for bindless texture handles).
|
||||
|
||||
`WbDrawDispatcher` is only constructed when `WbFoundationFlag.Enabled` is true, which gates on `_useModernRendering = HasOpenGL43 && HasBindless`. We inherit WB's gating.
|
||||
|
||||
**Additional check:** `GL_ARB_shader_draw_parameters` (for `gl_BaseInstanceARB`, `gl_DrawIDARB`). Standard on GL 4.6, available as extension on 4.3+. Add to N.5's capability check; if missing, `WbDrawDispatcher` constructor logs a one-time warning and the foundation flag flips off (falls back to `InstancedMeshRenderer`).
|
||||
|
||||
### 7.2 Shader compile failure
|
||||
|
||||
If `mesh_modern.vert/.frag` fails to compile (driver bug, GLSL version mismatch, extension issue): catch the compile exception in `WbDrawDispatcher` constructor, log the GLSL info log + GPU vendor/renderer string ONCE, flip `WbFoundationFlag.Enabled = false` for the session, fall back to `InstancedMeshRenderer`. Do not crash.
|
||||
|
||||
### 7.3 Non-resident handle (the bindless foot-gun)
|
||||
|
||||
Sampling a non-resident handle causes undefined behavior (driver-dependent: black texture, GPU fault, device-lost).
|
||||
|
||||
Mitigation in code: `TextureCache.MakeResidentHandle` is the only API that produces a handle, and it makes the handle resident in the same call. There is no API surface that produces a non-resident handle. Defense-in-depth: dispatcher asserts `BindlessTextureHandle != 0` before queuing a draw (zero handles get filtered out, same as zero `surfaceId` does today).
|
||||
|
||||
### 7.4 Indirect command corruption
|
||||
|
||||
`count`, `firstIndex`, `baseVertex` come from WB's `ObjectRenderBatch` (never user input; WB-internal correctness). `instanceCount` is `grp.Matrices.Count` (we control). `baseInstance` is `grp.FirstInstance` (we control, computed cumulatively). Bug-class is "WB-internal corruption + our cumulative-offset bug" — same surface area as N.4's `BaseInstance` already trusts. Add a debug-build assertion: cumulative `baseInstance` values must be strictly increasing.
|
||||
|
||||
### 7.5 Disposal order
|
||||
|
||||
`WbDrawDispatcher.Dispose` releases bindless handles before deleting underlying textures (driver UB otherwise). `TextureCache.Dispose` does this:
|
||||
1. Iterate `_bindlessHandlesByGlName.Values`, call `glMakeTextureHandleNonResidentARB(handle)`.
|
||||
2. Call `_glExtensions.MakeAllNonResidentARB` if available (some drivers prefer batch).
|
||||
3. Then `glDeleteTextures` proceeds as today.
|
||||
|
||||
Dispatcher's own buffer cleanup (`_instanceSsbo`, `_batchSsbo`, `_indirectBuffer`) via `glDeleteBuffers`.
|
||||
|
||||
### 7.6 Persistent first-failure diagnostic
|
||||
|
||||
If shader compile fails OR an extension check fails OR `glMultiDrawElementsIndirect` returns `GL_INVALID_OPERATION` on first frame: log ONCE with GPU vendor/renderer string + GLSL info log. Don't spam. User pastes the line into a bug report; we know exactly where to look.
|
||||
|
||||
---
|
||||
|
||||
## 8. Testing and acceptance
|
||||
|
||||
### 8.1 Unit / conformance tests
|
||||
|
||||
- **`TextureCacheBindlessTests`** — for each `Bindless`-suffixed `GetOrUpload*`: returns non-zero `ulong`, returns same handle for same key (cache hit), distinct keys yield distinct handles, returned handle is resident per GL state query.
|
||||
- **`WbDrawDispatcherIndirectBuilderTests`** — pure CPU test: given a fixture of `(entity, mesh, batch)` tuples, verify the indirect buffer layout: `count` / `firstIndex` / `baseVertex` / `baseInstance` per group, opaque section sorted front-to-back, transparent section in classification order (no sort — back-to-front sort can be added in a follow-up if measured useful).
|
||||
- **`WbDrawDispatcherTranslucencyTests`** — verify groups land in correct indirect buffer (opaque vs transparent) per `TranslucencyKind`. `Additive`/`InvAlpha` go to transparent. `ClipMap` goes to opaque. Empty groups skipped.
|
||||
- **Existing N.4 tests stay green.** All 60 tests captured by `FullyQualifiedName~Wb|MatrixComposition` filter remain at 60/0.
|
||||
|
||||
### 8.2 Visual verification
|
||||
|
||||
Same gate as N.4 used. Live ACE + retail dat, in-world testing.
|
||||
|
||||
- **Holtburg courtyard** — characters + scenery + buildings render identically to N.4. No missing entities, no z-fighting, no exploded parts.
|
||||
- **Foundry interior** — dense static-object scene, stress-tests indirect call count and translucency classification.
|
||||
- **Indoor → outdoor cell transition** — confirms cell visibility filtering still works (we cull on CPU; dispatcher should never see invisible-cell entities).
|
||||
- **Drudge / character close-up** — confirms Issue #47 close-detail mesh preservation.
|
||||
- **Magic content (additive fallback check)** — Foundry runes, glowing weapons if observable, boss models with luminous decals. Trigger spec amendment if regression spotted.
|
||||
|
||||
User-confirms each. These are visual identity checks against the running N.4 behavior (use `git stash` of N.5 changes + relaunch as the comparison baseline).
|
||||
|
||||
### 8.3 Perf measurement (the win gate)
|
||||
|
||||
`[WB-DIAG]` augmented:
|
||||
|
||||
```
|
||||
[WB-DIAG] entSeen=N entDrawn=M ... drawsIssued=K groups=G (existing)
|
||||
[WB-DIAG] cpu_us=Xmedian/Y95p gpu_us=Zmedian/W95p (new)
|
||||
```
|
||||
|
||||
Capture before/after numbers in fixed scenes/cameras:
|
||||
|
||||
| Scene | Camera position | Metric |
|
||||
|---|---|---|
|
||||
| Holtburg courtyard | 30m elevated, looking SW | `cpu`, `gpu`, `drawsIssued` |
|
||||
| Foundry interior | character spawn, default heading | `cpu`, `gpu`, `drawsIssued` |
|
||||
| Open landscape | terrain wander, no entities | `cpu`, `gpu`, `drawsIssued` (sanity) |
|
||||
|
||||
**Acceptance gates** (paste into SHIP commit message):
|
||||
|
||||
- Visual identity to N.4 — confirmed via §8.2.
|
||||
- CPU dispatcher time ≤ 70% of N.4 in Holtburg courtyard (target: ≥30% reduction).
|
||||
- GPU rendering time within ±10% of N.4 (sanity: no regression).
|
||||
- `drawsIssued ≤ 5 per pass` (down from "few hundred per pass").
|
||||
- All tests green — 60+ Wb tests + new bindless/indirect tests.
|
||||
- `ACDREAM_USE_WB_FOUNDATION=0` still works — `InstancedMeshRenderer` fallback runs and renders correctly.
|
||||
|
||||
### 8.4 Long-session sanity check
|
||||
|
||||
Hour-long session with `ACDREAM_WB_DIAG=1`. Watch resident-handle count grow. Expected: bounded plateau under 5K once content set is fully traversed. If unbounded growth, residency policy revisit required in N.6.
|
||||
|
||||
---
|
||||
|
||||
## 9. Risks
|
||||
|
||||
| Risk | Likelihood | Impact | Mitigation |
|
||||
|---|---|---|---|
|
||||
| Driver bug in bindless residency | Low (mature in 2025+ drivers) | Crash / black textures | One-time logging on first failure; legacy fallback under flag-off |
|
||||
| Driver bug in `glMultiDrawElementsIndirect` | Low | GL_INVALID_OPERATION | Capability check + first-failure logging + fallback |
|
||||
| Resident handle count exceeds driver limit in long session | Low (acdream content is bounded) | Cumulative GPU memory pressure → eventual eviction surprises | `[WB-DIAG]` resident-count log; revisit eviction in N.6 if it grows unbounded |
|
||||
| Shader compile fails on weird GPU | Medium-low | First-launch failure | Compile-error catch + fallback to `InstancedMeshRenderer` |
|
||||
| Additive fidelity regression on rare GfxObj surfaces | Medium | Subtle visual difference | Visual verification at magic-themed content; spec amendment for additive sub-pass if found |
|
||||
| `gl_BaseInstanceARB` fields not advancing per-instance attribs we still use | Low (we drop attribs entirely) | Wrong matrices | All instance data via SSBO; no vertex attrib at locations 3-6 to misalign |
|
||||
| SSBO indexing GPU cost worse than uniform-array | Low (well-optimized in modern drivers) | Possible GPU time regression | GL timer queries detect; if observed, fall back to uniform array of bounded size |
|
||||
| Persistent-mapped buffer foot-guns (chosen NOT to use in N.5) | n/a | n/a | Decision 7 defers to N.6 |
|
||||
| Per-instance highlight (selection blink) feature creep | Low | Scope grows | Decision 8 defers; field reserved in design doc |
|
||||
|
||||
---
|
||||
|
||||
## 10. Out of scope (explicitly)
|
||||
|
||||
The following are NOT N.5 work. They become possible follow-ons.
|
||||
|
||||
- **WB's `TextureAtlasManager` adoption for atlas tier.** N.5 keeps acdream's `TextureCache` as the texture owner for everything. Atlas adoption is N.6+ if memory pressure shows up.
|
||||
- **Persistent-mapped buffer ring with sync fences.** Decision 7. N.6 candidate if profiling shows residual `glBufferData` cost.
|
||||
- **GPU-side culling (compute pre-pass).** Future phase.
|
||||
- **Texture array repacking for multi-layer per-instance composites.** Future, if many palette-overrides actually share dimensions and could be packed.
|
||||
- **Selection-blink highlight color.** Decision 8. Phase B.4 follow-up. Field reserved in `InstanceData` design (extend stride to 80 bytes when implementing).
|
||||
- ~~**Deletion of legacy `InstancedMeshRenderer`.** N.6.~~ **Done in N.5 ship amendment** — `InstancedMeshRenderer`, `StaticMeshRenderer`, and `WbFoundationFlag` were deleted in the retirement commit.
|
||||
- **Terrain wiring through WB.** Future.
|
||||
|
||||
---
|
||||
|
||||
## 11. Open questions
|
||||
|
||||
None outstanding. All 8 brainstorm questions resolved + 1 clarification on highlight semantics. Ready for plan.
|
||||
|
||||
---
|
||||
|
||||
*End of design.*
|
||||
|
|
@ -14,6 +14,7 @@
|
|||
</ItemGroup>
|
||||
<ItemGroup>
|
||||
<PackageReference Include="Silk.NET.OpenGL" Version="2.23.0" />
|
||||
<PackageReference Include="Silk.NET.OpenGL.Extensions.ARB" Version="2.23.0" />
|
||||
<PackageReference Include="Silk.NET.Windowing" Version="2.23.0" />
|
||||
<PackageReference Include="Silk.NET.Input" Version="2.23.0" />
|
||||
<PackageReference Include="Silk.NET.OpenAL" Version="2.23.0" />
|
||||
|
|
|
|||
|
|
@ -25,14 +25,17 @@ public sealed class GameWindow : IDisposable
|
|||
private DatCollection? _dats;
|
||||
private float _lastMouseX;
|
||||
private float _lastMouseY;
|
||||
private InstancedMeshRenderer? _staticMesh;
|
||||
private Shader? _meshShader;
|
||||
private TextureCache? _textureCache;
|
||||
/// <summary>Phase N.4: WB-backed rendering pipeline adapter. Non-null only
|
||||
/// when <c>ACDREAM_USE_WB_FOUNDATION=1</c> is set; null otherwise.</summary>
|
||||
/// <summary>Phase N.4+: WB-backed rendering pipeline adapter. Always non-null
|
||||
/// after <c>OnLoad</c> completes (modern path is mandatory as of N.5).</summary>
|
||||
private AcDream.App.Rendering.Wb.WbMeshAdapter? _wbMeshAdapter;
|
||||
private AcDream.App.Rendering.Wb.EntitySpawnAdapter? _wbEntitySpawnAdapter;
|
||||
private AcDream.App.Rendering.Wb.WbDrawDispatcher? _wbDrawDispatcher;
|
||||
/// <summary>Phase N.5: ARB_bindless_texture + ARB_shader_draw_parameters
|
||||
/// support. Required at startup — missing bindless throws
|
||||
/// <see cref="NotSupportedException"/> in <c>OnLoad</c>.</summary>
|
||||
private AcDream.App.Rendering.Wb.BindlessSupport? _bindlessSupport;
|
||||
private SamplerCache? _samplerCache;
|
||||
private DebugLineRenderer? _debugLines;
|
||||
// K-fix4 (2026-04-26): default OFF. The orange BSP / green cylinder
|
||||
|
|
@ -966,10 +969,6 @@ public sealed class GameWindow : IDisposable
|
|||
Path.Combine(shadersDir, "terrain.vert"),
|
||||
Path.Combine(shadersDir, "terrain.frag"));
|
||||
|
||||
_meshShader = new Shader(_gl,
|
||||
Path.Combine(shadersDir, "mesh_instanced.vert"),
|
||||
Path.Combine(shadersDir, "mesh_instanced.frag"));
|
||||
|
||||
// Phase G.1/G.2: shared scene-lighting UBO. Stays bound at
|
||||
// binding=1 for the lifetime of the process — every shader that
|
||||
// declares `layout(std140, binding = 1) uniform SceneLighting`
|
||||
|
|
@ -1419,7 +1418,43 @@ public sealed class GameWindow : IDisposable
|
|||
_heightTable = heightTable;
|
||||
_surfaceCache = new Dictionary<uint, AcDream.Core.Terrain.SurfaceInfo>();
|
||||
|
||||
_textureCache = new TextureCache(_gl, _dats);
|
||||
// N.5: detect ARB_bindless_texture + ARB_shader_draw_parameters.
|
||||
// The modern path (SSBO + glMultiDrawElementsIndirect + bindless textures)
|
||||
// is mandatory as of Phase N.5 — missing extensions throw at startup with
|
||||
// a clear error so users can file a real bug report rather than silently
|
||||
// falling back to a half-working renderer.
|
||||
if (AcDream.App.Rendering.Wb.BindlessSupport.TryCreate(_gl, out var bindless))
|
||||
{
|
||||
if (bindless!.HasShaderDrawParameters(_gl))
|
||||
{
|
||||
_bindlessSupport = bindless;
|
||||
Console.WriteLine("[N.5] modern path capabilities present (bindless + ARB_shader_draw_parameters)");
|
||||
}
|
||||
else
|
||||
{
|
||||
Console.WriteLine("[N.5] GL_ARB_shader_draw_parameters not present — modern path not available");
|
||||
}
|
||||
}
|
||||
else
|
||||
{
|
||||
Console.WriteLine("[N.5] GL_ARB_bindless_texture not present — modern path not available");
|
||||
}
|
||||
|
||||
if (_bindlessSupport is null)
|
||||
{
|
||||
throw new NotSupportedException(
|
||||
"acdream requires GL_ARB_bindless_texture + GL_ARB_shader_draw_parameters " +
|
||||
"(GL 4.3+ with bindless support). Your GPU/driver does not expose these extensions. " +
|
||||
"If this is unexpected, please file a bug report with your GPU vendor + driver version.");
|
||||
}
|
||||
|
||||
// Mesh shader always loads (modern path is the only path).
|
||||
_meshShader = new Shader(_gl,
|
||||
Path.Combine(shadersDir, "mesh_modern.vert"),
|
||||
Path.Combine(shadersDir, "mesh_modern.frag"));
|
||||
Console.WriteLine("[N.5] mesh_modern shader loaded");
|
||||
|
||||
_textureCache = new TextureCache(_gl, _dats, _bindlessSupport);
|
||||
// Two persistent GL sampler objects (Repeat + ClampToEdge) so
|
||||
// the sky pass can pick wrap mode per submesh without mutating
|
||||
// shared per-texture wrap state. See SamplerCache + the
|
||||
|
|
@ -1427,17 +1462,14 @@ public sealed class GameWindow : IDisposable
|
|||
// references/WorldBuilder/Chorizite.OpenGLSDLBackend/OpenGLGraphicsDevice.cs:115-132.
|
||||
_samplerCache = new SamplerCache(_gl);
|
||||
|
||||
// Phase N.4 — WB rendering pipeline foundation. Constructed only when
|
||||
// ACDREAM_USE_WB_FOUNDATION=1 is set; otherwise the legacy renderer
|
||||
// path stays in charge. The full ObjectMeshManager bring-up lives in
|
||||
// WbMeshAdapter (Task 9): OpenGLGraphicsDevice + DefaultDatReaderWriter
|
||||
// + ObjectMeshManager. WbMeshAdapter opens its own file handles for
|
||||
// the dat files (independent of our DatCollection).
|
||||
if (AcDream.App.Rendering.Wb.WbFoundationFlag.IsEnabled)
|
||||
// Phase N.4+N.5 — WB rendering pipeline foundation. The modern path is
|
||||
// mandatory as of N.5 ship amendment: WbMeshAdapter + WbDrawDispatcher
|
||||
// always construct. WbMeshAdapter owns ObjectMeshManager and opens its
|
||||
// own file handles for the dat files (independent of our DatCollection).
|
||||
{
|
||||
var wbLogger = Microsoft.Extensions.Logging.Abstractions.NullLogger<AcDream.App.Rendering.Wb.WbMeshAdapter>.Instance;
|
||||
_wbMeshAdapter = new AcDream.App.Rendering.Wb.WbMeshAdapter(_gl, _datDir, _dats, wbLogger);
|
||||
Console.WriteLine("[N.4] WbFoundation flag is ENABLED — routing static content through ObjectMeshManager.");
|
||||
Console.WriteLine("[N.4+N.5] WB foundation + modern path active — routing all content through ObjectMeshManager.");
|
||||
}
|
||||
|
||||
// Phase N.4 Task 12: construct LandblockSpawnAdapter under the feature flag
|
||||
|
|
@ -1446,60 +1478,51 @@ public sealed class GameWindow : IDisposable
|
|||
// one that carries the adapter so AddLandblock/RemoveLandblock notify WB.
|
||||
// Phase N.4 Task 17: also construct EntitySpawnAdapter for server-spawned
|
||||
// per-instance content under the same flag.
|
||||
// N.5 mandatory path: spawn adapters + dispatcher always construct.
|
||||
// _wbMeshAdapter, _meshShader, _textureCache, and _bindlessSupport are
|
||||
// all guaranteed non-null here (startup throws above if any are missing).
|
||||
{
|
||||
AcDream.App.Rendering.Wb.LandblockSpawnAdapter? wbSpawnAdapter = null;
|
||||
AcDream.App.Rendering.Wb.EntitySpawnAdapter? wbEntitySpawnAdapter = null;
|
||||
if (AcDream.App.Rendering.Wb.WbFoundationFlag.IsEnabled && _wbMeshAdapter is not null)
|
||||
var wbSpawnAdapter = new AcDream.App.Rendering.Wb.LandblockSpawnAdapter(_wbMeshAdapter!);
|
||||
// Sequencer factory: look up Setup + MotionTable from dats and build
|
||||
// an AnimationSequencer. Falls back to a no-op sequencer when the
|
||||
// entity has no motion table (static props, etc.). Uses _animLoader
|
||||
// which is initialised earlier in OnLoad; it is non-null here.
|
||||
var capturedDats = _dats;
|
||||
var capturedAnimLoader = _animLoader;
|
||||
AcDream.Core.Physics.AnimationSequencer SequencerFactory(AcDream.Core.World.WorldEntity e)
|
||||
{
|
||||
wbSpawnAdapter = new AcDream.App.Rendering.Wb.LandblockSpawnAdapter(_wbMeshAdapter);
|
||||
// Sequencer factory: look up Setup + MotionTable from dats and build
|
||||
// an AnimationSequencer. Falls back to a no-op sequencer when the
|
||||
// entity has no motion table (static props, etc.). Uses _animLoader
|
||||
// which is initialised at line 1004; it is non-null here because
|
||||
// OnLoad wires _dats + _animLoader before this block runs.
|
||||
var capturedDats = _dats;
|
||||
var capturedAnimLoader = _animLoader;
|
||||
AcDream.Core.Physics.AnimationSequencer SequencerFactory(AcDream.Core.World.WorldEntity e)
|
||||
if (capturedDats is not null && capturedAnimLoader is not null)
|
||||
{
|
||||
if (capturedDats is not null && capturedAnimLoader is not null)
|
||||
var setup = capturedDats.Get<DatReaderWriter.DBObjs.Setup>(e.SourceGfxObjOrSetupId);
|
||||
if (setup is not null)
|
||||
{
|
||||
var setup = capturedDats.Get<DatReaderWriter.DBObjs.Setup>(e.SourceGfxObjOrSetupId);
|
||||
if (setup is not null)
|
||||
uint mtableId = (uint)setup.DefaultMotionTable;
|
||||
if (mtableId != 0)
|
||||
{
|
||||
uint mtableId = (uint)setup.DefaultMotionTable;
|
||||
if (mtableId != 0)
|
||||
{
|
||||
var mtable = capturedDats.Get<DatReaderWriter.DBObjs.MotionTable>(mtableId);
|
||||
if (mtable is not null)
|
||||
return new AcDream.Core.Physics.AnimationSequencer(setup, mtable, capturedAnimLoader);
|
||||
}
|
||||
// Setup exists but no motion table — no-op sequencer.
|
||||
return new AcDream.Core.Physics.AnimationSequencer(
|
||||
setup,
|
||||
new DatReaderWriter.DBObjs.MotionTable(),
|
||||
capturedAnimLoader);
|
||||
var mtable = capturedDats.Get<DatReaderWriter.DBObjs.MotionTable>(mtableId);
|
||||
if (mtable is not null)
|
||||
return new AcDream.Core.Physics.AnimationSequencer(setup, mtable, capturedAnimLoader);
|
||||
}
|
||||
// Setup exists but no motion table — no-op sequencer.
|
||||
return new AcDream.Core.Physics.AnimationSequencer(
|
||||
setup,
|
||||
new DatReaderWriter.DBObjs.MotionTable(),
|
||||
capturedAnimLoader);
|
||||
}
|
||||
// Complete fallback: empty setup + empty motion table + null loader.
|
||||
return new AcDream.Core.Physics.AnimationSequencer(
|
||||
new DatReaderWriter.DBObjs.Setup(),
|
||||
new DatReaderWriter.DBObjs.MotionTable(),
|
||||
new NullAnimLoader());
|
||||
}
|
||||
wbEntitySpawnAdapter = new AcDream.App.Rendering.Wb.EntitySpawnAdapter(
|
||||
_textureCache, SequencerFactory, _wbMeshAdapter);
|
||||
_wbEntitySpawnAdapter = wbEntitySpawnAdapter;
|
||||
// Complete fallback: empty setup + empty motion table + null loader.
|
||||
return new AcDream.Core.Physics.AnimationSequencer(
|
||||
new DatReaderWriter.DBObjs.Setup(),
|
||||
new DatReaderWriter.DBObjs.MotionTable(),
|
||||
new NullAnimLoader());
|
||||
}
|
||||
var wbEntitySpawnAdapter = new AcDream.App.Rendering.Wb.EntitySpawnAdapter(
|
||||
_textureCache!, SequencerFactory, _wbMeshAdapter!);
|
||||
_wbEntitySpawnAdapter = wbEntitySpawnAdapter;
|
||||
_worldState = new AcDream.App.Streaming.GpuWorldState(wbSpawnAdapter, wbEntitySpawnAdapter);
|
||||
}
|
||||
|
||||
_staticMesh = new InstancedMeshRenderer(_gl, _meshShader, _textureCache, _wbMeshAdapter);
|
||||
|
||||
if (AcDream.App.Rendering.Wb.WbFoundationFlag.IsEnabled
|
||||
&& _wbMeshAdapter is not null && _wbEntitySpawnAdapter is not null)
|
||||
{
|
||||
_wbDrawDispatcher = new AcDream.App.Rendering.Wb.WbDrawDispatcher(
|
||||
_gl, _meshShader, _textureCache, _wbMeshAdapter, _wbEntitySpawnAdapter);
|
||||
_gl, _meshShader!, _textureCache!, _wbMeshAdapter!, _wbEntitySpawnAdapter, _bindlessSupport!);
|
||||
}
|
||||
|
||||
// Phase G.1 sky renderer — its own shader (sky.vert / sky.frag)
|
||||
|
|
@ -1509,7 +1532,7 @@ public sealed class GameWindow : IDisposable
|
|||
Path.Combine(shadersDir, "sky.vert"),
|
||||
Path.Combine(shadersDir, "sky.frag"));
|
||||
_skyRenderer = new AcDream.App.Rendering.Sky.SkyRenderer(
|
||||
_gl, _dats, skyShader, _textureCache, _samplerCache);
|
||||
_gl, _dats, skyShader, _textureCache!, _samplerCache);
|
||||
|
||||
// Phase G.1 particle renderer — renders rain / snow / spell auras
|
||||
// spawned into the shared ParticleSystem as billboard quads.
|
||||
|
|
@ -2025,7 +2048,7 @@ public sealed class GameWindow : IDisposable
|
|||
}
|
||||
}
|
||||
|
||||
if (_dats is null || _staticMesh is null) return;
|
||||
if (_dats is null) return;
|
||||
if (spawn.Position is null || spawn.SetupTableId is null)
|
||||
{
|
||||
// Can't place a mesh without both. Most of these are inventory
|
||||
|
|
@ -2360,10 +2383,9 @@ public sealed class GameWindow : IDisposable
|
|||
continue;
|
||||
}
|
||||
_physicsDataCache.CacheGfxObj(mr.GfxObjId, gfx);
|
||||
var subMeshes = AcDream.Core.Meshing.GfxObjMesh.Build(gfx, _dats);
|
||||
_staticMesh.EnsureUploaded(mr.GfxObjId, subMeshes);
|
||||
if (dumpClothing)
|
||||
{
|
||||
var subMeshes = AcDream.Core.Meshing.GfxObjMesh.Build(gfx, _dats);
|
||||
int tris = 0; int subs = 0;
|
||||
foreach (var sm in subMeshes) { tris += sm.Indices.Length / 3; subs++; }
|
||||
dumpClothingTotalTris += tris;
|
||||
|
|
@ -5194,44 +5216,25 @@ public sealed class GameWindow : IDisposable
|
|||
portalPlanes, origin.X, origin.Y);
|
||||
}
|
||||
|
||||
// Upload every GfxObj referenced by this landblock's entities.
|
||||
// EnsureUploaded is idempotent so duplicates across landblocks are free.
|
||||
if (_staticMesh is not null)
|
||||
// N.5: WbMeshAdapter.Tick() handles GPU upload for all GfxObj meshes via
|
||||
// ObjectMeshManager.PrepareMeshDataAsync. The legacy EnsureUploaded loop
|
||||
// (and _pendingCellMeshes drain) are retired with InstancedMeshRenderer.
|
||||
// Cache GfxObj physics data (BSP trees) for the physics engine — this
|
||||
// loop is physics-only, not renderer-side.
|
||||
foreach (var entity in lb.Entities)
|
||||
{
|
||||
// Task 8: drain any pending EnvCell room-mesh sub-meshes first.
|
||||
// The worker thread pre-built these CPU-side and stored them in
|
||||
// _pendingCellMeshes. We must upload them here (render thread) before
|
||||
// the per-MeshRef loop below tries to look them up via GfxObjMesh.Build,
|
||||
// which would fail because EnvCell ids (0xAAAA01xx) aren't real GfxObj
|
||||
// dat ids. EnsureUploaded is idempotent so calling it here then seeing
|
||||
// the same id again in the loop below is safe.
|
||||
foreach (var entity in lb.Entities)
|
||||
foreach (var meshRef in entity.MeshRefs)
|
||||
{
|
||||
foreach (var meshRef in entity.MeshRefs)
|
||||
{
|
||||
if (_pendingCellMeshes.TryRemove(meshRef.GfxObjId, out var cellSubMeshes))
|
||||
_staticMesh.EnsureUploaded(meshRef.GfxObjId, cellSubMeshes);
|
||||
}
|
||||
}
|
||||
|
||||
// Now upload regular GfxObj sub-meshes (stabs, scenery, interior stabs).
|
||||
// Skip any ids already uploaded (includes the cell meshes just drained).
|
||||
foreach (var entity in lb.Entities)
|
||||
{
|
||||
foreach (var meshRef in entity.MeshRefs)
|
||||
{
|
||||
// Skip EnvCell synthetic ids — already handled above (or already
|
||||
// uploaded on a prior tick). GfxObj ids are 0x01xxxxxx; Setup ids
|
||||
// are 0x02xxxxxx; anything else is not a GfxObj dat record.
|
||||
if ((meshRef.GfxObjId & 0xFF000000u) != 0x01000000u) continue;
|
||||
var gfx = _dats.Get<DatReaderWriter.DBObjs.GfxObj>(meshRef.GfxObjId);
|
||||
if (gfx is null) continue;
|
||||
_physicsDataCache.CacheGfxObj(meshRef.GfxObjId, gfx);
|
||||
var subMeshes = AcDream.Core.Meshing.GfxObjMesh.Build(gfx, _dats);
|
||||
_staticMesh.EnsureUploaded(meshRef.GfxObjId, subMeshes);
|
||||
}
|
||||
if ((meshRef.GfxObjId & 0xFF000000u) != 0x01000000u) continue;
|
||||
var gfx = _dats.Get<DatReaderWriter.DBObjs.GfxObj>(meshRef.GfxObjId);
|
||||
if (gfx is null) continue;
|
||||
_physicsDataCache.CacheGfxObj(meshRef.GfxObjId, gfx);
|
||||
}
|
||||
}
|
||||
// Drain _pendingCellMeshes to prevent unbounded accumulation.
|
||||
// The data is no longer consumed (WB handles EnvCell geometry through
|
||||
// its own pipeline), but the worker thread still populates this dict.
|
||||
_pendingCellMeshes.Clear();
|
||||
|
||||
// Task 7: register static entities into the ShadowObjectRegistry so the
|
||||
// Transition system can find and collide against them during movement.
|
||||
|
|
@ -6336,20 +6339,11 @@ public sealed class GameWindow : IDisposable
|
|||
animatedIds.Add(k);
|
||||
}
|
||||
|
||||
if (_wbDrawDispatcher is not null)
|
||||
{
|
||||
_wbDrawDispatcher.Draw(camera, _worldState.LandblockEntries, frustum,
|
||||
neverCullLandblockId: playerLb,
|
||||
visibleCellIds: visibility?.VisibleCellIds,
|
||||
animatedEntityIds: animatedIds);
|
||||
}
|
||||
else
|
||||
{
|
||||
_staticMesh?.Draw(camera, _worldState.LandblockEntries, frustum,
|
||||
neverCullLandblockId: playerLb,
|
||||
visibleCellIds: visibility?.VisibleCellIds,
|
||||
animatedEntityIds: animatedIds);
|
||||
}
|
||||
// N.5: WbDrawDispatcher is always non-null (modern path mandatory).
|
||||
_wbDrawDispatcher!.Draw(camera, _worldState.LandblockEntries, frustum,
|
||||
neverCullLandblockId: playerLb,
|
||||
visibleCellIds: visibility?.VisibleCellIds,
|
||||
animatedEntityIds: animatedIds);
|
||||
|
||||
// Phase G.1 / E.3: draw all live particles after opaque
|
||||
// scene geometry so alpha blending composites correctly.
|
||||
|
|
@ -8731,11 +8725,10 @@ public sealed class GameWindow : IDisposable
|
|||
_liveSession?.Dispose();
|
||||
_audioEngine?.Dispose(); // Phase E.2: stop all voices, close AL context
|
||||
_wbDrawDispatcher?.Dispose();
|
||||
_staticMesh?.Dispose();
|
||||
_skyRenderer?.Dispose(); // depends on sampler cache; dispose first
|
||||
_samplerCache?.Dispose();
|
||||
_textureCache?.Dispose();
|
||||
_wbMeshAdapter?.Dispose(); // Phase N.4 WB foundation — null when flag off
|
||||
_wbMeshAdapter?.Dispose(); // Phase N.4+N.5 WB foundation (mandatory modern path)
|
||||
|
||||
_meshShader?.Dispose();
|
||||
_terrain?.Dispose();
|
||||
|
|
|
|||
|
|
@ -1,596 +0,0 @@
|
|||
// src/AcDream.App/Rendering/InstancedMeshRenderer.cs
|
||||
//
|
||||
// True instanced rendering for static-object meshes.
|
||||
// Groups entities by GfxObjId. All instance model matrices are written into
|
||||
// a single shared instance VBO once per frame. Each sub-mesh is drawn with
|
||||
// DrawElementsInstanced — one GL draw call per (GfxObj × sub-mesh) instead
|
||||
// of one per entity. For a scene with N unique GfxObjs and M total entities
|
||||
// this reduces draw calls from M*subMeshes to N*subMeshes.
|
||||
//
|
||||
// Matrix layout:
|
||||
// System.Numerics.Matrix4x4 is row-major. Written to the float[] buffer in
|
||||
// natural memory order (M11..M44). The GLSL shader reads 4 vec4 attributes
|
||||
// (aInstanceRow0-3) and constructs mat4(row0, row1, row2, row3). Because
|
||||
// GLSL mat4() takes column vectors, the rows of the C# matrix become the
|
||||
// columns of the GLSL mat4 — which is the same transpose that UniformMatrix4
|
||||
// with transpose=false produces. Visual result is identical to the old
|
||||
// SetMatrix4("uModel", ...) path.
|
||||
//
|
||||
// Architecture note: public API matches StaticMeshRenderer so GameWindow only
|
||||
// needs to update the shader and uniform setup at the call sites.
|
||||
using System.Numerics;
|
||||
using System.Runtime.InteropServices;
|
||||
using AcDream.App.Rendering.Wb;
|
||||
using AcDream.Core.Meshing;
|
||||
using AcDream.Core.Terrain;
|
||||
using AcDream.Core.World;
|
||||
using Silk.NET.OpenGL;
|
||||
|
||||
namespace AcDream.App.Rendering;
|
||||
|
||||
public sealed unsafe class InstancedMeshRenderer : IDisposable
|
||||
{
|
||||
private readonly GL _gl;
|
||||
private readonly Shader _shader;
|
||||
private readonly TextureCache _textures;
|
||||
|
||||
/// <summary>
|
||||
/// Optional WB adapter. Held but currently unused — Phase N.4 Adjustment 2
|
||||
/// (2026-05-08) reverted Task 9's renderer-level routing. Tier-routing decisions
|
||||
/// (atlas vs per-instance) belong at the spawn-callback layer (Task 11
|
||||
/// LandblockSpawnAdapter for atlas-tier; Task 17 EntitySpawnAdapter for
|
||||
/// per-instance), not in the renderer which is intentionally tier-blind. The
|
||||
/// constructor parameter is preserved so GameWindow's wire-up doesn't shift
|
||||
/// when later tasks need adapter access.
|
||||
/// </summary>
|
||||
private readonly WbMeshAdapter? _wbMeshAdapter;
|
||||
|
||||
// One GPU bundle per unique GfxObj id. Each GfxObj can have multiple sub-meshes.
|
||||
private readonly Dictionary<uint, List<SubMeshGpu>> _gpuByGfxObj = new();
|
||||
|
||||
// Shared instance VBO — filled every frame with all instance model matrices.
|
||||
private readonly uint _instanceVbo;
|
||||
|
||||
// Per-frame scratch: reused float buffer for instance matrix data.
|
||||
// 16 floats per mat4. Grown on demand; never shrunk.
|
||||
private float[] _instanceBuffer = new float[256 * 16]; // start at 256 instances
|
||||
|
||||
// ── Instance grouping scratch ─────────────────────────────────────────────
|
||||
//
|
||||
// Reused every frame to avoid per-frame allocation.
|
||||
//
|
||||
// **Group key = (GfxObjId, PaletteOverrideHash, SurfaceOverridesHash).**
|
||||
//
|
||||
// An earlier implementation grouped on <c>GfxObjId</c> alone and resolved
|
||||
// the per-sub-mesh texture from the first instance in the group — which
|
||||
// is fine for scenery where every tree shares the same palette, but
|
||||
// utterly broken for NPCs: every humanoid uses the same base body
|
||||
// GfxObjs and they all piled into one group, so the first NPC's palette
|
||||
// was used for every NPC in the frame. Frustum culling + iteration
|
||||
// order meant that "first NPC" changed as the camera turned — producing
|
||||
// the "NPC clothing changes when I turn" symptom.
|
||||
//
|
||||
// Now we also key by the entity's PaletteOverride + per-MeshRef
|
||||
// SurfaceOverrides signature so only entities that decode to the
|
||||
// SAME texture for every sub-mesh can share a batch. Entities with
|
||||
// unique appearance fall to single-instance groups (still correct,
|
||||
// marginally slower than true instancing).
|
||||
private readonly Dictionary<GroupKey, InstanceGroup> _groups = new();
|
||||
|
||||
private readonly record struct GroupKey(uint GfxObjId, ulong TextureSignature);
|
||||
|
||||
public InstancedMeshRenderer(GL gl, Shader shader, TextureCache textures,
|
||||
WbMeshAdapter? wbMeshAdapter = null)
|
||||
{
|
||||
_gl = gl;
|
||||
_shader = shader;
|
||||
_textures = textures;
|
||||
_wbMeshAdapter = wbMeshAdapter;
|
||||
|
||||
_instanceVbo = _gl.GenBuffer();
|
||||
}
|
||||
|
||||
// ── Upload ────────────────────────────────────────────────────────────────
|
||||
|
||||
public void EnsureUploaded(uint gfxObjId, IReadOnlyList<GfxObjSubMesh> subMeshes)
|
||||
{
|
||||
if (_gpuByGfxObj.ContainsKey(gfxObjId))
|
||||
return;
|
||||
|
||||
// Phase N.4 Adjustment 2 (2026-05-08): renderer is tier-blind. Tier-routing
|
||||
// (atlas vs per-instance) lives at the spawn-callback layer (Tasks 11 + 17),
|
||||
// not here. Smoke-test of the original Task 9 routing showed it caught
|
||||
// characters / NPCs (server-spawned, per-instance tier) along with static
|
||||
// scenery, because EnsureUploaded is called from both spawn paths.
|
||||
var list = new List<SubMeshGpu>(subMeshes.Count);
|
||||
foreach (var sm in subMeshes)
|
||||
list.Add(UploadSubMesh(sm));
|
||||
_gpuByGfxObj[gfxObjId] = list;
|
||||
}
|
||||
|
||||
private SubMeshGpu UploadSubMesh(GfxObjSubMesh sm)
|
||||
{
|
||||
uint vao = _gl.GenVertexArray();
|
||||
_gl.BindVertexArray(vao);
|
||||
|
||||
// ── Vertex buffer (positions, normals, UVs) ───────────────────────────
|
||||
uint vbo = _gl.GenBuffer();
|
||||
_gl.BindBuffer(BufferTargetARB.ArrayBuffer, vbo);
|
||||
fixed (void* p = sm.Vertices)
|
||||
_gl.BufferData(BufferTargetARB.ArrayBuffer,
|
||||
(nuint)(sm.Vertices.Length * sizeof(Vertex)), p, BufferUsageARB.StaticDraw);
|
||||
|
||||
uint stride = (uint)sizeof(Vertex);
|
||||
_gl.EnableVertexAttribArray(0);
|
||||
_gl.VertexAttribPointer(0, 3, VertexAttribPointerType.Float, false, stride, (void*)0);
|
||||
_gl.EnableVertexAttribArray(1);
|
||||
_gl.VertexAttribPointer(1, 3, VertexAttribPointerType.Float, false, stride, (void*)(3 * sizeof(float)));
|
||||
_gl.EnableVertexAttribArray(2);
|
||||
_gl.VertexAttribPointer(2, 2, VertexAttribPointerType.Float, false, stride, (void*)(6 * sizeof(float)));
|
||||
// Note: location 3 (uint TerrainLayer) is NOT used by mesh_instanced.vert;
|
||||
// that slot is reserved for per-instance mat4 row 0 from the instance VBO.
|
||||
|
||||
// ── Index buffer ──────────────────────────────────────────────────────
|
||||
uint ebo = _gl.GenBuffer();
|
||||
_gl.BindBuffer(BufferTargetARB.ElementArrayBuffer, ebo);
|
||||
fixed (void* p = sm.Indices)
|
||||
_gl.BufferData(BufferTargetARB.ElementArrayBuffer,
|
||||
(nuint)(sm.Indices.Length * sizeof(uint)), p, BufferUsageARB.StaticDraw);
|
||||
|
||||
// ── Per-instance model matrix (locations 3-6) ─────────────────────────
|
||||
// Bind the shared instance VBO. The VAO captures this binding at each
|
||||
// attribute location. At draw time we re-call VertexAttribPointer with
|
||||
// the per-group byte offset (to address different groups in the VBO
|
||||
// without DrawElementsInstancedBaseInstance).
|
||||
_gl.BindBuffer(BufferTargetARB.ArrayBuffer, _instanceVbo);
|
||||
// mat4 = 4 × vec4, stride = 64 bytes, divisor = 1 (advance once per instance)
|
||||
for (uint row = 0; row < 4; row++)
|
||||
{
|
||||
uint loc = 3 + row;
|
||||
_gl.EnableVertexAttribArray(loc);
|
||||
_gl.VertexAttribPointer(loc, 4, VertexAttribPointerType.Float, false, 64, (void*)(row * 16));
|
||||
_gl.VertexAttribDivisor(loc, 1);
|
||||
}
|
||||
|
||||
_gl.BindVertexArray(0);
|
||||
|
||||
return new SubMeshGpu
|
||||
{
|
||||
Vao = vao,
|
||||
Vbo = vbo,
|
||||
Ebo = ebo,
|
||||
IndexCount = sm.Indices.Length,
|
||||
SurfaceId = sm.SurfaceId,
|
||||
Translucency = sm.Translucency,
|
||||
};
|
||||
}
|
||||
|
||||
// ── Draw ──────────────────────────────────────────────────────────────────
|
||||
|
||||
public void Draw(ICamera camera,
|
||||
IEnumerable<(uint LandblockId, Vector3 AabbMin, Vector3 AabbMax, IReadOnlyList<WorldEntity> Entities)> landblockEntries,
|
||||
FrustumPlanes? frustum = null,
|
||||
uint? neverCullLandblockId = null,
|
||||
HashSet<uint>? visibleCellIds = null,
|
||||
// L-fix1 (2026-04-28): set of entity ids that should bypass the
|
||||
// landblock-level frustum cull. Animated entities (other
|
||||
// players, NPCs, monsters) are always rendered if their
|
||||
// landblock is loaded — without this they vanish whenever the
|
||||
// camera rotates away from their landblock, even though
|
||||
// they're within visible distance of the player. Pass null /
|
||||
// empty to keep the previous "cull everything by landblock"
|
||||
// behavior.
|
||||
HashSet<uint>? animatedEntityIds = null)
|
||||
{
|
||||
_shader.Use();
|
||||
|
||||
var vp = camera.View * camera.Projection;
|
||||
_shader.SetMatrix4("uViewProjection", vp);
|
||||
|
||||
// Phase G: lighting + ambient + fog are owned by the
|
||||
// SceneLighting UBO (binding=1) uploaded once per frame by
|
||||
// GameWindow. The instanced mesh fragment shader reads it
|
||||
// directly — no per-draw uniform uploads needed.
|
||||
|
||||
// ── Collect and group instances ───────────────────────────────────────
|
||||
CollectGroups(landblockEntries, frustum, neverCullLandblockId, visibleCellIds, animatedEntityIds);
|
||||
|
||||
// ── Build and upload the instance buffer ──────────────────────────────
|
||||
// Count total instances.
|
||||
int totalInstances = 0;
|
||||
foreach (var grp in _groups.Values)
|
||||
totalInstances += grp.Count;
|
||||
|
||||
// Grow the scratch buffer if needed.
|
||||
int needed = totalInstances * 16;
|
||||
if (_instanceBuffer.Length < needed)
|
||||
_instanceBuffer = new float[needed + 256 * 16]; // extra headroom
|
||||
|
||||
// Write all groups contiguously. Record each group's starting offset
|
||||
// (in units of instances, not bytes) so we can address them at draw time.
|
||||
int instanceOffset = 0;
|
||||
foreach (var grp in _groups.Values)
|
||||
{
|
||||
grp.BufferOffset = instanceOffset;
|
||||
foreach (ref readonly var inst in CollectionsMarshal.AsSpan(grp.Entries))
|
||||
WriteMatrix(_instanceBuffer, instanceOffset++ * 16, inst.Model);
|
||||
}
|
||||
|
||||
// Upload all instance data in a single DynamicDraw call.
|
||||
if (totalInstances > 0)
|
||||
{
|
||||
_gl.BindBuffer(BufferTargetARB.ArrayBuffer, _instanceVbo);
|
||||
fixed (void* p = _instanceBuffer)
|
||||
_gl.BufferData(BufferTargetARB.ArrayBuffer,
|
||||
(nuint)(totalInstances * 16 * sizeof(float)), p, BufferUsageARB.DynamicDraw);
|
||||
}
|
||||
|
||||
// ── Pass 1: Opaque + ClipMap ──────────────────────────────────────────
|
||||
// Diagnostic: ACDREAM_NO_CULL=1 disables backface culling entirely.
|
||||
if (string.Equals(Environment.GetEnvironmentVariable("ACDREAM_NO_CULL"), "1", StringComparison.Ordinal))
|
||||
{
|
||||
_gl.Disable(EnableCap.CullFace);
|
||||
}
|
||||
foreach (var (key, grp) in _groups)
|
||||
{
|
||||
if (!_gpuByGfxObj.TryGetValue(key.GfxObjId, out var subMeshes))
|
||||
continue;
|
||||
|
||||
bool hasOpaqueSubMesh = false;
|
||||
foreach (var sub in subMeshes)
|
||||
{
|
||||
if (sub.Translucency == TranslucencyKind.Opaque ||
|
||||
sub.Translucency == TranslucencyKind.ClipMap)
|
||||
{
|
||||
hasOpaqueSubMesh = true;
|
||||
break;
|
||||
}
|
||||
}
|
||||
if (!hasOpaqueSubMesh) continue;
|
||||
|
||||
// For this group, instance data starts at grp.BufferOffset in the VBO.
|
||||
// We need to tell the VAO to read from that offset.
|
||||
uint byteOffset = (uint)(grp.BufferOffset * 64); // 64 bytes per mat4
|
||||
|
||||
foreach (var sub in subMeshes)
|
||||
{
|
||||
if (sub.Translucency != TranslucencyKind.Opaque &&
|
||||
sub.Translucency != TranslucencyKind.ClipMap)
|
||||
continue;
|
||||
|
||||
_shader.SetInt("uTranslucencyKind", (int)sub.Translucency);
|
||||
|
||||
// Bind VAO + re-point instance attributes to the group's slice
|
||||
// in the shared VBO. This updates the VAO's stored offset for
|
||||
// locations 3-6 without touching the vertex or index bindings.
|
||||
_gl.BindVertexArray(sub.Vao);
|
||||
_gl.BindBuffer(BufferTargetARB.ArrayBuffer, _instanceVbo);
|
||||
for (uint row = 0; row < 4; row++)
|
||||
{
|
||||
_gl.VertexAttribPointer(3 + row, 4, VertexAttribPointerType.Float,
|
||||
false, 64, (void*)(byteOffset + row * 16));
|
||||
}
|
||||
|
||||
// Resolve texture from the first instance (all instances in this
|
||||
// group share the same GfxObj so they have compatible overrides
|
||||
// only in the degenerate case of mixed-palette entities using the
|
||||
// same GfxObj — rare enough to accept the approximation here).
|
||||
if (grp.Count == 0) continue;
|
||||
var firstEntry = grp.Entries[0];
|
||||
uint tex = ResolveTex(firstEntry.Entity, firstEntry.MeshRef, sub);
|
||||
_gl.ActiveTexture(TextureUnit.Texture0);
|
||||
_gl.BindTexture(TextureTarget.Texture2D, tex);
|
||||
|
||||
_gl.DrawElementsInstanced(PrimitiveType.Triangles,
|
||||
(uint)sub.IndexCount,
|
||||
DrawElementsType.UnsignedInt,
|
||||
(void*)0,
|
||||
(uint)grp.Count);
|
||||
}
|
||||
}
|
||||
|
||||
// ── Pass 2: Translucent (AlphaBlend, Additive, InvAlpha) ─────────────
|
||||
_gl.Enable(EnableCap.Blend);
|
||||
_gl.DepthMask(false);
|
||||
// Diagnostic: ACDREAM_NO_CULL=1 disables backface culling (used 2026-05-01
|
||||
// to test if our mesh winding (0,i,i+1) vs ACME's (i+1,i,0) is causing
|
||||
// visible polygons to be culled, especially around the neck/coat seam).
|
||||
if (string.Equals(Environment.GetEnvironmentVariable("ACDREAM_NO_CULL"), "1", StringComparison.Ordinal))
|
||||
{
|
||||
_gl.Disable(EnableCap.CullFace);
|
||||
}
|
||||
else
|
||||
{
|
||||
_gl.Enable(EnableCap.CullFace);
|
||||
_gl.CullFace(TriangleFace.Back);
|
||||
_gl.FrontFace(FrontFaceDirection.Ccw);
|
||||
}
|
||||
|
||||
foreach (var (key, grp) in _groups)
|
||||
{
|
||||
if (!_gpuByGfxObj.TryGetValue(key.GfxObjId, out var subMeshes))
|
||||
continue;
|
||||
|
||||
bool hasTranslucentSubMesh = false;
|
||||
foreach (var sub in subMeshes)
|
||||
{
|
||||
if (sub.Translucency != TranslucencyKind.Opaque &&
|
||||
sub.Translucency != TranslucencyKind.ClipMap)
|
||||
{
|
||||
hasTranslucentSubMesh = true;
|
||||
break;
|
||||
}
|
||||
}
|
||||
if (!hasTranslucentSubMesh) continue;
|
||||
|
||||
uint byteOffset = (uint)(grp.BufferOffset * 64);
|
||||
|
||||
foreach (var sub in subMeshes)
|
||||
{
|
||||
if (sub.Translucency == TranslucencyKind.Opaque ||
|
||||
sub.Translucency == TranslucencyKind.ClipMap)
|
||||
continue;
|
||||
|
||||
switch (sub.Translucency)
|
||||
{
|
||||
case TranslucencyKind.Additive:
|
||||
_gl.BlendFunc(BlendingFactor.SrcAlpha, BlendingFactor.One);
|
||||
break;
|
||||
case TranslucencyKind.InvAlpha:
|
||||
_gl.BlendFunc(BlendingFactor.OneMinusSrcAlpha, BlendingFactor.SrcAlpha);
|
||||
break;
|
||||
default: // AlphaBlend
|
||||
_gl.BlendFunc(BlendingFactor.SrcAlpha, BlendingFactor.OneMinusSrcAlpha);
|
||||
break;
|
||||
}
|
||||
|
||||
_shader.SetInt("uTranslucencyKind", (int)sub.Translucency);
|
||||
|
||||
_gl.BindVertexArray(sub.Vao);
|
||||
_gl.BindBuffer(BufferTargetARB.ArrayBuffer, _instanceVbo);
|
||||
for (uint row = 0; row < 4; row++)
|
||||
{
|
||||
_gl.VertexAttribPointer(3 + row, 4, VertexAttribPointerType.Float,
|
||||
false, 64, (void*)(byteOffset + row * 16));
|
||||
}
|
||||
|
||||
if (grp.Count == 0) continue;
|
||||
var firstEntry = grp.Entries[0];
|
||||
uint tex = ResolveTex(firstEntry.Entity, firstEntry.MeshRef, sub);
|
||||
_gl.ActiveTexture(TextureUnit.Texture0);
|
||||
_gl.BindTexture(TextureTarget.Texture2D, tex);
|
||||
|
||||
_gl.DrawElementsInstanced(PrimitiveType.Triangles,
|
||||
(uint)sub.IndexCount,
|
||||
DrawElementsType.UnsignedInt,
|
||||
(void*)0,
|
||||
(uint)grp.Count);
|
||||
}
|
||||
}
|
||||
|
||||
// Restore default GL state.
|
||||
_gl.DepthMask(true);
|
||||
_gl.Disable(EnableCap.Blend);
|
||||
_gl.Disable(EnableCap.CullFace);
|
||||
_gl.BindVertexArray(0);
|
||||
}
|
||||
|
||||
// ── Grouping ──────────────────────────────────────────────────────────────
|
||||
|
||||
/// <summary>
|
||||
/// Iterates all visible landblock entries and groups every (entity, meshRef)
|
||||
/// pair by GfxObjId. Clears previous frame's groups before filling.
|
||||
/// </summary>
|
||||
private void CollectGroups(
|
||||
IEnumerable<(uint LandblockId, Vector3 AabbMin, Vector3 AabbMax, IReadOnlyList<WorldEntity> Entities)> landblockEntries,
|
||||
FrustumPlanes? frustum,
|
||||
uint? neverCullLandblockId,
|
||||
HashSet<uint>? visibleCellIds,
|
||||
HashSet<uint>? animatedEntityIds)
|
||||
{
|
||||
foreach (var grp in _groups.Values)
|
||||
grp.Entries.Clear();
|
||||
|
||||
foreach (var entry in landblockEntries)
|
||||
{
|
||||
// L-fix1 (2026-04-28): the landblock cull decision is now
|
||||
// PER-LANDBLOCK boolean, not a continue. We still need to
|
||||
// walk the entity list because animated entities (in
|
||||
// animatedEntityIds) bypass the cull and render anyway.
|
||||
bool landblockVisible = frustum is null
|
||||
|| entry.LandblockId == neverCullLandblockId
|
||||
|| FrustumCuller.IsAabbVisible(frustum.Value, entry.AabbMin, entry.AabbMax);
|
||||
|
||||
// Fast path: no animated entities globally → if landblock is
|
||||
// culled, skip the whole entity list (preserves the original
|
||||
// O(visible-landblocks) cost when the caller doesn't care
|
||||
// about animated bypass).
|
||||
if (!landblockVisible && (animatedEntityIds is null || animatedEntityIds.Count == 0))
|
||||
continue;
|
||||
|
||||
foreach (var entity in entry.Entities)
|
||||
{
|
||||
if (entity.MeshRefs.Count == 0)
|
||||
continue;
|
||||
|
||||
// L-fix1: when the landblock is frustum-culled, only
|
||||
// render entities flagged as animated. This keeps
|
||||
// remote players / NPCs / monsters visible even when
|
||||
// their landblock rotates out of the view frustum.
|
||||
bool isAnimated = animatedEntityIds?.Contains(entity.Id) == true;
|
||||
if (!landblockVisible && !isAnimated)
|
||||
continue;
|
||||
|
||||
// Step 4: portal visibility filter. If we have a visible cell set,
|
||||
// skip interior entities whose parent cell isn't visible.
|
||||
// visibleCellIds == null means camera is outdoors → show all interiors.
|
||||
if (entity.ParentCellId.HasValue && visibleCellIds is not null
|
||||
&& !visibleCellIds.Contains(entity.ParentCellId.Value))
|
||||
continue;
|
||||
|
||||
var entityRoot =
|
||||
Matrix4x4.CreateFromQuaternion(entity.Rotation) *
|
||||
Matrix4x4.CreateTranslation(entity.Position);
|
||||
|
||||
// Hash the entity's PaletteOverride once — shared by every
|
||||
// MeshRef on this entity, so we compute it outside the loop.
|
||||
ulong palHash = HashPaletteOverride(entity.PaletteOverride);
|
||||
|
||||
foreach (var meshRef in entity.MeshRefs)
|
||||
{
|
||||
if (!_gpuByGfxObj.TryGetValue(meshRef.GfxObjId, out var cachedMeshes))
|
||||
continue;
|
||||
|
||||
var model = meshRef.PartTransform * entityRoot;
|
||||
|
||||
// Texture signature = palette hash ^ surface-overrides hash.
|
||||
// Two instances can share a batch only when their ResolveTex
|
||||
// would return identical handles for every sub-mesh — that
|
||||
// means identical palette AND identical surface overrides.
|
||||
ulong surfHash = HashSurfaceOverrides(meshRef.SurfaceOverrides);
|
||||
ulong texSig = palHash ^ surfHash;
|
||||
var key = new GroupKey(meshRef.GfxObjId, texSig);
|
||||
|
||||
if (!_groups.TryGetValue(key, out var group))
|
||||
{
|
||||
group = new InstanceGroup();
|
||||
_groups[key] = group;
|
||||
}
|
||||
|
||||
group.Entries.Add(new InstanceEntry(model, entity, meshRef));
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
private static ulong HashPaletteOverride(AcDream.Core.World.PaletteOverride? p)
|
||||
{
|
||||
if (p is null) return 0UL;
|
||||
ulong h = 0xCBF29CE484222325UL;
|
||||
const ulong prime = 0x100000001B3UL;
|
||||
h = (h ^ p.BasePaletteId) * prime;
|
||||
foreach (var sp in p.SubPalettes)
|
||||
{
|
||||
h = (h ^ sp.SubPaletteId) * prime;
|
||||
h = (h ^ sp.Offset) * prime;
|
||||
h = (h ^ sp.Length) * prime;
|
||||
}
|
||||
return h;
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Order-independent hash of a SurfaceOverrides dictionary. XOR of each
|
||||
/// (key, value) pair keeps the result stable regardless of Dictionary
|
||||
/// iteration order, so two instances whose override maps contain the
|
||||
/// same pairs will hash identically.
|
||||
/// </summary>
|
||||
private static ulong HashSurfaceOverrides(IReadOnlyDictionary<uint, uint>? overrides)
|
||||
{
|
||||
if (overrides is null || overrides.Count == 0) return 0UL;
|
||||
ulong acc = 0UL;
|
||||
foreach (var kvp in overrides)
|
||||
{
|
||||
ulong pair = ((ulong)kvp.Key << 32) | kvp.Value;
|
||||
acc ^= pair;
|
||||
}
|
||||
// Fold with a prime so the zero case doesn't collide with "empty".
|
||||
return (acc ^ 0xCBF29CE484222325UL) * 0x100000001B3UL;
|
||||
}
|
||||
|
||||
// ── Matrix write ──────────────────────────────────────────────────────────
|
||||
|
||||
/// <summary>
|
||||
/// Writes a System.Numerics Matrix4x4 into <paramref name="buf"/> starting
|
||||
/// at <paramref name="offset"/> as 16 consecutive floats in row-major order
|
||||
/// (the C# natural memory layout). The GLSL shader reads each 4-float row
|
||||
/// as a column of the mat4 — identical to what UniformMatrix4(transpose=false)
|
||||
/// produces for the uniform path.
|
||||
/// </summary>
|
||||
private static void WriteMatrix(float[] buf, int offset, in Matrix4x4 m)
|
||||
{
|
||||
buf[offset + 0] = m.M11; buf[offset + 1] = m.M12; buf[offset + 2] = m.M13; buf[offset + 3] = m.M14;
|
||||
buf[offset + 4] = m.M21; buf[offset + 5] = m.M22; buf[offset + 6] = m.M23; buf[offset + 7] = m.M24;
|
||||
buf[offset + 8] = m.M31; buf[offset + 9] = m.M32; buf[offset + 10] = m.M33; buf[offset + 11] = m.M34;
|
||||
buf[offset + 12] = m.M41; buf[offset + 13] = m.M42; buf[offset + 14] = m.M43; buf[offset + 15] = m.M44;
|
||||
}
|
||||
|
||||
// ── Texture resolution ────────────────────────────────────────────────────
|
||||
|
||||
private uint ResolveTex(WorldEntity entity, MeshRef meshRef, SubMeshGpu sub)
|
||||
{
|
||||
uint overrideOrigTex = 0;
|
||||
bool hasOrigTexOverride = meshRef.SurfaceOverrides is not null
|
||||
&& meshRef.SurfaceOverrides.TryGetValue(sub.SurfaceId, out overrideOrigTex);
|
||||
uint? origTexOverride = hasOrigTexOverride ? overrideOrigTex : (uint?)null;
|
||||
|
||||
if (entity.PaletteOverride is not null)
|
||||
{
|
||||
return _textures.GetOrUploadWithPaletteOverride(
|
||||
sub.SurfaceId, origTexOverride, entity.PaletteOverride);
|
||||
}
|
||||
else if (hasOrigTexOverride)
|
||||
{
|
||||
return _textures.GetOrUploadWithOrigTextureOverride(sub.SurfaceId, overrideOrigTex);
|
||||
}
|
||||
else
|
||||
{
|
||||
return _textures.GetOrUpload(sub.SurfaceId);
|
||||
}
|
||||
}
|
||||
|
||||
// ── Disposal ──────────────────────────────────────────────────────────────
|
||||
|
||||
public void Dispose()
|
||||
{
|
||||
foreach (var subs in _gpuByGfxObj.Values)
|
||||
{
|
||||
foreach (var sub in subs)
|
||||
{
|
||||
_gl.DeleteBuffer(sub.Vbo);
|
||||
_gl.DeleteBuffer(sub.Ebo);
|
||||
_gl.DeleteVertexArray(sub.Vao);
|
||||
}
|
||||
}
|
||||
_gl.DeleteBuffer(_instanceVbo);
|
||||
_gpuByGfxObj.Clear();
|
||||
_groups.Clear();
|
||||
}
|
||||
|
||||
// ── Private types ─────────────────────────────────────────────────────────
|
||||
|
||||
private sealed class SubMeshGpu
|
||||
{
|
||||
public uint Vao;
|
||||
public uint Vbo;
|
||||
public uint Ebo;
|
||||
public int IndexCount;
|
||||
public uint SurfaceId;
|
||||
public TranslucencyKind Translucency;
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// All instances of one GfxObj for this frame, plus their starting offset
|
||||
/// in the shared instance VBO (in units of instances, not bytes).
|
||||
/// </summary>
|
||||
private sealed class InstanceGroup
|
||||
{
|
||||
public readonly List<InstanceEntry> Entries = new();
|
||||
public int BufferOffset;
|
||||
|
||||
public int Count => Entries.Count;
|
||||
}
|
||||
|
||||
private readonly struct InstanceEntry
|
||||
{
|
||||
public readonly Matrix4x4 Model;
|
||||
public readonly WorldEntity Entity;
|
||||
public readonly MeshRef MeshRef;
|
||||
|
||||
public InstanceEntry(Matrix4x4 model, WorldEntity entity, MeshRef meshRef)
|
||||
{
|
||||
Model = model;
|
||||
Entity = entity;
|
||||
MeshRef = meshRef;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
@ -1,35 +0,0 @@
|
|||
#version 430 core
|
||||
|
||||
// Per-vertex attributes
|
||||
layout(location = 0) in vec3 aPosition;
|
||||
layout(location = 1) in vec3 aNormal;
|
||||
layout(location = 2) in vec2 aTexCoord;
|
||||
|
||||
// Per-instance model matrix, split across four vec4 attribute slots.
|
||||
// A mat4 consumes 4 consecutive attribute locations, so locations 3-6 are
|
||||
// all occupied by this single logical matrix. The C# side must call
|
||||
// VertexAttribPointer four times (one per row) and VertexAttribDivisor(loc, 1)
|
||||
// on each of the four slots.
|
||||
layout(location = 3) in vec4 aInstanceRow0;
|
||||
layout(location = 4) in vec4 aInstanceRow1;
|
||||
layout(location = 5) in vec4 aInstanceRow2;
|
||||
layout(location = 6) in vec4 aInstanceRow3;
|
||||
|
||||
uniform mat4 uViewProjection;
|
||||
|
||||
out vec2 vTex;
|
||||
out vec3 vWorldNormal;
|
||||
out vec3 vWorldPos;
|
||||
|
||||
void main() {
|
||||
// Reconstruct the per-instance model matrix from its four row vectors.
|
||||
mat4 model = mat4(aInstanceRow0, aInstanceRow1, aInstanceRow2, aInstanceRow3);
|
||||
|
||||
vec4 worldPos = model * vec4(aPosition, 1.0);
|
||||
gl_Position = uViewProjection * worldPos;
|
||||
|
||||
vWorldPos = worldPos.xyz;
|
||||
// Transform normal into world space.
|
||||
vWorldNormal = normalize(mat3(model) * aNormal);
|
||||
vTex = aTexCoord;
|
||||
}
|
||||
|
|
@ -1,24 +1,22 @@
|
|||
#version 430 core
|
||||
#extension GL_ARB_bindless_texture : require
|
||||
|
||||
in vec2 vTex;
|
||||
in vec3 vWorldNormal;
|
||||
in vec3 vNormal;
|
||||
in vec2 vTexCoord;
|
||||
in vec3 vWorldPos;
|
||||
in flat uvec2 vTextureHandle;
|
||||
in flat uint vTextureLayer;
|
||||
|
||||
out vec4 fragColor;
|
||||
// uRenderPass values (Phase N.5 Decision 2 — two-pass alpha-test):
|
||||
// 0 = opaque pass — discard fragments with alpha < 0.95
|
||||
// (lets the depth write succeed for solid pixels)
|
||||
// 1 = translucent pass — covers AlphaBlend / Additive / InvAlpha;
|
||||
// discard alpha >= 0.95 (already drawn opaque) and
|
||||
// alpha < 0.05 (skip empty fragments — large
|
||||
// transparent overdraw cost otherwise)
|
||||
uniform int uRenderPass;
|
||||
|
||||
// One 2D texture per draw call — same binding point as mesh.frag so the
|
||||
// C# side can use the same TextureCache without a texture-array pipeline.
|
||||
uniform sampler2D uDiffuse;
|
||||
|
||||
// Translucency kind — matches TranslucencyKind C# enum (same as mesh.frag):
|
||||
// 0 = Opaque — depth write+test, no blend; shader never discards
|
||||
// 1 = ClipMap — alpha-key discard at 0.5 (doors, windows, vegetation)
|
||||
// 2 = AlphaBlend — GL blending handles compositing; do NOT discard
|
||||
// 3 = Additive — GL additive blending; do NOT discard
|
||||
// 4 = InvAlpha — GL inverted-alpha blending; do NOT discard
|
||||
uniform int uTranslucencyKind;
|
||||
|
||||
// Phase G.1+G.2: shared scene-lighting UBO (see mesh.frag for layout docs).
|
||||
// SceneLighting UBO — IDENTICAL layout to mesh_instanced.frag binding=1.
|
||||
struct Light {
|
||||
vec4 posAndKind;
|
||||
vec4 dirAndRange;
|
||||
|
|
@ -38,10 +36,8 @@ vec3 accumulateLights(vec3 N, vec3 worldPos) {
|
|||
int activeLights = int(uCellAmbient.w);
|
||||
for (int i = 0; i < 8; ++i) {
|
||||
if (i >= activeLights) break;
|
||||
|
||||
int kind = int(uLights[i].posAndKind.w);
|
||||
vec3 Lcol = uLights[i].colorAndIntensity.xyz * uLights[i].colorAndIntensity.w;
|
||||
|
||||
if (kind == 0) {
|
||||
vec3 Ldir = -uLights[i].dirAndRange.xyz;
|
||||
float ndl = max(0.0, dot(N, Ldir));
|
||||
|
|
@ -77,16 +73,24 @@ vec3 applyFog(vec3 lit, vec3 worldPos) {
|
|||
return mix(lit, uFogColor.xyz, fog);
|
||||
}
|
||||
|
||||
out vec4 FragColor;
|
||||
|
||||
void main() {
|
||||
vec4 color = texture(uDiffuse, vTex);
|
||||
sampler2DArray tex = sampler2DArray(vTextureHandle);
|
||||
vec4 color = texture(tex, vec3(vTexCoord, float(vTextureLayer)));
|
||||
|
||||
// Alpha cutout only for clip-map surfaces (doors, windows, vegetation).
|
||||
if (uTranslucencyKind == 1 && color.a < 0.5) discard;
|
||||
// Two-pass alpha-test (N.5 Decision 2).
|
||||
if (uRenderPass == 0) {
|
||||
if (color.a < 0.95) discard; // opaque pass
|
||||
} else {
|
||||
if (color.a >= 0.95) discard; // transparent pass
|
||||
if (color.a < 0.05) discard; // skip totally-empty
|
||||
}
|
||||
|
||||
vec3 N = normalize(vWorldNormal);
|
||||
vec3 N = normalize(vNormal);
|
||||
vec3 lit = accumulateLights(N, vWorldPos);
|
||||
|
||||
// Lightning flash — additive scene bump.
|
||||
// Lightning flash — additive scene bump (matches mesh_instanced.frag).
|
||||
lit += uFogParams.z * vec3(0.6, 0.6, 0.75);
|
||||
|
||||
// Retail clamp per-channel to 1.0 (r13 §13.1).
|
||||
|
|
@ -94,5 +98,5 @@ void main() {
|
|||
|
||||
vec3 rgb = color.rgb * lit;
|
||||
rgb = applyFog(rgb, vWorldPos);
|
||||
fragColor = vec4(rgb, color.a);
|
||||
FragColor = vec4(rgb, color.a);
|
||||
}
|
||||
62
src/AcDream.App/Rendering/Shaders/mesh_modern.vert
Normal file
62
src/AcDream.App/Rendering/Shaders/mesh_modern.vert
Normal file
|
|
@ -0,0 +1,62 @@
|
|||
#version 430 core
|
||||
#extension GL_ARB_shader_draw_parameters : require
|
||||
|
||||
layout(location = 0) in vec3 aPosition;
|
||||
layout(location = 1) in vec3 aNormal;
|
||||
layout(location = 2) in vec2 aTexCoord;
|
||||
|
||||
struct InstanceData {
|
||||
mat4 transform;
|
||||
// Reserved for Phase B.4 follow-up (selection-blink retail-faithful
|
||||
// highlight): vec4 highlightColor; — extend stride here, increase the
|
||||
// _instanceSsbo upload size in WbDrawDispatcher, add a flat varying out,
|
||||
// and consume in mesh_modern.frag.
|
||||
};
|
||||
|
||||
struct BatchData {
|
||||
uvec2 textureHandle; // bindless handle for sampler2DArray
|
||||
uint textureLayer; // layer index (always 0 for per-instance composites)
|
||||
uint flags; // reserved — N.5 dispatcher owns all blend state
|
||||
// (glBlendFunc per pass). If a future phase wants
|
||||
// shader-side per-batch additive flag (Decision 2
|
||||
// fallback), encode it here as bit 0.
|
||||
};
|
||||
|
||||
layout(std430, binding = 0) readonly buffer InstanceBuffer {
|
||||
InstanceData Instances[];
|
||||
};
|
||||
|
||||
// binding=1 here is the SSBO namespace — distinct from the UBO namespace.
|
||||
// SceneLighting UBO also uses binding=1 in the fragment shader; GL keeps
|
||||
// GL_SHADER_STORAGE_BUFFER and GL_UNIFORM_BUFFER binding tables separate.
|
||||
// Task 10 dispatcher binds:
|
||||
// glBindBufferBase(GL_SHADER_STORAGE_BUFFER, 0, instanceSsbo)
|
||||
// glBindBufferBase(GL_SHADER_STORAGE_BUFFER, 1, batchSsbo)
|
||||
// Existing SceneLightingUboBinding handles the UBO side.
|
||||
layout(std430, binding = 1) readonly buffer BatchBuffer {
|
||||
BatchData Batches[];
|
||||
};
|
||||
|
||||
uniform mat4 uViewProjection;
|
||||
|
||||
out vec3 vNormal;
|
||||
out vec2 vTexCoord;
|
||||
out vec3 vWorldPos;
|
||||
out flat uvec2 vTextureHandle;
|
||||
out flat uint vTextureLayer;
|
||||
|
||||
void main() {
|
||||
int instanceIndex = gl_BaseInstanceARB + gl_InstanceID;
|
||||
mat4 model = Instances[instanceIndex].transform;
|
||||
|
||||
vec4 worldPos = model * vec4(aPosition, 1.0);
|
||||
gl_Position = uViewProjection * worldPos;
|
||||
|
||||
vWorldPos = worldPos.xyz;
|
||||
vNormal = normalize(mat3(model) * aNormal);
|
||||
vTexCoord = aTexCoord;
|
||||
|
||||
BatchData b = Batches[gl_DrawIDARB];
|
||||
vTextureHandle = b.textureHandle;
|
||||
vTextureLayer = b.textureLayer;
|
||||
}
|
||||
|
|
@ -1,293 +0,0 @@
|
|||
// src/AcDream.App/Rendering/StaticMeshRenderer.cs
|
||||
using System.Numerics;
|
||||
using AcDream.Core.Meshing;
|
||||
using AcDream.Core.Terrain;
|
||||
using AcDream.Core.World;
|
||||
using Silk.NET.OpenGL;
|
||||
|
||||
namespace AcDream.App.Rendering;
|
||||
|
||||
public sealed unsafe class StaticMeshRenderer : IDisposable
|
||||
{
|
||||
private readonly GL _gl;
|
||||
private readonly Shader _shader;
|
||||
private readonly TextureCache _textures;
|
||||
|
||||
// One GPU bundle per unique GfxObj id. Each GfxObj can have multiple sub-meshes.
|
||||
private readonly Dictionary<uint, List<SubMeshGpu>> _gpuByGfxObj = new();
|
||||
|
||||
public StaticMeshRenderer(GL gl, Shader shader, TextureCache textures)
|
||||
{
|
||||
_gl = gl;
|
||||
_shader = shader;
|
||||
_textures = textures;
|
||||
}
|
||||
|
||||
public void EnsureUploaded(uint gfxObjId, IReadOnlyList<GfxObjSubMesh> subMeshes)
|
||||
{
|
||||
if (_gpuByGfxObj.ContainsKey(gfxObjId))
|
||||
return;
|
||||
|
||||
var list = new List<SubMeshGpu>(subMeshes.Count);
|
||||
foreach (var sm in subMeshes)
|
||||
list.Add(UploadSubMesh(sm));
|
||||
_gpuByGfxObj[gfxObjId] = list;
|
||||
}
|
||||
|
||||
private SubMeshGpu UploadSubMesh(GfxObjSubMesh sm)
|
||||
{
|
||||
uint vao = _gl.GenVertexArray();
|
||||
_gl.BindVertexArray(vao);
|
||||
|
||||
uint vbo = _gl.GenBuffer();
|
||||
_gl.BindBuffer(BufferTargetARB.ArrayBuffer, vbo);
|
||||
fixed (void* p = sm.Vertices)
|
||||
_gl.BufferData(BufferTargetARB.ArrayBuffer,
|
||||
(nuint)(sm.Vertices.Length * sizeof(Vertex)), p, BufferUsageARB.StaticDraw);
|
||||
|
||||
uint ebo = _gl.GenBuffer();
|
||||
_gl.BindBuffer(BufferTargetARB.ElementArrayBuffer, ebo);
|
||||
fixed (void* p = sm.Indices)
|
||||
_gl.BufferData(BufferTargetARB.ElementArrayBuffer,
|
||||
(nuint)(sm.Indices.Length * sizeof(uint)), p, BufferUsageARB.StaticDraw);
|
||||
|
||||
uint stride = (uint)sizeof(Vertex);
|
||||
_gl.EnableVertexAttribArray(0);
|
||||
_gl.VertexAttribPointer(0, 3, VertexAttribPointerType.Float, false, stride, (void*)0);
|
||||
_gl.EnableVertexAttribArray(1);
|
||||
_gl.VertexAttribPointer(1, 3, VertexAttribPointerType.Float, false, stride, (void*)(3 * sizeof(float)));
|
||||
_gl.EnableVertexAttribArray(2);
|
||||
_gl.VertexAttribPointer(2, 2, VertexAttribPointerType.Float, false, stride, (void*)(6 * sizeof(float)));
|
||||
_gl.EnableVertexAttribArray(3);
|
||||
_gl.VertexAttribIPointer(3, 1, VertexAttribIType.UnsignedInt, stride, (void*)(8 * sizeof(float)));
|
||||
|
||||
_gl.BindVertexArray(0);
|
||||
|
||||
return new SubMeshGpu
|
||||
{
|
||||
Vao = vao,
|
||||
Vbo = vbo,
|
||||
Ebo = ebo,
|
||||
IndexCount = sm.Indices.Length,
|
||||
SurfaceId = sm.SurfaceId,
|
||||
// Capture translucency at upload time so the draw loop never
|
||||
// has to look it up from external state.
|
||||
Translucency = sm.Translucency,
|
||||
};
|
||||
}
|
||||
|
||||
public void Draw(ICamera camera,
|
||||
IEnumerable<(uint LandblockId, Vector3 AabbMin, Vector3 AabbMax, IReadOnlyList<WorldEntity> Entities)> landblockEntries,
|
||||
FrustumPlanes? frustum = null,
|
||||
uint? neverCullLandblockId = null)
|
||||
{
|
||||
_shader.Use();
|
||||
_shader.SetMatrix4("uView", camera.View);
|
||||
_shader.SetMatrix4("uProjection", camera.Projection);
|
||||
|
||||
// ── Pass 1: Opaque + ClipMap ──────────────────────────────────────────
|
||||
// Depth write on (default). No blending. ClipMap surfaces use the
|
||||
// alpha-discard path in the fragment shader (uTranslucencyKind == 1).
|
||||
foreach (var entry in landblockEntries)
|
||||
{
|
||||
// Per-landblock frustum cull. Never cull the player's landblock.
|
||||
if (frustum is not null &&
|
||||
entry.LandblockId != neverCullLandblockId &&
|
||||
!FrustumCuller.IsAabbVisible(frustum.Value, entry.AabbMin, entry.AabbMax))
|
||||
continue;
|
||||
|
||||
foreach (var entity in entry.Entities)
|
||||
{
|
||||
if (entity.MeshRefs.Count == 0)
|
||||
continue;
|
||||
|
||||
foreach (var meshRef in entity.MeshRefs)
|
||||
{
|
||||
if (!_gpuByGfxObj.TryGetValue(meshRef.GfxObjId, out var subMeshes))
|
||||
continue;
|
||||
|
||||
var entityRoot =
|
||||
Matrix4x4.CreateFromQuaternion(entity.Rotation) *
|
||||
Matrix4x4.CreateTranslation(entity.Position);
|
||||
var model = meshRef.PartTransform * entityRoot;
|
||||
_shader.SetMatrix4("uModel", model);
|
||||
|
||||
foreach (var sub in subMeshes)
|
||||
{
|
||||
// Skip translucent sub-meshes in the first pass.
|
||||
if (sub.Translucency != TranslucencyKind.Opaque &&
|
||||
sub.Translucency != TranslucencyKind.ClipMap)
|
||||
continue;
|
||||
|
||||
_shader.SetInt("uTranslucencyKind", (int)sub.Translucency);
|
||||
|
||||
uint tex = ResolveTex(entity, meshRef, sub);
|
||||
_gl.ActiveTexture(TextureUnit.Texture0);
|
||||
_gl.BindTexture(TextureTarget.Texture2D, tex);
|
||||
|
||||
_gl.BindVertexArray(sub.Vao);
|
||||
_gl.DrawElements(PrimitiveType.Triangles, (uint)sub.IndexCount, DrawElementsType.UnsignedInt, (void*)0);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// ── Pass 2: Translucent (AlphaBlend, Additive, InvAlpha) ─────────────
|
||||
// Depth test on so translucents composite correctly behind opaque geometry.
|
||||
// Depth write OFF so translucents don't occlude each other or downstream
|
||||
// opaque draws. Blend function is set per-draw based on TranslucencyKind.
|
||||
//
|
||||
// NOTE: translucent draws are NOT sorted by depth — overlapping translucent
|
||||
// surfaces can composite in the wrong order. Portal-sized billboards don't
|
||||
// overlap in practice so this is acceptable and avoids a larger refactor.
|
||||
_gl.Enable(EnableCap.Blend);
|
||||
_gl.DepthMask(false);
|
||||
|
||||
// Phase 9.2: enable back-face culling for the translucent pass so
|
||||
// closed-shell translucents (lifestone crystal, glow gems, any
|
||||
// convex blended mesh) don't draw their back faces over their
|
||||
// front faces in arbitrary iteration order. Without this, the
|
||||
// 58 triangles of the lifestone crystal composited with an
|
||||
// "inside-out" look where the user saw through one face into
|
||||
// the hollow interior. With back-face culling on, back faces are
|
||||
// dropped at rasterization time, front faces composite as-is,
|
||||
// and depth ordering within the front-facing subset is a
|
||||
// non-issue for closed convex-ish shells. Matches WorldBuilder's
|
||||
// per-batch CullMode handling in
|
||||
// references/WorldBuilder/Chorizite.OpenGLSDLBackend/Lib/
|
||||
// BaseObjectRenderManager.cs:361-365.
|
||||
//
|
||||
// Our fan triangulation emits pos-side polygons as
|
||||
// (0, i, i+1) which is CCW in standard OpenGL conventions, so
|
||||
// GL_BACK + CCW front is the correct state. Neg-side polygons
|
||||
// (if any) use reversed winding and get culled here — that's a
|
||||
// known limitation and matches the opaque-pass behavior since
|
||||
// neg-side polys are virtually never translucent in AC content.
|
||||
_gl.Enable(EnableCap.CullFace);
|
||||
_gl.CullFace(TriangleFace.Back);
|
||||
_gl.FrontFace(FrontFaceDirection.Ccw);
|
||||
|
||||
foreach (var entry in landblockEntries)
|
||||
{
|
||||
// Same per-landblock frustum cull for pass 2.
|
||||
if (frustum is not null &&
|
||||
entry.LandblockId != neverCullLandblockId &&
|
||||
!FrustumCuller.IsAabbVisible(frustum.Value, entry.AabbMin, entry.AabbMax))
|
||||
continue;
|
||||
|
||||
foreach (var entity in entry.Entities)
|
||||
{
|
||||
if (entity.MeshRefs.Count == 0)
|
||||
continue;
|
||||
|
||||
foreach (var meshRef in entity.MeshRefs)
|
||||
{
|
||||
if (!_gpuByGfxObj.TryGetValue(meshRef.GfxObjId, out var subMeshes))
|
||||
continue;
|
||||
|
||||
var entityRoot =
|
||||
Matrix4x4.CreateFromQuaternion(entity.Rotation) *
|
||||
Matrix4x4.CreateTranslation(entity.Position);
|
||||
var model = meshRef.PartTransform * entityRoot;
|
||||
_shader.SetMatrix4("uModel", model);
|
||||
|
||||
foreach (var sub in subMeshes)
|
||||
{
|
||||
if (sub.Translucency == TranslucencyKind.Opaque ||
|
||||
sub.Translucency == TranslucencyKind.ClipMap)
|
||||
continue;
|
||||
|
||||
// Set per-draw blend function.
|
||||
switch (sub.Translucency)
|
||||
{
|
||||
case TranslucencyKind.Additive:
|
||||
// src*a + dst — portal swirls, glows
|
||||
_gl.BlendFunc(BlendingFactor.SrcAlpha, BlendingFactor.One);
|
||||
break;
|
||||
|
||||
case TranslucencyKind.InvAlpha:
|
||||
// src*(1-a) + dst*a
|
||||
_gl.BlendFunc(BlendingFactor.OneMinusSrcAlpha, BlendingFactor.SrcAlpha);
|
||||
break;
|
||||
|
||||
default: // AlphaBlend
|
||||
// src*a + dst*(1-a)
|
||||
_gl.BlendFunc(BlendingFactor.SrcAlpha, BlendingFactor.OneMinusSrcAlpha);
|
||||
break;
|
||||
}
|
||||
|
||||
_shader.SetInt("uTranslucencyKind", (int)sub.Translucency);
|
||||
|
||||
uint tex = ResolveTex(entity, meshRef, sub);
|
||||
_gl.ActiveTexture(TextureUnit.Texture0);
|
||||
_gl.BindTexture(TextureTarget.Texture2D, tex);
|
||||
|
||||
_gl.BindVertexArray(sub.Vao);
|
||||
_gl.DrawElements(PrimitiveType.Triangles, (uint)sub.IndexCount, DrawElementsType.UnsignedInt, (void*)0);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Restore default GL state for subsequent renderers (terrain etc.).
|
||||
_gl.DepthMask(true);
|
||||
_gl.Disable(EnableCap.Blend);
|
||||
_gl.Disable(EnableCap.CullFace);
|
||||
|
||||
_gl.BindVertexArray(0);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Resolves the GL texture id for a sub-mesh, honouring palette and
|
||||
/// texture overrides carried on the entity and the mesh-ref.
|
||||
/// </summary>
|
||||
private uint ResolveTex(WorldEntity entity, MeshRef meshRef, SubMeshGpu sub)
|
||||
{
|
||||
uint overrideOrigTex = 0;
|
||||
bool hasOrigTexOverride = meshRef.SurfaceOverrides is not null
|
||||
&& meshRef.SurfaceOverrides.TryGetValue(sub.SurfaceId, out overrideOrigTex);
|
||||
uint? origTexOverride = hasOrigTexOverride ? overrideOrigTex : (uint?)null;
|
||||
|
||||
if (entity.PaletteOverride is not null)
|
||||
{
|
||||
return _textures.GetOrUploadWithPaletteOverride(
|
||||
sub.SurfaceId, origTexOverride, entity.PaletteOverride);
|
||||
}
|
||||
else if (hasOrigTexOverride)
|
||||
{
|
||||
return _textures.GetOrUploadWithOrigTextureOverride(sub.SurfaceId, overrideOrigTex);
|
||||
}
|
||||
else
|
||||
{
|
||||
return _textures.GetOrUpload(sub.SurfaceId);
|
||||
}
|
||||
}
|
||||
|
||||
public void Dispose()
|
||||
{
|
||||
foreach (var subs in _gpuByGfxObj.Values)
|
||||
{
|
||||
foreach (var sub in subs)
|
||||
{
|
||||
_gl.DeleteBuffer(sub.Vbo);
|
||||
_gl.DeleteBuffer(sub.Ebo);
|
||||
_gl.DeleteVertexArray(sub.Vao);
|
||||
}
|
||||
}
|
||||
_gpuByGfxObj.Clear();
|
||||
}
|
||||
|
||||
private sealed class SubMeshGpu
|
||||
{
|
||||
public uint Vao;
|
||||
public uint Vbo;
|
||||
public uint Ebo;
|
||||
public int IndexCount;
|
||||
public uint SurfaceId;
|
||||
/// <summary>
|
||||
/// Cached from GfxObjSubMesh.Translucency at upload time.
|
||||
/// Avoids any per-draw lookup into external state.
|
||||
/// </summary>
|
||||
public TranslucencyKind Translucency;
|
||||
}
|
||||
}
|
||||
|
|
@ -29,10 +29,22 @@ public sealed unsafe class TextureCache : Wb.ITextureCachePerInstance, IDisposab
|
|||
private readonly Dictionary<(uint surfaceId, uint origTexOverride, ulong paletteHash), uint> _handlesByPalette = new();
|
||||
private uint _magentaHandle;
|
||||
|
||||
public TextureCache(GL gl, DatCollection dats)
|
||||
private readonly Wb.BindlessSupport? _bindless;
|
||||
|
||||
// Bindless / Texture2DArray parallel caches. Keys mirror the legacy three
|
||||
// caches so a surface used by both the legacy (Texture2D, sampler2D) and
|
||||
// modern (Texture2DArray, sampler2DArray) paths is uploaded twice — once
|
||||
// per target. Each entry stores both the GL texture name (for Dispose
|
||||
// cleanup) and the resident bindless handle (returned to callers).
|
||||
private readonly Dictionary<uint, (uint Name, ulong Handle)> _bindlessBySurfaceId = new();
|
||||
private readonly Dictionary<(uint surfaceId, uint origTexOverride), (uint Name, ulong Handle)> _bindlessByOverridden = new();
|
||||
private readonly Dictionary<(uint surfaceId, uint origTexOverride, ulong paletteHash), (uint Name, ulong Handle)> _bindlessByPalette = new();
|
||||
|
||||
public TextureCache(GL gl, DatCollection dats, Wb.BindlessSupport? bindless = null)
|
||||
{
|
||||
_gl = gl;
|
||||
_dats = dats;
|
||||
_bindless = bindless;
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
|
|
@ -149,6 +161,82 @@ public sealed unsafe class TextureCache : Wb.ITextureCachePerInstance, IDisposab
|
|||
return h;
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// 64-bit bindless handle variant of <see cref="GetOrUpload"/> for the WB
|
||||
/// modern rendering path. Uploads the texture as a 1-layer Texture2DArray
|
||||
/// (so the shader's <c>sampler2DArray</c> can sample at layer 0) and returns
|
||||
/// a resident bindless handle. Caches by surfaceId in a separate dictionary
|
||||
/// from the legacy Texture2D path; the same surface may be uploaded twice
|
||||
/// if used by both paths (acceptable transition cost — N.6 deletes the legacy
|
||||
/// path).
|
||||
/// Throws if BindlessSupport wasn't provided to the constructor.
|
||||
/// </summary>
|
||||
public ulong GetOrUploadBindless(uint surfaceId)
|
||||
{
|
||||
EnsureBindlessAvailable();
|
||||
if (_bindlessBySurfaceId.TryGetValue(surfaceId, out var entry))
|
||||
return entry.Handle;
|
||||
var decoded = DecodeFromDats(surfaceId, origTextureOverride: null, paletteOverride: null);
|
||||
uint name = UploadRgba8AsLayer1Array(decoded);
|
||||
ulong handle = _bindless!.GetResidentHandle(name);
|
||||
_bindlessBySurfaceId[surfaceId] = (name, handle);
|
||||
return handle;
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// 64-bit bindless handle variant of <see cref="GetOrUploadWithOrigTextureOverride"/>
|
||||
/// for the WB modern rendering path. Uploads the texture as a 1-layer
|
||||
/// Texture2DArray with the override SurfaceTexture id and returns a resident
|
||||
/// bindless handle. Caches under a separate composite key from the legacy
|
||||
/// path. Throws if BindlessSupport wasn't provided to the constructor.
|
||||
/// </summary>
|
||||
public ulong GetOrUploadWithOrigTextureOverrideBindless(uint surfaceId, uint overrideOrigTextureId)
|
||||
{
|
||||
EnsureBindlessAvailable();
|
||||
var key = (surfaceId, overrideOrigTextureId);
|
||||
if (_bindlessByOverridden.TryGetValue(key, out var entry))
|
||||
return entry.Handle;
|
||||
var decoded = DecodeFromDats(surfaceId, origTextureOverride: overrideOrigTextureId, paletteOverride: null);
|
||||
uint name = UploadRgba8AsLayer1Array(decoded);
|
||||
ulong handle = _bindless!.GetResidentHandle(name);
|
||||
_bindlessByOverridden[key] = (name, handle);
|
||||
return handle;
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// 64-bit bindless handle variant of <see cref="GetOrUploadWithPaletteOverride"/>
|
||||
/// for the WB modern rendering path. Applies the palette override on top of
|
||||
/// the texture's default palette before decoding, uploads as a 1-layer
|
||||
/// Texture2DArray, and returns a resident bindless handle. Takes a
|
||||
/// precomputed palette hash so the WB dispatcher can compute it once per
|
||||
/// entity. Throws if BindlessSupport wasn't provided to the constructor.
|
||||
/// </summary>
|
||||
public ulong GetOrUploadWithPaletteOverrideBindless(
|
||||
uint surfaceId,
|
||||
uint? overrideOrigTextureId,
|
||||
PaletteOverride paletteOverride,
|
||||
ulong precomputedPaletteHash)
|
||||
{
|
||||
EnsureBindlessAvailable();
|
||||
uint origTexKey = overrideOrigTextureId ?? 0;
|
||||
var key = (surfaceId, origTexKey, precomputedPaletteHash);
|
||||
if (_bindlessByPalette.TryGetValue(key, out var entry))
|
||||
return entry.Handle;
|
||||
var decoded = DecodeFromDats(surfaceId, origTextureOverride: overrideOrigTextureId, paletteOverride: paletteOverride);
|
||||
uint name = UploadRgba8AsLayer1Array(decoded);
|
||||
ulong handle = _bindless!.GetResidentHandle(name);
|
||||
_bindlessByPalette[key] = (name, handle);
|
||||
return handle;
|
||||
}
|
||||
|
||||
private void EnsureBindlessAvailable()
|
||||
{
|
||||
if (_bindless is null)
|
||||
throw new InvalidOperationException(
|
||||
"TextureCache constructed without BindlessSupport — cannot generate bindless handles. " +
|
||||
"WbDrawDispatcher requires the bindless-aware ctor overload (pass non-null BindlessSupport).");
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Cheap 64-bit hash over a palette override's identity so two
|
||||
/// entities with the same palette setup share a decode. Internal so
|
||||
|
|
@ -279,17 +367,79 @@ public sealed unsafe class TextureCache : Wb.ITextureCachePerInstance, IDisposab
|
|||
return tex;
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Variant of <see cref="UploadRgba8"/> that uploads pixel data as a 1-layer
|
||||
/// Texture2DArray. Required by the WB modern rendering path which samples via
|
||||
/// sampler2DArray in its bindless shader. Pixel data is identical.
|
||||
/// </summary>
|
||||
private uint UploadRgba8AsLayer1Array(DecodedTexture decoded)
|
||||
{
|
||||
uint tex = _gl.GenTexture();
|
||||
_gl.BindTexture(TextureTarget.Texture2DArray, tex);
|
||||
|
||||
fixed (byte* p = decoded.Rgba8)
|
||||
_gl.TexImage3D(
|
||||
TextureTarget.Texture2DArray,
|
||||
0,
|
||||
InternalFormat.Rgba8,
|
||||
(uint)decoded.Width,
|
||||
(uint)decoded.Height,
|
||||
depth: 1,
|
||||
border: 0,
|
||||
PixelFormat.Rgba,
|
||||
PixelType.UnsignedByte,
|
||||
p);
|
||||
|
||||
_gl.TexParameter(TextureTarget.Texture2DArray, TextureParameterName.TextureMinFilter, (int)TextureMinFilter.Linear);
|
||||
_gl.TexParameter(TextureTarget.Texture2DArray, TextureParameterName.TextureMagFilter, (int)TextureMagFilter.Linear);
|
||||
_gl.TexParameter(TextureTarget.Texture2DArray, TextureParameterName.TextureWrapS, (int)TextureWrapMode.Repeat);
|
||||
_gl.TexParameter(TextureTarget.Texture2DArray, TextureParameterName.TextureWrapT, (int)TextureWrapMode.Repeat);
|
||||
|
||||
_gl.BindTexture(TextureTarget.Texture2DArray, 0);
|
||||
return tex;
|
||||
}
|
||||
|
||||
public void Dispose()
|
||||
{
|
||||
// Phase 1: make all bindless handles non-resident BEFORE any
|
||||
// DeleteTexture call. ARB_bindless_texture requires that resident
|
||||
// handles be released before their backing texture is deleted —
|
||||
// interleaving per-entry is UB. Single null-guard around the whole
|
||||
// block (cleaner than per-call null-conditionals).
|
||||
if (_bindless is not null)
|
||||
{
|
||||
foreach (var (_, handle) in _bindlessBySurfaceId.Values)
|
||||
_bindless.MakeNonResident(handle);
|
||||
foreach (var (_, handle) in _bindlessByOverridden.Values)
|
||||
_bindless.MakeNonResident(handle);
|
||||
foreach (var (_, handle) in _bindlessByPalette.Values)
|
||||
_bindless.MakeNonResident(handle);
|
||||
}
|
||||
|
||||
// Phase 2: delete the Texture2DArray textures backing those handles.
|
||||
foreach (var (name, _) in _bindlessBySurfaceId.Values)
|
||||
_gl.DeleteTexture(name);
|
||||
_bindlessBySurfaceId.Clear();
|
||||
foreach (var (name, _) in _bindlessByOverridden.Values)
|
||||
_gl.DeleteTexture(name);
|
||||
_bindlessByOverridden.Clear();
|
||||
foreach (var (name, _) in _bindlessByPalette.Values)
|
||||
_gl.DeleteTexture(name);
|
||||
_bindlessByPalette.Clear();
|
||||
|
||||
// Phase 3: legacy Texture2D textures.
|
||||
foreach (var h in _handlesBySurfaceId.Values)
|
||||
_gl.DeleteTexture(h);
|
||||
_handlesBySurfaceId.Clear();
|
||||
|
||||
foreach (var h in _handlesByOverridden.Values)
|
||||
_gl.DeleteTexture(h);
|
||||
_handlesByOverridden.Clear();
|
||||
|
||||
foreach (var h in _handlesByPalette.Values)
|
||||
_gl.DeleteTexture(h);
|
||||
_handlesByPalette.Clear();
|
||||
|
||||
if (_magentaHandle != 0)
|
||||
{
|
||||
_gl.DeleteTexture(_magentaHandle);
|
||||
|
|
|
|||
55
src/AcDream.App/Rendering/Wb/BindlessSupport.cs
Normal file
55
src/AcDream.App/Rendering/Wb/BindlessSupport.cs
Normal file
|
|
@ -0,0 +1,55 @@
|
|||
using Silk.NET.OpenGL;
|
||||
using Silk.NET.OpenGL.Extensions.ARB;
|
||||
|
||||
namespace AcDream.App.Rendering.Wb;
|
||||
|
||||
/// <summary>
|
||||
/// Thin wrapper around <see cref="ArbBindlessTexture"/> + capability detection
|
||||
/// for the modern rendering path. Constructed once at startup via
|
||||
/// <see cref="TryCreate"/>, which returns false if the extension isn't present.
|
||||
/// </summary>
|
||||
public sealed class BindlessSupport
|
||||
{
|
||||
private readonly ArbBindlessTexture _ext;
|
||||
|
||||
private BindlessSupport(ArbBindlessTexture extension)
|
||||
{
|
||||
_ext = extension;
|
||||
}
|
||||
|
||||
public static bool TryCreate(GL gl, out BindlessSupport? support)
|
||||
{
|
||||
if (gl.TryGetExtension<ArbBindlessTexture>(out var ext))
|
||||
{
|
||||
support = new BindlessSupport(ext);
|
||||
return true;
|
||||
}
|
||||
support = null;
|
||||
return false;
|
||||
}
|
||||
|
||||
/// <summary>Get a 64-bit bindless handle for the texture and make it resident.
|
||||
/// Idempotent: handle is the same for a given texture name.</summary>
|
||||
public ulong GetResidentHandle(uint textureName)
|
||||
{
|
||||
ulong h = _ext.GetTextureHandle(textureName);
|
||||
if (!_ext.IsTextureHandleResident(h))
|
||||
_ext.MakeTextureHandleResident(h);
|
||||
return h;
|
||||
}
|
||||
|
||||
/// <summary>Release residency for a handle. Call before deleting the underlying texture.</summary>
|
||||
public void MakeNonResident(ulong handle)
|
||||
{
|
||||
if (_ext.IsTextureHandleResident(handle))
|
||||
_ext.MakeTextureHandleNonResident(handle);
|
||||
}
|
||||
|
||||
/// <summary>Detect <c>GL_ARB_shader_draw_parameters</c> in addition to bindless.
|
||||
/// N.5's vertex shader uses <c>gl_BaseInstanceARB</c> and <c>gl_DrawIDARB</c>
|
||||
/// from this extension.</summary>
|
||||
public bool HasShaderDrawParameters(GL gl)
|
||||
{
|
||||
return gl.IsExtensionPresent("GL_ARB_shader_draw_parameters");
|
||||
}
|
||||
}
|
||||
17
src/AcDream.App/Rendering/Wb/DrawElementsIndirectCommand.cs
Normal file
17
src/AcDream.App/Rendering/Wb/DrawElementsIndirectCommand.cs
Normal file
|
|
@ -0,0 +1,17 @@
|
|||
using System.Runtime.InteropServices;
|
||||
|
||||
namespace AcDream.App.Rendering.Wb;
|
||||
|
||||
/// <summary>
|
||||
/// Layout matches what <c>glMultiDrawElementsIndirect</c> expects.
|
||||
/// Total size 20 bytes; arrays are typically uploaded with stride = sizeof(this).
|
||||
/// </summary>
|
||||
[StructLayout(LayoutKind.Sequential, Pack = 4)]
|
||||
public struct DrawElementsIndirectCommand
|
||||
{
|
||||
public uint Count; // index count for this draw
|
||||
public uint InstanceCount; // number of instances
|
||||
public uint FirstIndex; // offset into IBO, in indices
|
||||
public int BaseVertex; // vertex offset into VBO
|
||||
public uint BaseInstance; // first instance ID (offsets per-instance attribs / SSBO read)
|
||||
}
|
||||
|
|
@ -1,6 +1,7 @@
|
|||
using System;
|
||||
using System.Collections.Generic;
|
||||
using System.Numerics;
|
||||
using System.Runtime.InteropServices;
|
||||
using AcDream.Core.Meshing;
|
||||
using AcDream.Core.Terrain;
|
||||
using AcDream.Core.World;
|
||||
|
|
@ -12,45 +13,49 @@ namespace AcDream.App.Rendering.Wb;
|
|||
/// <summary>
|
||||
/// Draws entities using WB's <see cref="ObjectRenderData"/> (a single global
|
||||
/// VAO/VBO/IBO under modern rendering) with acdream's <see cref="TextureCache"/>
|
||||
/// for texture resolution and <see cref="AcSurfaceMetadataTable"/> for
|
||||
/// for bindless texture resolution and <see cref="AcSurfaceMetadataTable"/> for
|
||||
/// translucency classification.
|
||||
///
|
||||
/// <para>
|
||||
/// <b>Atlas-tier</b> entities (<c>ServerGuid == 0</c>): mesh data comes from WB's
|
||||
/// <see cref="ObjectMeshManager"/> via <see cref="WbMeshAdapter.TryGetRenderData"/>.
|
||||
/// Textures resolve through <see cref="TextureCache.GetOrUpload"/> using the batch's
|
||||
/// <c>SurfaceId</c>.
|
||||
/// Textures resolve through the bindless-suffixed
|
||||
/// <see cref="TextureCache.GetOrUploadBindless"/> variants, returning 64-bit
|
||||
/// resident handles stored in the per-group SSBO.
|
||||
/// </para>
|
||||
///
|
||||
/// <para>
|
||||
/// <b>Per-instance-tier</b> entities (<c>ServerGuid != 0</c>): mesh data also from
|
||||
/// WB, but textures resolve through <see cref="TextureCache"/> with palette and
|
||||
/// surface overrides applied. <see cref="AnimatedEntityState"/> is currently
|
||||
/// WB, but textures resolve through
|
||||
/// <see cref="TextureCache.GetOrUploadWithPaletteOverrideBindless"/> with palette
|
||||
/// and surface overrides applied. <see cref="AnimatedEntityState"/> is currently
|
||||
/// unused at draw time — GameWindow's spawn path already bakes AnimPartChanges +
|
||||
/// GfxObjDegradeResolver (Issue #47 close-detail mesh) into <c>MeshRefs</c>.
|
||||
/// </para>
|
||||
///
|
||||
/// <para>
|
||||
/// <b>GL strategy:</b> GROUPED instanced drawing. All visible (entity, batch)
|
||||
/// pairs are bucketed by <see cref="GroupKey"/>; within a group a single
|
||||
/// <c>glDrawElementsInstancedBaseVertexBaseInstance</c> renders all instances.
|
||||
/// All matrices for the frame land in one shared instance VBO via a single
|
||||
/// <c>BufferData</c> upload. This drops draw calls from O(entities×batches)
|
||||
/// to O(unique GfxObj×batch×texture) — typically two orders of magnitude fewer.
|
||||
/// <b>GL strategy (N.5 — mandatory):</b> <c>glMultiDrawElementsIndirect</c> with SSBOs
|
||||
/// and <c>GL_ARB_bindless_texture</c> + <c>GL_ARB_shader_draw_parameters</c>.
|
||||
/// All visible (entity, batch) pairs are bucketed by <see cref="GroupKey"/>;
|
||||
/// each group becomes one <c>DrawElementsIndirectCommand</c>. Three GPU buffers
|
||||
/// are uploaded per frame: instance matrices (SSBO binding 0), per-group batch
|
||||
/// metadata/texture handles (SSBO binding 1), and the indirect draw commands.
|
||||
/// Two <c>glMultiDrawElementsIndirect</c> calls cover the opaque and transparent
|
||||
/// passes respectively — one GL call per pass regardless of group count.
|
||||
/// </para>
|
||||
///
|
||||
/// <para>
|
||||
/// <b>Shader:</b> reuses <c>mesh_instanced</c> (vert locations 0-2 = Position/
|
||||
/// Normal/UV from WB's <c>VertexPositionNormalTexture</c>; locations 3-6 = instance
|
||||
/// matrix from our VBO). WB's 32-byte vertex stride is compatible.
|
||||
/// <b>Shader:</b> <c>mesh_modern</c> (bindless + <c>gl_DrawIDARB</c> /
|
||||
/// <c>gl_BaseInstanceARB</c>). Missing bindless/draw-parameters throws
|
||||
/// <see cref="NotSupportedException"/> at startup — there is no legacy fallback.
|
||||
/// </para>
|
||||
///
|
||||
/// <para>
|
||||
/// <b>Modern rendering assumption:</b> WB's <c>_useModernRendering</c> path (GL
|
||||
/// 4.3 + bindless) puts every mesh in a single shared VAO/VBO/IBO and uses
|
||||
/// <c>FirstIndex</c> + <c>BaseVertex</c> per batch. The dispatcher honors those
|
||||
/// offsets via <c>DrawElementsInstancedBaseVertex(BaseInstance)</c>. The legacy
|
||||
/// per-mesh-VAO path also works since FirstIndex/BaseVertex are zero there.
|
||||
/// offsets inside each <c>DrawElementsIndirectCommand</c> via
|
||||
/// <c>glMultiDrawElementsIndirect</c>.
|
||||
/// </para>
|
||||
/// </summary>
|
||||
public sealed unsafe class WbDrawDispatcher : IDisposable
|
||||
|
|
@ -61,14 +66,40 @@ public sealed unsafe class WbDrawDispatcher : IDisposable
|
|||
private readonly WbMeshAdapter _meshAdapter;
|
||||
private readonly EntitySpawnAdapter _entitySpawnAdapter;
|
||||
|
||||
private readonly uint _instanceVbo;
|
||||
private readonly HashSet<uint> _patchedVaos = new();
|
||||
private readonly BindlessSupport _bindless;
|
||||
|
||||
// SSBO buffer ids
|
||||
private uint _instanceSsbo;
|
||||
private uint _batchSsbo;
|
||||
private uint _indirectBuffer;
|
||||
|
||||
// Per-frame scratch arrays — Tasks 9-10 fully wire these.
|
||||
private float[] _instanceData = new float[256 * 16]; // mat4 floats per instance
|
||||
private BatchData[] _batchData = new BatchData[256];
|
||||
private DrawElementsIndirectCommand[] _indirectCommands = new DrawElementsIndirectCommand[256];
|
||||
|
||||
private int _opaqueDrawCount;
|
||||
private int _transparentDrawCount;
|
||||
private int _transparentByteOffset;
|
||||
|
||||
// std430 layout: ulong TextureHandle (uvec2) at offset 0, uint TextureLayer
|
||||
// at offset 8, uint Flags at offset 12. Total 16 bytes.
|
||||
// Pack=8 (not 4) because std430's uvec2 requires 8-byte alignment — Pack=4
|
||||
// works today by accident (TextureHandle is the first field, so offset 0 is
|
||||
// always 8-byte aligned), but adding a 4-byte field before TextureHandle
|
||||
// without bumping Pack would silently misalign the GPU struct.
|
||||
[StructLayout(LayoutKind.Sequential, Pack = 8)]
|
||||
private struct BatchData
|
||||
{
|
||||
public ulong TextureHandle; // bindless handle (uvec2 in GLSL)
|
||||
public uint TextureLayer;
|
||||
public uint Flags;
|
||||
}
|
||||
|
||||
// Per-frame scratch — reused across frames to avoid per-frame allocation.
|
||||
private readonly Dictionary<GroupKey, InstanceGroup> _groups = new();
|
||||
private readonly List<InstanceGroup> _opaqueDraws = new();
|
||||
private readonly List<InstanceGroup> _translucentDraws = new();
|
||||
private float[] _instanceBuffer = new float[256 * 16]; // grow on demand, never shrink
|
||||
|
||||
// Per-entity-cull AABB radius. Conservative — covers most entities; large
|
||||
// outliers (long banners, tall columns) are still landblock-culled.
|
||||
|
|
@ -84,12 +115,23 @@ public sealed unsafe class WbDrawDispatcher : IDisposable
|
|||
private int _instancesIssued;
|
||||
private long _lastLogTick;
|
||||
|
||||
// CPU + GPU timing for [WB-DIAG] under ACDREAM_WB_DIAG=1.
|
||||
private readonly System.Diagnostics.Stopwatch _cpuStopwatch = new();
|
||||
private readonly long[] _cpuSamples = new long[256]; // microseconds
|
||||
private int _cpuSampleCursor;
|
||||
private uint _gpuQueryOpaque;
|
||||
private uint _gpuQueryTransparent;
|
||||
private readonly long[] _gpuSamples = new long[256]; // microseconds
|
||||
private int _gpuSampleCursor;
|
||||
private bool _gpuQueriesInitialized;
|
||||
|
||||
public WbDrawDispatcher(
|
||||
GL gl,
|
||||
Shader shader,
|
||||
TextureCache textures,
|
||||
WbMeshAdapter meshAdapter,
|
||||
EntitySpawnAdapter entitySpawnAdapter)
|
||||
EntitySpawnAdapter entitySpawnAdapter,
|
||||
BindlessSupport bindless)
|
||||
{
|
||||
ArgumentNullException.ThrowIfNull(gl);
|
||||
ArgumentNullException.ThrowIfNull(shader);
|
||||
|
|
@ -103,7 +145,10 @@ public sealed unsafe class WbDrawDispatcher : IDisposable
|
|||
_meshAdapter = meshAdapter;
|
||||
_entitySpawnAdapter = entitySpawnAdapter;
|
||||
|
||||
_instanceVbo = _gl.GenBuffer();
|
||||
_bindless = bindless ?? throw new ArgumentNullException(nameof(bindless));
|
||||
_instanceSsbo = _gl.GenBuffer();
|
||||
_batchSsbo = _gl.GenBuffer();
|
||||
_indirectBuffer = _gl.GenBuffer();
|
||||
}
|
||||
|
||||
public static Matrix4x4 ComposePartWorldMatrix(
|
||||
|
|
@ -126,6 +171,16 @@ public sealed unsafe class WbDrawDispatcher : IDisposable
|
|||
|
||||
bool diag = string.Equals(Environment.GetEnvironmentVariable("ACDREAM_WB_DIAG"), "1", StringComparison.Ordinal);
|
||||
|
||||
if (diag && !_gpuQueriesInitialized)
|
||||
{
|
||||
_gpuQueryOpaque = _gl.GenQuery();
|
||||
_gpuQueryTransparent = _gl.GenQuery();
|
||||
_gpuQueriesInitialized = true;
|
||||
}
|
||||
|
||||
// Always run the CPU stopwatch — cheap; only logged under diag.
|
||||
_cpuStopwatch.Restart();
|
||||
|
||||
// Camera world-space position for front-to-back sort (perf #2). The view
|
||||
// matrix is the inverse of the camera's world transform, so the world
|
||||
// translation lives in the inverse's translation row.
|
||||
|
|
@ -235,23 +290,24 @@ public sealed unsafe class WbDrawDispatcher : IDisposable
|
|||
// Nothing visible — skip the GL pass entirely.
|
||||
if (anyVao == 0)
|
||||
{
|
||||
_cpuStopwatch.Stop();
|
||||
if (diag) MaybeFlushDiag();
|
||||
return;
|
||||
}
|
||||
|
||||
// ── Phase 2: lay matrices out contiguously, assign per-group offsets,
|
||||
// split into opaque/translucent + compute sort keys ─────────
|
||||
// ── Phase 3: assign FirstInstance per group, lay matrices contiguously, sort opaque ──
|
||||
int totalInstances = 0;
|
||||
foreach (var grp in _groups.Values) totalInstances += grp.Matrices.Count;
|
||||
if (totalInstances == 0)
|
||||
{
|
||||
_cpuStopwatch.Stop();
|
||||
if (diag) MaybeFlushDiag();
|
||||
return;
|
||||
}
|
||||
|
||||
int needed = totalInstances * 16;
|
||||
if (_instanceBuffer.Length < needed)
|
||||
_instanceBuffer = new float[needed + 256 * 16]; // headroom
|
||||
if (_instanceData.Length < needed)
|
||||
_instanceData = new float[needed + 256 * 16];
|
||||
|
||||
_opaqueDraws.Clear();
|
||||
_translucentDraws.Clear();
|
||||
|
|
@ -268,17 +324,17 @@ public sealed unsafe class WbDrawDispatcher : IDisposable
|
|||
// position for front-to-back sort (perf #2). Cheap heuristic; works
|
||||
// well when instances of one group are spatially coherent
|
||||
// (typical for trees in one landblock area, NPCs at one spawn).
|
||||
var firstM = grp.Matrices[0];
|
||||
var grpPos = new Vector3(firstM.M41, firstM.M42, firstM.M43);
|
||||
var first = grp.Matrices[0];
|
||||
var grpPos = new Vector3(first.M41, first.M42, first.M43);
|
||||
grp.SortDistance = Vector3.DistanceSquared(camPos, grpPos);
|
||||
|
||||
for (int i = 0; i < grp.Matrices.Count; i++)
|
||||
{
|
||||
WriteMatrix(_instanceBuffer, cursor * 16, grp.Matrices[i]);
|
||||
WriteMatrix(_instanceData, cursor * 16, grp.Matrices[i]);
|
||||
cursor++;
|
||||
}
|
||||
|
||||
if (grp.Translucency == TranslucencyKind.Opaque || grp.Translucency == TranslucencyKind.ClipMap)
|
||||
if (IsOpaque(grp.Translucency))
|
||||
_opaqueDraws.Add(grp);
|
||||
else
|
||||
_translucentDraws.Add(grp);
|
||||
|
|
@ -290,90 +346,141 @@ public sealed unsafe class WbDrawDispatcher : IDisposable
|
|||
// Foundry interior).
|
||||
_opaqueDraws.Sort(static (a, b) => a.SortDistance.CompareTo(b.SortDistance));
|
||||
|
||||
// ── Phase 3: one upload of all matrices ─────────────────────────────
|
||||
_gl.BindBuffer(BufferTargetARB.ArrayBuffer, _instanceVbo);
|
||||
fixed (float* p = _instanceBuffer)
|
||||
_gl.BufferData(BufferTargetARB.ArrayBuffer,
|
||||
(nuint)(totalInstances * 16 * sizeof(float)), p, BufferUsageARB.DynamicDraw);
|
||||
// ── Phase 4: build IndirectGroupInput list (opaque sorted, then translucent),
|
||||
// fill via BuildIndirectArrays ──────────────────────────────────
|
||||
int totalDraws = _opaqueDraws.Count + _translucentDraws.Count;
|
||||
if (_batchData.Length < totalDraws)
|
||||
_batchData = new BatchData[totalDraws + 64];
|
||||
if (_indirectCommands.Length < totalDraws)
|
||||
_indirectCommands = new DrawElementsIndirectCommand[totalDraws + 64];
|
||||
|
||||
// ── Phase 4: bind VAO once (modern rendering shares one global VAO) ──
|
||||
EnsureInstanceAttribs(anyVao);
|
||||
var groupInputs = new List<IndirectGroupInput>(totalDraws);
|
||||
foreach (var g in _opaqueDraws) groupInputs.Add(ToInput(g));
|
||||
foreach (var g in _translucentDraws) groupInputs.Add(ToInput(g));
|
||||
|
||||
// Cast _batchData (private BatchData) to public-mirror BatchDataPublic for BuildIndirectArrays.
|
||||
// Layout is asserted at test time (BatchDataPublic_LayoutMatchesPrivateBatchData test).
|
||||
var batchPublic = new BatchDataPublic[totalDraws];
|
||||
var layout = BuildIndirectArrays(groupInputs, _indirectCommands, batchPublic);
|
||||
|
||||
// Copy back into _batchData
|
||||
for (int i = 0; i < totalDraws; i++)
|
||||
{
|
||||
_batchData[i] = new BatchData
|
||||
{
|
||||
TextureHandle = batchPublic[i].TextureHandle,
|
||||
TextureLayer = batchPublic[i].TextureLayer,
|
||||
Flags = batchPublic[i].Flags,
|
||||
};
|
||||
}
|
||||
_opaqueDrawCount = layout.OpaqueCount;
|
||||
_transparentDrawCount = layout.TransparentCount;
|
||||
_transparentByteOffset = layout.TransparentByteOffset;
|
||||
|
||||
// ── Phase 5: upload three buffers ───────────────────────────────────
|
||||
fixed (float* ip = _instanceData)
|
||||
UploadSsbo(_instanceSsbo, 0, ip, totalInstances * 16 * sizeof(float));
|
||||
|
||||
fixed (BatchData* bp = _batchData)
|
||||
UploadSsbo(_batchSsbo, 1, bp, totalDraws * sizeof(BatchData));
|
||||
|
||||
fixed (DrawElementsIndirectCommand* cp = _indirectCommands)
|
||||
{
|
||||
_gl.BindBuffer(BufferTargetARB.DrawIndirectBuffer, _indirectBuffer);
|
||||
_gl.BufferData(BufferTargetARB.DrawIndirectBuffer,
|
||||
(nuint)(totalDraws * sizeof(DrawElementsIndirectCommand)), cp, BufferUsageARB.DynamicDraw);
|
||||
}
|
||||
|
||||
// ── Phase 6: bind global VAO once ───────────────────────────────────
|
||||
_gl.BindVertexArray(anyVao);
|
||||
|
||||
// ── Phase 5: opaque + ClipMap pass (front-to-back sorted) ───────────
|
||||
if (string.Equals(Environment.GetEnvironmentVariable("ACDREAM_NO_CULL"), "1", StringComparison.Ordinal))
|
||||
_gl.Disable(EnableCap.CullFace);
|
||||
|
||||
foreach (var grp in _opaqueDraws)
|
||||
// ── Phase 7: opaque pass ─────────────────────────────────────────────
|
||||
if (_opaqueDrawCount > 0)
|
||||
{
|
||||
_shader.SetInt("uTranslucencyKind", (int)grp.Translucency);
|
||||
DrawGroup(grp);
|
||||
_gl.Disable(EnableCap.Blend);
|
||||
_gl.DepthMask(true);
|
||||
_shader.SetInt("uRenderPass", 0);
|
||||
_gl.BindBuffer(BufferTargetARB.DrawIndirectBuffer, _indirectBuffer);
|
||||
if (diag && _gpuQueriesInitialized) _gl.BeginQuery(QueryTarget.TimeElapsed, _gpuQueryOpaque);
|
||||
_gl.MultiDrawElementsIndirect(
|
||||
PrimitiveType.Triangles,
|
||||
DrawElementsType.UnsignedShort,
|
||||
(void*)0,
|
||||
(uint)_opaqueDrawCount,
|
||||
(uint)DrawCommandStride);
|
||||
if (diag && _gpuQueriesInitialized) _gl.EndQuery(QueryTarget.TimeElapsed);
|
||||
}
|
||||
|
||||
// ── Phase 6: translucent pass ───────────────────────────────────────
|
||||
_gl.Enable(EnableCap.Blend);
|
||||
_gl.DepthMask(false);
|
||||
|
||||
if (string.Equals(Environment.GetEnvironmentVariable("ACDREAM_NO_CULL"), "1", StringComparison.Ordinal))
|
||||
// ── Phase 8: transparent pass ────────────────────────────────────────
|
||||
if (_transparentDrawCount > 0)
|
||||
{
|
||||
_gl.Disable(EnableCap.CullFace);
|
||||
}
|
||||
else
|
||||
{
|
||||
_gl.Enable(EnableCap.CullFace);
|
||||
_gl.CullFace(TriangleFace.Back);
|
||||
_gl.FrontFace(FrontFaceDirection.Ccw);
|
||||
_gl.Enable(EnableCap.Blend);
|
||||
_gl.BlendFunc(BlendingFactor.SrcAlpha, BlendingFactor.OneMinusSrcAlpha);
|
||||
_gl.DepthMask(false);
|
||||
_shader.SetInt("uRenderPass", 1);
|
||||
if (diag && _gpuQueriesInitialized) _gl.BeginQuery(QueryTarget.TimeElapsed, _gpuQueryTransparent);
|
||||
_gl.MultiDrawElementsIndirect(
|
||||
PrimitiveType.Triangles,
|
||||
DrawElementsType.UnsignedShort,
|
||||
(void*)_transparentByteOffset,
|
||||
(uint)_transparentDrawCount,
|
||||
(uint)DrawCommandStride);
|
||||
if (diag && _gpuQueriesInitialized) _gl.EndQuery(QueryTarget.TimeElapsed);
|
||||
_gl.DepthMask(true);
|
||||
_gl.Disable(EnableCap.Blend);
|
||||
}
|
||||
|
||||
foreach (var grp in _translucentDraws)
|
||||
{
|
||||
switch (grp.Translucency)
|
||||
{
|
||||
case TranslucencyKind.Additive:
|
||||
_gl.BlendFunc(BlendingFactor.SrcAlpha, BlendingFactor.One);
|
||||
break;
|
||||
case TranslucencyKind.InvAlpha:
|
||||
_gl.BlendFunc(BlendingFactor.OneMinusSrcAlpha, BlendingFactor.SrcAlpha);
|
||||
break;
|
||||
default:
|
||||
_gl.BlendFunc(BlendingFactor.SrcAlpha, BlendingFactor.OneMinusSrcAlpha);
|
||||
break;
|
||||
}
|
||||
|
||||
_shader.SetInt("uTranslucencyKind", (int)grp.Translucency);
|
||||
DrawGroup(grp);
|
||||
}
|
||||
|
||||
_gl.DepthMask(true);
|
||||
_gl.Disable(EnableCap.Blend);
|
||||
_gl.Disable(EnableCap.CullFace);
|
||||
_gl.BindVertexArray(0);
|
||||
|
||||
_cpuStopwatch.Stop();
|
||||
|
||||
if (diag)
|
||||
{
|
||||
_drawsIssued += _opaqueDraws.Count + _translucentDraws.Count;
|
||||
long cpuUs = _cpuStopwatch.ElapsedTicks * 1_000_000L / System.Diagnostics.Stopwatch.Frequency;
|
||||
_cpuSamples[_cpuSampleCursor] = cpuUs;
|
||||
_cpuSampleCursor = (_cpuSampleCursor + 1) % _cpuSamples.Length;
|
||||
|
||||
// Read GPU samples non-blocking; the result for the previous frame's
|
||||
// queries should be ready by now. If not, drop the sample (don't stall
|
||||
// the CPU waiting for the GPU).
|
||||
if (_gpuQueriesInitialized)
|
||||
{
|
||||
_gl.GetQueryObject(_gpuQueryOpaque, QueryObjectParameterName.ResultAvailable, out int avail);
|
||||
if (avail != 0)
|
||||
{
|
||||
_gl.GetQueryObject(_gpuQueryOpaque, QueryObjectParameterName.Result, out ulong opaqueNs);
|
||||
_gl.GetQueryObject(_gpuQueryTransparent, QueryObjectParameterName.Result, out ulong transNs);
|
||||
long gpuUs = (long)((opaqueNs + transNs) / 1000UL);
|
||||
_gpuSamples[_gpuSampleCursor] = gpuUs;
|
||||
_gpuSampleCursor = (_gpuSampleCursor + 1) % _gpuSamples.Length;
|
||||
}
|
||||
}
|
||||
|
||||
_drawsIssued += _opaqueDrawCount + _transparentDrawCount;
|
||||
_instancesIssued += totalInstances;
|
||||
MaybeFlushDiag();
|
||||
}
|
||||
}
|
||||
|
||||
private void DrawGroup(InstanceGroup grp)
|
||||
{
|
||||
_gl.ActiveTexture(TextureUnit.Texture0);
|
||||
_gl.BindTexture(TextureTarget.Texture2D, grp.TextureHandle);
|
||||
_gl.BindBuffer(BufferTargetARB.ElementArrayBuffer, grp.Ibo);
|
||||
private static IndirectGroupInput ToInput(InstanceGroup g) => new(
|
||||
IndexCount: g.IndexCount,
|
||||
FirstIndex: g.FirstIndex,
|
||||
BaseVertex: g.BaseVertex,
|
||||
InstanceCount: g.InstanceCount,
|
||||
FirstInstance: g.FirstInstance,
|
||||
TextureHandle: g.BindlessTextureHandle,
|
||||
TextureLayer: g.TextureLayer,
|
||||
Translucency: g.Translucency);
|
||||
|
||||
// BaseInstance offsets the per-instance attribute fetches into our
|
||||
// shared instance VBO so each group reads its own slice. Requires
|
||||
// GL_ARB_base_instance (GL 4.2+); WB requires 4.3 so this is available.
|
||||
_gl.DrawElementsInstancedBaseVertexBaseInstance(
|
||||
PrimitiveType.Triangles,
|
||||
(uint)grp.IndexCount,
|
||||
DrawElementsType.UnsignedShort,
|
||||
(void*)(grp.FirstIndex * sizeof(ushort)),
|
||||
(uint)grp.InstanceCount,
|
||||
grp.BaseVertex,
|
||||
(uint)grp.FirstInstance);
|
||||
private unsafe void UploadSsbo(uint ssbo, uint binding, void* data, int byteCount)
|
||||
{
|
||||
_gl.BindBuffer(BufferTargetARB.ShaderStorageBuffer, ssbo);
|
||||
_gl.BufferData(BufferTargetARB.ShaderStorageBuffer, (nuint)byteCount, data, BufferUsageARB.DynamicDraw);
|
||||
_gl.BindBufferBase(BufferTargetARB.ShaderStorageBuffer, binding, ssbo);
|
||||
}
|
||||
|
||||
private void MaybeFlushDiag()
|
||||
|
|
@ -381,13 +488,41 @@ public sealed unsafe class WbDrawDispatcher : IDisposable
|
|||
long now = Environment.TickCount64;
|
||||
if (now - _lastLogTick > 5000)
|
||||
{
|
||||
long cpuMed = MedianMicros(_cpuSamples);
|
||||
long cpuP95 = Percentile95Micros(_cpuSamples);
|
||||
long gpuMed = MedianMicros(_gpuSamples);
|
||||
long gpuP95 = Percentile95Micros(_gpuSamples);
|
||||
Console.WriteLine(
|
||||
$"[WB-DIAG] entSeen={_entitiesSeen} entDrawn={_entitiesDrawn} meshMissing={_meshesMissing} drawsIssued={_drawsIssued} instances={_instancesIssued} groups={_groups.Count}");
|
||||
$"[WB-DIAG] entSeen={_entitiesSeen} entDrawn={_entitiesDrawn} meshMissing={_meshesMissing} drawsIssued={_drawsIssued} instances={_instancesIssued} groups={_groups.Count} " +
|
||||
$"cpu_us={cpuMed}m/{cpuP95}p95 gpu_us={gpuMed}m/{gpuP95}p95");
|
||||
_entitiesSeen = _entitiesDrawn = _meshesMissing = _drawsIssued = _instancesIssued = 0;
|
||||
_lastLogTick = now;
|
||||
// Don't reset the sample buffers — they're a moving window of the
|
||||
// last 256 frames; clearing per 5s flush would lose recent history.
|
||||
}
|
||||
}
|
||||
|
||||
private static long MedianMicros(long[] samples)
|
||||
{
|
||||
var copy = (long[])samples.Clone();
|
||||
Array.Sort(copy);
|
||||
int nz = 0;
|
||||
foreach (var v in copy) if (v > 0) nz++;
|
||||
if (nz == 0) return 0;
|
||||
return copy[copy.Length - nz / 2];
|
||||
}
|
||||
|
||||
private static long Percentile95Micros(long[] samples)
|
||||
{
|
||||
var copy = (long[])samples.Clone();
|
||||
Array.Sort(copy);
|
||||
int nz = 0;
|
||||
foreach (var v in copy) if (v > 0) nz++;
|
||||
if (nz == 0) return 0;
|
||||
int idx = copy.Length - 1 - (int)(nz * 0.05);
|
||||
return copy[idx];
|
||||
}
|
||||
|
||||
private void ClassifyBatches(
|
||||
ObjectRenderData renderData,
|
||||
ulong gfxObjId,
|
||||
|
|
@ -413,12 +548,16 @@ public sealed unsafe class WbDrawDispatcher : IDisposable
|
|||
: TranslucencyKind.Opaque;
|
||||
}
|
||||
|
||||
uint texHandle = ResolveTexture(entity, meshRef, batch, palHash);
|
||||
ulong texHandle = ResolveTexture(entity, meshRef, batch, palHash);
|
||||
if (texHandle == 0) continue;
|
||||
|
||||
// TextureLayer is always 0 for per-instance composites; non-zero when
|
||||
// WB atlas is adopted in N.6+ and batches reference a shared atlas layer.
|
||||
uint texLayer = 0;
|
||||
|
||||
var key = new GroupKey(
|
||||
batch.IBO, batch.FirstIndex, (int)batch.BaseVertex,
|
||||
batch.IndexCount, texHandle, translucency);
|
||||
batch.IndexCount, texHandle, texLayer, translucency);
|
||||
|
||||
if (!_groups.TryGetValue(key, out var grp))
|
||||
{
|
||||
|
|
@ -428,7 +567,8 @@ public sealed unsafe class WbDrawDispatcher : IDisposable
|
|||
FirstIndex = batch.FirstIndex,
|
||||
BaseVertex = (int)batch.BaseVertex,
|
||||
IndexCount = batch.IndexCount,
|
||||
TextureHandle = texHandle,
|
||||
BindlessTextureHandle = texHandle,
|
||||
TextureLayer = texLayer,
|
||||
Translucency = translucency,
|
||||
};
|
||||
_groups[key] = grp;
|
||||
|
|
@ -437,10 +577,8 @@ public sealed unsafe class WbDrawDispatcher : IDisposable
|
|||
}
|
||||
}
|
||||
|
||||
private uint ResolveTexture(WorldEntity entity, MeshRef meshRef, ObjectRenderBatch batch, ulong palHash)
|
||||
private ulong ResolveTexture(WorldEntity entity, MeshRef meshRef, ObjectRenderBatch batch, ulong palHash)
|
||||
{
|
||||
// WB stores the surface id on batch.Key.SurfaceId (TextureKey struct);
|
||||
// batch.SurfaceId is unset (zero) for batches built by ObjectMeshManager.
|
||||
uint surfaceId = batch.Key.SurfaceId;
|
||||
if (surfaceId == 0 || surfaceId == 0xFFFFFFFF) return 0;
|
||||
|
||||
|
|
@ -451,34 +589,16 @@ public sealed unsafe class WbDrawDispatcher : IDisposable
|
|||
|
||||
if (entity.PaletteOverride is not null)
|
||||
{
|
||||
// perf #4: pass the entity-precomputed palette hash so TextureCache
|
||||
// can skip its internal HashPaletteOverride for repeat lookups
|
||||
// within the same character.
|
||||
return _textures.GetOrUploadWithPaletteOverride(
|
||||
return _textures.GetOrUploadWithPaletteOverrideBindless(
|
||||
surfaceId, origTexOverride, entity.PaletteOverride, palHash);
|
||||
}
|
||||
else if (hasOrigTexOverride)
|
||||
{
|
||||
return _textures.GetOrUploadWithOrigTextureOverride(surfaceId, overrideOrigTex);
|
||||
return _textures.GetOrUploadWithOrigTextureOverrideBindless(surfaceId, overrideOrigTex);
|
||||
}
|
||||
else
|
||||
{
|
||||
return _textures.GetOrUpload(surfaceId);
|
||||
}
|
||||
}
|
||||
|
||||
private void EnsureInstanceAttribs(uint vao)
|
||||
{
|
||||
if (!_patchedVaos.Add(vao)) return;
|
||||
|
||||
_gl.BindVertexArray(vao);
|
||||
_gl.BindBuffer(BufferTargetARB.ArrayBuffer, _instanceVbo);
|
||||
for (uint row = 0; row < 4; row++)
|
||||
{
|
||||
uint loc = 3 + row;
|
||||
_gl.EnableVertexAttribArray(loc);
|
||||
_gl.VertexAttribPointer(loc, 4, VertexAttribPointerType.Float, false, 64, (void*)(row * 16));
|
||||
_gl.VertexAttribDivisor(loc, 1);
|
||||
return _textures.GetOrUploadBindless(surfaceId);
|
||||
}
|
||||
}
|
||||
|
||||
|
|
@ -494,15 +614,138 @@ public sealed unsafe class WbDrawDispatcher : IDisposable
|
|||
{
|
||||
if (_disposed) return;
|
||||
_disposed = true;
|
||||
_gl.DeleteBuffer(_instanceVbo);
|
||||
_gl.DeleteBuffer(_instanceSsbo);
|
||||
_gl.DeleteBuffer(_batchSsbo);
|
||||
_gl.DeleteBuffer(_indirectBuffer);
|
||||
if (_gpuQueriesInitialized)
|
||||
{
|
||||
_gl.DeleteQuery(_gpuQueryOpaque);
|
||||
_gl.DeleteQuery(_gpuQueryTransparent);
|
||||
}
|
||||
}
|
||||
|
||||
// ── Public types + helpers for BuildIndirectArrays (Task 9) ─────────────
|
||||
//
|
||||
// These are public so the pure-CPU unit tests in AcDream.Core.Tests can
|
||||
// exercise BuildIndirectArrays without needing a GL context.
|
||||
|
||||
/// <summary>
|
||||
/// Stride in bytes of <c>DrawElementsIndirectCommand</c> in the indirect buffer.
|
||||
/// 5 × <c>uint</c> = 20 bytes. Tests and callers reference this symbolically
|
||||
/// rather than hard-coding <c>20</c> so a layout change produces a compile error.
|
||||
/// </summary>
|
||||
public const int DrawCommandStride = 20; // sizeof(DrawElementsIndirectCommand): 5 × uint
|
||||
|
||||
/// <summary>
|
||||
/// Public view of the per-group inputs to <see cref="BuildIndirectArrays"/> — used in tests.
|
||||
/// </summary>
|
||||
public readonly record struct IndirectGroupInput(
|
||||
int IndexCount,
|
||||
uint FirstIndex,
|
||||
int BaseVertex,
|
||||
int InstanceCount,
|
||||
int FirstInstance,
|
||||
ulong TextureHandle,
|
||||
uint TextureLayer,
|
||||
TranslucencyKind Translucency);
|
||||
|
||||
/// <summary>
|
||||
/// Public mirror of the per-group <see cref="BatchData"/> uploaded to the SSBO.
|
||||
/// Tests verify the layout. Same field shape as the private BatchData.
|
||||
/// </summary>
|
||||
[StructLayout(LayoutKind.Sequential, Pack = 8)]
|
||||
public struct BatchDataPublic
|
||||
{
|
||||
public ulong TextureHandle;
|
||||
public uint TextureLayer;
|
||||
public uint Flags;
|
||||
}
|
||||
|
||||
/// <summary>Result of <see cref="BuildIndirectArrays"/>.</summary>
|
||||
public readonly record struct IndirectLayoutResult(
|
||||
int OpaqueCount,
|
||||
int TransparentCount,
|
||||
int TransparentByteOffset);
|
||||
|
||||
/// <summary>
|
||||
/// Lays out the indirect commands + parallel BatchData array contiguously:
|
||||
/// opaque section first (caller sorts before calling), transparent section second.
|
||||
/// Pure CPU, no GL state. Caller passes pre-sized scratch arrays.
|
||||
/// </summary>
|
||||
/// <remarks>
|
||||
/// Classification: Opaque + ClipMap → opaque pass (ClipMap uses discard, not
|
||||
/// blending). Everything else (AlphaBlend, Additive, InvAlpha) → transparent pass.
|
||||
/// </remarks>
|
||||
public static IndirectLayoutResult BuildIndirectArrays(
|
||||
IReadOnlyList<IndirectGroupInput> groups,
|
||||
DrawElementsIndirectCommand[] indirectScratch,
|
||||
BatchDataPublic[] batchScratch)
|
||||
{
|
||||
int opaqueCount = 0;
|
||||
int transparentCount = 0;
|
||||
|
||||
foreach (var g in groups)
|
||||
{
|
||||
if (IsOpaque(g.Translucency)) opaqueCount++;
|
||||
else transparentCount++;
|
||||
}
|
||||
|
||||
int oi = 0; // opaque write cursor (fills [0..opaqueCount))
|
||||
int ti = opaqueCount; // transparent write cursor (fills [opaqueCount..end))
|
||||
|
||||
foreach (var g in groups)
|
||||
{
|
||||
var dec = new DrawElementsIndirectCommand
|
||||
{
|
||||
Count = (uint)g.IndexCount,
|
||||
InstanceCount = (uint)g.InstanceCount,
|
||||
FirstIndex = g.FirstIndex,
|
||||
BaseVertex = g.BaseVertex,
|
||||
BaseInstance = (uint)g.FirstInstance,
|
||||
};
|
||||
var bd = new BatchDataPublic
|
||||
{
|
||||
TextureHandle = g.TextureHandle,
|
||||
TextureLayer = g.TextureLayer,
|
||||
Flags = 0,
|
||||
};
|
||||
|
||||
if (IsOpaque(g.Translucency))
|
||||
{
|
||||
indirectScratch[oi] = dec;
|
||||
batchScratch[oi] = bd;
|
||||
oi++;
|
||||
}
|
||||
else
|
||||
{
|
||||
indirectScratch[ti] = dec;
|
||||
batchScratch[ti] = bd;
|
||||
ti++;
|
||||
}
|
||||
}
|
||||
|
||||
return new IndirectLayoutResult(opaqueCount, transparentCount, opaqueCount * DrawCommandStride);
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Public test shim for <see cref="IsOpaque"/>. Locks in the N.5 Decision 2
|
||||
/// translucency partition: Opaque + ClipMap → opaque indirect; AlphaBlend +
|
||||
/// Additive + InvAlpha → transparent indirect.
|
||||
/// </summary>
|
||||
public static bool IsOpaquePublic(TranslucencyKind t) => IsOpaque(t);
|
||||
|
||||
private static bool IsOpaque(TranslucencyKind t)
|
||||
=> t == TranslucencyKind.Opaque || t == TranslucencyKind.ClipMap;
|
||||
|
||||
// ────────────────────────────────────────────────────────────────────────
|
||||
|
||||
private readonly record struct GroupKey(
|
||||
uint Ibo,
|
||||
uint FirstIndex,
|
||||
int BaseVertex,
|
||||
int IndexCount,
|
||||
uint TextureHandle,
|
||||
ulong BindlessTextureHandle,
|
||||
uint TextureLayer,
|
||||
TranslucencyKind Translucency);
|
||||
|
||||
private sealed class InstanceGroup
|
||||
|
|
@ -511,7 +754,8 @@ public sealed unsafe class WbDrawDispatcher : IDisposable
|
|||
public uint FirstIndex;
|
||||
public int BaseVertex;
|
||||
public int IndexCount;
|
||||
public uint TextureHandle;
|
||||
public ulong BindlessTextureHandle; // 64-bit (was uint TextureHandle in N.4)
|
||||
public uint TextureLayer; // 0 for per-instance composites; non-zero when WB atlas is adopted in N.6+
|
||||
public TranslucencyKind Translucency;
|
||||
public int FirstInstance; // offset into the shared instance VBO (in instances, not bytes)
|
||||
public int InstanceCount;
|
||||
|
|
|
|||
|
|
@ -1,39 +0,0 @@
|
|||
namespace AcDream.App.Rendering.Wb;
|
||||
|
||||
/// <summary>
|
||||
/// Process-lifetime cache of <c>ACDREAM_USE_WB_FOUNDATION</c> env var.
|
||||
/// Read once at static-init time; all consumers import this rather than
|
||||
/// re-reading the env var per call (env-var lookups on Windows are not
|
||||
/// free at hot-path cadence).
|
||||
///
|
||||
/// <para>
|
||||
/// <b>Default-on as of Phase N.4 ship (2026-05-08).</b> The WB foundation
|
||||
/// (<c>WbMeshAdapter</c> + <c>WbDrawDispatcher</c>) is the production
|
||||
/// rendering path. Set <c>ACDREAM_USE_WB_FOUNDATION=0</c> to fall back
|
||||
/// to the legacy <c>InstancedMeshRenderer</c> path — kept as an escape
|
||||
/// hatch until N.6 fully replaces it.
|
||||
/// </para>
|
||||
///
|
||||
/// <para>
|
||||
/// Per-instance customized content (server <c>CreateObject</c> entities
|
||||
/// with palette / texture overrides) routes through
|
||||
/// <see cref="TextureCache.GetOrUploadWithPaletteOverride"/> regardless
|
||||
/// of the flag — the flag controls which DRAW path consumes those
|
||||
/// textures.
|
||||
/// </para>
|
||||
/// </summary>
|
||||
public static class WbFoundationFlag
|
||||
{
|
||||
private static bool _isEnabled =
|
||||
System.Environment.GetEnvironmentVariable("ACDREAM_USE_WB_FOUNDATION") != "0";
|
||||
|
||||
public static bool IsEnabled => _isEnabled;
|
||||
|
||||
/// <summary>
|
||||
/// FOR TESTS ONLY. Forces <see cref="IsEnabled"/> to <c>true</c> so
|
||||
/// integration tests can exercise the WB adapter path without having to
|
||||
/// set the env var before static initialisation. Never call from
|
||||
/// production code.
|
||||
/// </summary>
|
||||
internal static void ForTestsOnly_ForceEnable() => _isEnabled = true;
|
||||
}
|
||||
|
|
@ -144,7 +144,7 @@ public sealed class GpuWorldState
|
|||
}
|
||||
|
||||
_loaded[landblock.LandblockId] = landblock;
|
||||
if (WbFoundationFlag.IsEnabled && _wbSpawnAdapter is not null)
|
||||
if (_wbSpawnAdapter is not null)
|
||||
_wbSpawnAdapter.OnLandblockLoaded(_loaded[landblock.LandblockId]);
|
||||
RebuildFlatView();
|
||||
}
|
||||
|
|
@ -195,7 +195,7 @@ public sealed class GpuWorldState
|
|||
|
||||
public void RemoveLandblock(uint landblockId)
|
||||
{
|
||||
if (WbFoundationFlag.IsEnabled && _wbSpawnAdapter is not null)
|
||||
if (_wbSpawnAdapter is not null)
|
||||
_wbSpawnAdapter.OnLandblockUnloaded(landblockId);
|
||||
|
||||
// Rescue persistent entities before removal. These get appended
|
||||
|
|
|
|||
|
|
@ -0,0 +1,32 @@
|
|||
using AcDream.App.Rendering;
|
||||
using AcDream.App.Rendering.Wb;
|
||||
using DatReaderWriter;
|
||||
using Xunit;
|
||||
|
||||
namespace AcDream.Core.Tests.Rendering;
|
||||
|
||||
/// <summary>
|
||||
/// Lightweight unit tests for <see cref="TextureCache"/>'s bindless path.
|
||||
/// We can't construct a real TextureCache in a headless test (it requires a
|
||||
/// live GL context), so this file documents contracts that future engineers
|
||||
/// should preserve. Real bindless integration is verified at Task 14's
|
||||
/// visual gate.
|
||||
/// </summary>
|
||||
public sealed class TextureCacheBindlessTests
|
||||
{
|
||||
[Fact]
|
||||
public void Contract_BindlessMethodsThrowWithoutBindlessSupport()
|
||||
{
|
||||
// The actual throw lives in TextureCache.EnsureBindlessAvailable
|
||||
// and is reached only via GL-bound Bindless* method calls. The
|
||||
// contract is: if the dispatcher (which requires bindless) ever
|
||||
// gets a TextureCache constructed without BindlessSupport, it
|
||||
// should fail-fast with InvalidOperationException — NOT silently
|
||||
// route a draw to handle 0 (which would produce a non-resident
|
||||
// GPU fault).
|
||||
//
|
||||
// This test is a marker. Future engineers: do not weaken
|
||||
// EnsureBindlessAvailable to swallow the missing dependency.
|
||||
Assert.True(true, "Contract documented in TextureCache.EnsureBindlessAvailable");
|
||||
}
|
||||
}
|
||||
|
|
@ -19,16 +19,9 @@ namespace AcDream.Core.Tests.Rendering.Wb;
|
|||
/// </summary>
|
||||
public sealed class PendingSpawnIntegrationTests
|
||||
{
|
||||
/// <summary>
|
||||
/// Force-enable WbFoundationFlag for this test class.
|
||||
/// GpuWorldState gates its adapter calls on this static-cached flag;
|
||||
/// calling the internal test hook lets us exercise the full integration
|
||||
/// path without needing the env var set before process startup.
|
||||
/// </summary>
|
||||
static PendingSpawnIntegrationTests()
|
||||
{
|
||||
WbFoundationFlag.ForTestsOnly_ForceEnable();
|
||||
}
|
||||
// N.5 ship amendment: WbFoundationFlag was deleted — GpuWorldState
|
||||
// no longer gates adapter calls on the flag; they are unconditional
|
||||
// when the adapter is non-null. No static ctor hook needed.
|
||||
|
||||
[Fact]
|
||||
public void LiveEntity_ParkedBeforeLandblock_DrainsButIsNotRegisteredWithAdapter()
|
||||
|
|
|
|||
|
|
@ -0,0 +1,113 @@
|
|||
using System.Numerics;
|
||||
using AcDream.App.Rendering.Wb;
|
||||
using AcDream.Core.Meshing;
|
||||
using Xunit;
|
||||
|
||||
namespace AcDream.Core.Tests.Rendering.Wb;
|
||||
|
||||
/// <summary>
|
||||
/// Pure CPU test of <see cref="WbDrawDispatcher.BuildIndirectArrays"/>.
|
||||
/// Verifies that a synthetic group set lays out into the indirect buffer
|
||||
/// + parallel batch data with opaque section first, transparent second,
|
||||
/// per-group fields propagated correctly.
|
||||
/// </summary>
|
||||
public sealed class WbDrawDispatcherIndirectBuilderTests
|
||||
{
|
||||
[Fact]
|
||||
public void TwoOpaqueGroupsAndOneTransparent_LaysOutContiguouslyOpaqueFirst()
|
||||
{
|
||||
// Arrange — three groups: 2 opaque (12+1 instances) + 1 transparent (12 instances)
|
||||
var groups = new List<WbDrawDispatcher.IndirectGroupInput>
|
||||
{
|
||||
new(IndexCount: 100, FirstIndex: 0, BaseVertex: 0, InstanceCount: 12, FirstInstance: 0, TextureHandle: 0xAA, TextureLayer: 0, Translucency: TranslucencyKind.Opaque),
|
||||
new(IndexCount: 200, FirstIndex: 100, BaseVertex: 0, InstanceCount: 12, FirstInstance: 12, TextureHandle: 0xBB, TextureLayer: 0, Translucency: TranslucencyKind.AlphaBlend),
|
||||
new(IndexCount: 50, FirstIndex: 300, BaseVertex: 100, InstanceCount: 1, FirstInstance: 24, TextureHandle: 0xCC, TextureLayer: 0, Translucency: TranslucencyKind.Opaque),
|
||||
};
|
||||
|
||||
var indirect = new DrawElementsIndirectCommand[16];
|
||||
var batch = new WbDrawDispatcher.BatchDataPublic[16];
|
||||
|
||||
// Act
|
||||
var result = WbDrawDispatcher.BuildIndirectArrays(groups, indirect, batch);
|
||||
|
||||
// Assert layout
|
||||
Assert.Equal(2, result.OpaqueCount);
|
||||
Assert.Equal(1, result.TransparentCount);
|
||||
Assert.Equal(2 * 20, result.TransparentByteOffset); // sizeof(DEIC) = 20
|
||||
|
||||
// Opaque section, in input order (Task 10 callers sort)
|
||||
Assert.Equal(100u, indirect[0].Count);
|
||||
Assert.Equal(0u, indirect[0].FirstIndex);
|
||||
Assert.Equal(0, indirect[0].BaseVertex);
|
||||
Assert.Equal(12u, indirect[0].InstanceCount);
|
||||
Assert.Equal(0u, indirect[0].BaseInstance);
|
||||
|
||||
Assert.Equal(50u, indirect[1].Count);
|
||||
Assert.Equal(300u, indirect[1].FirstIndex);
|
||||
Assert.Equal(100, indirect[1].BaseVertex);
|
||||
Assert.Equal(1u, indirect[1].InstanceCount);
|
||||
Assert.Equal(24u, indirect[1].BaseInstance);
|
||||
|
||||
// Transparent section
|
||||
Assert.Equal(200u, indirect[2].Count);
|
||||
Assert.Equal(100u, indirect[2].FirstIndex);
|
||||
Assert.Equal(12u, indirect[2].InstanceCount);
|
||||
Assert.Equal(12u, indirect[2].BaseInstance);
|
||||
|
||||
// BatchData parallel — same indices as indirect
|
||||
Assert.Equal(0xAAul, batch[0].TextureHandle);
|
||||
Assert.Equal(0xCCul, batch[1].TextureHandle);
|
||||
Assert.Equal(0xBBul, batch[2].TextureHandle);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void EmptyGroupList_ProducesZeroCounts()
|
||||
{
|
||||
var groups = new List<WbDrawDispatcher.IndirectGroupInput>();
|
||||
var indirect = new DrawElementsIndirectCommand[0];
|
||||
var batch = new WbDrawDispatcher.BatchDataPublic[0];
|
||||
|
||||
var result = WbDrawDispatcher.BuildIndirectArrays(groups, indirect, batch);
|
||||
|
||||
Assert.Equal(0, result.OpaqueCount);
|
||||
Assert.Equal(0, result.TransparentCount);
|
||||
Assert.Equal(0, result.TransparentByteOffset);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void ClipMapTreatedAsOpaque()
|
||||
{
|
||||
// ClipMap surfaces (alpha-cutout) belong with the opaque pass
|
||||
// because the discard handles transparency, not blending.
|
||||
var groups = new List<WbDrawDispatcher.IndirectGroupInput>
|
||||
{
|
||||
new(IndexCount: 10, FirstIndex: 0, BaseVertex: 0, InstanceCount: 1, FirstInstance: 0, TextureHandle: 0x1, TextureLayer: 0, Translucency: TranslucencyKind.ClipMap),
|
||||
};
|
||||
var indirect = new DrawElementsIndirectCommand[4];
|
||||
var batch = new WbDrawDispatcher.BatchDataPublic[4];
|
||||
|
||||
var result = WbDrawDispatcher.BuildIndirectArrays(groups, indirect, batch);
|
||||
|
||||
Assert.Equal(1, result.OpaqueCount);
|
||||
Assert.Equal(0, result.TransparentCount);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void BatchDataPublic_LayoutMatchesPrivateBatchData()
|
||||
{
|
||||
// Task 10 will use MemoryMarshal.Cast<BatchData, BatchDataPublic> to
|
||||
// expose the dispatcher's per-frame BatchData[] scratch to BuildIndirectArrays
|
||||
// without copying. The cast is only safe if the structs have identical
|
||||
// layout (size, field offsets). Both use [StructLayout(Sequential, Pack=8)].
|
||||
Assert.Equal(16, System.Runtime.CompilerServices.Unsafe.SizeOf<WbDrawDispatcher.BatchDataPublic>());
|
||||
Assert.Equal(0, (int)System.Runtime.InteropServices.Marshal.OffsetOf<WbDrawDispatcher.BatchDataPublic>(nameof(WbDrawDispatcher.BatchDataPublic.TextureHandle)));
|
||||
Assert.Equal(8, (int)System.Runtime.InteropServices.Marshal.OffsetOf<WbDrawDispatcher.BatchDataPublic>(nameof(WbDrawDispatcher.BatchDataPublic.TextureLayer)));
|
||||
Assert.Equal(12, (int)System.Runtime.InteropServices.Marshal.OffsetOf<WbDrawDispatcher.BatchDataPublic>(nameof(WbDrawDispatcher.BatchDataPublic.Flags)));
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void DrawCommandStride_MatchesStructSize()
|
||||
{
|
||||
Assert.Equal(WbDrawDispatcher.DrawCommandStride, System.Runtime.CompilerServices.Unsafe.SizeOf<DrawElementsIndirectCommand>());
|
||||
}
|
||||
}
|
||||
|
|
@ -0,0 +1,25 @@
|
|||
using AcDream.App.Rendering.Wb;
|
||||
using AcDream.Core.Meshing;
|
||||
using Xunit;
|
||||
|
||||
namespace AcDream.Core.Tests.Rendering.Wb;
|
||||
|
||||
/// <summary>
|
||||
/// Locks in the N.5 translucency partition contract (spec Decision 2).
|
||||
/// If the partition drifts, the dispatcher's opaque + transparent indirect
|
||||
/// passes will silently render the wrong groups in the wrong pass — visible
|
||||
/// regression that's hard to spot in code review.
|
||||
/// </summary>
|
||||
public sealed class WbDrawDispatcherTranslucencyTests
|
||||
{
|
||||
[Theory]
|
||||
[InlineData(TranslucencyKind.Opaque, true)]
|
||||
[InlineData(TranslucencyKind.ClipMap, true)]
|
||||
[InlineData(TranslucencyKind.AlphaBlend, false)]
|
||||
[InlineData(TranslucencyKind.Additive, false)]
|
||||
[InlineData(TranslucencyKind.InvAlpha, false)]
|
||||
public void IsOpaque_PartitionsByKind(TranslucencyKind kind, bool expected)
|
||||
{
|
||||
Assert.Equal(expected, WbDrawDispatcher.IsOpaquePublic(kind));
|
||||
}
|
||||
}
|
||||
Loading…
Add table
Add a link
Reference in a new issue