acdream/docs/research/named-retail
Erik 69d884a3d6 tools(pdb-extract): #8 PDB -> symbols.json + types.json sidecar
Pure-Python MSF 7.00 PDB extractor (no deps, stdlib only). Reads
refs/acclient.pdb directly:
  - DBI stream (3) -> symbol record stream index + section header
    stream index
  - Section headers stream (9) -> per-segment image VA bases
  - Symbol record stream (8) -> S_PUB32 records with image VAs
  - TPI stream (2) -> LF_CLASS / LF_STRUCTURE named records (not
    forward-declared), with size leaf + name

Includes a best-effort MSVC C++ demangler so symbols.json is
grep-friendly:
  ?EnchantAttribute@CEnchantmentRegistry@@QBEHKAAK@Z
  -> CEnchantmentRegistry::EnchantAttribute

Both demangled `name` + raw `mangled` emitted per entry so callers
can choose. Operator overloads, vtables, and other special forms
where a partial demangle would be misleading are kept mangled.

Outputs committed to docs/research/named-retail/:
  - symbols.json (2.9 MB) — 18,366 named public function symbols
  - types.json (506 KB) — 5,371 unique named class/struct records

Spot check (matches discovery agent's earlier finding):
  CEnchantmentRegistry::EnchantAttribute -> 0x00594570 ✓

Updated docs/research/acclient_function_map.md header preamble to
direct readers at the new symbols.json as the authoritative name
source; the hand-curated table stays as the cross-port (ACE/ACME)
index. Several addresses there are wrong vs the PDB and will be
swept in the issue #9 close (Phase E).

Closes #8 (filed in Phase D's commit). Foundation for the address
sweep + name-driven workflows from here on.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 17:31:52 +02:00
..
acclient.c docs(research): commit named retail decomp + spells.csv (foundation) 2026-04-25 17:27:19 +02:00
acclient.h docs(research): commit named retail decomp + spells.csv (foundation) 2026-04-25 17:27:19 +02:00
acclient_2013_pseudo_c.txt docs(research): commit named retail decomp + spells.csv (foundation) 2026-04-25 17:27:19 +02:00
README.md docs(research): commit named retail decomp + spells.csv (foundation) 2026-04-25 17:27:19 +02:00
symbols.json tools(pdb-extract): #8 PDB -> symbols.json + types.json sidecar 2026-04-25 17:31:52 +02:00
types.json tools(pdb-extract): #8 PDB -> symbols.json + types.json sidecar 2026-04-25 17:31:52 +02:00

Named-retail decompilation reference

This is the primary reference for any AC-specific algorithm, formula, constant, wire format, or coordinate convention. Every retail symbol question goes here first — before touching docs/research/decompiled/ (the older Ghidra FUN_xxx chunks, which remain a fallback for chunk-by-chunk address-range navigation).

Contents

File Source Use for
acclient_2013_pseudo_c.txt Binary Ninja pseudo-C export of the Sept 2013 EoR acclient.exe build. 1,437,645 lines. 99.6% function-name recovery (54,873 named, 232 still sub_*). Class names + method names + many struct field names recovered. Primary symbol lookup. Grep by class::method to find function bodies. Address-prefixed lines: 00<address> <return-type> __<conv> Class::Method(args).
acclient.h IDA-decompiled retail headers. 70,719 lines / 1.7 MB. Full struct + class definitions for the entire AC client object model: Attribute, SecondaryAttribute, AttributeCache, Attribute2ndTable, SkillFormula, Enchantment, CEnchantmentRegistry (with _mult_list / _add_list / _vitae), CSpellBook, MotionState, RawMotionState, MoveToStatePack, CACQualities, CPhysicsObj. Struct field names + offsets. When you need to know what a field is actually called, grep this file.
acclient.c Ghidra (or IDA) full-binary decomp export. 1,327,522 lines / 46 MB. Mixed naming: ~5,100 named methods + ~8,553 still FUN_xxx. Has named struct types like _max_health, _add_list that the chunked Ghidra export under decompiled/ lacks. Secondary cross-reference. Useful when pseudo-C body is corrupt / packed and you need a different decompiler's view.
symbols.json Generated by tools/pdb-extract/ from refs/acclient.pdb. 18,366 entries: {"address", "name", "obj_module"}. Programmatic symbol lookup. `jq '.[]
types.json Generated by tools/pdb-extract/. 3,172 named struct/class type records with field offsets + sizes. Programmatic type-layout queries.

Workflow — grep first, decompile second

The CLAUDE.md "Development workflow" mandates Step 0: GREP NAMED FIRST before any decompilation work. Concretely:

# Find a function by class::method:
grep -n "CEnchantmentRegistry::EnchantAttribute" docs/research/named-retail/acclient_2013_pseudo_c.txt

# Find a struct definition:
grep -n "^struct.*CEnchantmentRegistry" docs/research/named-retail/acclient.h

# Find by raw address (PDB and pseudo-C addresses match):
grep -n "^00594570" docs/research/named-retail/acclient_2013_pseudo_c.txt

# Programmatic symbol lookup:
cat docs/research/named-retail/symbols.json | jq '.[] | select(.name == "CEnchantmentRegistry::EnchantAttribute")'

Only fall back to docs/research/decompiled/chunk_*.c (Ghidra FUN_xxx chunks) when the named pseudo-C lacks a function — rare; covers only the obfuscated/packed minority.

Origin

  • PDB: refs/acclient.pdb — Sept 2013 End-of-Retail (EoR) build, MSVC 7.00 program database, 29 MB. Build root: d:\ac1_sep13\.
  • pseudo-C: Binary Ninja export of the acclient_2013-2024-09-11.bndb database (also in refs/). 99.6% naming via PDB-overlay analysis.
  • acclient.h: IDA-decompiled headers from a parallel RE effort.
  • acclient.c: full-binary IDA/Ghidra decomp export (different from our Ghidra chunks under decompiled/).

The refs/ directory is gitignored (per-developer download cache); these extracts are committed so subagents and post-compaction sessions inherit them automatically.

Address mapping caveat

The PDB is from a slightly different build run than the binary that produced our Ghidra chunks (~0xC00 byte delta on some functions). When correcting addresses in docs/research/acclient_function_map.md, match by name, not by raw address.

Regenerating symbols.json / types.json

py tools\pdb-extract\pdb_extract.py refs\acclient.pdb

Outputs land in docs/research/named-retail/symbols.json and docs/research/named-retail/types.json. See tools/pdb-extract/README.md.