Initial commit — leak-hunt project complete
Five bugs identified and patched in retail Asheron's Call client: - v3b: palette refcount over-increment (3-byte NOP at two sites) - v5: RenderSurface PurgeResource no-op stub (vtable slot 2 thunk) - v11: two dangling-pointer crash guards (NULL-check + reorder) - v14: CEnvCell::Destroy ClipPlaneList leak (18-byte JMP to cleanup thunk) - v22: unpacker stale-pointer SEH guard (whole-function __try/__except) All five ship in leakfix.dll (117 KB, SHA d282f23c…) which is loaded by acclient.exe at process start via PE import table patching by tools/install_leakfix.py. Controlled 15-client fleet soak: unpatched control died at 26h with palette exhaustion; all 14 patched clients survived past that point and reached ≥5-day uptime. Residual ~15 MB/h growth traced to d3d9.dll's internal slab allocator (260KB surface backing buffers retained after Release). See REPORT.md §10 for the full investigation; conclusion is that it's unfixable from outside d3d9. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
commit
57b5e43d0e
199 changed files with 1648333 additions and 0 deletions
319
dll/DESIGN.md
Normal file
319
dll/DESIGN.md
Normal file
|
|
@ -0,0 +1,319 @@
|
|||
# leakfix.dll — Standalone Native Patch DLL
|
||||
|
||||
## Goal
|
||||
|
||||
Consolidate all runtime patches (v3b, v5, v11, v12, v14) **plus** add a
|
||||
periodic CObjCell/LongHash cleanup sweep that's impossible at the
|
||||
byte-patching level. Ship as a single native 32-bit DLL + tiny launcher
|
||||
EXE. No Decal dependency.
|
||||
|
||||
## Why now
|
||||
|
||||
- Per-client byte patching works but doesn't scale to the residual
|
||||
~7–8 MB/hr CPhysicsObj-family leak (requires real cleanup loops, not
|
||||
inline thunks).
|
||||
- The Python patchers re-apply on every restart via the monitor —
|
||||
brittle. A DLL loads with the process.
|
||||
- Native code = clean crash dumps at real fault sites (no CLR wrapping
|
||||
like UB's `System.AccessViolationException` issue).
|
||||
|
||||
## Tech stack
|
||||
|
||||
- **Language:** C++17, MSVC `cl.exe` (verified working: `MSVC 14.44.35207`).
|
||||
- **Target:** 32-bit x86 (`/arch:IA32`, default for `vcvars32`).
|
||||
- **Runtime:** static link (`/MT`) → no extra runtime DLL dependency.
|
||||
- **Hooking:** MinHook (single-header MIT, ~700 LOC) for frame-tick detour.
|
||||
- **AC struct mirrors:** subset of `references/acclient.h`.
|
||||
|
||||
## Project layout
|
||||
|
||||
```
|
||||
dll/
|
||||
├── DESIGN.md # this file
|
||||
├── leakfix/
|
||||
│ ├── build.bat # one-shot build via vcvars32
|
||||
│ ├── src/
|
||||
│ │ ├── dllmain.cpp # DllMain, patch application, hook install
|
||||
│ │ ├── patches.cpp # v3b, v5, v11, v12, v14 application
|
||||
│ │ ├── thunks.cpp # inline-asm thunks (v14 ClipPlaneList, v5 purge)
|
||||
│ │ ├── sweep.cpp # periodic CObjCell/LongHash cleanup
|
||||
│ │ ├── hook.cpp # MinHook wiring for frame-tick detour
|
||||
│ │ ├── logging.cpp # rolling log file
|
||||
│ │ ├── ac_addrs.h # EoR address constants
|
||||
│ │ ├── ac_types.h # struct mirrors
|
||||
│ │ └── minhook/ # vendored MinHook source
|
||||
│ └── injector/
|
||||
│ └── inject.cpp # CreateProcess(suspended) + LoadLibraryA inject
|
||||
└── test/ # hello.dll already verified
|
||||
```
|
||||
|
||||
## Patch porting plan
|
||||
|
||||
Each existing Python patcher becomes a few lines of C++ that runs in
|
||||
`DllMain` on `DLL_PROCESS_ATTACH`.
|
||||
|
||||
### v3b — palette NOP (trivial port)
|
||||
|
||||
```cpp
|
||||
WriteCode(0x0053EFFE, "\x90\x90\x90", 3);
|
||||
WriteCode(0x0053F19C, "\x90\x90\x90", 3);
|
||||
```
|
||||
|
||||
### v5 — RenderSurface PurgeResource vtable override
|
||||
|
||||
The current 10-byte thunk becomes a real function:
|
||||
|
||||
```cpp
|
||||
typedef void (__thiscall *DestroyFn)(void* self);
|
||||
constexpr auto RENDERSURFACE_DESTROY = (DestroyFn)0x00444540;
|
||||
constexpr auto RENDERTEXTURE_DESTROY = (DestroyFn)0x0044C4F0;
|
||||
|
||||
int __thiscall purge_rendersurface(void* self) {
|
||||
RENDERSURFACE_DESTROY(self);
|
||||
return 1;
|
||||
}
|
||||
int __thiscall purge_rendertexture(void* self) {
|
||||
RENDERTEXTURE_DESTROY(self);
|
||||
return 1;
|
||||
}
|
||||
|
||||
void apply_v5() {
|
||||
WriteVtableSlot(0x0079A684, (void*)&purge_rendersurface);
|
||||
WriteVtableSlot(0x0079C1A0, (void*)&purge_rendertexture);
|
||||
}
|
||||
```
|
||||
|
||||
Replaces VirtualAllocEx + 10-byte thunk with proper function pointers
|
||||
inside our DLL's .text.
|
||||
|
||||
### v11 — NULL-check NOPs
|
||||
|
||||
Two byte-level rewrites identical to Python patcher.
|
||||
|
||||
### v12 — unpacker validator + dispatch redirect
|
||||
|
||||
- Patcher allocates a 29-byte validator thunk + rewrites a dispatch
|
||||
table entry.
|
||||
- C++ version: validator becomes a `__declspec(naked)` function;
|
||||
dispatch table entry becomes a function pointer.
|
||||
|
||||
### v14 — CEnvCell ClipPlaneList fix
|
||||
|
||||
Replace 18 bytes at `0x0052E661` with a 5-byte JMP into a naked
|
||||
function:
|
||||
|
||||
```cpp
|
||||
__declspec(naked) void clipplane_cleanup_thunk() {
|
||||
__asm {
|
||||
pushad
|
||||
mov edi, [esi + 0xDC]
|
||||
test edi, edi
|
||||
jz done
|
||||
mov ecx, [edi]
|
||||
test ecx, ecx
|
||||
jz free_outer
|
||||
push ecx
|
||||
mov eax, 0x0053C760 ; ClipPlaneList::~ClipPlaneList
|
||||
call eax
|
||||
pop ecx
|
||||
push ecx
|
||||
mov eax, 0x005DF15E ; operator delete
|
||||
call eax
|
||||
add esp, 4
|
||||
free_outer:
|
||||
push edi
|
||||
mov eax, 0x005DF164 ; operator delete[]
|
||||
call eax
|
||||
add esp, 4
|
||||
mov [esi + 0xDC], ebx
|
||||
done:
|
||||
popad
|
||||
push 0x0052E673 ; resume
|
||||
ret
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Then install a 5-byte `E9 rel32` from `0x0052E661` to `clipplane_cleanup_thunk`,
|
||||
followed by 13 NOPs.
|
||||
|
||||
## NEW: CObjCell/LongHash cleanup sweep
|
||||
|
||||
This is the actual reason for going to a DLL. Byte patches can't
|
||||
express the logic.
|
||||
|
||||
### What we know
|
||||
|
||||
- Top owner vtable holding leaked CPhysicsObjs: `0x0079BF64` (= `LongHash<CPhysicsObj>::Node`, 21,553 hits).
|
||||
- Secondary: `0x007ED3B0` (CObjCell-family containers, `object_list` DArrays) and `0x007CA4DC` (another LongHash family).
|
||||
- All `CPhysicsObj::Destroy` teardown code is correct when called — the bug is it's never called for these objects.
|
||||
|
||||
### Sweep design
|
||||
|
||||
```cpp
|
||||
struct LongHashNode {
|
||||
LongHashNode* next;
|
||||
uint32_t key;
|
||||
void* value; // CPhysicsObj*
|
||||
};
|
||||
|
||||
struct LongHashTable {
|
||||
void* vtable;
|
||||
LongHashNode** buckets;
|
||||
uint32_t bucket_count;
|
||||
uint32_t entry_count;
|
||||
// ... mirror layout from acclient.h
|
||||
};
|
||||
|
||||
void sweep_physobj_table(LongHashTable* table, uint32_t cutoff_ts) {
|
||||
for (uint32_t b = 0; b < table->bucket_count; ++b) {
|
||||
LongHashNode** prev = &table->buckets[b];
|
||||
LongHashNode* node = *prev;
|
||||
while (node) {
|
||||
LongHashNode* next = node->next;
|
||||
CPhysicsObj* po = (CPhysicsObj*)node->value;
|
||||
|
||||
if (is_safe_to_destroy(po, cutoff_ts)) {
|
||||
*prev = next;
|
||||
CPhysicsObj_Destroy(po); // 0x005145D0
|
||||
operator_delete(node);
|
||||
--table->entry_count;
|
||||
} else {
|
||||
prev = &node->next;
|
||||
}
|
||||
node = next;
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Safety predicates (critical — these prevent v13-class crashes)
|
||||
|
||||
A CPhysicsObj is "safe to destroy" only if:
|
||||
|
||||
1. `po->parent == NULL` (not currently attached to anything live)
|
||||
2. `po->object_state` indicates dead/destroyed (need to find flag)
|
||||
3. `po->last_used_timestamp` is older than some threshold (e.g., 60s)
|
||||
4. `po->cell == NULL` (not in any cell's object list)
|
||||
5. `po` is NOT referenced from any other table we know about (best-effort scan)
|
||||
|
||||
If any predicate is uncertain, leave it. **Conservative wins.**
|
||||
|
||||
### Tick hook
|
||||
|
||||
Need to find a function AC calls every frame, hook it via MinHook,
|
||||
and trigger sweep every N frames (e.g., every 300 frames ≈ 5s at 60fps).
|
||||
|
||||
Candidate hook targets to investigate:
|
||||
- `Render::Render` or main game loop entry
|
||||
- `Input::ProcessFrame`
|
||||
- `cm_GameLoop::Tick` (if it exists)
|
||||
|
||||
This needs another small investigation. Once found, hook:
|
||||
|
||||
```cpp
|
||||
typedef void (__cdecl *TickFn)();
|
||||
TickFn original_tick;
|
||||
|
||||
void __cdecl hooked_tick() {
|
||||
original_tick();
|
||||
static int counter = 0;
|
||||
if (++counter >= 300) {
|
||||
counter = 0;
|
||||
sweep_all_physobj_tables();
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Injection mechanism
|
||||
|
||||
### Phase 1 — launcher EXE (development & testing)
|
||||
|
||||
```cpp
|
||||
int main(int argc, char** argv) {
|
||||
STARTUPINFO si = { sizeof(si) };
|
||||
PROCESS_INFORMATION pi;
|
||||
CreateProcess("acclient.exe", build_cmdline(argc, argv),
|
||||
NULL, NULL, FALSE, CREATE_SUSPENDED,
|
||||
NULL, NULL, &si, &pi);
|
||||
|
||||
// Inject DLL
|
||||
void* mem = VirtualAllocEx(pi.hProcess, NULL, MAX_PATH, MEM_COMMIT, PAGE_READWRITE);
|
||||
WriteProcessMemory(pi.hProcess, mem, "C:\\path\\to\\leakfix.dll", MAX_PATH, NULL);
|
||||
HANDLE thr = CreateRemoteThread(pi.hProcess, NULL, 0,
|
||||
(LPTHREAD_START_ROUTINE)GetProcAddress(GetModuleHandle("kernel32"), "LoadLibraryA"),
|
||||
mem, 0, NULL);
|
||||
WaitForSingleObject(thr, INFINITE);
|
||||
ResumeThread(pi.hThread);
|
||||
return 0;
|
||||
}
|
||||
```
|
||||
|
||||
Usage: `leakfix_launch.exe -h server -p port -u user -...` → drops in
|
||||
as substitute for direct `acclient.exe`.
|
||||
|
||||
### Phase 2 — PE import table modification (production)
|
||||
|
||||
Patch `acclient.exe`'s PE header to add `leakfix.dll` to its imports.
|
||||
Then the OS loader pulls our DLL in automatically before AC's
|
||||
`WinMain` runs. User just runs acclient as normal.
|
||||
|
||||
Tool: small Python or C++ utility that does:
|
||||
- Open PE
|
||||
- Find IMPORT_DIRECTORY
|
||||
- Add new IMAGE_IMPORT_DESCRIPTOR pointing at `leakfix.dll`
|
||||
- Stuff in a fake IAT with a single function (`leakfix_init` exported from our DLL)
|
||||
- Resave executable
|
||||
|
||||
(There are existing tools like `LoadDll`, `PE Bear`, or
|
||||
`peimporter` we can crib from.)
|
||||
|
||||
## Build setup
|
||||
|
||||
```batch
|
||||
@echo off
|
||||
call "C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Auxiliary\Build\vcvars32.bat"
|
||||
cl /LD /nologo /O2 /MT /EHsc /std:c++17 /W3 ^
|
||||
/D_CRT_SECURE_NO_WARNINGS /D_WIN32_WINNT=0x0601 ^
|
||||
/Fe:leakfix.dll ^
|
||||
src\dllmain.cpp src\patches.cpp src\thunks.cpp src\sweep.cpp ^
|
||||
src\hook.cpp src\logging.cpp src\minhook\*.c ^
|
||||
/link kernel32.lib user32.lib
|
||||
```
|
||||
|
||||
`/MT` avoids needing `vcruntime*.dll` alongside.
|
||||
|
||||
## Implementation order
|
||||
|
||||
1. ✅ Verify toolchain builds 32-bit DLL (hello.dll)
|
||||
2. Write `dllmain.cpp` + `patches.cpp` with v3b only — verify same bytes as Python patcher produces, manually inject into a test PID
|
||||
3. Add v11 (similar simple byte writes)
|
||||
4. Add v5 (real `__thiscall` purge functions in our DLL .text)
|
||||
5. Add v12 (more complex but pattern same as v5)
|
||||
6. Add v14 (inline-asm naked function)
|
||||
7. Build injector EXE, test full apply-on-attach flow
|
||||
8. Find frame-tick hook target via Ghidra (separate task)
|
||||
9. Wire MinHook + sweep skeleton
|
||||
10. Implement sweep predicates iteratively, very long soak windows per iteration
|
||||
11. Optional: PE import table patcher for one-launcher-binary UX
|
||||
|
||||
## Risk management
|
||||
|
||||
- Each patch porting step is verified against the Python patcher's
|
||||
byte output before merging. No new bytes = no new risk.
|
||||
- Sweep is the only NEW logic and follows v13 lessons: long soaks,
|
||||
conservative predicates, refuse-to-destroy-if-uncertain rule.
|
||||
- Crash dumps land cleanly because we're not crossing managed/unmanaged
|
||||
boundary.
|
||||
|
||||
## What it replaces
|
||||
|
||||
- `tools/patch_palette_v3b.py` — runtime-applied at DLL load
|
||||
- `tools/patch_purge_v5_test.py` — runtime-applied at DLL load
|
||||
- `tools/patch_v11_test.py` — runtime-applied at DLL load
|
||||
- `tools/patch_v12_test.py` — runtime-applied at DLL load
|
||||
- `tools/patch_v14_cenvcell_clipplane.py` — runtime-applied at DLL load
|
||||
- `tools/fleet_monitor.sh` auto-patching cascade — no longer needed (DLL
|
||||
applies all on every restart automatically)
|
||||
|
||||
Snapshot/HB monitoring stays in place — that's separate from patching.
|
||||
Loading…
Add table
Add a link
Reference in a new issue