# leakfix.dll — Standalone Native Patch DLL ## Goal Consolidate all runtime patches (v3b, v5, v11, v12, v14) **plus** add a periodic CObjCell/LongHash cleanup sweep that's impossible at the byte-patching level. Ship as a single native 32-bit DLL + tiny launcher EXE. No Decal dependency. ## Why now - Per-client byte patching works but doesn't scale to the residual ~7–8 MB/hr CPhysicsObj-family leak (requires real cleanup loops, not inline thunks). - The Python patchers re-apply on every restart via the monitor — brittle. A DLL loads with the process. - Native code = clean crash dumps at real fault sites (no CLR wrapping like UB's `System.AccessViolationException` issue). ## Tech stack - **Language:** C++17, MSVC `cl.exe` (verified working: `MSVC 14.44.35207`). - **Target:** 32-bit x86 (`/arch:IA32`, default for `vcvars32`). - **Runtime:** static link (`/MT`) → no extra runtime DLL dependency. - **Hooking:** MinHook (single-header MIT, ~700 LOC) for frame-tick detour. - **AC struct mirrors:** subset of `references/acclient.h`. ## Project layout ``` dll/ ├── DESIGN.md # this file ├── leakfix/ │ ├── build.bat # one-shot build via vcvars32 │ ├── src/ │ │ ├── dllmain.cpp # DllMain, patch application, hook install │ │ ├── patches.cpp # v3b, v5, v11, v12, v14 application │ │ ├── thunks.cpp # inline-asm thunks (v14 ClipPlaneList, v5 purge) │ │ ├── sweep.cpp # periodic CObjCell/LongHash cleanup │ │ ├── hook.cpp # MinHook wiring for frame-tick detour │ │ ├── logging.cpp # rolling log file │ │ ├── ac_addrs.h # EoR address constants │ │ ├── ac_types.h # struct mirrors │ │ └── minhook/ # vendored MinHook source │ └── injector/ │ └── inject.cpp # CreateProcess(suspended) + LoadLibraryA inject └── test/ # hello.dll already verified ``` ## Patch porting plan Each existing Python patcher becomes a few lines of C++ that runs in `DllMain` on `DLL_PROCESS_ATTACH`. ### v3b — palette NOP (trivial port) ```cpp WriteCode(0x0053EFFE, "\x90\x90\x90", 3); WriteCode(0x0053F19C, "\x90\x90\x90", 3); ``` ### v5 — RenderSurface PurgeResource vtable override The current 10-byte thunk becomes a real function: ```cpp typedef void (__thiscall *DestroyFn)(void* self); constexpr auto RENDERSURFACE_DESTROY = (DestroyFn)0x00444540; constexpr auto RENDERTEXTURE_DESTROY = (DestroyFn)0x0044C4F0; int __thiscall purge_rendersurface(void* self) { RENDERSURFACE_DESTROY(self); return 1; } int __thiscall purge_rendertexture(void* self) { RENDERTEXTURE_DESTROY(self); return 1; } void apply_v5() { WriteVtableSlot(0x0079A684, (void*)&purge_rendersurface); WriteVtableSlot(0x0079C1A0, (void*)&purge_rendertexture); } ``` Replaces VirtualAllocEx + 10-byte thunk with proper function pointers inside our DLL's .text. ### v11 — NULL-check NOPs Two byte-level rewrites identical to Python patcher. ### v12 — unpacker validator + dispatch redirect - Patcher allocates a 29-byte validator thunk + rewrites a dispatch table entry. - C++ version: validator becomes a `__declspec(naked)` function; dispatch table entry becomes a function pointer. ### v14 — CEnvCell ClipPlaneList fix Replace 18 bytes at `0x0052E661` with a 5-byte JMP into a naked function: ```cpp __declspec(naked) void clipplane_cleanup_thunk() { __asm { pushad mov edi, [esi + 0xDC] test edi, edi jz done mov ecx, [edi] test ecx, ecx jz free_outer push ecx mov eax, 0x0053C760 ; ClipPlaneList::~ClipPlaneList call eax pop ecx push ecx mov eax, 0x005DF15E ; operator delete call eax add esp, 4 free_outer: push edi mov eax, 0x005DF164 ; operator delete[] call eax add esp, 4 mov [esi + 0xDC], ebx done: popad push 0x0052E673 ; resume ret } } ``` Then install a 5-byte `E9 rel32` from `0x0052E661` to `clipplane_cleanup_thunk`, followed by 13 NOPs. ## NEW: CObjCell/LongHash cleanup sweep This is the actual reason for going to a DLL. Byte patches can't express the logic. ### What we know - Top owner vtable holding leaked CPhysicsObjs: `0x0079BF64` (= `LongHash::Node`, 21,553 hits). - Secondary: `0x007ED3B0` (CObjCell-family containers, `object_list` DArrays) and `0x007CA4DC` (another LongHash family). - All `CPhysicsObj::Destroy` teardown code is correct when called — the bug is it's never called for these objects. ### Sweep design ```cpp struct LongHashNode { LongHashNode* next; uint32_t key; void* value; // CPhysicsObj* }; struct LongHashTable { void* vtable; LongHashNode** buckets; uint32_t bucket_count; uint32_t entry_count; // ... mirror layout from acclient.h }; void sweep_physobj_table(LongHashTable* table, uint32_t cutoff_ts) { for (uint32_t b = 0; b < table->bucket_count; ++b) { LongHashNode** prev = &table->buckets[b]; LongHashNode* node = *prev; while (node) { LongHashNode* next = node->next; CPhysicsObj* po = (CPhysicsObj*)node->value; if (is_safe_to_destroy(po, cutoff_ts)) { *prev = next; CPhysicsObj_Destroy(po); // 0x005145D0 operator_delete(node); --table->entry_count; } else { prev = &node->next; } node = next; } } } ``` ### Safety predicates (critical — these prevent v13-class crashes) A CPhysicsObj is "safe to destroy" only if: 1. `po->parent == NULL` (not currently attached to anything live) 2. `po->object_state` indicates dead/destroyed (need to find flag) 3. `po->last_used_timestamp` is older than some threshold (e.g., 60s) 4. `po->cell == NULL` (not in any cell's object list) 5. `po` is NOT referenced from any other table we know about (best-effort scan) If any predicate is uncertain, leave it. **Conservative wins.** ### Tick hook Need to find a function AC calls every frame, hook it via MinHook, and trigger sweep every N frames (e.g., every 300 frames ≈ 5s at 60fps). Candidate hook targets to investigate: - `Render::Render` or main game loop entry - `Input::ProcessFrame` - `cm_GameLoop::Tick` (if it exists) This needs another small investigation. Once found, hook: ```cpp typedef void (__cdecl *TickFn)(); TickFn original_tick; void __cdecl hooked_tick() { original_tick(); static int counter = 0; if (++counter >= 300) { counter = 0; sweep_all_physobj_tables(); } } ``` ## Injection mechanism ### Phase 1 — launcher EXE (development & testing) ```cpp int main(int argc, char** argv) { STARTUPINFO si = { sizeof(si) }; PROCESS_INFORMATION pi; CreateProcess("acclient.exe", build_cmdline(argc, argv), NULL, NULL, FALSE, CREATE_SUSPENDED, NULL, NULL, &si, &pi); // Inject DLL void* mem = VirtualAllocEx(pi.hProcess, NULL, MAX_PATH, MEM_COMMIT, PAGE_READWRITE); WriteProcessMemory(pi.hProcess, mem, "C:\\path\\to\\leakfix.dll", MAX_PATH, NULL); HANDLE thr = CreateRemoteThread(pi.hProcess, NULL, 0, (LPTHREAD_START_ROUTINE)GetProcAddress(GetModuleHandle("kernel32"), "LoadLibraryA"), mem, 0, NULL); WaitForSingleObject(thr, INFINITE); ResumeThread(pi.hThread); return 0; } ``` Usage: `leakfix_launch.exe -h server -p port -u user -...` → drops in as substitute for direct `acclient.exe`. ### Phase 2 — PE import table modification (production) Patch `acclient.exe`'s PE header to add `leakfix.dll` to its imports. Then the OS loader pulls our DLL in automatically before AC's `WinMain` runs. User just runs acclient as normal. Tool: small Python or C++ utility that does: - Open PE - Find IMPORT_DIRECTORY - Add new IMAGE_IMPORT_DESCRIPTOR pointing at `leakfix.dll` - Stuff in a fake IAT with a single function (`leakfix_init` exported from our DLL) - Resave executable (There are existing tools like `LoadDll`, `PE Bear`, or `peimporter` we can crib from.) ## Build setup ```batch @echo off call "C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Auxiliary\Build\vcvars32.bat" cl /LD /nologo /O2 /MT /EHsc /std:c++17 /W3 ^ /D_CRT_SECURE_NO_WARNINGS /D_WIN32_WINNT=0x0601 ^ /Fe:leakfix.dll ^ src\dllmain.cpp src\patches.cpp src\thunks.cpp src\sweep.cpp ^ src\hook.cpp src\logging.cpp src\minhook\*.c ^ /link kernel32.lib user32.lib ``` `/MT` avoids needing `vcruntime*.dll` alongside. ## Implementation order 1. ✅ Verify toolchain builds 32-bit DLL (hello.dll) 2. Write `dllmain.cpp` + `patches.cpp` with v3b only — verify same bytes as Python patcher produces, manually inject into a test PID 3. Add v11 (similar simple byte writes) 4. Add v5 (real `__thiscall` purge functions in our DLL .text) 5. Add v12 (more complex but pattern same as v5) 6. Add v14 (inline-asm naked function) 7. Build injector EXE, test full apply-on-attach flow 8. Find frame-tick hook target via Ghidra (separate task) 9. Wire MinHook + sweep skeleton 10. Implement sweep predicates iteratively, very long soak windows per iteration 11. Optional: PE import table patcher for one-launcher-binary UX ## Risk management - Each patch porting step is verified against the Python patcher's byte output before merging. No new bytes = no new risk. - Sweep is the only NEW logic and follows v13 lessons: long soaks, conservative predicates, refuse-to-destroy-if-uncertain rule. - Crash dumps land cleanly because we're not crossing managed/unmanaged boundary. ## What it replaces - `tools/patch_palette_v3b.py` — runtime-applied at DLL load - `tools/patch_purge_v5_test.py` — runtime-applied at DLL load - `tools/patch_v11_test.py` — runtime-applied at DLL load - `tools/patch_v12_test.py` — runtime-applied at DLL load - `tools/patch_v14_cenvcell_clipplane.py` — runtime-applied at DLL load - `tools/fleet_monitor.sh` auto-patching cascade — no longer needed (DLL applies all on every restart automatically) Snapshot/HB monitoring stays in place — that's separate from patching.