leakhunt/tools/probe_rtd3d_lost.py
acbot 57b5e43d0e Initial commit — leak-hunt project complete
Five bugs identified and patched in retail Asheron's Call client:
- v3b: palette refcount over-increment (3-byte NOP at two sites)
- v5: RenderSurface PurgeResource no-op stub (vtable slot 2 thunk)
- v11: two dangling-pointer crash guards (NULL-check + reorder)
- v14: CEnvCell::Destroy ClipPlaneList leak (18-byte JMP to cleanup thunk)
- v22: unpacker stale-pointer SEH guard (whole-function __try/__except)

All five ship in leakfix.dll (117 KB, SHA d282f23c…) which is loaded
by acclient.exe at process start via PE import table patching by
tools/install_leakfix.py.

Controlled 15-client fleet soak: unpatched control died at 26h with
palette exhaustion; all 14 patched clients survived past that point
and reached ≥5-day uptime.

Residual ~15 MB/h growth traced to d3d9.dll's internal slab allocator
(260KB surface backing buffers retained after Release). See REPORT.md
§10 for the full investigation; conclusion is that it's unfixable from
outside d3d9.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 21:07:58 +02:00

115 lines
4.6 KiB
Python

"""probe_rtd3d_lost.py <pid>
Walk s_Resources, find RenderTextureD3D entries (GR-view vfptr=0x00801A18),
characterize buffer state on both LOST and LIVE entries.
Entry stored in s_Resources is the GR-view = primary + 0x30.
Buffers / fields tracked (offsets from PRIMARY in parens, accessed as entry+offset-0x30):
primary+0x98 -> m_p2DTextureD3D (entry+0x68)
primary+0x9c -> m_pCubeTextureD3D (entry+0x6c)
primary+0xa0 -> m_D3DSurfaces.m_data (SmartArray data ptr) (entry+0x70)
primary+0xa4 -> m_D3DSurfaces size+flag (high bit = own-buffer) (entry+0x74)
primary+0x114 -> RenderSurface m_pSurfaceBits (entry+0xE4)
primary+0x64 -> RenderSurface sourceData inner ptr (entry+0x34)
primary+0x7c -> RenderTexture m_DBLevelInfos data ptr (entry+0x4c)
primary+0x84 -> RenderTexture m_DBLevelInfos num (entry+0x54)
The "would benefit from a v5/v20-style fix" question:
- If lost entries have m_p2DTextureD3D non-NULL -> PurgeResource never ran (engine state)
- If buf-fields non-NULL but D3D is NULL -> CPU shells with CPU buffers (v5-style WIN)
- If everything NULL -> inert shells, same shape as RSD3D (need deleting dtor, v19-class)
"""
import ctypes, ctypes.wintypes as wt, sys, struct
PROCESS_VM_READ = 0x10
PROCESS_QUERY_INFORMATION = 0x400
k = ctypes.windll.kernel32
k.OpenProcess.argtypes = [wt.DWORD, wt.BOOL, wt.DWORD]; k.OpenProcess.restype = wt.HANDLE
k.ReadProcessMemory.argtypes = [wt.HANDLE, wt.LPCVOID, wt.LPVOID, ctypes.c_size_t, ctypes.POINTER(ctypes.c_size_t)]
k.ReadProcessMemory.restype = wt.BOOL
S_RESOURCES_M_DATA = 0x008398C4
S_RESOURCES_M_NUM = 0x008398CC
RTD3D_VTABLE = 0x00801A18
def rd(h, va, n):
buf = (ctypes.c_ubyte * n)(); sz = ctypes.c_size_t(0)
if not k.ReadProcessMemory(h, va, buf, n, ctypes.byref(sz)): return None
return bytes(buf[:sz.value])
def rd_u32(h, va):
b = rd(h, va, 4); return struct.unpack('<I', b)[0] if b else None
pid = int(sys.argv[1])
h = k.OpenProcess(PROCESS_VM_READ | PROCESS_QUERY_INFORMATION, False, pid)
if not h: print(f"OpenProcess err={ctypes.get_last_error()}"); sys.exit(2)
m_data = rd_u32(h, S_RESOURCES_M_DATA)
m_num = rd_u32(h, S_RESOURCES_M_NUM)
print(f"pid {pid}: s_Resources m_data=0x{m_data:08x} m_num={m_num}")
def cnt():
return dict(d3d2d=0, d3dCube=0, surfdata=0, surfBits=0, sourceInner=0,
dbLevel=0, dbLevelNum=0, total=0, none=0)
stats = {'lost': cnt(), 'live': cnt()}
samples = []
total_inscope = 0
lost_total = 0
live_total = 0
for i in range(min(m_num, 200000)):
entry = rd_u32(h, m_data + i * 4)
if not entry: continue
vt = rd_u32(h, entry)
if vt != RTD3D_VTABLE: continue
total_inscope += 1
bIsLost_word = rd_u32(h, entry + 8)
is_lost = (bIsLost_word & 0xFF) != 0
d3d2d = rd_u32(h, entry + 0x68)
d3dCube = rd_u32(h, entry + 0x6c)
surf = rd_u32(h, entry + 0x70)
surfNum = rd_u32(h, entry + 0x74)
sBits = rd_u32(h, entry + 0xE4)
sInner = rd_u32(h, entry + 0x34)
dbLvl = rd_u32(h, entry + 0x4c)
dbLvlN = rd_u32(h, entry + 0x54)
if d3d2d is None: continue
bucket = stats['lost'] if is_lost else stats['live']
if is_lost: lost_total += 1
else: live_total += 1
bucket['total'] += 1
any_nn = False
if d3d2d: bucket['d3d2d'] += 1; any_nn = True
if d3dCube: bucket['d3dCube'] += 1; any_nn = True
if surf: bucket['surfdata'] += 1; any_nn = True
if sBits: bucket['surfBits'] += 1; any_nn = True
if sInner: bucket['sourceInner']+= 1; any_nn = True
if dbLvl: bucket['dbLevel'] += 1; any_nn = True
if dbLvlN: bucket['dbLevelNum'] += 1
if not any_nn:
bucket['none'] += 1
if is_lost and any_nn and len(samples) < 6:
samples.append((entry, d3d2d, d3dCube, surf, surfNum, sBits, sInner, dbLvl, dbLvlN))
print(f"RTD3D entries: in_sresources={total_inscope} lost={lost_total} live={live_total}")
def show(label, b, denom):
print(f" {label} (n={b['total']}):")
for k_ in ('d3d2d','d3dCube','surfdata','surfBits','sourceInner','dbLevel','dbLevelNum','none'):
v = b[k_]
pct = (100*v//max(denom,1))
print(f" {k_:14s}: {v:5d} ({pct}%)")
show("LOST", stats['lost'], lost_total)
show("LIVE", stats['live'], live_total)
if samples:
print("Sample LOST entries with any non-NULL field:")
for s in samples:
e, d2, dc, sd, sn, sb, si, dl, dn = s
own = (sn & 0x80000000) >> 31 if sn else 0
print(f" entry=0x{e:08x} 2dTex=0x{d2:08x} cube=0x{dc:08x} "
f"surfData=0x{sd:08x} num&flag=0x{sn:08x}(own={own}) "
f"surfBits=0x{sb:08x} srcInner=0x{si:08x} "
f"dbLvl=0x{dl:08x} dbLvlN={dn}")