typetechniqueconfidencehighcreated2026-05-30updated2026-05-30obfuscationpeevasionmalware-familycompileranti-vm

Semantic Jargon Export Obfuscation

What It Does

A PE export table is populated with hundreds of plausible-sounding function names drawn from unrelated technical domains (machine learning, networking, game engines, DevOps). The names are syntactically valid and semantically coherent within their domains, but all resolve to a handful of tiny ret-only stubs. This creates a veneer of a large, legitimate software project, drowning real functionality in noise and frustrating signature-based detection.

Detection / Fingerprint

Export count > 400 with unique RVA count < 25 (high name-to-body ratio)
Names are grammatically consistent English compound words drawn from 2–4 technical domain vocabularies mixed together (e.g., BackoffExtrapolate, CorruptTurbulence, TokenizeDrag, CrossEntropyRevokePinch)
No meaningful cross-references from the stub bodies — each stub is push rbp; mov rbp,rsp; ret or shorter
High-entropy .data section suggests actual payload is separate from the export façade

Implementation Patterns Observed

In the sunwukong sample:

503 exported names mapped to ~21 unique RVAs in the .text section ^[sample fa16b64a/pefile.txt:338-500]
Names generated by combining a vocabulary of ~100 base terms (e.g., Backoff, CrossEntropy, Tokenize, Drag, Perplexity, Turbulence) with random pairing
The export directory is large (0x3475 bytes) and sits in .rdata, consuming a notable portion of initialized data

Reproduce on Your Own VMs

Toolchain: Python 3 + Visual Studio 2022 (MSVC 14.50) or mingw-w64.

Generate a vocabulary file (vocab.txt) with ~100 technical terms from ML, networking, game dev, and misc domains.

Use a Python script to produce 500+ unique compound names:

import random, itertools
first = ["Backoff","Perplexity","CrossEntropy","Tokenize","Corrupt","Gradient","Bandwidth","PacketLoss"]
second = ["Extrapolate","Turbulence","Drag","RevokePinch","Shrink","Activate","Bounce","Reshard"]
names = [f"{a}{b}" for a,b in itertools.product(first, second)]

Create a minimal DLL project in C with an exports .def file mapping all names to a single _Stub function:
```
__declspec(dllexport) void __stdcall _Stub(void) { __asm { ret } }
```
Build with /SUBSYSTEM:WINDOWS, link with .def file. Inspect with dumpbin /exports.
Observe the export table in pefile.py or rabin2 -iE showing 500+ names and very few unique RVAs.

Verification step: Scan your reproducer with yara using the rule from the sunwukong analysis. Should match on export count and name set overlap.

Defensive Countermeasures

Sigma / EDR: Alert on PE images with export count > 300 and unique-RVA-to-name ratio < 0.05.
Hunt query: | from PEExports | where ExportCount > 300 and len(set(RVA)) < 30
YARA: Combine pe.number_of_exports with string sets of known semantic-jargon vocabulary hits.

Pages Where Observed

sunwukong — malware family employing this technique
hippamsascom — Littel LLC / "wireless sensor" sibling (9a3c18be) displaying identical export flooding pattern
/intel/analyses/fa16b64ae95d6492be2074e65a0d6eae3ddb8adb9706f41f1fb0ad71c50aa7ce.html — primary analysis

> Technique · Semantic Jargon Export Obfuscation_