typetechniqueconfidencehighcreated2026-05-30updated2026-05-30obfuscationpeevasionmalware-familycompileranti-vm

Semantic Jargon Export Obfuscation

What It Does

A PE export table is populated with hundreds of plausible-sounding function names drawn from unrelated technical domains (machine learning, networking, game engines, DevOps). The names are syntactically valid and semantically coherent within their domains, but all resolve to a handful of tiny ret-only stubs. This creates a veneer of a large, legitimate software project, drowning real functionality in noise and frustrating signature-based detection.

Detection / Fingerprint

  • Export count > 400 with unique RVA count < 25 (high name-to-body ratio)
  • Names are grammatically consistent English compound words drawn from 2–4 technical domain vocabularies mixed together (e.g., BackoffExtrapolate, CorruptTurbulence, TokenizeDrag, CrossEntropyRevokePinch)
  • No meaningful cross-references from the stub bodies — each stub is push rbp; mov rbp,rsp; ret or shorter
  • High-entropy .data section suggests actual payload is separate from the export façade

Implementation Patterns Observed

In the sunwukong sample:

  • 503 exported names mapped to ~21 unique RVAs in the .text section ^[sample fa16b64a/pefile.txt:338-500]
  • Names generated by combining a vocabulary of ~100 base terms (e.g., Backoff, CrossEntropy, Tokenize, Drag, Perplexity, Turbulence) with random pairing
  • The export directory is large (0x3475 bytes) and sits in .rdata, consuming a notable portion of initialized data

Reproduce on Your Own VMs

Toolchain: Python 3 + Visual Studio 2022 (MSVC 14.50) or mingw-w64.

  1. Generate a vocabulary file (vocab.txt) with ~100 technical terms from ML, networking, game dev, and misc domains.
  2. Use a Python script to produce 500+ unique compound names:
    import random, itertools
    first = ["Backoff","Perplexity","CrossEntropy","Tokenize","Corrupt","Gradient","Bandwidth","PacketLoss"]
    second = ["Extrapolate","Turbulence","Drag","RevokePinch","Shrink","Activate","Bounce","Reshard"]
    names = [f"{a}{b}" for a,b in itertools.product(first, second)]
    
  3. Create a minimal DLL project in C with an exports .def file mapping all names to a single _Stub function:
    __declspec(dllexport) void __stdcall _Stub(void) { __asm { ret } }
    
  4. Build with /SUBSYSTEM:WINDOWS, link with .def file. Inspect with dumpbin /exports.
  5. Observe the export table in pefile.py or rabin2 -iE showing 500+ names and very few unique RVAs.

Verification step: Scan your reproducer with yara using the rule from the sunwukong analysis. Should match on export count and name set overlap.

Defensive Countermeasures

  • Sigma / EDR: Alert on PE images with export count > 300 and unique-RVA-to-name ratio < 0.05.
  • Hunt query: | from PEExports | where ExportCount > 300 and len(set(RVA)) < 30
  • YARA: Combine pe.number_of_exports with string sets of known semantic-jargon vocabulary hits.

Pages Where Observed

  • sunwukong — malware family employing this technique
  • hippamsascom — Littel LLC / "wireless sensor" sibling (9a3c18be) displaying identical export flooding pattern
  • /intel/analyses/fa16b64ae95d6492be2074e65a0d6eae3ddb8adb9706f41f1fb0ad71c50aa7ce.html — primary analysis