Module 04 — Static Analysis — Capabilities¶

Type 6 · Reconstruct — recover a binary's behaviour without running it by interpreting capa's MITRE ATT&CK and MBC output, deliverable a YARA rule targeting one identified capability plus a note on what static analysis can and cannot see (packed → capa sees nothing). (Secondary: Tool-Build — the YARA authoring is a mini tool build.) Go to the hands-on lab →

Last reviewed: 2026-06

Malware Analysis — knowing the import list is a vocabulary lesson; capabilities analysis tells you what sentences the binary can form.

Difficulty: Intermediate · Estimated time: ~5–7 hrs (study + lab) · Prerequisites: Foundations

In 60 seconds

An import list is a vocabulary lesson; a capability view tells you what sentences the binary can form. capa turns "calls VirtualAllocEx and WriteProcessMemory" into "performs process injection (T1055)" — labelled behaviours mapped to ATT&CK, the language the rest of the team already speaks. The catch: capa reasons statically, so a packed sample produces almost no output — which is itself a signal. YARA is the other half: capa applies curated behavioural rules, YARA applies yours.

Why this matters¶

Module 03 gave you the raw API list and the strings. This module answers the harder question: what behaviour does this binary exhibit, at the level a threat analyst actually speaks? "Calls VirtualAllocEx and WriteProcessMemory" is technical; "performs process injection (T1055.002)" is actionable. The gap between those two statements is what capability detection tools close. TrickBot is a clean illustration of why a capability view beats an import view: it is a modular C++ trojan whose plugins each implement a named behaviour — pwgrab harvests browser and email credentials (T1555.003), the core injects into svchost.exe/wermgr.exe (T1055), shareDll enumerates the network. A capa run over a TrickBot module returns those behaviours as labels mapped to ATT&CK IDs, which is the language the rest of the team already uses — not a raw list of APIs an analyst then has to interpret. (TrickBot — MITRE ATT&CK S0266 catalogues the modules and their technique mappings.)

Objective¶

Run capa against a PE binary, interpret its MITRE ATT&CK and MBC (Malware Behavior Catalog) output, and write a YARA rule that targets a specific capability. Understand the difference between what capa can detect statically and what requires dynamic analysis.

The core idea¶

The mental model

capa is a static capability detector from the Mandiant FLARE team. Each rule describes a pattern of API calls, string constants, byte sequences, or code structures that, in combination, implement a recognisable behaviour. Where pefile gives you raw ingredients (imports, strings), capa gives you labelled behaviours — process injection, keylogging, credential harvesting — each mapped to a MITRE ATT&CK technique ID. That mapping connects your finding directly to the threat-model language the rest of the team already uses.

The gotcha

capa reasons from static structure, so anything obfuscated or resolved at runtime — dynamic imports via GetProcAddress, encrypted strings, packer stubs — is invisible to it. A packed binary produces almost no capa output, and that silence is a result, not a clean bill of health: cross-check it against the Module 02 entropy read. Empty capa + high entropy means "packed — unpack or detonate first," not "no capabilities."

YARA is the other side of capability detection. Where capa applies curated behavioural rules, YARA applies your rules — bytes, strings, or regular expressions you write based on specific indicators. A YARA rule written after seeing a particular mutex string or a specific XOR key in a malware family becomes a detection that scales across millions of files. It is the practitioner's way of encoding "I've seen this before; here is the pattern that identifies it." The MITRE ATT&CK framework's Indicator Removal technique (T1070) and Obfuscated Files (T1027) both have known YARA signatures in public repositories.

Go deeper: the workflow order

The practical workflow is: capa first (what does this binary claim it can do?), YARA second (does it match any known family?), then static strings and imports to fill in gaps. None of these steps replaces the others; they are views of the same artifact from different angles.

Learn (~4 hrs)¶

capa - FLARE — Automatically Identify Malware Capabilities with capa (blog) — the original announcement post with worked examples. Read the "How capa Works" and "Example Output" sections carefully. (~30 min.) - capa GitHub — Rules README — understand the rule structure (YAML format, and/or logic, feature types). Read the "Rule Format" section.

YARA - YARA Official Documentation — Writing Rules — the canonical reference; read through "String Modifiers" and "Conditions." (~1 hr.)

Malware Behavior Catalog - MBC Overview (MITRE CTID) — the companion catalog to ATT&CK specifically for malware; understand how MBC Objective and Behavior map to ATT&CK techniques. Read the README and browse 3–4 behaviour entries.

A real modular family whose capabilities are documented (~15 min) - TrickBot — MITRE ATT&CK S0266 — read the technique list and map each TrickBot module (pwgrab, shareDll, the injector core) to its behaviour and ATT&CK ID. This is what well-formed capa output should look like for a real sample, before you run the tool.

Key concepts¶

capa rule structure: features (api, string, bytes, characteristic) + logic (and/or/not)
capa output fields: ATT&CK technique, MBC objective/behavior, namespace, rule name
What capa cannot detect: dynamic imports, runtime-decrypted strings, packed code
YARA rule anatomy: rule, strings, condition blocks
YARA modifiers: nocase, wide, ascii, fullword
Threat-model alignment: capability → ATT&CK technique ID
MITRE ATT&CK T1055 (Process Injection), T1547 (Boot/Logon Autostart Execution)
Real worked family: TrickBot (modular C++ trojan) — its named modules (pwgrab credential theft T1555.003, the svchost.exe injector T1055) are capabilities-as-labels, exactly what capa is built to surface

AI acceleration¶

AI can generate a YARA rule skeleton quickly — give it the mutex string, a registry key, and the PE section name and ask for a rule that matches any two of the three. Then validate it: run the rule against a file that should match (true positive) and one that should not (true negative). A YARA rule that compiles but matches nothing is the most common AI output failure.

AI caveat

A model writes a YARA skeleton fast, but its most common failure is a rule that compiles and matches nothing. The compiler being happy is not validation — run it against a true positive and a true negative yourself.

Check yourself

What does capa give you that a raw pefile import dump does not?
capa returns almost nothing on a sample. Why is that a finding rather than a clean result?
capa vs. YARA: which applies curated rules and which applies yours, and when do you reach for each?

Comments

Sign in with GitHub to comment. Choose the type: Feedback (errors or suggestions on this page) · Hints (help for fellow learners — no spoilers) · General (anything else).