Lab 12 — Malware Artifacts in IR: CAPA and YARA¶
Hands-on lab · ← Back to the module concept
Setup¶
This is a reference lab — it ships a one-command environment in the companion
plaintext-labs repo:
git clone https://github.com/plaintext-security/plaintext-labs
cd plaintext-labs/forensics/12-malware-artifacts-ir
make up # builds container with capa and yara; ships benign PE samples
make fetch-data # OPTIONAL: pull a REAL Latrodectus sample from MalwareBazaar (needs Auth-Key)
make demo # runs capa over samples/ and fires yara rules/ against them
make shell # interactive shell for investigation
make down # stop when done
The lab ships data/samples/ — three benign PE binaries compiled from simple C programs.
No malware is committed; the lab practices the tooling and output interpretation against safe
samples. The YARA rules targeting the loader family are in rules/latrodectus_loader.yar.
Optional real target (advanced). make fetch-data pulls a genuine Latrodectus loader
sample from MalwareBazaar (abuse.ch) — the same loader family from
the DFIR Report "Lunar Spider" case this track is anchored to — so you can tune the YARA rule
against the real thing. The sample is live malware: make fetch-data downloads the
password-protected zip (password infected) but does not unzip it. Only detonate or even
unpack it inside an isolated analysis VM you own — never on your host. A free abuse.ch Auth-Key
is required (see PROVENANCE.md).
Do not commit real malware to this repository, and do not run YARA/CAPA against binaries on systems you don't own or aren't authorised to analyse. The committed samples directory contains only benign binaries. Any real malware (the MalwareBazaar sample) is handled exclusively in an isolated analysis environment.
Scenario¶
The IR team has recovered the loader binary dropped on BEACHHEAD-WS01 — in the anchored
DFIR Report "Lunar Spider" case, this is the Latrodectus loader delivered via
Form_W-9*.js → update.msi. The binary has been safely uploaded to an isolated analysis
environment (not this lab). Your task is to practice the triage workflow: understand CAPA's
capability output format, write a YARA rule that matches the loader based on its known
characteristics (C2 string, network imports), and confirm the benign samples in the lab do not
match your rule (false-positive check). For the advanced path, make fetch-data pulls a real
Latrodectus sample so you can validate the rule against the genuine family.
Do not test YARA rules or CAPA against binaries on systems you don't own or aren't authorised to analyse. In production, all sample analysis is conducted in an isolated environment.
Do¶
Part 1 — CAPA capability profiling
- [ ]
make demo— read the CAPA output for each sample. For each binary, note: - The top-level capabilities listed (e.g., "link function at runtime," "create process").
- The ATT&CK technique mapped to each capability (shown in the
[ATTCK]column). -
How many capabilities does the most "feature-rich" benign binary have?
-
[ ] Run CAPA manually on the largest sample inside
Find a capability that maps to ATT&CK T1059 or T1547. Does the sample actually use that capability maliciously? What additional evidence would you need to conclude "yes"?make shell: -
[ ] Interpret a hypothetical CAPA output. The
data/hypothetical_capa_output.txtfile shows what CAPA would output for a loader with network C2, persistence via scheduled task, and process injection capability. Read it and write three sentences: what this binary can do, what the IR team should check next to confirm it did do those things, and which other modules' artifacts would provide that confirmation.
Part 2 — YARA rule writing
-
[ ] Examine
rules/latrodectus_loader.yar— the pre-written YARA rules for the Latrodectus loader. Read every condition. Why is each rule scoped to PE files? What would happen if you removed theuint16(0) == 0x5A4Dcheck? -
[ ] Run the rules against the benign samples:
Do any benign samples match? If yes, identify which condition caused the match and explain how you'd tighten the rule. -
[ ] Write your own YARA rule. Open a new rule file (
rules/custom.yar) and write a rule that matches any PE file containing both the stringworkspacin.cloud(the real Latrodectus C2 from the case) and an import ofWSAConnectorconnect. Test it against the benign samples to confirm no false positives. -
[ ] Add a meta section to your rule with:
author,description,date,reference(cite a MITRE ATT&CK technique), andhash(the SHA256 of the sample). If you ranmake fetch-data, use the real Latrodectus sample's SHA-256; otherwise use the placeholder indata/hypothetical_capa_output.txt. This is production rule hygiene. -
[ ] (Advanced) Validate against the real sample. If you fetched the Latrodectus sample with
make fetch-data, unzip it (passwordinfected) inside an isolated analysis VM only, and runrules/latrodectus_loader.yarand yourcustom.yaragainst it. Does the curated rule fire? Record the sample's SHA-256 in your meta and note which strings matched a real binary versus a benign one.
Success criteria — you're done when¶
- [ ] You have documented the capability profile of each benign sample (number of capabilities, highest ATT&CK technique severity).
- [ ] You have written three sentences interpreting the hypothetical dropper's CAPA output.
- [ ]
rules/latrodectus_loader.yarruns cleanly against the benign sample directory with zero false positives. - [ ]
rules/custom.yar— your rule — also produces zero false positives on the benign set. - [ ] Both rules have complete meta sections.
Deliverables¶
Commit to your portfolio repo:
- rules/custom.yar — your YARA rule with complete meta section.
- capability-analysis.md — the three-sentence dropper interpretation + false-positive analysis.
Do not commit PE binaries — reference sample filenames in your analysis.
Automate & own it¶
Required. Write a Python script triage_samples.py that:
1. Accepts a directory of binary files.
2. Runs yara (via subprocess) against each file using all .yar files in a specified rules directory.
3. For each match, prints: filename, matched rule name, matched strings.
4. Outputs a summary: total files scanned, total matches, list of matched files.
Have a model draft the script; test it against the lab's benign sample set and confirm the
output is correct before using it on anything else. Commit triage_samples.py.
AI acceleration¶
Describe the Latrodectus loader's characteristics to a model ("PE file, imports
WSAConnect and CreateRemoteThread, contains string 'workspacin.cloud', uses UPX packing with
section name .upx0") and ask it to draft a YARA rule. Compare the model's draft to rules/latrodectus_loader.yar
in the lab. Where does the model's rule differ? Is it more or less specific? Run both against the
benign samples — does the model's version produce false positives that the curated rule avoids?
The comparison teaches rule quality faster than reading about it.
Connects forward¶
The YARA rule you write becomes a retroactive hunt artifact: in a real engagement, it goes to the threat intel team and to the EDR for a fleet-wide search. Module 13 (IR Process) documents this handoff as part of the NIST "Containment, Eradication, and Recovery" phase. Track 04 (Malware Analysis) covers the deep reverse-engineering that follows triage.
Marketable proof¶
"I triage suspicious binaries using CAPA capability profiling, write YARA rules from sample characteristics, validate them against benign baselines, and hand off operationalisable IOCs to threat intel — without running the malware."
Stretch¶
- Add a CAPA rule to
rules/(in CAPA's YAML format) that matches a binary which resolves a network function at runtime (e.g., viaGetProcAddress). Test it against the benign samples. Observe how the CAPA rule grammar differs from YARA's. - Use YARA's
pemodule to write a rule that matches any PE binary whose import hash (imphash) matches a specific value — the imphash ties samples from the same build environment together across campaigns. Document why imphash-based rules degrade over time (hint: recompiling changes the imphash even if the code is identical).
Comments
Sign in with GitHub to comment. Choose the type: Feedback (errors or suggestions on this page) · Hints (help for fellow learners — no spoilers) · General (anything else).