Lab 06 — Read the Wire: the Handshake, the Lookup, and the Beacon¶

Variant D · skill-first, breach as stakes. ← Back to the module concept

Setup¶

This is a reference lab — it ships a one-command environment in the companion plaintext-labs repo. It spins up a tiny HTTP server and a tool container (tcpdump, dig, curl) on an isolated Docker network — no real network, no cloud, nothing leaves your machine.

git clone https://github.com/plaintext-security/plaintext-labs.git
cd plaintext-labs/foundations/06-networking
make up         # start the HTTP server + tool container
make demo       # generate live DNS + HTTP traffic, then walk an annotated capture
make fetch-data # (optional, recommended) download a REAL malware C2 pcap to hunt in
make shell      # drop into the tool container (tcpdump, dig, curl)
make down       # stop when done

make fetch-data downloads a real RAT C2 capture from Malware-Traffic-Analysis.net (the "You dirty rat!" exercise). It ships as a password-protected zip on purpose — the password is on the site's about page. You unzip it yourself inside the lab container; that friction is the correct way to handle real malicious traffic. See data/PROVENANCE.md. (The bundled beacon capture below is the safe, zero-download default if you'd rather not handle live malware traffic yet.)

Scenario¶

You capture one ordinary HTTP request end to end — the DNS lookup, then the TCP handshake — read by you, packet by packet. Then you hunt a C2 beacon. You have two targets, your choice of stakes: a bundled capture with one DNS-based C2 beacon mixed in (safe, synthetic, modeled on SUNBURST's avsvmcloud.com lookups), and — once you've done that — the real RAT C2 capture from Malware-Traffic-Analysis.net (make fetch-data), where the callbacks are genuine. Same skill, real stakes: read the wire, then read it adversarially.

Capture only on systems/networks you own, or inside this throwaway container. The bundled beacon capture is a safe, synthetic stand-in. The fetched MTA.net pcap is real malicious traffic — open it only inside the throwaway lab container, never execute anything it references, and defang any live domain/IP before writing it down.

Do¶

Work the commands out from the Learn resources and man tcpdump — that derivation is the lab. Each step feeds the next.

[ ] Capture a live exchange. Start a packet capture on all interfaces, writing to a file, in the background. (Which flags write to a file instead of printing? How do you background a command?) (Why all interfaces, not the default? In a container, name resolution goes to Docker's embedded DNS at 127.0.0.11 over the loopback interface — so a capture bound to eth0 alone records the ARP and TCP handshake but misses the DNS query and answer entirely. Capturing every interface is what makes the lookup visible.)
[ ] Generate traffic. In another shell, produce exactly one DNS lookup and one HTTP request to the lab server, then stop the capture cleanly.
[ ] Find the lookup. From the saved file, isolate just the DNS traffic. Point to the query and the answer — the A record and the IP it returned. (Which port carries DNS?)
[ ] Find the handshake. Isolate the SYN, SYN-ACK, and ACK that open the TCP connection. State the client port, the server port, and the order of the first ~6 packets. (Hint: tcpdump can filter on TCP flags.)
[ ] Read the whole stream. Open the capture in Wireshark and "Follow TCP Stream" to see the request and response as one conversation — the layers, reassembled.
[ ] Now hunt the beacon. Open the provided synthetic beacon capture. Walk its DNS lookups the way you just walked yours, and find the one that's wrong. Check your README prediction: what actually gives it away — a subdomain no human would type, an unusually long/random name, a repeating interval? Name the packet and say, in one sentence, why it's the beacon.
[ ] Do it for real (recommended). make fetch-data, unzip the MTA.net capture with the password from the about page, and open the real .pcap in the lab container. Apply the same workflow — isolate the DNS, then hunt the HTTP/DNS callbacks — but expect real C2 to look nothing like the synthetic model: it hides in normal-looking domains amid a firehose of legitimate Windows/Microsoft telemetry, and it may carry no "random subdomain" tell at all. You won't recognize the malicious names by sight — and that's expected. The skill is method, not recall: (1) pin the one suspect host, (2) read its traffic in sequence (delivery → callback → repeat), and (3) enrich the domains you don't recognize — look them up (VirusTotal, urlscan.io, or a plain web search) rather than hoping one looks obviously wrong. A legitimate service used at the wrong time is itself a tell (e.g. why would a workstation call a geolocation API?). Treat "I found a suspicious callback and can justify it" as the win — full host/user attribution and naming the malware family belong to the Active Directory and Forensics tracks, not this one. Check yourself against the published analysis. (Read-only: never run anything it references.)

Success criteria — you're done when¶

[ ] You can point to the DNS query and the A record it returned in your own capture.
[ ] You can identify the SYN, SYN-ACK, and ACK, and state the client and server ports.
[ ] You can name the order of the first ~6 packets and how many flowed before any data did.
[ ] You've identified the beacon lookup in the second capture and can explain in one sentence what made it stand out from the legitimate DNS around it.

Deliverables¶

A short networking.md: the resolved IP, the annotated SYN/SYN-ACK/ACK packets, how many packets were exchanged before data flowed, and a one-paragraph verdict on the beacon (which packet, and the tell). Reference cap.pcap — do not commit it (see .gitignore).

Automate & own it¶

Required. Turn the manual read into a small reusable tool: a Python script that takes a capture file and prints (a) the DNS queries and their answers, (b) the SYN/SYN-ACK/ACK of each handshake, and (c) any DNS lookup that looks like a beacon by a rule you choose and can defend — e.g. an unusually long subdomain, a high-entropy/random-looking name, or a name not on an allowlist. AI drafts the parser; you review every line, confirm it flags the real beacon for the right reason, and run it against both captures to prove it. Next layer (the Step 7 reflex in code): have it enrich each flagged domain — look it up against a reputation source (e.g. the VirusTotal or urlscan.io API) and print the verdict — so the tool investigates the names you don't recognize instead of relying on a structural guess. Commit the script alongside networking.md. (A pcap library like scapy or dpkt is the usual path; reading tcpdump -r text output is a fine beginner alternative.)

AI acceleration¶

Paste any tcpdump line you don't understand to a model for a plain-English read of the flags, then confirm it against your capture and man tcpdump. For the beacon, ask the model why a given lookup is suspicious before you decide — it'll propose tells (length, randomness, frequency); your job is to check each against the actual packets and keep only the ones that hold. The verdict is yours.

Connects forward¶

Reading the wire underpins Offensive recon and scanning, Defensive network monitoring (Zeek/Suricata), and Forensic network reconstruction. The DNS-beacon hunt you just did is the seed of cloud detection — SUNBURST's C2 hid in DNS precisely because cloud and enterprise networks let it out, the exact gap the Cloud track's logging-and-detection modules close.

Marketable proof¶

"I can take a packet capture cold and walk the DNS lookup and the TCP handshake — and I can spot a DNS-based C2 beacon hiding in ordinary traffic and script the triage that flags it."

Stretch¶

Capture an HTTPS request (curl https://example.com): you'll see the TLS ClientHello and the SNI, but not the payload. Explain why the body is opaque — and what a defender can still learn from the metadata (the destination, the SNI, the timing) even without decrypting it. This is exactly the metadata a DNS/TLS beacon hunt leans on.
The resolver lies about its port — find out how. Capture DNS on the loopback interface (-i lo) while you curl http://server and look closely: the answer comes from 127.0.0.11:53, but the query you sent to :53 shows up with a high destination port (tcpdump won't even dissect it as DNS) — your app never asked for that port. Trace the machinery: the container reaches the embedded resolver via 127.0.0.11 (confirm in /etc/resolv.conf), and the container's own netfilter rules rewrite the port in flight. Dump them with iptables -t nat -S (the lab container has NET_ADMIN) and find the DNAT that redirects :53 to the resolver's real high port and the SNAT that rewrites the answer's source back to :53. Answer two questions: what connection state ties the two halves together so the application never notices, and why does tcpdump -i lo port 53 catch the answer but miss the question (what filter would catch both)? This DNAT-to-a-local-port trick is the same one behind transparent proxies and a lot of container "magic" addresses — spot it here and you'll recognize it everywhere.

Comments

Sign in with GitHub to comment. Choose the type: Feedback (errors or suggestions on this page) · Hints (help for fellow learners — no spoilers) · General (anything else).