Skip to content

Module 13 — Incident Response Process

Type 1 · Concept Autopsy — dissect the incident response against the four phases of NIST SP 800-61, naming the gaps and successes, and produce a structured post-incident analysis a security manager or auditor could act on. (Secondary: Decision/ADR — frame the containment-timing call as a recorded decision.) Go to the hands-on lab →

Last reviewed: 2026-06

Digital Forensics & IRthe investigation without a process is just a fire drill — the process is what turns technical findings into organisational decisions.

Difficulty: Intermediate  ·  Estimated time: ~3–5 hrs (study + lab)  ·  Prerequisites: Foundations

In 60 seconds

Technical forensic skill without a process produces findings that go nowhere. NIST SP 800-61's four phases — Preparation, Detection & Analysis, Containment/Eradication/Recovery, Post-Incident — are the skeleton most regulated IR programs run on, and they're cyclical, not linear. The phase teams botch most is Containment (too early loses scope, too late lets the attacker run) — and it's a business decision informed by technical findings, not a technical one. Eradication that misses the exfiltrated key is cleanup theater; post-incident review must find why, not just what.

Why this matters

Technical forensic skill without a process framework produces findings that go nowhere. A responder who can identify timestomping, reconstruct a CloudTrail attack chain, and profile a dropper with CAPA — but who doesn't know when to escalate, who to notify, how to document decisions, or what "eradication" means in their environment — is a technician, not a responder. NIST SP 800-61 is the process skeleton that most IR programs in regulated industries are built on; knowing it means knowing how to operate in any of them.

To see the framework against a real incident rather than an abstraction, read a published case from The DFIR Report — a freely available library of 85+ detailed real-intrusion writeups, each reconstructed end-to-end and mapped to MITRE ATT&CK. Take their IcedID-to-Quantum-ransomware case: the ~78-hour window from a malicious-ISO click to domain-wide encryption is a forced study in the NIST phases — where Detection could have fired earlier, where Containment didn't happen, and what a complete Eradication would have had to rotate. Mapping a real case like this onto Preparation / Detection & Analysis / Containment-Eradication-Recovery / Post-Incident is exactly the lab exercise, and these reports are the realistic raw material for it.

Objective

Map the incident to the four phases of NIST SP 800-61, identify gaps and successes in the simulated response, and produce a structured post-incident analysis that a security manager or auditor could use.

The core idea

NIST SP 800-61 structures incident response into four phases: Preparation, Detection & Analysis, Containment/Eradication/Recovery, and Post-Incident Activity. These are not sequential in practice — a real response cycles between Detection and Containment many times as new hosts are scoped in — but as an analytical frame, they give you a consistent way to audit any incident response: what did the team know and when, what decisions were made, what was contained and when, and what was learned.

The mental model

The four NIST phases aren't a checklist you walk once — they're an analytical frame and a cycle you re-enter as new hosts are scoped in. Use them to audit a response: what did the team know and when, what was decided, what was contained, what was learned.

stateDiagram-v2
    [*] --> Preparation
    Preparation --> Detection: incident reported
    Detection --> Containment: scoped
    Containment --> Detection: new host found
    Containment --> Eradication
    Eradication --> Recovery
    Recovery --> PostIncident
    PostIncident --> Preparation: lessons learned
    Detection: Detection & Analysis
    Containment: Containment
    Eradication: Eradication
    Recovery: Recovery
    PostIncident: Post-Incident Activity

The phase that responders most consistently execute poorly is Containment, and the failure mode is almost always one of two things: containing too early (before scoping is complete, so the attacker pivots to a host you haven't identified yet) or containing too late (waiting for forensic certainty before acting, while the attacker continues to operate). The decision is never "do we have enough evidence?" — it's "does the evidence we have change the cost-benefit of waiting?" A live attacker with persistence on two hosts is a different calculation than a months-old compromised account with no evidence of recent activity. Containment is a business decision informed by technical findings, not a technical decision.

The gotcha

Containment fails in both directions: too early and the attacker pivots to a host you hadn't scoped; too late and they keep operating while you chase certainty. The question is never "do we have enough evidence?" — it's "does the evidence we have change the cost-benefit of waiting?" That makes containment a business call informed by technical findings, not a technical one.

Eradication is the phase most organisations underestimate in complexity. Identifying and removing the backdoor is the obvious step; equally important are: rotating all credentials that touched the compromised host or used by the compromised account, revoking and reissuing any secrets or certificates the host had access to, auditing all systems the compromised account could reach, and confirming that the initial access vector is closed. An eradication that misses the AWS access key that was exfiltrated and used to create a backdoor IAM user — as in this case — is not eradication. It's cleanup theater.

Post-Incident Activity is where institutional learning happens, and most teams do it badly or not at all. A post-incident review that is only a timeline ("what happened") instead of a causal analysis ("why did this happen and what would have stopped it") produces a report but not an improvement. The two questions worth forcing: "what was the first detection that could have fired, and why didn't it?" and "what single control, if implemented, would have had the highest probability of stopping or detecting this earlier?" Those two answers drive the remediation roadmap more than any checklist.

Go deeper: why eradication is harder than it looks

Removing the backdoor is the obvious step and the easy half. The complete job: rotate every credential that touched the compromised host or account, revoke and reissue any secrets or certs it could reach, audit all systems the account could access, and confirm the initial access vector is closed. An eradication that removes the implant but misses the exfiltrated AWS key used to create a backdoor IAM user is not eradication — it's cleanup theater, and the attacker walks back in.

AI caveat

The synthesis in post-incident review — read a long timeline, name causal factors, draft remediation — suits AI well. Feed it the merged timeline and ask "earliest point this was detectable? what control would have stopped it?" But a model answers from generic knowledge: it recommends controls you may already have and misses gaps specific to your configuration. The review judgment stays yours.

Learn (~2 hrs)

NIST SP 800-61 (~1 hr) - NIST SP 800-61 Rev. 2 — Computer Security Incident Handling Guide (PDF) — the canonical reference. Read Sections 3 (Handling an Incident) and 4 (Coordination and Information Sharing) — about 40 pages, but dense with practical guidance. This is the specification the lab exercise maps to. - CISA — Incident Response Playbook (PDF) — CISA's operationalised playbook that implements NIST 800-61 for federal agencies; readable as a concrete example of what the framework looks like in practice.

IR in practice (~0.5 hrs) - SANS — The Incident Handler's Handbook (PDF) — a free practitioner walkthrough of the NIST phases with real-world colour on where responses go wrong. Read sections 2 and 3 for the triage and containment decision frameworks.

Post-incident analysis (~0.5 hrs) - The DFIR Report — "Malicious ISO File Leads to Domain Wide Ransomware" — a complete real intrusion (initial access through domain-wide ransomware in ~78 hrs). Read it once, then re-read mapping each section to a NIST 800-61 phase and asking "where could detection or containment have broken this chain?" — that is the post-incident analysis the lab asks you to produce.

Key concepts

  • NIST SP 800-61 phases: Preparation → Detection & Analysis → Containment/Eradication/Recovery → Post-Incident
  • Containment timing: a business decision (cost of waiting vs. cost of acting) informed by technical scope
  • Eradication completeness: backdoors, credentials, secrets, access vectors — all must be addressed
  • Post-incident review quality: causal analysis ("why did controls fail?") beats timeline ("what happened?")
  • Evidence tracking: chain of custody, decision log, and notification record are non-optional in regulated industries
  • The "lessons learned" artifact is an organisational deliverable, not an optional appendix
  • The DFIR Report's public case library is realistic raw material for mapping a real incident onto the NIST 800-61 phases

AI acceleration

The synthesis task in post-incident review — reading a long timeline, identifying causal factors, and drafting remediation recommendations — is well-suited to AI assistance. Feed the merged incident timeline (from module 10) to a model and ask: "What was the earliest point this incident could have been detected? What controls, if present, would have stopped it?" Use the model's output as a starting point; the output must be reviewed against your actual environment and organisational context. A model answering from a generic knowledge base will recommend controls you may already have and miss gaps specific to your configuration. The review judgment is yours.

Check yourself

  • Why is containment timing a business decision rather than a technical one — what's the actual question being weighed?
  • An eradication removed the backdoor but the incident recurred a week later. Name a likely step that was skipped.
  • What distinguishes a post-incident review that drives improvement from one that just produces a report?

Comments

Sign in with GitHub to comment. Choose the type: Feedback (errors or suggestions on this page) · Hints (help for fellow learners — no spoilers) · General (anything else).