Module 05 — Browser & Application Artifacts¶

Type 6 · Reconstruct — reconstruct what a user actually did from a Chrome/Chromium SQLite history database with hindsight and direct SQL, recovering history, searches, and downloads as artifact-cited findings. (Secondary: Misconception Reveal — show which artifacts survive incognito and disprove "clearing history erases it" via WAL recovery.) Go to the hands-on lab →

Last reviewed: 2026-06

Digital Forensics & IR — the browser is the most detailed diary of what a user actually did — and most people forget it exists.

Difficulty: Intermediate · Estimated time: ~4–6 hrs (study + lab) · Prerequisites: Foundations

In 60 seconds

The browser is the densest diary of what a user actually did — Chrome stores history, searches, and downloads in SQLite databases in the user profile, going back months. hindsight parses those profiles into a structured timeline, and crucially reads the Write-Ahead Log, which often still holds rows from before the user cleared history. The artifacts are redundant (history, DNS cache, downloads, network flows), and the recurring trap is Chrome's timestamp epoch: microseconds since 1601, not Unix — get it wrong and every time shifts 369 years.

Why this matters¶

Web browsers and desktop applications are the primary interface for most modern work — and most modern threats. A user who was phishing-targeted clicked something in a browser. An insider who exfiltrated data probably navigated to a cloud storage service or a webmail provider. A compromised machine is often first accessed via a malicious payload downloaded by a browser. Understanding what the browser recorded — and that it recorded far more than the visible history — is essential for reconstructing user activity during an incident.

The classic worked example is the Nitroba University Harassment scenario, a public Digital Corpora dataset built around a single investigative question: which of several students sharing a dorm IP sent a harassing webmail message? The case is solved by reconstructing web activity — the anonymous-mailer page the suspect visited, the webmail logins, the navigation chain that places one person at the keyboard. (Nitroba ships as a PCAP, so you read the activity off the wire rather than out of a SQLite history file — but the artifacts are the same URLs, search terms, and webmail sessions you'll pull from a browser profile here, which is why it's the canonical "tie web activity to a human" teaching case.) It is the same reasoning you apply when an insider's browser history is the evidence that they reached a cloud-storage exfiltration page.

Objective¶

Extract browsing history, search terms, download records, and cached credentials from a Chrome/Chromium SQLite history database using hindsight and direct SQL queries; explain which browser artifacts survive private/incognito mode and which don't.

The core idea¶

Every major browser stores its state in a collection of SQLite databases and JSON files on the local filesystem. Chrome and its derivatives (Edge, Brave, Arc) store history in ~/.config/google-chrome/Default/History on Linux, %LOCALAPPDATA%\Google\Chrome\User Data\Default\History on Windows. This SQLite file contains tables for urls (every URL visited, visit count, last visit time), visits (each individual visit with transition type and referrer chain), downloads (filename, URL, byte count, state), and keyword_search_terms (search queries typed in the address bar). This is not a redacted log — it is a dense, timestamped audit trail of everything the browser navigated to, going back months or years unless manually cleared.

The mental model

A browser profile is not a redacted log — it's a dense, timestamped audit trail of everything the browser navigated to, and it's redundant by design. The same activity is recorded in the history DB, the WAL, the DNS cache, the download directory, and network flows. Knock out one and the others still testify.

What makes browser artifacts forensically rich is that they are redundant. Clearing browser history removes the urls and visits table rows, but it does not clear the browser's DNS cache, the OS's DNS cache, the network flow data, or the download directory. More usefully for the forensic investigator: browser databases maintain a Write-Ahead Log (History-wal) and a journal file that often contain rows from before a clear was issued, because SQLite's WAL is not synchronously pruned when transactions commit. hindsight — a forensic tool purpose-built for Chrome-family browsers — understands this and reads both the main database and the WAL.

Application-specific databases follow the same pattern: Slack stores its workspace message history in a LevelDB database. Teams stores logs and media in %APPDATA%\Microsoft\Teams. Outlook stores email, contacts, and calendar in .ost (offline store) files — structured data formats that forensic tools can parse without a running application. The investigator's job is to know that "the user says they didn't download the file" is a testable claim, and that the SQLite download table, the $MFT, the prefetch, and the browser cache collectively either corroborate or contradict it.

Hindsight is a Python-based forensic tool from Google's internal DFIR team, released as open-source. It parses Chrome-family browser profile directories and produces structured output (XLSX, JSONL, HTML) of everything it finds: visited URLs sorted by timestamp, download records with source URLs, search terms, cookies, form autofill data, and extension state. For the forensic investigator, it provides a first-pass timeline of browser activity that can be fed directly into the super-timeline built in Module 07.

The gotcha

Chrome's native timestamp is microseconds since 1601-01-01 UTC (Windows FILETIME), not the Unix epoch. Hindsight converts automatically; raw SQL needs datetime(visit_time / 1000000 - 11644473600, 'unixepoch'). Get the epoch wrong and every timestamp shifts 369 years — a mistake that has shipped in real reports. Always sanity-check one known timestamp before trusting the conversion.

Go deeper: application artifacts beyond the browser

The same pattern recurs across desktop apps: Slack stores workspace history in a LevelDB database, Teams keeps logs and media in %APPDATA%\Microsoft\Teams, and Outlook stores mail, contacts, and calendar in .ost files — all parseable offline without the app running. "The user says they didn't download the file" is a testable claim: the SQLite download table, $MFT, prefetch, and browser cache collectively corroborate or contradict it.

AI caveat

A model is handy for converting a raw Chrome microsecond value or drafting SQL against the browser schema — but it guesses at schema details and sometimes confuses the Chrome epoch with Unix. Always verify a timestamp conversion against a known event before trusting AI-generated SQL.

Learn (~3 hrs)¶

Browser forensics foundations (~1.5 hrs) - hindsight — GitHub documentation — the tool's README and output field reference; read the Usage and Output sections before the lab. - Chrome SQLite Schema — Chromium source — the authoritative schema; understand urls, visits, and downloads tables.

SQLite forensics (~1 hr) - SQLite WAL mode — official documentation — explains why WAL files contain rows that history-clearing misses; short and essential for understanding forensic recovery. - DB Browser for SQLite — the GUI tool for exploring SQLite files interactively; useful for understanding schema before scripting queries.

A real web-activity case (~0.5 hr) - Digital Corpora — Nitroba University Harassment scenario — the canonical teaching case for attributing web activity to a person from a shared IP. Read the scenario brief and slides (~20 min) to see how visited URLs, webmail sessions, and search terms combine into an attribution — the same artifact chain you'll pull from a browser profile in the lab.

Application artifacts (~0.5 hr) - SANS Forensic Resources — Application Artifacts — curated guides on where major applications (Slack, Teams, Office) store forensically relevant data; use as a reference during analysis.

Key concepts¶

Chrome stores history, downloads, and searches in SQLite databases in the user profile directory.
Chrome's timestamp epoch is microseconds since 1601-01-01 UTC (not Unix epoch — conversion required).
hindsight reads Chrome profile directories and outputs structured forensic reports including WAL data.
WAL files often contain rows cleared from the main database — forensic gold.
Browser artifacts are redundant: history, DNS cache, downloads, network flows each tell part of the story.
Private/incognito mode does not write to the urls/visits tables — but DNS cache and network flows persist.
Application artifact locations vary by OS; know where Slack, Teams, and Outlook store their data.
The Nitroba scenario is the canonical case for attributing web activity (mailer pages, webmail logins, searches) to a specific person.

AI acceleration¶

AI is useful for translating Chrome's timestamp format (feed it a raw microsecond value, get a human-readable datetime) and for drafting SQL queries against the browser schema. Where it's not reliable: it will guess at SQLite schema details and sometimes confuse the Chrome epoch with the Unix epoch. Always verify a timestamp conversion against a known event before trusting AI-generated SQL output.

Check yourself

A suspect cleared their browser history. Name two places the visited URLs may still survive, and why.
Your SQL converts a Chrome visit_time to a date in the year 1601. What did you forget, and what's the fix?
Incognito mode left no rows in urls/visits. What artifacts can still place the user on a given site?

Comments

Sign in with GitHub to comment. Choose the type: Feedback (errors or suggestions on this page) · Hints (help for fellow learners — no spoilers) · General (anything else).