Module 7 · Lesson 1

The Closed Loop: Pentest Findings as Detection Fuel

Every gap a red team exposes is a rule a blue team can write — if they know how to listen.

How do organizations systematically convert offensive findings into durable defensive improvements?

In late 2009, Google's security team discovered that attackers had spent weeks inside its network before any detection fired. Post-incident forensics revealed that the intrusion techniques — spear-phishing, lateral movement via Windows shares, exfiltration over encrypted HTTP — were all known attacker behaviors. No detection rule existed for any of them. The Aurora campaign, targeting at least 34 companies including Adobe and Intel, became the clearest early demonstration that detection engineering had to be systematically coupled to offensive knowledge, not just reactive to past breaches.

Google's public response included the founding of Project Zero, explicitly tasked with finding and publishing vulnerabilities before attackers could exploit them. The underlying principle: offensive research must feed back into defensive posture, continuously and methodically.

Why Pentest Findings Decay Without a Feedback Loop

A penetration test produces a report. In most organizations, that report enters a ticketing system, gets triaged by severity, and drives patch cycles. This is valuable — but it misses the detection dimension entirely. The attacker's technique is documented; the defensive rule that would catch it is never written.

Detection engineering feedback is the disciplined process of taking each exploited technique from a pentest and asking: what observable evidence did this leave, and how do we write a detection for it? Without this step, the next attacker who uses the same technique will succeed as thoroughly as the pentesters did.

The MITRE ATT&CK framework, first publicly released in 2015, was built directly to solve this problem. ATT&CK is a living registry of adversary techniques observed in real intrusions, organized so that both red teams and detection engineers share a common vocabulary. When a pentest documents "T1059.001 — PowerShell execution," a detection engineer can immediately query: what SIEM rules cover this technique, and what did the pentest find that bypassed them?

Real Case — NSA TAO Tools Leak, 2016–2017

When the Shadow Brokers published NSA Equation Group tools in 2016–2017, defenders who had conducted internal pentests using similar SMB exploitation techniques were dramatically better positioned. Organizations with mature feedback loops had already written Sigma rules for EternalBlue-style lateral movement. Those without them scrambled for weeks. The gap between the two populations was not technical sophistication — it was whether pentest findings had been systematically converted into detection logic.

The Anatomy of a Detection Engineering Feedback Cycle

A mature feedback loop has four components. First: technique capture. Each pentest finding is tagged to an ATT&CK technique ID. Not just "lateral movement" but specifically T1021.002 (SMB/Windows Admin Shares) or T1021.006 (Windows Remote Management). Precision at the sub-technique level is what enables precise detection.

Second: evidence mapping. For each technique that succeeded, the pentester documents what artifacts were generated — event log IDs, network flow signatures, process tree anomalies, registry keys touched. This is the raw material from which detection rules are constructed. AI tools now assist this step significantly by querying large corpora of documented technique evidence and surfacing relevant telemetry fields.

Third: coverage gap analysis. The detection team audits whether existing rules would have fired on the documented artifacts. If EternalBlue was used and Windows Event ID 4625 (failed logon) spiked with no alert, that is a documented gap. If Mimikatz ran and no LSASS access alert fired, that is a gap. The list of gaps becomes a prioritized work queue for the detection team.

Fourth: rule authoring and validation. Detection rules are written, tested against the pentest evidence (or in a replay environment), and deployed. Critically, the rules are validated by the red team in a subsequent exercise — confirming they actually fire, and that the attacker cannot trivially bypass them.

AI Augmentation Point

Large language models accelerate evidence mapping substantially. Given a technique ID and a target environment (Windows Server 2022, Splunk SIEM), an AI assistant can enumerate likely telemetry sources, draft Sigma rule skeletons, and flag known bypass patterns that might blind a naive first-draft rule. The human engineer provides context and judgment; the AI compresses research time from hours to minutes.

Key Terms

Detection EngineeringThe discipline of designing, building, validating, and maintaining detection rules and logic within security monitoring systems.

ATT&CK Technique IDMITRE ATT&CK's unique identifier for a specific adversary technique (e.g., T1059.001), providing a shared vocabulary for red and blue teams.

Coverage GapA technique or evidence artifact for which no current detection rule exists or would fire under real attacker conditions.

Sigma RuleA generic, vendor-agnostic detection rule format that can be converted to queries for Splunk, Elastic, Chronicle, and other SIEMs.

Lesson 1 Quiz

The Closed Loop: Pentest Findings as Detection Fuel

Operation Aurora demonstrated which critical gap in enterprise security programs?

Correct. Aurora attackers used spear-phishing, SMB lateral movement, and encrypted HTTP exfiltration — all known technique categories. The problem was absent detection coverage, not unknown techniques. This drove Google's shift toward systematic offensive-defensive feedback.

Not quite. The Aurora techniques were documented attacker behaviors. The failure was the absence of detection rules for those behaviors, not novelty of the attack.

In the four-component detection engineering feedback cycle, what is the purpose of "evidence mapping"?

Correct. Evidence mapping is the step where pentester-documented artifacts become the raw material for detection rule authoring. Without precise artifact documentation, detection engineers are writing rules against hypothetical signals rather than observed reality.

That describes other reporting functions. Evidence mapping specifically catalogs the observable artifacts each technique produced — the inputs a detection rule would need to fire.

What advantage did organizations with mature pentest feedback loops have when the Shadow Brokers published NSA tools in 2016–2017?

Correct. The differentiator was pre-existing detection logic derived from previous red team exercises that used SMB exploitation. The feedback loop had already produced the rules needed; organizations without it had to write those rules under active threat.

The advantage was specifically in detection readiness — existing Sigma rules for the technique — not patching or intelligence.

Lab 1 — Detection Coverage Gap Analysis

Use AI to map pentest findings to ATT&CK techniques and identify detection gaps

Scenario

Your red team has completed an engagement and documented three successful attack paths: (1) credential spray against VPN using usernames from LinkedIn, (2) lateral movement via PsExec to three servers, (3) data staged to a cloud storage bucket via rclone. Your SIEM is Splunk with Windows event forwarding and Zeek network logs.

Use the AI assistant to identify ATT&CK technique IDs for each path, enumerate the specific evidence artifacts each technique should generate, and draft coverage gap questions to bring to your detection team.

Start by describing the first attack path and asking the assistant to identify the ATT&CK technique ID and list the telemetry sources and event IDs that would capture evidence of it in a Splunk + Zeek environment.

AI Lab Assistant Detection Coverage Gap Analysis

Ready to work through your red team findings. Describe the first attack path and I'll map it to ATT&CK, enumerate evidence artifacts, and help you build coverage gap questions for your detection team.

Module 7 · Lesson 2

Writing Detection Rules with AI Assistance

From pentest artifact to Sigma rule in minutes — and the judgment calls that keep it there.

What does effective AI-assisted Sigma rule generation actually look like, and what can go wrong?

FIN7, also known as Carbanak, stole over $1 billion from financial institutions across more than 40 countries between 2014 and 2018. Their tradecraft was methodical: spear-phishing with weaponized Office documents, PowerShell for post-exploitation, BITS jobs for persistence, and slow-and-low exfiltration over months. Every one of these techniques had documented detection paths. The FBI's 2018 indictments detailed how FIN7's command-and-control patterns, PowerShell invocations, and scheduled task creation all left event log traces that a properly tuned SIEM would have caught.

The gap was not detection capability — most victim organizations had Splunk or comparable tools. The gap was that no one had written the rules. Detection engineering was treated as a project rather than a continuous discipline. FIN7 exploited the absence of systematic rule development for four years.

What Sigma Is and Why It Matters

Sigma is an open-source detection rule format maintained by the SigmaHQ project. A Sigma rule is YAML-structured, vendor-agnostic, and can be compiled into native queries for Splunk SPL, Elastic DSL, Microsoft Sentinel KQL, Chronicle YARA-L, and a dozen other platforms. Write once, deploy everywhere — with platform-specific tuning.

A Sigma rule has four essential components: title and description (human-readable context), logsource (the category of log the rule applies to — e.g., Windows process creation, network connection), detection logic (the field-value conditions that trigger the rule), and falsepositives (known benign scenarios that match the rule). The last component is critical and frequently underdeveloped, leading to alert fatigue that causes SOC analysts to disable rules entirely.

Real Case — Elastic Security Labs, Ongoing

Elastic Security Labs publishes detection rules derived directly from malware analysis and incident response. Their process mirrors what AI-assisted detection engineering enables: given observed process behavior (e.g., a parent process spawning cmd.exe with encoded arguments), analysts enumerate relevant fields (CommandLine, ParentImage, EncodedCommand), draft detection logic, and enumerate false-positive scenarios. Since 2021, Elastic has open-sourced over 1,400 rules via GitHub, all following this structure. The documented false-positive reasoning in these rules — often two or three sentences — is where the most engineering judgment lives.

The AI-Assisted Rule Writing Workflow

Given a pentest artifact — say, a PowerShell invocation observed during the engagement with the exact command line documented — an AI assistant can draft a Sigma rule skeleton in under a minute. The workflow proceeds as follows.

Input: the artifact. Provide the AI with the exact command string, process tree, or network flow observed. "cmd.exe /c powershell.exe -EncodedCommand [base64 string] -ExecutionPolicy Bypass -NonInteractive" is sufficient context for a useful first draft.

AI output: rule skeleton. The model will propose a logsource (Windows process_creation), detection fields (CommandLine contains '-EncodedCommand', Image endswith 'powershell.exe', ParentImage endswith 'cmd.exe'), and a condition. It will often suggest relevant Sysmon Event IDs (Event ID 1 for process creation) and Windows Security Event IDs (4688 if process audit logging is enabled).

Human step: false positive analysis. This is where the engineer's environment knowledge is irreplaceable. Does your organization use SCCM, which invokes PowerShell with encoded commands for software deployment? Does your monitoring stack itself use PowerShell tasks? The AI can enumerate common false positive scenarios from public knowledge, but only the engineer knows whether those scenarios occur in this environment.

Human step: threshold and tuning. A rule that fires on every encoded PowerShell invocation will generate thousands of alerts daily in a large enterprise. The engineer adds conditions: parent process is not SCCM's ccmexec.exe, execution time is outside maintenance windows, or the invocation includes additional suspicious flags like -WindowStyle Hidden.

Practical Constraint

AI-generated Sigma rules consistently require human review of three things: (1) field name accuracy for the specific SIEM platform — Splunk's field names differ from Elastic's; (2) false positive reasoning specific to the target environment; and (3) rule condition logic for edge cases (case sensitivity, substring vs. exact match). Treating AI output as a first draft, not a finished product, is the operational norm among mature detection teams.

Validating Rules Against Pentest Evidence

A detection rule is hypothetical until it fires on real evidence. The gold standard for validation is replay testing: executing the exact pentest technique in a staging environment and confirming the rule fires. Teams that lack a replay environment can validate against the pentest's collected log data if that data was preserved and ingested into the SIEM during or after the engagement.

MITRE's open-source tool Atomic Red Team provides scripted, individual technique executions mapped to ATT&CK IDs. After a pentest identifies T1059.001 as an exploited technique, a detection engineer can run the corresponding Atomic test (Atomic Test #1: Mimikatz, or Atomic Test #3: PowerShell encoded command) and observe whether the drafted Sigma rule fires. This closes the loop: pentest finding → artifact documentation → rule draft → validation test → deployed rule.

Key Terms

SigmaA vendor-agnostic YAML-based detection rule format convertible to native queries for Splunk, Elastic, Sentinel, Chronicle, and other SIEMs.

LogsourceThe Sigma field that specifies which category of log data a rule applies to (e.g., Windows process_creation, network connection, authentication).

Atomic Red TeamMITRE-aligned open-source library of small, single-technique test scripts for validating detection coverage against specific ATT&CK technique IDs.

False Positive RateThe frequency with which a detection rule fires on benign activity; excessive false positives cause SOC fatigue and rule suppression.

Lesson 2 Quiz

Writing Detection Rules with AI Assistance

The FIN7 / Carbanak case demonstrates that the primary detection failure was:

Correct. Most FIN7 victims had Splunk or equivalent. The FBI indictments showed every technique left observable traces. The failure was treating detection engineering as a project rather than a continuous discipline — rules for known techniques simply were not written.

Technology availability was not the issue. The FBI's case documentation shows victims had capable SIEMs. The gap was absent detection rules for techniques whose artifacts were well-documented.

Which component of a Sigma rule is most frequently underdeveloped and most likely to cause SOC analysts to disable the rule entirely?

Correct. Underdeveloped false positive documentation leads to high false positive rates, which overwhelm analysts and create pressure to suppress or disable the rule. The Elastic Security Labs rules are notable precisely because their false positive reasoning is specific and actionable.

While all components matter, the falsepositives section is the most commonly neglected and the most directly responsible for alert fatigue that causes rules to be disabled.

When an AI assistant drafts a Sigma rule from a pentest artifact, which step requires irreplaceable human environment knowledge?

Correct. The AI can enumerate common false positive patterns from public knowledge. Only the engineer knows whether SCCM, custom monitoring scripts, or other legitimate tools in this specific environment generate the same artifacts the rule targets.

ATT&CK mapping, Event ID knowledge, and YAML formatting are all areas where AI assistance is effective. The irreplaceable human contribution is environment-specific false positive reasoning.

Lab 2 — AI-Assisted Sigma Rule Drafting

Draft, critique, and refine Sigma rules from pentest artifacts

Scenario

During a recent engagement, your team observed: a PowerShell process spawning from WScript.exe executing an encoded command, followed by a network connection to a non-standard port (8443) on an external IP. Your SIEM is Microsoft Sentinel with Sysmon event forwarding. Process creation data includes CommandLine, ParentImage, and ParentCommandLine fields.

Use the AI assistant to draft a Sigma rule for the PowerShell spawning behavior, then work through false positive analysis, threshold tuning, and a second rule for the anomalous outbound connection.

Ask the assistant to draft a Sigma rule for PowerShell spawned from WScript.exe with an encoded command argument. Then push back on its false positive section and ask it to enumerate environment-specific questions you should answer before deploying.

AI Lab Assistant Sigma Rule Drafting

Ready to draft Sigma rules from your pentest artifacts. Describe the behavior you want to detect and your SIEM/log environment, and I'll produce a working draft with detection logic, logsource, and false positive considerations.

Module 7 · Lesson 3

Purple Teaming: Closing the Validation Loop

Detection rules that have never been tested against a live attacker have never been tested at all.

How do structured purple team exercises ensure that defensive improvements from pentests actually hold under adversary conditions?

The SolarWinds Orion compromise, carried out by UNC2452 (Cozy Bear / APT29) and discovered in December 2020, infected approximately 18,000 organizations including multiple US federal agencies. The attackers maintained access for nine months — in some environments, over a year — before detection. FireEye, which discovered the intrusion, had conducted internal red team exercises and had detection logic for many common techniques. What SolarWinds exposed was the gap in supply-chain-specific detection: legitimate software update mechanisms as delivery vehicles, SAML token forgery for cloud identity abuse, and living-off-the-land binaries that blended with expected behavior.

The subsequent industry response included the creation of CISA's Emergency Directive 21-01 and a wave of purple team exercises specifically targeting supply chain and identity attack paths. Organizations that had conducted structured validation exercises — confirming their SIEM rules actually fired on SAML golden ticket creation and anomalous service principal activity — detected the compromise artifacts faster when the forensic data was reviewed retroactively. Those without validated detection rules found nothing in their logs, even in retrospect.

What Purple Teaming Is and Is Not

A purple team exercise is not a red team engagement where the blue team watches. It is a structured, collaborative session where red team operators execute specific techniques while detection engineers observe, in real time, whether their detection logic fires. The red team's job in a purple team is to execute techniques with full transparency — sharing exact command strings, execution timing, and expected artifacts — so that detection validation is unambiguous.

The contrast with a traditional red team engagement is significant. In a red team, the measure of success is whether the blue team detects the attackers. In a purple team, the measure of success is whether the detection rules work. These are different objectives. Purple teaming is fundamentally a detection engineering validation mechanism, not an assessment of blue team capability.

Real Case — CISA SILENTSHIELD Assessments

CISA's SILENTSHIELD program, operational since 2022, provides no-cost red team assessments to critical infrastructure operators. A distinguishing feature is the post-engagement purple team phase: CISA operators work directly with agency detection teams to replay each successful technique and confirm whether detection rules would have fired. CISA's 2024 red team assessment report documented that in one assessed organization, 15 of 17 successful techniques generated no SIEM alerts — not because the logs were absent, but because no rules existed for them. The purple team phase resulted in 15 new detection rules written and validated within two weeks of the engagement.

AI in Purple Team Exercises

AI assistance has changed three aspects of purple team execution. First: technique variant generation. When a detection rule successfully catches a baseline technique execution, the red team needs to test variants — encoded commands, alternate execution paths, LOLBin substitutions. AI tools can rapidly enumerate known evasion variants for a given technique, providing the red team with a systematic test battery rather than ad-hoc improvisation.

Second: real-time rule refinement. When a rule fails to fire, the detection engineer needs to understand why. Was the field name wrong? Did the attacker's command use a substring the rule's condition missed? AI assistants can analyze the gap between the rule's condition logic and the actual observed artifact, proposing targeted amendments in real time. This compresses the iterate-and-retest cycle from hours to minutes.

Third: documentation automation. Purple team exercises produce significant documentation: which techniques were tested, which rules fired, which were tuned, which new rules were written. AI tools can generate structured exercise reports from session notes, mapping each technique to ATT&CK, documenting rule changes, and producing the before/after coverage assessment that justifies the exercise investment.

VECTR and Structured Tracking

VECTR (from SecurityRisk Advisors, open-source since 2017) is the standard platform for tracking purple team results. Each technique execution is logged as a test case: ATT&CK technique ID, exact execution parameters, expected artifact, actual alert outcome (fired / not fired / fired with wrong context). AI assistants can import VECTR export data and produce prioritized gap remediation roadmaps, ranking un-detected techniques by ATT&CK prevalence data and threat intelligence relevance to the organization's sector.

Building a Continuous Purple Team Program

One-time purple team exercises have limited value if detection rules decay — systems change, log sources are modified, rule conditions break silently when field names change across SIEM upgrades. Mature programs schedule quarterly purple team exercises targeting high-priority ATT&CK technique families, with AI-assisted tracking of coverage trend over time.

The FBI and CISA's Joint Cybersecurity Advisories, released roughly monthly, document techniques observed in current campaigns. Each advisory is a prioritized purple team input: run Atomic Red Team tests for every technique in the advisory, confirm detection coverage, file gaps as remediation tickets. AI assistants can parse advisory text, extract technique IDs, and generate the corresponding Atomic test execution plan automatically — reducing advisory-to-validation time from weeks to days.

Key Terms

Purple Team ExerciseA structured, collaborative session where red team operators execute specific techniques transparently while detection engineers validate whether detection rules fire in real time.

SILENTSHIELDCISA's no-cost adversary simulation program for critical infrastructure operators, including a post-engagement purple team validation phase.

VECTROpen-source purple team tracking platform for logging technique executions, alert outcomes, and coverage trends across exercises.

Technique VariantAn alternate execution path for the same ATT&CK technique that may evade detection rules tuned only for the baseline execution pattern.

Lesson 3 Quiz

Purple Teaming: Closing the Validation Loop

What distinguishes the primary objective of a purple team exercise from that of a traditional red team engagement?

Correct. The fundamental difference is transparency and objective. Purple teams operate with full coordination to validate detection logic. Red teams operate covertly to measure blue team detection capability. These are complementary but distinct activities.

The key distinction is the objective and operational mode. Purple teams are transparent and validate detection rules; red teams are covert and measure detection capability.

The SolarWinds intrusion remained undetected for nine months in most environments primarily because:

Correct. The post-mortem analysis showed logs were available — even retroactively, organizations with validated detection rules found the artifacts. Those without detection coverage for supply-chain and cloud identity attack paths found nothing in their logs despite having collected the data.

Telemetry was being collected. The SolarWinds case demonstrated that having log data and having detection rules for what that data contains are two entirely separate things.

How do AI tools specifically accelerate purple team technique variant testing?

Correct. When a baseline detection rule fires, the real test is whether it holds against known evasion variants. AI can systematically enumerate LOLBin substitutions, encoding variations, and alternate execution paths — converting ad-hoc red team improvisation into structured variant coverage.

AI accelerates variant enumeration — systematically cataloging documented evasion paths for a technique so the red team's test coverage is comprehensive rather than improvised.

Lab 3 — Purple Team Variant Analysis

Use AI to enumerate technique variants and structure purple team test batteries

Scenario

Your team has drafted a Sigma rule for T1059.001 (PowerShell) that detects the baseline encoded command pattern (-EncodedCommand flag). Before deploying, you want to run a purple team validation covering known bypass variants. You also need to prepare a purple team test battery for T1055.001 (Process Injection via DLL) following a recent engagement where this technique was used successfully.

Use the AI assistant to enumerate PowerShell detection bypass variants, identify which Sigma rule conditions would catch each variant and which would not, and generate a structured VECTR-compatible test case list for both techniques.

Start by asking the assistant to enumerate documented detection bypass variants for PowerShell encoded command execution. For each variant, ask whether your current rule condition (CommandLine contains '-EncodedCommand') would fire.

AI Lab Assistant Purple Team Variant Analysis

Ready to work through your purple team validation planning. Share the technique you want to test and the current rule logic, and I'll enumerate documented evasion variants and assess which your current conditions would catch.

Module 7 · Lesson 4

From One-Time Exercise to Continuous Improvement

Detection coverage is not a milestone. It is a rate of change that must outpace the adversary's adaptation.

How do organizations build the processes, metrics, and tooling that sustain detection engineering improvement between pentests?

Mandiant's annual M-Trends report has tracked attacker dwell time — the median number of days an attacker remains in a network before detection — since 2012. In 2012, median dwell time was 416 days. By 2022, it had fallen to 16 days for organizations with internal detection capabilities. The organizations driving this improvement shared a common characteristic: they had converted detection engineering from a project activity into an operational function with continuous improvement metrics, regular cadence, and feedback integration from both incident response and offensive exercises.

The organizations that remained at multi-hundred-day dwell times were not technologically unsophisticated. They had SIEMs, they had threat intelligence feeds, they conducted annual pentests. What they lacked was the closed feedback loop: pentest findings not converted to rules, IR findings not converted to rules, threat intel not converted to rules. Their detection posture was static while attackers adapted.

The Detection Engineering Maturity Model

MITRE's Detection Engineering Maturity Model (DEMM), published alongside ATT&CK Navigator updates, defines five maturity levels. Level 1: reactive — rules written only after incidents. Level 2: pentest-driven — rules written from engagement findings but without systematic coverage tracking. Level 3: coverage-aware — ATT&CK Navigator heatmaps maintained, coverage gaps tracked as work items. Level 4: continuously validated — purple team exercises on a defined cadence, rule quality metrics tracked, coverage trend measured over time. Level 5: threat-informed — coverage priorities driven by current adversary activity, TI feeds integrated into detection backlog, AI-assisted technique variant enumeration standard practice.

Most organizations that conduct annual pentests operate at Level 2. The jump to Level 3 requires one process change: converting the pentest report's ATT&CK technique list into ATT&CK Navigator annotations, then auditing current detection coverage against those annotations. AI tools can automate this annotation step in minutes.

Real Case — Microsoft DART Team Practices

Microsoft's Detection and Response Team (DART), which responds to hundreds of major incidents annually, publishes its detection logic in the Microsoft Sentinel GitHub repository. The team's documented practice includes post-incident rule authoring for every novel technique observed, structured ATT&CK coverage reviews after each major campaign (Hafnium Exchange exploitation in 2021, DEV-0537 / LAPSUS$ in 2022, Midnight Blizzard in 2023–2024), and AI-assisted parsing of incident telemetry to identify artifact patterns that existing rules missed. Their published detection rules average 4–6 new rules per major campaign, typically within 72 hours of campaign attribution.

Metrics That Drive Continuous Improvement

Detection programs that improve over time measure two core metrics. ATT&CK coverage percentage: the fraction of technique IDs in the organization's threat model for which at least one validated detection rule exists. This metric, tracked monthly, shows whether the program is keeping pace with technique documentation. A static or declining coverage percentage despite ongoing investment signals that rules are decaying (broken by system changes) as fast as new ones are written.

Mean time to detect (MTTD) for known techniques: measured during purple team exercises and tracked over time. If the baseline PowerShell detection rule fired in 40 seconds in Q1 and fires in 40 seconds in Q3, the rule is stable. If it stopped firing — often discovered only during the Q3 exercise — something changed in the environment (Sysmon version update changed field names, log forwarding pipeline dropped events) and the gap went unnoticed.

AI tools add a third emerging metric: variant coverage ratio. For each ATT&CK technique with a detection rule, what fraction of its documented execution variants does the rule catch? A rule that catches the baseline but misses five of seven documented variants has low variant coverage. AI can maintain this ratio automatically by querying technique documentation and assessing rule conditions against the variant list.

Threat Intelligence Integration

CISA and FBI Joint Cybersecurity Advisories, ISAC threat reports, and vendor-published campaign analyses all contain ATT&CK technique lists for current adversary groups. An AI assistant can parse these documents, extract technique IDs, query the organization's ATT&CK Navigator coverage map, and produce a prioritized gap list in minutes. This converts threat intelligence from a reading exercise into a detection backlog input — a process that previously required a full-time analyst working for days.

Building the Feedback Process

The complete continuous improvement process has five steps, each with an AI acceleration point. Step 1 — Technique identification: ATT&CK tagging of pentest findings, IR findings, and threat intel. AI parses reports and extracts technique IDs. Step 2 — Coverage audit: Navigator annotation + gap list. AI compares technique list to current rule inventory. Step 3 — Rule authoring: Sigma drafts from artifact documentation. AI generates first drafts for human review. Step 4 — Validation: Atomic Red Team execution + purple team confirmation. AI enumerates variants for test battery. Step 5 — Rule maintenance: Scheduled re-validation after system changes. AI flags rules whose logsource conditions may have broken due to environment changes.

Organizations at Detection Engineering Maturity Level 4 run this cycle quarterly for high-priority techniques and annually for their full ATT&CK coverage map. AI tooling at each step compresses the total cycle time from months to weeks — making the difference between a program that keeps pace with adversary adaptation and one that perpetually lags.

Key Terms

Detection Engineering Maturity ModelMITRE's five-level framework describing progression from reactive rule-writing to threat-informed continuous detection improvement.

ATT&CK Coverage PercentageThe fraction of technique IDs in an organization's threat model for which at least one validated detection rule exists; primary metric for tracking detection program health.

Variant Coverage RatioThe fraction of documented execution variants for a given ATT&CK technique that an existing detection rule would successfully detect.

Dwell TimeThe median number of days an attacker remains in a network before detection; the primary outcome metric that detection engineering improvement programs seek to reduce.

Lesson 4 Quiz

From One-Time Exercise to Continuous Improvement

According to Mandiant M-Trends data, what did organizations that drove median dwell time from 416 days (2012) to 16 days (2022) have in common?

Correct. The M-Trends data consistently shows the differentiator is the closed feedback loop — pentest, IR, and TI findings systematically converted to detection rules — not technology investment alone. Organizations with static detection postures remained at high dwell times regardless of tool sophistication.

The M-Trends differentiator is the operational discipline of detection engineering — the feedback loop — not any specific technology choice or external advantage.

A detection program has an ATT&CK coverage percentage that remains static at 45% despite ongoing investment in new rule authoring. What does this most likely indicate?

Correct. Static coverage despite active investment is the hallmark of rule decay without re-validation. Sysmon version updates change field names; log forwarding pipelines drop sources; SIEM upgrades alter query syntax. Without scheduled validation, rules break silently and coverage percentage erodes as new rules are added to replace ones that quietly stopped firing.

Static coverage despite active investment most commonly signals silent rule decay — not a ceiling, framework growth, or platform limits.

What is the "variant coverage ratio" metric and why does it represent a more sophisticated view of detection quality than ATT&CK coverage percentage alone?

Correct. A high ATT&CK coverage percentage can mask shallow coverage — rules that catch the textbook execution but miss the five or six documented evasion variants real attackers use. Variant coverage ratio exposes this gap, showing whether detection logic is robust or brittle.

Variant coverage ratio addresses the depth problem: a rule can technically "cover" a technique while catching only the most naive execution pattern. The ratio measures robustness against documented real-world variants.

Lab 4 — Detection Program Health Assessment

Use AI to build a continuous improvement roadmap from ATT&CK coverage data

Scenario

You are the detection engineering lead for a mid-sized financial services firm. Your ATT&CK Navigator heatmap shows 38% technique coverage. A recent pentest identified 9 ATT&CK techniques across the Initial Access, Execution, Persistence, and Lateral Movement tactics. A new CISA advisory has just been released for a threat group targeting financial sector firms, listing 12 ATT&CK technique IDs — 7 of which overlap with your pentest findings. Your SIEM was upgraded last quarter and two Sysmon EventIDs changed names.

Use the AI assistant to build a prioritized remediation roadmap: which gaps to close first, what rule re-validation is needed post-SIEM upgrade, and how to structure a 90-day sprint to improve coverage from 38% to 55%.

Describe your situation to the assistant and ask it to help you prioritize the 7 overlapping technique gaps, estimate detection rule authoring effort, and structure a 90-day sprint plan with milestones.

AI Lab Assistant Detection Program Improvement Roadmap

Ready to help you build your detection improvement roadmap. Share your current coverage situation, the technique gaps from your pentest and the CISA advisory, and your environment details, and I'll help you prioritize and structure a sprint plan.

Module 7 — Module Test

Detection Engineering Feedback · 15 questions · Pass at 80%

1. Which real-world event most directly demonstrated that systematic offensive-defensive feedback loops were necessary for enterprise security?

Correct. Aurora's core lesson — known techniques, no detection rules — drove the formalization of detection engineering feedback as a discipline, including Google's founding of Project Zero.

Operation Aurora is the canonical case for detection engineering feedback gaps — known techniques left undetected by absent rules.

2. MITRE ATT&CK was built specifically to solve which problem in the detection engineering feedback process?

Correct. ATT&CK's foundational value is the shared vocabulary — "T1059.001" means the same thing to a red teamer and a detection engineer, enabling precise mapping from offensive finding to detection requirement.

ATT&CK's core contribution is the shared vocabulary connecting offensive techniques to defensive detection requirements — enabling the feedback loop.

3. In the four-component detection engineering feedback cycle, what happens at the "coverage gap analysis" step?

Correct. Coverage gap analysis is the audit step: comparing what the attacker did (and what artifacts it left) against what detection rules currently exist. The output is a prioritized gap list — the detection engineering work queue.

Coverage gap analysis is the audit between artifact documentation and rule authoring — determining which artifacts have no current detection rule.

4. What is the primary advantage of the Sigma rule format for detection engineering?

Correct. Write-once, deploy-everywhere via SIEM-specific compilation is Sigma's core value. The SigmaHQ project maintains converter tooling for all major platforms.

Sigma's primary value is platform agnosticism — one rule definition compiles to native queries for any supported SIEM.

5. The FIN7 / Carbanak campaign succeeded for four years primarily because:

Correct. The FBI indictments documented observable artifacts for every FIN7 technique. Victims had capable SIEMs. The gap was the absence of systematic detection rule development for known, observable technique signatures.

The FBI case documentation shows every technique had observable signatures in event logs. The failure was absent detection rules for documented behaviors.

6. Which component of a Sigma rule is most critical for preventing alert fatigue and SOC rule suppression?

Correct. Underdeveloped false positive documentation leads to high false positive rates and SOC fatigue, which ultimately results in rules being suppressed or disabled entirely — eliminating the detection coverage they were meant to provide.

The falsepositives section is the most commonly neglected and most operationally consequential component — it determines whether the rule is usable in production.

7. What is the gold standard method for validating that a newly written detection rule will actually fire under attacker conditions?

Correct. Replay testing is the only way to confirm a rule fires on real artifacts rather than hypothetical signals. Atomic Red Team provides the standardized execution scripts for this validation step.

Logic review and syntax validation are necessary but insufficient. Only replay testing against real technique execution confirms the rule fires on actual artifacts.

8. How does a purple team exercise fundamentally differ from a traditional red team engagement?

Correct. The fundamental difference is transparency and objective. Purple teams are collaborative detection validation exercises; red teams are covert capability assessments. Both are valuable; they answer different questions.

The defining difference is operational mode (transparent vs. covert) and objective (rule validation vs. detection capability measurement).

9. The SolarWinds / UNC2452 intrusion highlighted which specific detection engineering gap?

Correct. SolarWinds demonstrated that detection gaps are technique-specific. Organizations had logs; they lacked rules for supply-chain and cloud identity attack paths. Those with validated detection rules for these patterns found the artifacts retroactively.

The SolarWinds gap was technique-specific: absent detection rules for supply-chain delivery, SAML forgery, and cloud identity abuse — despite telemetry being collected.

10. CISA's SILENTSHIELD assessment program includes a purple team phase that documented which result at one assessed organization?

Correct. CISA's 2024 published assessment result — 15 of 17 techniques with no SIEM alert, despite logs being present — is a stark illustration of the gap between log collection and detection coverage.

CISA documented 15 of 17 techniques generating no alerts despite present log data — a critical illustration that detection coverage and log collection are not the same thing.

11. What does ATT&CK Navigator provide that makes it the standard tool for tracking detection coverage percentage?

Correct. ATT&CK Navigator's heatmap annotation capability is its core utility — making coverage gaps visually obvious and enabling coverage percentage calculation across the technique matrix.

ATT&CK Navigator provides visual heatmap annotation of the technique matrix — making coverage gaps visible and enabling coverage percentage tracking over time.

12. Which Detection Engineering Maturity Model level describes programs where coverage priorities are driven by current adversary activity with TI feeds integrated into the detection backlog?

Correct. Level 5 (Threat-informed) is the highest maturity level, characterized by real-time integration of threat intelligence into detection priorities, including AI-assisted technique variant enumeration as standard practice.

Level 5 — Threat-informed — is defined by TI-driven coverage priorities and AI-assisted variant coverage. Levels 3 and 4 cover coverage tracking and continuous validation respectively.

13. According to Mandiant M-Trends data, what was the median attacker dwell time in organizations with mature internal detection capabilities by 2022?

Correct. From 416 days in 2012 to 16 days in 2022 for organizations with internal detection programs — a 25x improvement driven by operational detection engineering disciplines, not just technology investment.

Mandiant M-Trends 2022 reported 16-day median dwell time for organizations with mature internal detection capabilities, down from 416 days in 2012.

14. What problem does the "variant coverage ratio" metric address that ATT&CK coverage percentage alone cannot?

Correct. High ATT&CK coverage percentage can mask brittle rules that catch only naive baseline executions. Variant coverage ratio exposes depth — whether the rule holds against the 5–7 documented evasion variants real attackers use.

Variant coverage ratio measures detection depth — whether rules catch real-world execution variants or just the textbook pattern — addressing a gap that coverage percentage cannot reveal.

15. When an AI assistant parses a CISA Joint Cybersecurity Advisory to support detection engineering, what is the specific output that converts the advisory from a reading exercise into an actionable detection backlog?

Correct. AI advisory parsing produces the specific, actionable output: a technique-level gap list cross-referenced against current coverage — converting intelligence documents into detection engineering work items without requiring days of manual analyst effort.

The actionable output is the technique-gap intersection: advisory ATT&CK IDs compared to current coverage, surfacing specific missing rules as detection engineering backlog items.