Module 2 · Lesson 1

GEOINT & SIGINT: AI Transforms Collection

From satellite pixels to finished intelligence — how machine learning reshaped the first link in the chain.

When a satellite photographs an entire country every 24 hours, who — or what — decides what matters?

In 2015, the National Geospatial-Intelligence Agency quietly deployed an automated object-recognition pipeline across its commercial satellite feed. Analysts who had spent years manually tagging aircraft on airfield imagery found that the system could process the same task in minutes that previously took a full shift. The program, later described in open NGA technical papers, marked the moment geospatial collection ceased to be the bottleneck — interpretation became the new frontier.

1.1 The Imagery Intelligence Revolution

Geospatial intelligence (GEOINT) encompasses satellite imagery, synthetic-aperture radar, multispectral sensors, and overhead video. For decades, the volume of raw imagery exceeded the human capacity to exploit it. The NRO estimated in public budget justifications that fewer than 10% of collected imagery was ever reviewed by an analyst. The remaining 90% sat in archives, potentially containing critical indicators that went unnoticed.

Deep-learning object-detection models — first demonstrated on public benchmarks like ImageNet (2012) and adapted for overhead imagery by DARPA's VIRAT program — changed that arithmetic. A convolutional neural network trained on labeled overhead imagery can flag military vehicles, construction activity, missile-erector positions, and changes in shipping patterns at a rate orders of magnitude faster than human review.

The commercial sector amplified this shift. Planet Labs, founded in 2010 and operational by 2014, began delivering daily global coverage at 3–5 meter resolution. By 2020 its fleet of Dove satellites provided near-daily revisit of every land mass. The volume of data made AI-assisted triage not optional but structurally necessary.

Documented Case — Xinjiang, 2018–2020

Australian Strategic Policy Institute researchers used Planet Labs imagery and AI-assisted change detection to document the construction of detention facilities in Xinjiang, China. The methodology — automated building-footprint extraction combined with human verification — identified over 380 suspected facilities. The analysis was later corroborated by leaked Chinese government documents (the "China Cables") and became a significant factor in diplomatic responses by the US, UK, and EU. This represents one of the clearest public examples of open-source AI-driven GEOINT producing policy-relevant intelligence.

1.2 Signals Intelligence and Machine Learning

SIGINT — the interception and exploitation of electronic signals — faces its own volume problem. The NSA's XKEYSCORE system, described in documents released by Edward Snowden in 2013, indexed internet traffic metadata at a scale that made manual review impossible. Machine learning addresses this through traffic analysis, voice-print identification, and automatic speech recognition (ASR) applied to collected audio.

NSA's TURBINE program, also described in Snowden documents and reported by The Intercept in 2014, used automated systems to manage implants — malicious software placed on targeted machines — at a scale that no human operator network could sustain. The automation of both collection and the management of collection infrastructure marked a qualitative shift in SIGINT operations.

On the open-source side, researchers at SRI International and DARPA's BABEL program demonstrated by 2016 that ASR systems could achieve useful accuracy on low-resource languages — Tagalog, Swahili, Pashto — within weeks of initial data collection, dramatically reducing the linguistic bottleneck in SIGINT exploitation.

GEOINTIntelligence derived from the exploitation and analysis of imagery and geospatial information, including satellite and aerial photography and mapping data.

SIGINTIntelligence gathered from intercepted electronic signals and communications, including communications intelligence (COMINT) and electronic intelligence (ELINT).

Change DetectionAn AI technique that compares images of the same area taken at different times to automatically identify structural or activity changes of potential intelligence value.

1.3 Limitations and Failure Modes

AI collection tools carry significant failure modes that intelligence professionals must understand. Distribution shift — the gap between training data and real-world conditions — is acute in overhead imagery. A model trained on US military vehicle signatures may perform poorly on Russian or Chinese equivalents in different terrain and lighting conditions.

The 2003 Iraq WMD assessment illustrates how collection confidence can be mistaken for analytical confidence. While that failure predated modern ML, it established the institutional lesson: the quantity and apparent precision of collection can create false certainty. AI systems that produce confidence scores may lead analysts to over-weight automated findings.

Adversarial camouflage and deception is an active countermeasure. Academic papers from MIT Lincoln Laboratory (2019) and Chinese PLA-affiliated universities have both demonstrated that simple physical modifications to vehicles — patterns painted on rooftops — can cause classification errors in commercial detection models. Nation-states with awareness of collection capabilities will exploit this.

Key Takeaway

AI dramatically increases the volume of imagery and signals that can be processed, but introduces new failure modes around distribution shift, adversarial manipulation, and false confidence. The intelligence value of automated collection is determined by the quality of training data, the honesty of uncertainty quantification, and the discipline of human oversight at the analytical layer.

Lesson 1 Quiz

GEOINT & SIGINT — test your understanding before the lab.

What structural problem did AI address for the NGA's imagery exploitation pipeline?

Correct. NGA acknowledged in public budget documents that fewer than 10% of collected imagery was reviewed. AI object-detection systems automated the triage layer, allowing far more of the archive to be exploited.

Not quite. The core problem was analyst bandwidth relative to collection volume — the "collection-exploitation gap." AI addressed the exploitation bottleneck, not resolution or language issues.

The ASPI Xinjiang facility mapping project is significant because it demonstrates:

Correct. ASPI used Planet Labs commercial imagery and AI change-detection, verified by human analysts, to identify over 380 suspected detention facilities. The work influenced diplomatic responses from multiple governments — a clear case of open-source AI-driven intelligence impact.

Incorrect. ASPI used commercial, not classified, imagery. Crucially, the automated findings required human verification — the project did not eliminate analyst review. The significance is the combination of AI scale and human judgment.

What is "distribution shift" as a failure mode for AI in GEOINT?

Correct. Distribution shift is the gap between training-data conditions and operational conditions. A vehicle-detection model trained on Western military hardware in European terrain may fail on different equipment in desert or arctic settings — with no warning signal to the analyst relying on it.

Not correct. Distribution shift is a machine-learning concept: the statistical mismatch between the data a model was trained on and the data it encounters in deployment. This is a major source of silent failure in AI-assisted intelligence.

Lab 1 — AI-Assisted GEOINT Analysis

Conversational lab · minimum 3 exchanges to complete

Scenario: Evaluating an AI Change-Detection Report

You are an all-source analyst at a fictional intelligence fusion cell. Your AI-assisted GEOINT system has flagged unusual construction activity at a facility in a country of concern and assigned it a 91% confidence score. Your supervisor asks you to assess the report before it goes to the senior analyst.

Discuss the methodology, the confidence score's meaning, potential failure modes, and what additional collection you would request — with your AI lab assistant below.

Start by asking: "What does a 91% confidence score from a change-detection model actually tell me about the intelligence value of this report?"

GEOINT Lab Assistant AI & National Security · M2-L1

Welcome to the GEOINT lab. I'm your analytical training assistant for this exercise. You've received a change-detection report with a 91% confidence score on unusual construction activity. What's your first question about evaluating this finding?

Module 2 · Lesson 2

Biometric Surveillance & Facial Recognition at Scale

From airport checkpoints to urban mass surveillance — how AI biometrics became a tool of both security and control.

When a government can identify any face in any crowd in real time, what changes about the nature of public space?

On January 27, 2020, Russian activist Mikhail Bakhtin attended a protest in Moscow. He wore a balaclava and sunglasses. Within days, he was detained — identified by Moscow's facial-recognition network, which by that date comprised over 100,000 cameras integrated with a database built partly from social media. Russian authorities confirmed the system's role. It was the first publicly documented case of protest-related mass facial-recognition enforcement in a major city, reported by The New York Times on February 5, 2020.

2.1 The Architecture of Biometric Surveillance

Modern AI-powered biometric surveillance combines three capabilities that, individually, existed before deep learning: camera networks, identity databases, and matching algorithms. The 2012–2015 advances in convolutional neural networks reduced facial verification error rates from roughly 20% to under 1% on benchmark datasets, enabling reliable real-time identification in operational environments.

China's Sharp Eyes program (雪亮工程), launched in 2015 and formalized in a 2018 State Council directive, aimed to achieve video surveillance coverage of all public spaces in China by 2020. Vendors including Hikvision, Dahua, and SenseTime supplied hardware and AI platforms. By 2019, China had an estimated 200 million surveillance cameras — approximately one per seven residents — with facial recognition integrated into transportation hubs, residential complexes, schools, and mosques in Xinjiang.

The Xinjiang deployment was documented in detail by researchers at the Australian Strategic Policy Institute, Human Rights Watch (2019 report "Eradicating Ideological Viruses"), and the Intercept. The system was explicitly designed to flag members of ethnic minorities — Uyghurs — for police attention, a documented instance of AI-assisted ethnic profiling at national scale.

Documented Case — US CBP Biometrics, 2018–Present

US Customs and Border Protection began deploying facial recognition at airport boarding gates in 2018 under the Biometric Entry-Exit program. By 2023 the system processed over 97 million travelers at over 200 airports, with CBP reporting a 97%+ match rate. The Government Accountability Office's 2022 review noted that CBP had not fully assessed privacy risks or demographic accuracy disparities before deployment — a documented governance gap in a democratic context.

2.2 Accuracy, Bias, and the Misidentification Problem

The NIST Face Recognition Vendor Test (FRVT), the authoritative public benchmark, documented in its 2019 report that most commercial facial recognition algorithms showed significantly higher false-positive rates for African-American and Asian faces compared to Caucasian faces — in some cases 10 to 100 times higher. For women and elderly subjects, error rates were also elevated.

These disparities have produced documented wrongful outcomes. Robert Williams, a Black man in Detroit, was arrested in January 2020 after a facial-recognition algorithm misidentified him as a shoplifting suspect. Detroit Police Department had used a Michigan State Police database and a vendor algorithm — the case was investigated and reported by the MIT Technology Review and the ACLU, which filed a formal complaint. Williams is believed to be the first documented wrongful arrest in the United States caused by facial recognition.

Two additional documented wrongful arrests — Michael Oliver (New Jersey, 2019) and Nijeer Parks (New Jersey, 2019) — followed similar patterns: algorithmic misidentification, insufficient secondary verification, and arrests of Black men. Parks was jailed for ten days before charges were dropped.

False Positive RateThe proportion of non-matches incorrectly identified as matches. In facial recognition, a higher false positive rate means more innocent people flagged as suspects.

FRVTNIST Face Recognition Vendor Test — the US government's primary independent benchmark for evaluating facial recognition algorithm performance, including accuracy disparities across demographic groups.

2.3 Democratic Governance Responses

In 2019, San Francisco became the first US city to ban government use of facial recognition technology, followed by Oakland, Boston, and Portland (Oregon). The EU's AI Act (2024) classifies real-time facial recognition in public spaces as a "prohibited AI practice" with narrow exceptions for serious crime investigation.

The UK's Information Commissioner's Office fined Clearview AI £7.5 million in 2022 for scraping billions of images from social media without consent to build a facial recognition database marketed to law enforcement. Similar enforcement actions were taken by authorities in France, Italy, Greece, and Australia.

These governance responses reflect a core tension: the same technology that enables efficient border screening and rapid suspect identification also enables mass ethnic surveillance and wrongful arrests. The accuracy disparities documented by NIST mean these harms fall disproportionately on minority populations.

Key Takeaway

AI facial recognition has transitioned from a theoretical concern to an operational reality used by governments across the democratic-authoritarian spectrum. Documented cases establish both its security utility and its capacity for harm — including wrongful arrests in the United States and ethnic surveillance in China. NIST data confirms that accuracy disparities are not hypothetical; they are measurable and unequal across demographic groups.

Lesson 2 Quiz

Biometric Surveillance & Facial Recognition — check your understanding.

What did the 2019 NIST Face Recognition Vendor Test (FRVT) establish about commercial facial recognition algorithms?

Correct. The FRVT 2019 report documented dramatic accuracy disparities across demographic groups, providing an empirical basis for concerns about discriminatory impact in law enforcement applications.

Incorrect. The FRVT 2019 report is the authoritative source here: it documented that most algorithms showed false-positive rates for African-American and Asian faces that were 10 to 100 times higher than for Caucasian faces — a finding with major implications for law enforcement deployment.

The Robert Williams wrongful arrest case (Detroit, 2020) is significant to AI policy because it:

Correct. The Williams case — documented by the ACLU and MIT Technology Review — established a direct causal chain from algorithmic misidentification to wrongful arrest, making abstract accuracy-disparity data concrete and policy-relevant.

Not correct. The Williams case is significant because it documents the real-world harm resulting from accuracy disparities and insufficient secondary verification — the first documented US wrongful arrest linked to facial recognition. There is no federal ban on law enforcement facial recognition in the US.

China's Sharp Eyes program and the Xinjiang facial recognition deployment are relevant to the broader AI governance debate because they demonstrate:

Correct. The documented Xinjiang deployment — confirmed by ASPI, Human Rights Watch, and leaked procurement documents — shows a system explicitly designed to flag members of an ethnic minority for police attention. This is not an accident of deployment but a design choice, representing AI as an instrument of ethnic control.

Incorrect. The Xinjiang case is significant precisely because it documents intentional ethnic targeting at scale — not a byproduct of imperfect accuracy, but a feature of system design. This is qualitatively different from surveillance programs in democracies, where legal constraints and oversight mechanisms, however imperfect, operate.

Lab 2 — Biometric Surveillance Policy Analysis

Conversational lab · minimum 3 exchanges to complete

Scenario: Drafting Biometric Use Policy for a Democratic Government

You are advising a national security committee in a fictional democratic country that is considering deploying facial recognition at major transportation hubs. Opposition lawmakers have raised concerns about accuracy disparities and civil liberties. The security ministry argues the technology is necessary to identify known terrorists at ports of entry.

Work through the policy tradeoffs with your AI assistant: what safeguards are necessary, what evidence thresholds should be required, and how can democratic accountability be maintained?

Start by asking: "What are the minimum safeguards that evidence from documented cases suggests must accompany any government biometric surveillance deployment?"

Biometric Policy Lab Assistant AI & National Security · M2-L2

Welcome to the biometric policy lab. You're advising on a facial recognition deployment proposal. The security case and the civil liberties concerns are both grounded in documented evidence. Where would you like to begin your analysis?

Module 2 · Lesson 3

Social Media Intelligence & Influence Operations

How AI enables both the detection of coordinated disinformation and its industrialized production.

If a foreign government can simulate millions of authentic citizens online, what does democratic opinion actually measure?

The Internet Research Agency (IRA), a Russian government-linked organization based in St. Petersburg, operated a network of fake social media accounts that reached at least 126 million Americans on Facebook between 2015 and 2017, according to Facebook's testimony to the US Senate Intelligence Committee in October 2017. The operation used coordinated inauthentic behavior — human-operated accounts amplified by automated bots — to exacerbate divisions on immigration, race, and gun rights. Mueller Report Volume I (March 2019) documented the operation in detail, including payroll records, operational pseudonyms, and spending totals exceeding $1.25 million per month at peak.

3.1 The SOCMINT Exploitation Layer

Social Media Intelligence (SOCMINT) refers to the systematic collection and analysis of intelligence from social media platforms. It operates on two tracks simultaneously: as a target (platforms are vectors for foreign influence) and as a collection resource (they yield open-source intelligence about adversary intentions, movements, and networks).

AI tools used in SOCMINT collection include network graph analysis to map account relationships, natural language processing to detect narrative coordination, bot-detection classifiers to identify inauthentic accounts, and sentiment analysis to monitor population attitudes. US Indo-Pacific Command contracted with Graphika — a social network analysis firm — for open-source SOCMINT support beginning in 2017, a relationship that became publicly known through contracting databases.

Twitter's 2018 publication of the IRA dataset — over 10 million tweets from 3,841 identified accounts — established the first large public corpus for training influence-operation detection models. Stanford Internet Observatory researchers used this data to develop and publish detection methodologies that are now standard in the field.

Documented Case — Operation Secondary Infektion

EU DisinfoLab and the Atlantic Council's Digital Forensic Research Lab (DFRLab) documented "Operation Secondary Infektion" in a 2019 report: a Russian influence operation involving over 2,500 fake accounts across 300 platforms in 24 languages that ran for at least seven years (2014–2019). The operation created and promoted fabricated documents attributed to real European politicians. Attribution was confirmed through linguistic analysis of Russian-language typos in English-language posts and overlapping infrastructure with known GRU accounts.

3.2 Generative AI and the Industrialization of Disinformation

The IRA's 2016 operation required significant human labor — writers producing content in English, graphic designers creating memes, operators managing accounts. The emergence of large language models fundamentally alters this cost structure. A GPT-class model can generate contextually appropriate, grammatically native-sounding social media content at near-zero marginal cost per post.

The Graphika and Stanford Internet Observatory report "Unheard Voice" (August 2022) documented the first confirmed instance of AI-generated text being used in an influence operation: a network linked to a US public relations contractor that used AI-generated profiles and articles to promote pro-US policy positions. Ironically, this was a US-origin operation — demonstrating that the capability is not limited to adversaries.

OpenAI's February 2024 threat intelligence report documented five influence operations — linked to Russia, China, Iran, and Israel — that had used GPT models to generate social media content, translate materials, and create fake personas. OpenAI terminated the accounts. The report is the first from a major AI developer explicitly documenting its own platform's use in state-linked influence operations.

Coordinated Inauthentic BehaviorFacebook's policy term for the coordinated use of fake accounts to manipulate public discourse — accounts that hide their true origin to make activity appear organic.

Narrative LaunderingThe practice of introducing a fabricated or distorted claim through low-credibility channels, then having it picked up and amplified by legitimate outlets that fail to trace its origin.

3.3 Detection and Counter-Measures

Detection of AI-generated influence content has become a central challenge. Classifiers trained to identify AI text achieve high accuracy on models they were trained against, but show significant accuracy drops on newer or fine-tuned models — a documented "arms race" dynamic. The GROVER model (Allen AI, 2019) demonstrated that the best detector of AI-generated news was another AI trained specifically for the task — but also that GROVER-generated text could fool human readers 73% of the time.

Platform-level countermeasures include provenance labeling (Content Credentials / C2PA standard, now adopted by Adobe, Microsoft, Google, and camera manufacturers), behavioral anomaly detection (identifying accounts that post at inhuman rates or exhibit coordinated timing), and network clustering (identifying communities of accounts with implausible overlap).

The key structural problem is asymmetric: generation of convincing content is computationally cheap; detection requires continuous model development against an evolving target. The CISA's 2023 guidance on AI-generated disinformation identifies this asymmetry as the central governance challenge.

Key Takeaway

AI is simultaneously a tool for conducting influence operations at industrial scale and the primary means of detecting them. The documented shift from labor-intensive IRA-style operations to AI-generated content represents a meaningful escalation in the threat landscape. Detection capabilities exist but face a structural disadvantage: generation is cheap, detection is hard, and the arms race dynamic favors offense.

Lesson 3 Quiz

Social Media Intelligence & Influence Operations — check your understanding.

According to Facebook's Senate testimony and the Mueller Report, the Internet Research Agency's operation reached how many Americans on Facebook?

Correct. Facebook's October 2017 Senate testimony stated that IRA content reached at least 126 million Americans between 2015 and 2017 — a figure that underscored the scale of the operation and drove subsequent platform policy changes and regulatory scrutiny.

Incorrect. The number documented in Facebook's Senate testimony is 126 million Americans — approximately 38% of the US population — reached through organic IRA content and paid advertising between 2015 and 2017. This scale was central to the policy response.

OpenAI's February 2024 threat report is significant because it:

Correct. The February 2024 OpenAI report — documenting accounts linked to Russia, China, Iran, and Israel — represented an unprecedented transparency disclosure: a frontier AI company acknowledging its platform had been actively used by state actors for information warfare, and publishing the details.

Not correct. The report documented confirmed use of GPT models by state-linked actors from Russia, China, Iran, and Israel for influence operations — not just two countries. No 100% detection capability was announced. The significance is the transparency disclosure itself.

What is the "arms race dynamic" in AI-generated disinformation detection?

Correct. The arms race dynamic is the structural asymmetry where generating convincing AI content is cheap and fast, while detection requires constant retraining against new models. CISA's 2023 guidance identifies this asymmetry as a central governance challenge. The GROVER research at Allen AI demonstrated this empirically.

Incorrect. In the context of AI disinformation, the arms race dynamic specifically refers to the cat-and-mouse between detection classifiers and content generation models. Detectors trained on one model's output become outdated when the generator is updated — an inherent structural advantage for offense over defense.

Lab 3 — Influence Operation Attribution

Conversational lab · minimum 3 exchanges to complete

Scenario: Assessing a Suspected AI-Assisted Influence Operation

You are a researcher at a fictional digital forensics organization. A major social media platform has shared a dataset of 2,000 suspended accounts that showed coordinated behavior amplifying divisive narratives in three swing states ahead of an election. Preliminary NLP analysis suggests 40% of the content may be AI-generated. You need to assess attribution confidence and decide what to publish.

Work through the methodology, evidentiary standards, attribution confidence levels, and the risks of premature or delayed disclosure with your AI lab assistant.

Start by asking: "What methodological steps are required before attributing a suspected influence operation to a specific state actor?"

Influence Operations Lab Assistant AI & National Security · M2-L3

Welcome to the influence operations lab. You're working a potential AI-assisted influence operation dataset ahead of an election. Attribution claims can have significant geopolitical consequences — let's work through the methodology carefully. What's your first analytical question?

Module 2 · Lesson 4

Predictive Analytics & Pattern-of-Life Intelligence

From lethal targeting to crime prediction — how AI's ability to model human behavior became a national security capability and a constitutional challenge.

If an algorithm can predict with 87% accuracy where a person will be tomorrow based on their phone data today, what are the legal and ethical limits on acting on that prediction?

Documents released by Edward Snowden and analyzed by The Intercept in May 2015 revealed that the NSA ran a machine-learning system codenamed SKYNET that analyzed the metadata of Pakistani mobile phone users to identify suspected couriers for Al-Qaeda. The system ingested 55 million phone records, extracted behavioral features — call frequency, SIM swaps, travel patterns, phone-sharing behavior — and produced a ranked list of suspects. Mathematician Patrick Ball, hired by The Intercept, analyzed the methodology and found its false-positive rate was likely high enough to have flagged thousands of innocent people. The Pakistani journalist Ahmad Mukhtar, whose movement patterns resembled the model's target profile, was later identified as potentially among those flagged.

4.1 Pattern-of-Life Analysis at Scale

Pattern-of-life (POL) analysis uses accumulated behavioral data — location histories, communication metadata, financial transactions, social network associations — to construct predictive models of individual behavior. In counterterrorism applications, POL analysis identifies anomalies: individuals whose behavior departs from their established baseline in ways consistent with pre-attack preparation.

The SKYNET program represents the earliest publicly documented large-scale application of machine learning to POL analysis for targeting. Its feature set — seven behavioral indicators including travel patterns and SIM card changes — was trained on a labeled dataset of known Al-Qaeda operatives. The Intercept's analysis by Patrick Ball demonstrated a fundamental methodological flaw: the training set was too small and the target population too large, mathematically guaranteeing a high false-positive rate regardless of model accuracy.

The drone targeting program in Yemen and Pakistan, documented by the Bureau of Investigative Journalism (2011–2020) and The Intercept's "Drone Papers" (October 2015), relied partly on SIM card and device tracking as targeting data. The Drone Papers included a leaked slide deck from a 2013 Special Operations Command assessment acknowledging that the US had "low confidence" in identifying specific individuals from device metadata alone. Innocent people were killed based on metadata that pointed to a device, not a verified individual.

Documented Case — Palantir & ICE, 2012–Present

Palantir Technologies provided its Gotham platform to US Immigration and Customs Enforcement beginning in 2012. The platform integrated license plate reader data, arrest records, utility data, social media, and phone records to build comprehensive dossiers on individuals and map their social networks. A 2018 Palantir internal document obtained by The Intercept described capabilities including "family tree" mapping and location prediction. A 2021 report by Georgetown Law's Center on Privacy and Technology documented the platform's use in at least 50 ICE field offices. The program represents domestic predictive analytics applied to immigration enforcement at national scale.

4.2 Predictive Policing: Evidence and Controversy

Predictive policing algorithms — including PredPol (now Geolitica), ShotSpotter, and HunchLab — were deployed across dozens of major US cities between 2011 and 2022. These systems predicted either locations (hotspot policing) or individuals (person-based prediction) likely to be involved in future crimes.

A 2021 investigation by the Los Angeles Times and documents obtained by the ACLU found that LAPD's person-based predictive policing system — contracted from Palantir — generated "chronic offender" lists that were disproportionately composed of Black and Latino men, and that being placed on the list generated additional police contact, which generated additional records, which reinforced placement on the list: a documented feedback loop.

In 2022, Los Angeles and Santa Cruz both terminated their predictive policing contracts following city council votes citing the bias evidence and the feedback loop problem. Chicago terminated its "Strategic Subject List" (SSL) — a person-based risk score system — in 2020 after a 2020 RAND Corporation evaluation found no evidence that the SSL reduced gun violence and documented racial disparities in scoring.

Pattern-of-Life AnalysisThe systematic collection and modeling of behavioral data over time to establish an individual's normal activity patterns and identify anomalous departures that may indicate threat-relevant behavior.

Base Rate FallacyThe error of ignoring the rarity of the target event (e.g., actual terrorists in a large population) when interpreting classifier outputs, causing even accurate models to produce mostly false positives in practice.

4.3 Legal and Constitutional Dimensions

The Fourth Amendment's prohibition on unreasonable searches without probable cause is structurally in tension with predictive analytics. Carpenter v. United States (2018, 5–4 Supreme Court) held that the government's acquisition of seven days or more of cell-site location information constitutes a Fourth Amendment search requiring a warrant. Chief Justice Roberts's majority opinion explicitly addressed the "seismic shifts in digital technology" and the capacity of comprehensive location data to "achieve near perfect surveillance."

The Carpenter ruling did not directly address predictive analytics or AI, but established the doctrinal principle that aggregation of location data creates constitutional concerns distinct from any single data point — a principle that applies directly to POL analysis systems. Academic and civil liberties organizations including the Electronic Frontier Foundation have argued that predictive analytics applied to individuals constitutes a search under Carpenter's logic.

Internationally, the EU AI Act (2024) classifies "AI systems used for real-time remote biometric identification in publicly accessible spaces for the purpose of law enforcement" and "AI systems used for risk assessment of natural persons" in criminal contexts as high-risk systems subject to mandatory conformity assessment, human oversight requirements, and — in the biometric case — near-prohibition.

Key Takeaway

Predictive analytics is among the highest-stakes AI applications in the national security domain because it can produce life-altering or lethal consequences based on probabilistic inference rather than evidence of specific acts. Documented cases — from SKYNET's high false-positive rate to Chicago's SSL feedback loop — establish that these systems can harm innocent people at scale. Carpenter v. United States and the EU AI Act represent the leading edges of legal frameworks still catching up to the technology.

Lesson 4 Quiz

Predictive Analytics & Pattern-of-Life Intelligence — check your understanding.

What was the primary methodological flaw in the NSA's SKYNET program identified by mathematician Patrick Ball?

Correct. Ball's analysis demonstrated the base rate fallacy at work: when the true target population (actual Al-Qaeda operatives) is a tiny fraction of the 55 million person dataset, even a highly accurate model will produce far more false positives than true positives in absolute terms. This is a fundamental limitation of predictive analytics applied to rare events in large populations.

Incorrect. Ball's critique centered on the base rate fallacy: the training dataset of known operatives was far too small relative to the 55 million person target population. Even with good accuracy metrics on training data, the mathematical reality of rare-event classification means most positive outputs will be false positives. This is the core problem with predictive analytics applied to counterterrorism.

Chicago's Strategic Subject List (SSL) was terminated in 2020 primarily because:

Correct. The RAND evaluation was critical: it provided empirical evidence that the SSL failed on both its stated objectives (reducing violence) and equity grounds (racial disparities). This dual failure made the system politically and practically indefensible. No court ruling was involved — it was an administrative termination following evaluation evidence.

Incorrect. The SSL was terminated after a RAND Corporation evaluation found it had no demonstrated effect on gun violence and showed racial disparities in scoring. No court struck it down — Chicago's city administration ended the program based on the evaluation evidence. This illustrates how government-commissioned evaluation can function as an accountability mechanism.

What principle did Carpenter v. United States (2018) establish that is directly relevant to AI-assisted predictive analytics?

Correct. Carpenter's doctrinal innovation was the "aggregation principle" — that comprehensive location data is qualitatively different from individual data points in its constitutional significance. This principle, while established in a cell-site location case, applies directly to POL analysis systems that aggregate multiple data streams over extended periods.

Incorrect. Carpenter held that government acquisition of seven or more days of cell-site location data requires a warrant — establishing that comprehensive location data aggregation triggers Fourth Amendment protections. The Court did not prohibit all location collection or directly address AI, but the aggregation principle is directly applicable to POL analysis systems. No blanket prohibition on AI risk scoring was established.

Lab 4 — Predictive Analytics Oversight Design

Conversational lab · minimum 3 exchanges to complete

Scenario: Reviewing a Pattern-of-Life Targeting Request

You are a legal advisor to a fictional government counterterrorism unit. The operations team has submitted a targeting package for a suspected terrorist network coordinator based on 90 days of metadata analysis from a POL system. The algorithmic confidence score is 84%. The suspect is a dual national with potential protected speech activity in the metadata. No direct evidence of a specific attack plan exists.

Work through the legal, ethical, and evidentiary standards required before any action is authorized — including the base rate problem, Carpenter implications, and the difference between predictive probability and evidence of specific intent.

Start by asking: "What is the difference between an 84% algorithmic confidence score and probable cause, and why does that distinction matter here?"

Targeting Oversight Lab Assistant AI & National Security · M2-L4

Welcome to the targeting oversight lab. You're reviewing a predictive analytics targeting package where the operational team has high confidence in the model output. The legal and evidentiary questions here have life-or-death stakes. Let's work through the analysis carefully. What's your first question?

Module 2 Test — Intelligence & Surveillance

15 questions · 80% required to pass · covers all four lessons

1. What organization's work on Xinjiang detention facilities is considered a landmark case of open-source AI-assisted GEOINT producing policy-relevant intelligence?

Correct. ASPI researchers used Planet Labs commercial imagery and AI change detection to identify over 380 suspected detention facilities, producing work that influenced diplomatic responses from multiple governments.

Incorrect. ASPI — the Australian Strategic Policy Institute — conducted the Xinjiang satellite imagery analysis using Planet Labs commercial imagery and AI-assisted change detection. The work influenced US, UK, and EU diplomatic responses.

2. The NSA's TURBINE program, described in Snowden documents and reported by The Intercept in 2014, automated which capability?

Correct. TURBINE automated the deployment and management of malware implants across targeted systems — a scale of offensive cyber operations that required AI-assisted management because human operators could not oversee thousands of simultaneous implants.

Incorrect. TURBINE was an automated implant-management system for malicious software placed on targeted machines — enabling offensive cyber operations at a scale that no human operator network could manage.

3. Which Supreme Court case established that aggregated location data over time creates Fourth Amendment search concerns, with direct implications for pattern-of-life AI systems?

Correct. Carpenter (2018) held that seven or more days of cell-site location data requires a warrant, establishing the aggregation principle: comprehensive location records are constitutionally different from individual data points. This principle directly applies to POL analytics systems.

Incorrect. Carpenter v. United States (2018) is the controlling case. Chief Justice Roberts's majority opinion addressed the "seismic shifts in digital technology" and held that aggregated location data enables "near perfect surveillance" requiring Fourth Amendment protection.

4. NIST's Face Recognition Vendor Test (2019) found that false-positive rates for African-American faces compared to Caucasian faces were:

Correct. The FRVT 2019 documented 10–100x higher false-positive rates for African-American and Asian faces across most commercial algorithms — the empirical basis for subsequent policy restrictions and wrongful arrest litigation.

Incorrect. NIST's FRVT 2019 documented false-positive rates for African-American and Asian faces that were 10 to 100 times higher than for Caucasian faces in most commercial algorithms — a finding that directly predicts the pattern of documented wrongful arrests.

5. What was the estimated monthly spending of the Internet Research Agency (IRA) at peak operations, as documented in the Mueller Report?

Correct. Mueller Report Volume I documented IRA spending exceeding $1.25 million per month at peak — establishing that this was a well-resourced professional operation, not an ad hoc effort.

Incorrect. The Mueller Report documented peak monthly spending exceeding $1.25 million, indicating a substantial, professional, state-linked operation with significant resources dedicated to the influence campaign.

6. China's Sharp Eyes program (雪亮工程) was formalized by which government document?

Correct. The 2018 State Council directive formalized Sharp Eyes, setting the target of comprehensive video surveillance coverage and integrating AI-powered facial recognition across transportation, residential, and commercial infrastructure.

Incorrect. Sharp Eyes was launched in 2015 and formalized in a 2018 State Council directive that established the target of comprehensive public space surveillance coverage, integrating AI facial recognition at national scale.

7. The "Unheard Voice" report (Graphika and Stanford Internet Observatory, 2022) was significant because it documented:

Correct. "Unheard Voice" documented a network linked to a US contractor using AI-generated profiles to promote pro-US policy positions — the first confirmed AI-text influence operation, and one that highlighted that democracies, not just adversaries, use these capabilities.

Incorrect. "Unheard Voice" documented the first confirmed use of AI-generated text in an influence operation, and it was linked to a US-origin contractor — not a foreign adversary. This was a significant finding because it showed democratic actors using the same techniques attributed to adversaries.

8. The base rate fallacy is critical to understanding predictive analytics in counterterrorism because:

Correct. This is the mathematical reality that made SKYNET problematic: if actual terrorists are 1-in-100,000 in a 55-million-person dataset, even a model that is 99% accurate will produce approximately 550 false positives for every true positive. The rarity of the target event dominates the false-positive count.

Incorrect. The base rate fallacy means that when looking for rare events (actual terrorists in a civilian population), the absolute number of false positives will vastly exceed true positives even with high model accuracy. This is why Patrick Ball concluded SKYNET's false-positive rate was likely very high despite its apparent technical sophistication.

9. What was Robert Williams's case (Detroit, 2020) the first documented instance of in the United States?

Correct. The Williams case — documented by the ACLU and MIT Technology Review — is the first documented wrongful arrest in the US attributed to facial recognition misidentification, making the NIST accuracy-disparity data concrete and its consequences real.

Incorrect. The Williams case (Detroit, January 2020) is documented as the first US wrongful arrest attributed to facial recognition misidentification. The case established that the accuracy disparities documented by NIST translate into real harm — in this case, the arrest of an innocent Black man.

10. DARPA's BABEL program demonstrated which specific capability relevant to SIGINT exploitation?

Correct. BABEL demonstrated that ASR systems could achieve useful accuracy on low-resource languages — Tagalog, Swahili, Pashto — within weeks, dramatically reducing the time and linguistic expertise required to exploit intercepted audio.

Incorrect. DARPA's BABEL program demonstrated automated speech recognition for low-resource languages, enabling SIGINT agencies to process intercepted audio in less-common languages without requiring years of linguist training — a significant capability expansion.

11. The EU AI Act (2024) classifies real-time facial recognition in public spaces for law enforcement as:

Correct. The EU AI Act classifies real-time remote biometric identification in public spaces for law enforcement as a prohibited practice with only narrow exceptions — one of the most restrictive regulatory approaches to facial recognition among major jurisdictions.

Incorrect. The EU AI Act classifies real-time facial recognition in public spaces for law enforcement as a near-prohibited practice — among the highest risk categories in the Act — with only narrow exceptions for investigating serious crimes. This represents the EU's position that the harms outweigh the routine law enforcement benefits.

12. Operation Secondary Infektion was attributed to Russia primarily through which analytical method?

Correct. The DFRLab and EU DisinfoLab attribution relied on computational linguistics — Russian-language errors embedded in English content — and technical infrastructure overlaps with known GRU-linked operations. This demonstrates how linguistic forensics and infrastructure analysis can support attribution even without direct access to intelligence sources.

Incorrect. The attribution in the 2019 DFRLab/EU DisinfoLab report relied on linguistic forensics (Russian-language typos in English posts revealing native Russian authorship) and infrastructure overlaps with known GRU accounts — techniques available to open-source researchers without classified access.

13. The LAPD predictive policing system's "feedback loop" problem refers to:

Correct. This feedback loop — documented by the LA Times and ACLU — is a structural feature of person-based predictive policing: the prediction increases police attention, which increases the probability of arrest and record creation, which increases the risk score, creating a self-reinforcing cycle that amplifies initial disparate impacts.

Incorrect. The feedback loop is the self-reinforcing cycle: placement on the list → increased police contact → new records → higher risk score → reinforced list placement. This structural problem, documented in the LAPD's Palantir contract, means the system's initial disparities compound over time rather than self-correct.

14. US Customs and Border Protection's Biometric Entry-Exit facial recognition program had processed how many travelers by 2023?

Correct. CBP reported over 97 million travelers processed at 200+ airports by 2023 — one of the largest democratic-government biometric surveillance deployments in the world, operating under authorities that the GAO found had not been fully assessed for privacy and demographic accuracy risks before deployment.

Incorrect. CBP's program processed over 97 million travelers at more than 200 airports by 2023. The GAO's 2022 review flagged that CBP had not fully assessed privacy risks or demographic accuracy disparities before deployment — a governance gap in a democratic context that mirrors the civil liberties concerns raised about authoritarian deployments.

15. OpenAI's February 2024 threat intelligence report documented influence operations using GPT models linked to actors from which countries?

Correct. OpenAI's February 2024 report documented five state-linked operations from Russia, China, Iran, and Israel — notably including a US ally — underscoring that AI-assisted influence operations are not limited to adversaries of Western democracies.

Incorrect. The February 2024 OpenAI report documented operations linked to Russia, China, Iran, and Israel — four countries including a close US ally. This breadth was significant: it demonstrated that AI-assisted influence operations are conducted by states across the geopolitical spectrum, not only US adversaries.