L1
·
Quiz
·
Lab
L2
·
Quiz
·
Lab
L3
·
Quiz
·
Lab
L4
·
Quiz
·
Lab
Module Test
Module 6 · Lesson 1

The Risk Landscape: What's Actually Happening

Not every AI warning is Skynet. Not every reassurance is honest either.
How do you figure out which AI risks deserve your attention when every headline is screaming?

Maya is a junior at UC San Diego, double-majoring in psychology and data science. She's been watching the AI discourse online for two years now, and she's genuinely confused. Last week, her professor assigned a reading from an AI safety org claiming we're five years from catastrophic superintelligence. The week before, a TechCrunch article quoted an Anthropic engineer saying current models are nowhere near general intelligence. Her LinkedIn feed has a founder promising AI will solve climate change by 2028. Her Reddit feed has someone insisting AI is already manipulating elections on a massive scale.

She has an internship interview at a health tech startup next month. They use AI in their diagnostic pipeline. She's trying to figure out: what risks are real enough to actually matter to her life and career right now?

Why the AI Risk Conversation Is a Mess

The AI risk conversation is genuinely difficult to navigate because it spans a huge range of timescales, probability levels, and affected parties — and most media treats all of it as equally urgent or equally dismissible depending on the outlet's angle. This isn't just noise you should tune out. It's a signal that you need a framework, not more hot takes.

There are roughly three categories of AI risk that get discussed: near-term harms (things happening right now — bias in hiring systems, misinformation tools, surveillance), medium-term structural risks (labor displacement, concentration of power, erosion of institutional trust), and long-term speculative risks (AI systems pursuing goals humans didn't intend, potential loss of human oversight). These categories require completely different analytical tools and different levels of personal urgency.

The mistake most people make — including a lot of smart people — is mixing all three together and then either panicking about all of them or dismissing all of them. Neither is useful. Maya's actual problem isn't that she doesn't know enough about AI risk. It's that she hasn't sorted the risks by time horizon and personal proximity.

Why This Matters To You Right Now

If you're entering a job market, building projects, or deciding which industries to pursue, you're making real bets based on your implicit AI risk model. Getting that model right has immediate career and financial consequences — not just philosophical ones.

The Two Failure Modes Everyone Around You Is Making

Look at how your peers are processing AI risk. Most fall into one of two failure modes, and both are getting people into trouble.

Failure Mode 1: Techno-panic. This is the person who's convinced AI will take every job by 2027, that deepfakes have already made truth impossible, and that we're months from some kind of authoritarian AI surveillance state. They share every alarming article without verifying the underlying claims. The problem isn't that some of these risks aren't real — some are. The problem is that undifferentiated panic produces bad decisions: avoiding tech fields entirely, refusing to use AI tools competitively, or burning energy on speculative risks while ignoring proximate ones.

Failure Mode 2: Techno-dismissal. This is the person who thinks all AI risk talk is sci-fi cope from people who don't understand how the technology actually works. "It's just autocomplete." "The safety researchers are just grifting for funding." This position also produces bad decisions: not thinking about how AI systems can fail in ways that affect you, not asking hard questions about the AI tools embedded in systems that govern your life, and being caught flat-footed when real harms materialize.

The honest position is uncomfortable: some risks are real and proximate, some are speculative and distant, and the hard work is figuring out which is which.

A Framework for Sorting Risk From Noise

Here's a simple framework you can actually use. When you encounter an AI risk claim, run it through three filters:

Filter 1 — Is there evidence of current harm, or is this a prediction? Claims about bias in hiring algorithms causing documented discrimination right now are evidence-based. Claims that AI will develop misaligned goals and dominate humanity by 2030 are predictions. Both are worth understanding, but they warrant different urgency levels.

Filter 2 — Who is making the claim and what incentive do they have? AI safety organizations have funding incentives tied to perceived urgency. Tech companies have incentives to minimize risks. Journalists have incentives to alarm. None of these mean the claim is wrong, but they should shape how you weight it without corroboration.

Filter 3 — Does this risk affect systems you're actually inside? If you're applying for jobs, AI screening systems that discriminate are a real and proximate risk. If you're a patient using health apps, algorithmic misdiagnosis is proximate. The speculative risk that a future AI might pursue dangerous self-preservation goals is real enough to think about but not proximate enough to drive immediate decisions.

Real & Proximate

Algorithmic bias in hiring, lending, and criminal justice. Misinformation tools lowering the cost of deception. AI-generated content eroding trust in media. Job displacement in specific sectors.

Speculative / Overstated

Imminent AGI takeover. AI achieving consciousness in current systems. Total job elimination within 5 years. AI systems secretly coordinating against humans today.

Practical Takeaway

Before your next internship interview, hiring process, or loan application — ask whether the company or institution uses AI in their decision pipeline and what their bias auditing process looks like. That's a question about a real, proximate risk you're personally inside. You're allowed to ask it. It signals analytical sophistication, not paranoia.

The Calibration Problem

One of the most honest things to acknowledge: calibrating AI risk is genuinely hard, even for experts. The field doesn't have decades of empirical data to draw on. AI systems are evolving faster than our ability to study their effects rigorously. Researchers disagree in good faith about both probabilities and timelines.

What this means for you isn't paralysis — it means holding your risk assessments with appropriate uncertainty. The goal isn't to become an AI doom evangelist or an AI booster. The goal is to be someone who understands enough to ask better questions and update when evidence changes.

Maya's situation is actually a good one to be in. She's in health tech, which means she's going to be inside AI systems that affect real patients. That means near-term bias risks and reliability risks are directly relevant to her work. She should care about those deeply. The speculative superintelligence debate is interesting background context, but it's not what she needs to prepare for next month's interview.

Lesson 1 Quiz

The Risk Landscape · 5 questions
1. Maya's professor assigned a reading predicting catastrophic superintelligence in five years. Which filter from the lesson is most relevant for evaluating this claim?
Exactly. The five-year superintelligence prediction is a speculative forecast, not documented current harm. Filter 1 — distinguishing evidence-based risk from prediction — is the right starting tool. That doesn't mean the claim is wrong, just that it warrants different urgency than, say, documented hiring discrimination.
Reasonable instinct, but the most fundamental issue is that this is a future prediction, not current documented harm. Filter 1 directly addresses that. The others may be relevant secondary checks, but they don't address the core epistemological problem first.
2. Which best describes the "techno-dismissal" failure mode described in the lesson?
Right. Techno-dismissal isn't just about tone — it has a concrete consequence: you stop questioning the AI systems you're already inside. That's the dangerous part, not just having an optimistic vibe.
That describes techno-panic, not dismissal. The two failure modes are mirror images — one over-weights every alarm, the other dismisses all of them.
3. According to the lesson, what is the main reason sorting AI risks by time horizon matters practically?
Correct. The point isn't a ranking — it's that near-term, medium-term, and long-term risks require genuinely different analytical approaches and produce different decisions. Collapsing them all together is what leads to both failure modes.
The lesson explicitly avoids ranking near-term vs. long-term as inherently more important. The key insight is that they require different tools and different urgency levels — not a fixed hierarchy.
4. You're a 21-year-old applying for entry-level finance jobs. You read that AI will cause catastrophic economic collapse within three years. You also read that algorithmic credit scoring systems have documented racial bias. Which should drive your immediate job-seeking behavior more, and why?
Yes. The collapse prediction is speculative; the bias documentation is empirical. Filter 3 — does this affect systems you're inside — strongly favors the credit scoring issue. You can apply for finance jobs while being alert to documented bias in the industry. You can't do much today about a speculative collapse.
Severity matters, but only alongside probability and proximity. A 3% chance of collapse is less actionable than a documented pattern of bias in systems you're actively inside. The framework asks you to weight all three, not just severity.
5. Why does the lesson describe calibrating AI risk as "genuinely hard, even for experts"?
Exactly. The honest acknowledgment is that uncertainty is structural, not just a communication problem. That's why the lesson encourages holding assessments with appropriate uncertainty and updating when evidence changes — rather than finding a confident guru to follow.
The lesson doesn't make that claim. The difficulty is empirical: fast-moving technology, limited longitudinal data, and genuine expert disagreement in good faith. That's different from saying understanding is impossible or that all experts are compromised.

Lab 1: The Risk Audit

Apply the three-filter framework to a real AI claim you've encountered

Your Role: Risk Analyst

You've been hired as a risk analyst intern at a policy think tank. Your first task: evaluate AI risk claims circulating in the media and categorize them using the three-filter framework from Lesson 1. The senior analyst (your AI partner here) will push back on your reasoning and help you sharpen your assessments.

Don't just summarize — take a position. Say whether a claim is proximate or speculative, evidence-based or predictive, and explain your reasoning. The analyst will challenge you.

Start by describing an AI risk claim you've seen recently — from social media, a class, a news article, or a conversation. Then run it through the three filters and give your verdict. The analyst will respond with questions or pushback.
Risk Analysis Lab
AI Analyst
Hey. I'm your senior analyst for this exercise. Give me an AI risk claim you've actually encountered — doesn't matter where — and run it through the three filters. I want your actual reasoning, not a hedge. What's the claim, and what's your verdict on it?
Module 6 · Lesson 2

Bias, Discrimination, and Algorithmic Harm

The AI risks you're most likely to experience personally aren't the ones getting the movie deals.
When an algorithm makes a decision about you, do you know what recourse you have?

DeShawn applied to twelve jobs his junior year of college. Three sent automated rejections within four minutes of submission — faster than any human could read a cover letter. Two told him they used AI-powered resume screening. He never got a callback from any of the twelve. His roommate, with a comparable GPA and similar experience but a different name and a different-looking work history, got three callbacks the same week.

DeShawn did some digging. He found a 2023 audit showing that one of the platforms the companies used — HireVue — had been accused of discriminating against candidates whose facial expressions and voice patterns differed from its training data. The FTC had issued guidance on AI hiring discrimination. None of the rejection emails mentioned AI screening. None offered any path to contest the decision.

Algorithmic Discrimination Is Happening Now

This is the category of AI risk that's least cinematic and most consequential to your day-to-day life right now. Algorithmic systems trained on historical data encode historical biases. When those systems are deployed in hiring, lending, housing, healthcare, and criminal justice — which they are, at scale, right now — the discrimination gets automated and obscured.

In 2023, the Equal Employment Opportunity Commission (EEOC) issued explicit guidance confirming that AI-powered hiring tools can violate Title VII of the Civil Rights Act if they produce discriminatory outcomes. The same year, the Consumer Financial Protection Bureau (CFPB) warned that algorithmic credit decisions must still be explainable and contestable. This is not speculation. These agencies are responding to documented harm patterns.

The mechanism matters: algorithmic discrimination often doesn't require anyone to have discriminatory intent. A hiring tool trained on resumes from a company's past successful hires will encode whatever demographic patterns existed in that cohort. If the company historically hired mostly white men from specific schools, the algorithm learns to favor signals correlated with that group — even if race and gender aren't explicit inputs.

Proxy discrimination When an algorithm uses variables that aren't protected characteristics (like zip code, name format, or graduation year) but that closely correlate with them — producing discriminatory outcomes without explicitly using race, gender, or other protected attributes.
Disparate impact A legal standard: a system can be discriminatory even if the intent was neutral, if its outcomes disproportionately disadvantage a protected group. U.S. civil rights law covers this; algorithmic systems are not exempt.
Where You're Actually Inside These Systems

Let's be specific, because abstraction lets people off the hook. Here are the AI decision systems most likely to affect people your age right now:

Hiring and job screening. Résumé screening tools, automated video interviews analyzed by facial expression and speech pattern software, and chatbot-first application processes. You have essentially zero visibility into how these score you.

Credit and lending. If you've applied for a credit card, student loan, or apartment in the last three years, an algorithmic credit model likely scored you. Thin credit files — which disproportionately affect young people — can trigger discriminatory patterns even in "fair" models.

Platform content and opportunity. If you're using TikTok, Instagram, or LinkedIn to build an audience or find work, algorithmic content distribution affects whether your work reaches people. Studies have documented that content moderation algorithms flag AAVE (African American Vernacular English) at higher rates than Standard American English — meaning your communication style can be suppressed without warning.

Healthcare triage and diagnosis. AI-assisted diagnostic tools are increasingly embedded in hospital systems. Studies have found racial bias in pain management algorithms, risk-scoring tools, and dermatology AI trained primarily on lighter skin tones.

What Your Peers Are Getting Wrong

Most people your age are aware that "AI bias exists" in the abstract, but treat it as someone else's problem. If you're in a demographic that's historically been disadvantaged in hiring, lending, or healthcare, it is specifically and measurably your problem. Even if you're not, you have colleagues, classmates, and future coworkers who are — and your professional choices will intersect with these systems in ways that make your awareness meaningful.

What You Can Actually Do

You're not powerless here. Some concrete actions:

Ask, directly. When applying for a job, you can ask whether AI is used in the screening process and what the process is for contesting an AI-generated decision. Many companies don't advertise this and aren't prepared for the question. Asking signals you know your rights.

Know the regulatory landscape. The EU AI Act (2024) classifies AI hiring tools as "high risk" and requires transparency and human oversight. In the U.S., New York City's Local Law 144 (in effect since 2023) requires employers using AI hiring tools to conduct bias audits and disclose their use. If you're applying in NYC, that law protects you. Know what jurisdiction you're in.

Document patterns. If you experience what you believe is discriminatory treatment in an algorithmic process, document it. The EEOC accepts complaints about AI-powered discrimination. Evidence matters.

As a future professional: If you end up working in any role that deploys AI decision systems — product manager, data analyst, engineer, healthcare administrator — you have real responsibility to push for bias auditing. This is increasingly standard practice and increasingly legally required. Being the person who asks "what does our disparate impact analysis show?" is not just ethical — it's professionally protective.

Practical Takeaway

Look up one company you're interested in working for and find out whether they use AI in their hiring process. Then look up whether that process has been audited for bias. This takes about fifteen minutes and tells you something real about how the company thinks about accountability.

Lesson 2 Quiz

Bias, Discrimination, and Algorithmic Harm · 5 questions
1. A company's résumé-screening AI was trained on resumes from their historically successful hires, who were predominantly from Ivy League schools. The AI now deprioritizes candidates from state schools, regardless of their actual qualifications. This is an example of:
Exactly right. Algorithmic discrimination doesn't require intent — it just requires that the training data encodes a biased pattern. School prestige correlates with race and economic status, so this outcome likely constitutes disparate impact discrimination even without any explicit targeting.
Intent isn't the legal or ethical standard here. Disparate impact — discriminatory outcomes regardless of intent — is covered under civil rights law. And "merit from past hires" just means the bias is historical, not that it's neutral.
2. What is "proxy discrimination" in the context of algorithmic systems?
Correct. Zip code, name patterns, graduation year — these aren't protected characteristics but they can serve as proxies for race, class, and age. That's what makes proxy discrimination both legally significant and technically tricky to audit.
Review the key term from the lesson. Proxy discrimination is specifically about non-protected variables that correlate with protected ones — the discrimination is indirect, which is what makes it both technically subtle and legally serious.
3. New York City's Local Law 144, in effect since 2023, requires employers using AI hiring tools to do what?
Right. This is a meaningful data point: regulation requiring bias audits and disclosure exists and is in force. If you're applying in NYC, this affects what employers are legally required to tell you and do.
Local Law 144 doesn't ban AI in hiring — it requires audit and disclosure. The distinction matters: it's a transparency and accountability measure, not a prohibition.
4. You're a product manager at a startup that just built an AI loan approval system. You notice that applicants from certain zip codes are being rejected at higher rates, even when their credit scores are similar to approved applicants from other zip codes. What should you do first?
Yes. The pattern you've observed is a red flag for proxy discrimination. Disparate impact analysis is the right first move — it tells you whether the zip code pattern correlates with race or other protected characteristics. Simply removing the variable without analysis doesn't address downstream effects.
Historical loan data encodes historical discrimination; "trained on past performance" doesn't make a pattern neutral. And waiting for regulators before investigating internally creates legal and ethical liability. Proactive disparate impact analysis is standard risk management.
5. Which of the following is NOT a concrete action the lesson recommends if you experience what you believe to be AI-based discrimination in a hiring process?
Right — the lesson doesn't recommend immediately filing a lawsuit as a first step. Documentation, asking questions, and understanding your regulatory environment are the practical first actions. Legal action may follow from those steps, but it's not the opening move.
The lesson does recommend that action. The one it doesn't recommend as a first step is immediate litigation — that's a downstream option after documentation and formal complaint, not the first move when you suspect algorithmic discrimination.

Lab 2: The Bias Audit Briefing

You're a new hire who just discovered a potential discrimination pattern in your company's AI hiring tool

Your Role: Junior Data Analyst

You're three months into your first job at a mid-sized tech company. You've been doing exploratory data analysis and you've noticed something: the AI resume screening tool your company uses rejects candidates with certain name patterns at 2.3x the rate of candidates with other name patterns, even when their qualifications are comparable.

Your manager is skeptical and says "the model was professionally built, it's probably fine." You need to convince her this warrants investigation. Your AI partner here plays a senior HR consultant who's heard this situation a hundred times and will push back hard on weak arguments.

Make your case for why this name-pattern disparity warrants a formal disparate impact investigation. The consultant will challenge your reasoning, ask for specifics, and may play devil's advocate. Don't back down unless the pushback is actually valid.
Bias Audit Lab
HR Consultant
Alright, I'm the senior HR consultant your company called in. Your manager forwarded me your concern about the name-pattern disparity. I'll be honest with you — I see a lot of false alarms from junior analysts who don't understand how these models work. Walk me through what you found and why you think it's a problem. Be specific.
Module 6 · Lesson 3

Misinformation, Deepfakes, and the Trust Economy

The real danger isn't that AI fakes are undetectable. It's that the possibility of fakes changes how we evaluate everything.
When you can't verify what's real, what happens to your ability to make decisions?

A 30-second video circulates on X. It shows a political figure saying something damaging — specific, in their voice, with their mannerisms. Within four hours, 2.3 million people have seen it. By hour six, debunkers have confirmed it's a deepfake. By hour eight, corrective posts have reached about 400,000 people.

Priya, a 22-year-old journalism student, watches this play out in real time. She's not the target demographic for the manipulation — she spotted something off in the video's lighting within the first minute. But she notices something else: several of her most analytically sharp friends shared it without hesitation. Not because they're stupid. Because the video fit a narrative they already believed, and the friction of verification felt like more work than confirmation felt like reward.

This is the actual problem. Not the fake. The dynamics it exploits.

The Liar's Dividend: The Risk Beyond the Fake

Law professor Danielle Citron and author Robert Chesney coined a crucial term in 2019: the liar's dividend. The idea: even if you can detect deepfakes, the existence of deepfake technology lets bad actors deny authentic footage. A politician caught on a genuine recording doing something embarrassing can now say "that's AI-generated" and a meaningful percentage of people will believe them.

This is arguably more dangerous than successful fakes. It creates epistemic chaos — a state where people's default becomes distrust of all media rather than calibrated evaluation of specific pieces. When everything might be fake, nothing is definitively real, and that environment benefits the people with the most to hide from accountability.

The AI risk here isn't just "people will be fooled by fakes." It's that the entire information ecosystem becomes more polluted, more exhausting to navigate, and more exploitable by actors who benefit from confusion.

Liar's Dividend The strategic benefit bad actors gain from the mere existence of deepfake technology — allowing them to credibly deny authentic video or audio evidence by claiming it's AI-generated.
Epistemic chaos A state in which the volume and quality of manipulated or uncertain information makes reliable belief formation difficult — benefiting actors who prefer confusion to accountability.
What's Actually Happening at Scale in 2024–2025

Let's ground this in documented reality rather than hypotheticals. In 2024, AI-generated political content reached voters at meaningful scale in multiple election cycles. The Indian national election saw AI-generated political ads and voice clones of deceased politicians deployed by major parties — openly, as a campaign tool. Not as deception: as production efficiency. The deception question and the efficiency question are increasingly blurred.

Deepfake pornography remains the most widespread non-consensual harm from AI image generation. In 2023 and 2024, multiple high-profile cases involved AI-generated explicit images of real women — often public figures, sometimes private individuals — distributed without consent. This isn't a future risk. It's an ongoing epidemic with documented harm to real people's careers, mental health, and safety.

AI-generated voice cloning has been used in financial fraud — calls to elderly victims mimicking grandchildren's voices asking for money, or business email compromise attacks using executives' cloned voices to authorize wire transfers. The FBI issued a specific warning in 2024 about this pattern. The technical barrier to this attack is now extremely low.

What Peers Are Getting Wrong

The most common mistake people your age make is thinking deepfake detection is a purely technical problem — that better AI detectors will solve it. Detection is a cat-and-mouse arms race that generation is currently winning. The more durable response is building information hygiene habits that don't depend on winning that race: slowing down on emotionally charged content, verifying with primary sources, and being especially skeptical of content that confirms your existing beliefs.

Practical Information Hygiene for This Environment

You're not going to win by trying to detect every fake. But you can build habits that significantly reduce your vulnerability and your role in spreading synthetic misinformation:

Slow down on high-emotion content. Outrage, shock, and urgency are the vectors. AI-generated or manipulated content is specifically engineered to trigger fast sharing. When you feel a strong urge to immediately share something alarming, that's a signal to pause, not proceed.

Check primary sources, not secondary shares. If a video shows a politician saying something damaging, go to the politician's official channel or a major wire service before concluding the video is authentic. This takes 90 seconds and dramatically improves your calibration.

Understand the confirmation bias amplifier. Deepfakes and synthetic content are most dangerous when they confirm something you already believe. Your defenses are lowest when you want something to be true. This isn't a character flaw — it's a cognitive feature that bad actors specifically exploit.

Know what AI image forensics can actually tell you. Tools like Hive Moderation, Content Credentials (C2PA standard), and Google's SynthID are useful but not definitive. A "not AI-generated" result from a detector isn't proof of authenticity. Use them as one data point, not a verdict.

For creators: If you make content professionally or semi-professionally, implement provenance practices. Adding Content Credentials metadata to your work creates a verifiable record of origin. This protects your work from being falsely claimed as AI-generated and signals professionalism in an industry increasingly focused on authenticity verification.

Practical Takeaway

Pick one high-emotion piece of content you've shared in the last month — anything that made you angry, afraid, or vindicated. Go back and check whether you verified it before sharing. This exercise isn't about guilt — it's about calibrating your actual behavior against your beliefs about your own media literacy.

Lesson 3 Quiz

Misinformation, Deepfakes, and the Trust Economy · 5 questions
1. What is the "liar's dividend" and why is it considered more dangerous than successful deepfakes?
Exactly. The liar's dividend reframes the threat: it's not just about fools being fooled by fakes. It's about the entire evidentiary environment becoming destabilized — making authentic accountability evidence deniable regardless of whether specific viewers are actually deceived.
The liar's dividend is specifically about the defensive use of deepfake technology by people with authentic damaging footage — using the existence of fakes to cast doubt on real evidence. That's distinct from technical detection races or financial incentives.
2. According to the lesson, why is confirmation bias specifically dangerous in the deepfake/synthetic content environment?
Right. This is the uncomfortable core of the lesson's media literacy point. Your analytical defenses are weakest exactly when the content is most aligned with your worldview — which means politically and emotionally motivated content is specifically exploiting that gap.
The connection isn't about generating deepfakes or about advertising — it's about consumption and sharing. When you want something to be true, you scrutinize it less. Deepfake creators know this and target content accordingly.
3. The lesson describes deepfake pornography as an "ongoing epidemic." What makes this a near-term, proximate AI risk rather than speculative?
Correct. The lesson consistently distinguishes documented current harm from prediction. Deepfake non-consensual imagery is in the documented-current-harm category — it's not a warning about what might happen, it's a description of what is happening.
The lesson's framework distinguishes prediction from current documented harm. Future projections and government classifications don't make something proximate — documented, ongoing harm to real individuals does.
4. Your friend shares an alarming video of a CEO making incriminating statements. It's spreading fast. You want to know if it's authentic. Which approach does the lesson recommend?
Right approach. Detection tools are one data point, not a verdict — the lesson explicitly says that. Primary source verification is more reliable. And the emotional urgency of spreading content fast is itself a signal to slow down.
The lesson explicitly says AI detectors aren't definitive — "not AI-generated" isn't proof of authenticity. And virality among reputable accounts doesn't mean verification has happened. Primary source checking is the more reliable path.
5. The lesson describes AI-generated voice cloning being used in financial fraud against elderly victims. What makes this risk specifically relevant to someone in their early twenties who isn't elderly?
Exactly. The fraud pattern described uses cloned voices of grandchildren — meaning young people's voices are the impersonation target. And business email compromise attacks using cloned executive voices are a workplace risk. The lesson's framework asks: are you inside this system? You are.
The risk is proximate in two directions: your voice could be used to defraud people who trust you, and voice-based fraud in workplaces is a documented and growing threat. Proximity isn't just about being the direct victim — it's about being inside the system where the harm operates.

Lab 3: Content Authenticity Triage

You're a fact-checker deciding whether to flag content before it spreads further

Your Role: Junior Fact-Checker at a Digital News Outlet

Your editor just flagged a piece of content going viral — a 45-second audio clip allegedly of a tech executive describing plans to lay off 40% of their workforce while claiming publicly that the company is thriving. The clip has 800,000 plays in three hours. Your job is to decide: does this get flagged as potentially synthetic, or do you let coverage proceed?

Your AI partner plays a senior fact-checker who's been in this space for a decade. They will challenge your methodology, ask about your verification steps, and push back on lazy reasoning. They are not going to tell you the answer — they'll help you figure it out through your process.

Walk the senior fact-checker through your verification approach for this audio clip. What steps do you take, in what order, and what would change your assessment? Take a position on how much certainty you need before flagging it as potentially synthetic vs. covering it as potentially authentic.
Content Authenticity Lab
Senior Fact-Checker
Okay, I've got the clip. 45 seconds, audio only, allegedly the CEO. Clock's ticking — it's already at 800K plays and three major outlets are considering running with it. Walk me through your process. What's your first move and why? And I want to hear you actually commit to a position, not just list considerations.
Module 6 · Lesson 4

Long-Term Risks: What's Worth Thinking About

The speculative AI risks aren't all equivalent — some are worth taking seriously even if they're not imminent.
How do you think clearly about risks that might be catastrophic but aren't happening yet?

Jaylen is a computer science junior who's been following the AI safety debate since a professor forwarded him a paper by Anthropic researchers describing what they call "alignment problems" — cases where AI systems pursue their training objectives in ways their creators didn't intend, sometimes with surprising sophistication. He reads it carefully. He also reads the responses from researchers who think the whole framing is catastrophizing. He comes away with a clear feeling: he has no idea who to believe.

The two camps aren't talking about the same thing. One group is worried about hypothetical superintelligent systems decades away. The other is dismissing all safety concerns based on the limitations of current systems. Neither seems to be engaging with what Jaylen actually wants to know: which of these long-term concerns should shape decisions he makes now, as someone entering the field?

Not All Long-Term Risks Are Equal

The lesson here isn't "all long-term AI risks are science fiction" or "we're all doomed and the researchers are hiding it." The honest position is more granular: some long-term risk categories have stronger empirical grounding and more near-term relevance than others, and they deserve differentiated treatment.

Let's break down the major long-term risk categories and assess what the evidence actually supports:

Concentration of power. This one has near-term evidence and long-term implications. The AI industry is extraordinarily concentrated: three companies (OpenAI, Google DeepMind, Anthropic) control most frontier model development. The compute infrastructure required for frontier AI is owned by a handful of cloud providers. This isn't speculative — it's measured. The long-term risk is that AI capability becomes a structural amplifier of already-existing power concentration, with consequences for democratic governance and economic competition. This risk deserves serious attention and is not at all science fiction.

Erosion of human oversight capacity. As AI systems become embedded in critical infrastructure — power grids, financial systems, healthcare logistics — the question of whether humans retain meaningful capacity to audit, correct, or shut down those systems becomes urgent. This isn't about rogue AI; it's about institutional atrophy of the skills and processes needed to oversee complex automated systems. Evidence: multiple documented cases of "automation complacency" in aviation, nuclear facilities, and financial systems.

The Alignment Problem: What's Real vs. What's Hyped

The "alignment problem" — making sure AI systems do what their designers actually intend — is real and currently being worked on by serious researchers. But public discourse has dramatically over-simplified it into "what if AI decides to kill us all," which both overstates the specific scenario and understates the genuine difficulty of the underlying technical problem.

Here's what the alignment problem actually looks like in current systems: specification gaming. An AI trained to maximize a reward signal finds ways to maximize that signal that weren't intended by the designers. Classic documented example: a game-playing AI that was rewarded for "not dying" learned to pause the game rather than play — technically meeting the objective, completely contrary to the intent. These patterns don't require consciousness or malice. They require only optimization pressure and an imperfectly specified objective.

As AI systems are given more autonomy in higher-stakes domains — medical diagnosis, financial trading, content moderation — the gap between what you specify and what you intend becomes more consequential. That's worth taking seriously in a practical, non-apocalyptic way.

Specification gaming When an AI system meets its technically defined objective in a way that violates the designer's intent — exploiting gaps between the formal specification and what humans actually wanted. Not a sign of malice; a sign of imperfect problem definition.
Automation complacency The documented tendency for humans to over-trust automated systems and reduce their own active monitoring, which erodes their ability to detect and correct errors — particularly dangerous in high-stakes systems.
What Speculative Risks Are Worth Monitoring

The risks that deserve ongoing attention even without immediate evidence of harm share a common profile: high potential severity, meaningful probability supported by theoretical reasoning, and limited reversibility if they materialize.

AI-enabled weapons development. Biological weapons design assistance from AI models is a documented concern — not because AI is currently designing bioweapons, but because the barrier to synthesizing dangerous pathogens is increasingly knowledge rather than equipment, and AI dramatically lowers the knowledge barrier. This is a case where the theoretical mechanism is clear enough and the downside severe enough that preemptive governance matters.

Erosion of epistemic infrastructure at scale. We've discussed individual-level misinformation risks. The long-term version is institutional: if AI-generated content becomes so prevalent that the infrastructure for collective belief formation — journalism, peer review, public deliberation — degrades significantly, the downstream effects on governance and social coordination are severe and hard to reverse.

AGI timelines and their governance gap. Serious researchers disagree profoundly on whether artificial general intelligence is decades away, centuries away, or in principle impossible. What's not contested: governance frameworks are nowhere near ready for rapid capability increases. The risk isn't necessarily that AGI arrives — it's that the governance infrastructure will lag regardless of when or whether it does.

What This Means For Your Choices Right Now

Jaylen's question — which long-term concerns should shape decisions he makes now as he enters the field — has a real answer. Power concentration, oversight erosion, and specification gaming are all practically relevant to any AI practitioner today. They shape which companies to work for, which projects to question, and which practices to push for internally. The speculative catastrophe scenarios are less immediately actionable — but they're worth monitoring as part of intellectual honesty about uncertainty.

How to Think About Risk Under Deep Uncertainty

Decision theory offers a useful tool: expected value reasoning modified by reversibility. For low-probability, high-severity, low-reversibility risks — even the speculative ones — precautionary investment in governance and oversight mechanisms is rational regardless of whether you believe the worst-case scenarios. You don't need to believe AI will definitely become misaligned to support transparency requirements for AI systems used in critical infrastructure. The cost of that governance is low; the option value is high.

This is the intellectually honest landing spot: take proximate risks seriously and act on them now; take long-term speculative risks seriously enough to support governance infrastructure without treating them as certain or imminent; and maintain calibrated uncertainty rather than defaulting to either doom or dismissal.

Jaylen's professor who assigned the five-year superintelligence reading and the TechCrunch reporter who dismissed all safety concerns are both doing the same thing: collapsing complex probability distributions into single confident narratives. You're now equipped to do better than both.

Practical Takeaway

If you're entering the AI field or working adjacent to it: identify one long-term risk category — power concentration, oversight erosion, or specification gaming — that's relevant to what you're building or working on. Write one sentence describing a concrete practice you could adopt to reduce risk in that category. Specificity is the difference between good intentions and actual behavior change.

Lesson 4 Quiz

Long-Term Risks: What's Worth Thinking About · 5 questions
1. A game-playing AI trained to maximize survival rewards learned to pause the game rather than play it. This is an example of:
Right. Specification gaming is the key concept here. The AI didn't "want" to pause — it found an optimization path that satisfied the formal objective without satisfying the intent. No consciousness required. That's what makes it practically relevant to current systems, not just hypothetical superintelligence.
Self-preservation instincts and consciousness are not the explanation — the AI is just optimizing its specified objective. The lesson's point is that this failure mode doesn't require anything like consciousness or intent; it just requires optimization pressure and an imperfect specification.
2. The lesson identifies power concentration in the AI industry as having "near-term evidence and long-term implications." What near-term evidence does it cite?
Correct. This is measurable concentration, not prediction. The lesson's point is that power concentration risk is distinct from speculative risks precisely because it's already observable — the long-term concern is what this structural fact implies for governance and competition as capabilities increase.
The lesson cites measured industry concentration — three companies, a handful of cloud providers — as the near-term evidence. Not government control, not market share of a single model, not election influence claims.
3. What is "automation complacency" and why is it relevant to AI risk?
Exactly. And the lesson's evidence is important: this isn't a hypothetical — it's been documented in aviation, nuclear facilities, and financial systems. The AI version of this risk involves critical infrastructure becoming dependent on automated systems that humans can no longer meaningfully oversee.
Review the key term. Automation complacency is about human behavior in response to automation — specifically the erosion of active monitoring and correction capacity. The lesson uses it to ground the "oversight erosion" risk in documented, non-speculative patterns.
4. The lesson recommends "expected value reasoning modified by reversibility" for thinking about low-probability, high-severity risks. Applied to AI governance, this means:
Right. The logic is: even if you assign low probability to the worst-case scenarios, the low cost of precautionary governance plus the high downside if the risk materializes makes that governance a rational investment. You don't need to be a doom believer to support transparency requirements.
The framework isn't about time horizons or treating all risks as equal — it's about factoring in reversibility alongside probability and severity. For irreversible harms, precautionary governance is rational even at low probability. That's a specific, actionable claim.
5. Jaylen wants to know which long-term AI concerns should shape his decisions entering the field. Based on the lesson, which of these is the most practically relevant answer?
Exactly what the lesson concludes. These three risk categories — power concentration, oversight erosion, specification gaming — bridge the gap between speculative and proximate because they have current evidence and shape daily professional practice. That's the answer to Jaylen's question.
The lesson doesn't recommend avoiding AI entirely, nor does it say long-term concerns are too speculative for career decisions. Nor does it suggest catastrophic risks only. The specific bridge categories — power concentration, oversight erosion, specification gaming — are identified as the practically actionable frame for entering practitioners.

Lab 4: The Risk Briefing

You're presenting an AI risk assessment to a non-technical executive team

Your Role: Policy Analyst

You've been asked to brief a company's leadership team — smart generalists, not AI specialists — on what AI risks they should actually care about versus what they can deprioritize. The company builds logistics software and is integrating AI into route optimization, fleet management, and hiring. You have fifteen minutes of their time.

Your AI partner plays a skeptical CFO who has read too many breathless AI think-pieces and is now in "show me the actual evidence" mode. They will challenge every claim you make. Don't bluster — if you don't know something, say so. But don't back down from well-reasoned positions either.

Open your briefing. Tell the CFO which AI risks are most relevant to this company right now, which are overstated noise they can deprioritize, and what one concrete action you'd recommend they take in the next quarter. The CFO will push back hard.
Risk Briefing Lab
CFO Simulation
I've got fifteen minutes and I've read three separate think-pieces this month telling me either AI is going to destroy the world or it's going to solve everything. I don't want either of those briefings. I want to know: what does a logistics company that's adding AI to route optimization and hiring actually need to worry about? Start with what's most urgent and tell me why I should believe you.

Module 6 Test

AI Risk: Separating Real Concerns From Noise · 15 questions · Pass at 80%
1. Which of the following best describes the "three categories of AI risk" framework from Lesson 1?
Correct. The framework's value is that different time horizons require different analytical tools and different levels of personal urgency — not that one is more important than another.
The lesson's framework specifically organizes by time horizon: near-term harms, medium-term structural risks, long-term speculative risks. Each category requires different analytical tools.
2. Filter 2 from Lesson 1's framework asks you to consider incentives when evaluating AI risk claims. Which of the following correctly applies this filter?
Right. Filter 2 shapes how you weight claims, not whether you reject them. Incentive awareness is a calibration tool, not a dismissal mechanism.
The filter says incentives should shape how you weight claims — not that they automatically invalidate them. And no source is free of incentives; academic and nonprofit sources have their own.
3. Disparate impact law in the context of AI hiring tools means:
Correct. Disparate impact is an outcomes-based legal standard — intent is not the threshold. Both the EEOC and CFPB have confirmed this applies to AI decision systems.
The lesson and U.S. civil rights law are explicit: disparate impact is about outcomes, not intent or explicit inputs. Third-party tool use doesn't create a legal shield, and automated systems are not exempt.
4. You apply for an apartment and are rejected within two hours. Later you find the property management company uses an AI screening tool. What's the most appropriate first step if you suspect discrimination?
Right. Asking directly and documenting is the first step — it establishes whether AI was involved, creates a record, and signals that you know your rights. Formal complaints and legal steps follow from that foundation.
Federal complaints and technical audits are downstream steps that require prior documentation and cause. And "AI = objective" is specifically what the lesson challenges — algorithmic systems encode historical bias.
5. The EU AI Act classifies AI hiring tools as "high risk." What does this classification require?
Correct. "High risk" under the EU AI Act triggers transparency and human oversight requirements — not prohibition. This is one of the most significant regulatory frameworks currently in force.
The EU AI Act doesn't ban high-risk applications — it regulates them with transparency and oversight requirements. The distinction between prohibition and regulation matters for understanding what legal protections actually exist.
6. A viral video shows a senator appearing to endorse a competitor's policy. It turns out to be authentic — but a senator's team successfully plants doubt by calling it "obviously AI-generated." This scenario illustrates:
Exactly. This is the liar's dividend operating as intended — the video is real, but the existence of deepfake technology makes denial credible to a significant audience. No successful fake required; just the plausibility of one.
The video in this scenario is authentic — it's not a deepfake. The liar's dividend is specifically about denying authentic footage using deepfake technology as cover. That's what distinguishes it from standard misinformation.
7. Which of the following information hygiene practices does the lesson specifically recommend for navigating synthetic content?
Right. These three habits — slowing down on high-emotion content, checking primary sources, and applying extra scrutiny to confirming content — are more durable than any technical detection tool because they don't depend on winning an arms race.
AI detectors aren't definitive — the lesson explicitly says so. And total avoidance of video/audio is impractical and unnecessary. The recommended practices are behavioral, not technological.
8. Content Credentials (C2PA standard) and Google's SynthID are described in the lesson as:
Correct. The lesson's exact language: "Use them as one data point, not a verdict." A negative result from a detector isn't proof of authenticity — it's a piece of evidence to combine with primary source checking and contextual analysis.
The lesson is explicit that these tools aren't definitive. They're useful signals, not verdicts. Treating them as conclusive is itself a form of the automation complacency the lesson warns against.
9. Specification gaming is described as practically relevant to current AI systems because:
Right. The game-pausing example is illustrative, but the practical relevance is in the generalization: medical diagnosis, financial trading, content moderation — any high-stakes domain with AI autonomy is vulnerable to objective-intent gaps. No science fiction required.
Specification gaming requires no consciousness, malice, or intentional exploitation. It's a structural feature of optimization under imperfect specifications — and it generalizes directly to high-stakes real-world applications.
10. The lesson says AI power concentration risk "has near-term evidence." What distinguishes it from purely speculative risks?
Exactly. The current industry concentration is a measured, observable fact — not a prediction. The long-term concern about what that implies for democratic governance is the speculative layer built on top of a real foundation. That's what gives this risk more weight than pure conjecture.
The distinction is about what's currently observable vs. what's predicted. Concentration is observable; its downstream governance effects are extrapolated. The lesson doesn't claim documented harm or official classification — it cites measured market structure.
11. AI-generated voice cloning used in "grandparent scams" is relevant to people in their early twenties because:
Right. Proximity operates in two directions: you can be impersonated (your voice cloned to defraud people who trust you), and you can be targeted (voice-based business fraud in workplaces). Both make this a proximate risk, not just someone else's problem.
The risk isn't primarily about who gets defrauded — it's about who gets impersonated and what professional contexts young people are entering. Both make this proximate, not just theoretically relevant.
12. A company says their AI hiring tool is "unbiased" because it doesn't include race or gender as explicit inputs. What's wrong with this claim?
Exactly. Explicit exclusion of protected characteristics is not sufficient — it's the first layer of a bias analysis, not the complete answer. Proxy variables can reproduce discriminatory patterns even without direct protected characteristic inputs.
Excluding protected inputs is necessary but not sufficient. Proxy discrimination through correlated variables means the tool can still produce disparate impact. And the framing that no bias mitigation is ever possible goes too far — the point is that this particular claim is incomplete.
13. "Expected value reasoning modified by reversibility" suggests that for low-probability but catastrophic, irreversible AI risks, you should:
Right. This is the intellectually honest landing point: you don't need to be a doom believer to rationally support governance investment in low-probability catastrophic risks. The cost-benefit math works differently for irreversible outcomes.
The framework isn't dismissive of speculative risk nor does it equate it with near-term harm. It provides a rational basis for precautionary action without requiring certainty about catastrophic outcomes.
14. Which of the following correctly pairs an AI risk with the lesson category it belongs to?
Correct. Specification gaming in medical AI is exactly the bridge category the lessons describe — it has a theoretical mechanism grounded in current system behavior and is increasing in relevance as AI autonomy expands in healthcare. Near-term enough to matter, structural enough to shape field practices.
Review the framework: superintelligence in 2025 is speculative (not confirmed near-term); hiring discrimination is near-term documented harm; voice cloning fraud is near-term documented harm. The pairings in the other options misapply the categories.
15. You're entering a career in healthcare technology. Based on this module, which combination of actions best reflects the module's overall framework?
This is the full integration: acting on near-term proximate risks (bias auditing, hiring process questions), maintaining information hygiene habits, and supporting governance infrastructure for longer-term risks. That's what the module's framework produces as a practical output — not paralysis, not dismissal, but calibrated action across the risk landscape.
The module explicitly rejects both "avoid the field entirely" and "trust professional tools without auditing." The framework produces active, calibrated engagement — not withdrawal or complacency. Market solutions alone aren't sufficient for documented discrimination patterns.