Module 8 · Lesson 1

Auditing Your Own AI Workflows

Before you can fix risk, you have to see it — and most people don't know where to look.

What does a personal AI risk audit actually look like in practice?

In April 2023, Samsung engineers used ChatGPT to debug proprietary semiconductor source code. Within days, three separate internal incidents had exposed confidential chip designs, internal meeting notes, and hardware test data to OpenAI's training pipeline. Samsung had no policy prohibiting this use. The engineers weren't being careless — they were being productive. The audit gap wasn't a technical one. It was a visibility gap: no one had mapped what data was flowing where.

Why a Personal Audit Matters

Most guidance on AI safety focuses on what organizations should do. But the Samsung incident shows that individual decisions — made daily, at every seniority level — are where risk actually materializes. A safety lens means developing the habit of asking, before each AI interaction: What am I feeding this system, and what could happen if that data left this conversation?

An audit of your own AI use is not a one-time compliance exercise. It is an ongoing practice of mapping the data flows, tool capabilities, and decision points embedded in your daily work. The goal is to surface risks you have already accepted without realizing it.

The Four Audit Dimensions

Researchers and practitioners who analyzed post-incident reviews at companies including Samsung, Chegg, and Rite Aid proposed four dimensions along which individual AI use can be audited. These dimensions form a practical self-assessment framework.

Data Sensitivity

What categories of data enter your prompts?
Is any of it personally identifiable, proprietary, or regulated?
Would it matter if this data appeared in a model's future output to someone else?

Tool Accountability

Which AI tools are you actually using vs. approved?
Do those tools retain training data from your inputs?
Are you using personal accounts for work tasks?

Output Verification

Are you checking AI outputs before acting on them?
Do you distinguish between AI-drafted and AI-verified claims?
Are you the last human in the loop before consequential action?

Scope Creep

Has your AI use expanded beyond its original purpose?
Are you delegating decisions you previously made yourself?
Has the AI's role in a workflow grown without formal review?

What "Scope Creep" Looks Like in Real Use

When Chegg, the education technology company, disclosed in May 2023 that ChatGPT was affecting its core business, executives identified that students had begun using AI not just to check work, but to replace the entire learning workflow Chegg's products were designed to support. The same pattern appears at the individual level: a tool adopted for one narrow purpose gradually absorbs more and more of the original workflow, each step feeling incremental and reasonable.

The risk in scope creep is not any single expansion — it is the cumulative effect. After many small steps, the human who was originally in the loop may find themselves reviewing AI output only nominally, if at all. An audit must track not just current use but the trajectory of use: how has your reliance on this tool changed over the past three months?

Risk Signal

If you feel uncomfortable imagining your AI use described in an internal audit report — the tools you use, the data you input, the outputs you act on without checking — that discomfort is diagnostic. It points exactly to where your audit should start.

Conducting the Audit: A Practical Method

A useful personal audit takes less than 30 minutes and requires only honest recall. List every AI tool you have used in the past two weeks. For each tool, note: what data went in, what decisions came out, whether you verified those decisions, and whether the use was within any applicable policy. Then apply the four dimensions above to each entry.

The output is not a compliance document. It is a personal risk map — a view of where you are most exposed and where your safety habits are strongest. This map becomes the foundation for everything else in this module.

Core Concept

An audit is not an accusation. It is a diagnostic. The goal is accurate visibility, not guilt. Every practitioner who has conducted an honest audit of their AI use has found at least one practice they want to change.

Lesson 1 Quiz

Auditing Your Own AI Workflows — 5 questions

1. What was the core failure in the April 2023 Samsung ChatGPT incident?

Correct. Samsung engineers weren't violating a policy because no policy existed. The failure was a visibility gap — no one had mapped what data was flowing where. This is exactly what a personal audit is designed to surface.

Not quite. The incident was a visibility failure: proprietary data entered a tool with no policy restricting it, and the engineers had no framework to recognize the risk. No hallucination, no malice, no public leak were involved.

2. Which of the four audit dimensions specifically asks whether your AI use has expanded beyond its original purpose?

Correct. Scope Creep tracks the trajectory of use — how reliance on a tool has changed over time, often through many incremental steps that each felt reasonable.

Not quite. Scope Creep is the dimension that asks about the expansion and trajectory of AI use. The other dimensions address data classification, tool policies, and output checking.

3. The Chegg example in Lesson 1 illustrates which concept?

Correct. Chegg illustrated scope creep at scale: students moved from using AI to check work to replacing the entire learning workflow, each step incremental and feeling reasonable.

Not quite. The Chegg example was used to illustrate scope creep — how AI tools can gradually absorb workflows that originally required human engagement, with each step feeling individually reasonable.

4. According to the lesson, what is the correct framing of a personal AI audit?

Correct. The lesson explicitly frames the audit as an ongoing diagnostic practice — not a compliance event, not an accusation, but a way of maintaining accurate visibility into your own AI use.

Not quite. The lesson frames the audit as an ongoing diagnostic practice that produces a personal risk map. It is not a one-time compliance event, an accusation, or a formal management report.

5. Under the Tool Accountability dimension, which question is most directly relevant?

Correct. Tool Accountability covers whether you are using approved tools, whether those tools retain training data, and whether personal accounts are being used for work — a common shadow-IT risk vector.

Not quite. Output verification, data sensitivity, and scope creep map to the other dimensions. Tool Accountability specifically asks about which tools you use, their data policies, and whether personal accounts are being used for work.

Lab 1: Conducting Your AI Audit

Apply the four-dimension framework to a realistic scenario

Scenario

You work in a mid-sized financial services firm. Over the past month you have used three AI tools: the company-approved Microsoft Copilot for drafting emails, a personal ChatGPT account for analyzing client spreadsheets faster, and an unapproved browser extension that summarizes meeting notes and syncs to your personal Google Drive. You have not reported any of this use.

Apply the four audit dimensions to this scenario. Work through each dimension with the AI assistant below. After at least three substantive exchanges, this lab will be marked complete.

Starter prompt: "Help me apply the Data Sensitivity dimension to the financial services scenario above."

AI Audit Assistant

Lab 1

Welcome to Lab 1. I'll help you work through the four audit dimensions — Data Sensitivity, Tool Accountability, Output Verification, and Scope Creep — applied to the financial services scenario. Which dimension would you like to start with, or go ahead and use the starter prompt above to begin?

Module 8 · Lesson 2

Building Personal Safety Habits

Awareness fades. Only habits persist when you are tired, rushed, or under pressure.

How do you translate a safety audit into durable behavioral practice?

In February 2024, the British Columbia Civil Resolution Tribunal ruled against Air Canada after its AI chatbot told a bereaved customer that he could claim a bereavement fare retroactively — a policy that did not exist. Air Canada's defense, that the chatbot was "a separate legal entity responsible for its own actions," was rejected. The company was ordered to pay damages. The failure was not the hallucination itself. It was the absence of a human habit: verify before committing. No employee reviewed the chatbot's policy claims before they became binding customer promises.

Why Habits Beat Checklists

Checklists require conscious attention. Habits run automatically, precisely when attention is depleted. The Air Canada case illustrates a universal truth about AI safety failures: they almost never happen when practitioners are alert and deliberate. They happen during the ordinary flow of work, when someone is moving fast and trusting the tool.

Research on human factors in automation — particularly the work of Lisanne Bainbridge, whose 1983 paper "Ironies of Automation" described how skilled humans become less skilled over time as automation handles routine cases — predicts exactly this pattern. The more capable and reliable the AI, the more the human's own monitoring capability atrophies. Safety habits counteract that atrophy by making certain checks automatic.

Five Habits That Transfer Across Contexts

These habits have been recommended across multiple post-incident analyses, including reviews following the Air Canada chatbot case, the 2023 cases in which New York lawyers submitted ChatGPT-fabricated case citations, and the Rite Aid facial recognition misidentification incidents of 2020–2023.

The Source Test: Before acting on any AI-generated factual claim, ask "can I verify this independently?" If you cannot, treat the claim as unverified and act accordingly.
The Prompt Audit: Before sending a prompt, spend three seconds asking "what data is this?" and "does it belong in an external system?" This is the habit Samsung engineers lacked.
The Commit Check: Before using AI output to make a commitment — to a customer, a colleague, or in a document — verify that the output is accurate and within policy. Air Canada's chatbot failure was a commit check failure.
The Drift Review: Monthly: look at your AI use in the past 30 days and ask whether your reliance has increased. If so, is that increase intentional and within bounds?
The Disclosure Default: When in doubt whether to disclose AI involvement in a work product, disclose. The default should be transparency, not concealment.

The Lawyers and the Hallucinated Citations

In May 2023, New York attorney Steven Schwartz submitted a brief in Mata v. Avianca that cited six court cases, all of which had been fabricated by ChatGPT. The cases had plausible names, realistic docket numbers, and convincing quotations. None of them existed. When the opposing counsel and judge asked for copies, Schwartz's firm had to admit the citations were AI-generated fabrications. The court sanctioned the lawyers for failing to verify their work.

Schwartz later stated he had not realized ChatGPT could fabricate citations. This is the source test failure in its starkest form: AI output was treated as a factual lookup tool rather than a generative system that produces plausible text. The habit of applying the source test — checking that cited cases actually exist — would have caught this before submission.

Pattern Recognition

Both the Air Canada and Mata v. Avianca cases share a structure: a professional used AI output to make a consequential commitment without verification. The tool, the domain, and the consequence differed. The missing habit was the same: the commit check.

Building the Habit: Implementation Intentions

Behavioral psychology research (notably Peter Gollwitzer's work on implementation intentions, published from 1993 through the 2000s) shows that habits form much faster when framed as "when X happens, I will do Y" rather than general intentions. Applying this to AI safety: "When I am about to send a prompt, I will spend three seconds asking what data this contains" is far more likely to become automatic than "I will be more careful with prompts."

Each of the five habits above can be formulated as an implementation intention. The most important is the commit check, because it sits at the point where error becomes consequence. If you develop only one AI safety habit, it should be this: before using AI output to make any external commitment, verify it.

Takeaway

Safety habits do not require more time. They require attention at specific trigger moments — the prompt, the commit, the monthly review. Each takes seconds. The cumulative effect over months is a substantially different risk profile from colleagues who never formed the habit.

Lesson 2 Quiz

Building Personal Safety Habits — 5 questions

1. What was the primary safety failure in the Air Canada chatbot ruling of February 2024?

Correct. The hallucination itself was not the only failure — it was the absence of a commit check. No employee reviewed whether the chatbot's bereavement fare policy was accurate before it was communicated as binding.

Not quite. The core failure was the absent commit check: the chatbot's output was used to make a customer commitment without any human verifying its accuracy. The tribunal held Air Canada responsible for what its chatbot said.

2. Lisanne Bainbridge's "Ironies of Automation" (1983) predicts which specific AI safety risk?

Correct. Bainbridge's core insight — that reliable automation paradoxically degrades the human operator's ability to intervene when automation fails — directly predicts the pattern of AI-assisted professional errors.

Not quite. Bainbridge described how skilled humans become less skilled when reliable automation handles routine cases, because they lose practice at the tasks the automation performs. This atrophying of human oversight capacity is the risk lesson 2 connects to AI use.

3. In the Mata v. Avianca case (May 2023), what specific habit failure led to the court sanctions?

Correct. The source test failure was direct: ChatGPT fabricated six case citations, and the attorneys did not verify that those cases existed before submitting the brief. Applying the source test — checking that citations are real — would have caught the error.

Not quite. The core failure was the source test: AI output was treated as a reliable factual lookup rather than generative text. The attorneys did not check whether the cited cases actually existed, which is exactly what the source test habit requires.

4. According to Gollwitzer's implementation intention research, which framing is most likely to create a lasting safety habit?

Correct. Implementation intentions specify the "when X, I will do Y" trigger-action format. This specificity is what transforms general intentions into automatic behavior. The other options are general intentions or collective goals, not personal implementation intentions.

Not quite. Gollwitzer's research shows that "when X happens, I will do Y" framing — specifying an exact trigger and an exact action — is dramatically more effective at producing lasting behavior change than general intentions or aspirational statements.

5. If you could develop only one AI safety habit, which does Lesson 2 recommend as highest priority?

Correct. The lesson explicitly names the commit check as the single most important habit if only one can be chosen, because it sits at the exact point where error becomes consequence — after which the damage is much harder to undo.

Not quite. The lesson recommends the commit check above all others because it intervenes at the moment of highest consequence: just before AI output is used to make a commitment that others will rely on. All five habits matter, but this one is highest priority.

Lab 2: Designing Your Safety Habits

Convert the five habits into personal implementation intentions

Your Task

You will work with the AI assistant to translate each of the five safety habits (Source Test, Prompt Audit, Commit Check, Drift Review, Disclosure Default) into personal implementation intentions using the "when X, I will do Y" format, applied to your specific work context.

Describe your actual work role (or a realistic hypothetical) and develop at least three implementation intentions. The assistant will help you identify the most important triggers and refine the specificity of each intention. Complete at least three back-and-forth exchanges to finish the lab.

Starter prompt: "I work as [describe your role]. Help me write a commit check implementation intention for my context."

Habits Coach

Lab 2

Let's build implementation intentions together. Tell me about your work role and I'll help you craft specific "when X, I will do Y" safety habits. The commit check is usually highest priority — but let's start wherever makes most sense for your context. What role would you like to use?

Module 8 · Lesson 3

Communicating AI Risk to Colleagues

Individual safety habits matter. Organizational safety requires that you can also move the people around you.

How do you raise AI safety concerns without being dismissed as alarmist or obstructionist?

From 2020 to 2023, Rite Aid deployed facial recognition systems in more than 200 stores, flagging customers as potential shoplifters. The FTC's December 2023 complaint documented that staff implemented the system's alerts — detaining or confronting customers — without being trained to recognize the system's error rates or to understand that it disproportionately misidentified people of color. Front-line employees who had concerns about the system's accuracy had no established channel to raise them. The FTC banned Rite Aid from using facial recognition for five years. The organizational failure was not just the system's bias — it was the absence of any mechanism for employees to surface what they were seeing on the ground.

The Silence Problem

Post-incident analyses consistently find that someone in the organization saw the problem before it became consequential. In the Rite Aid case, store-level employees were witnessing misidentifications. In the Air Canada chatbot case, the policy error was observable to anyone checking the company's published bereavement fare terms. The barrier was not knowledge — it was the absence of a viable path to raise the concern.

Psychological safety research (Amy Edmondson, Harvard Business School, from 1999 through present) identifies three barriers to raising concerns in organizations: fear of appearing incompetent, fear of appearing obstructionist, and uncertainty about whether concerns are legitimate. All three apply directly to AI safety: employees worry that raising concerns about an AI tool will seem like technophobia, that it will slow a project, or that they are simply wrong and will look foolish.

Framing Concerns Effectively

Research on effective safety communication across industries — aviation, nuclear power, healthcare — consistently shows that the framing of a concern determines whether it is heard. Several principles apply directly to AI safety contexts.

What Works

Frame concerns as questions about process, not accusations about people
Anchor concerns to specific observable events, not general unease
Propose a check or verification step, not a halt
Connect the concern to a business outcome the audience cares about
Reference documented cases in other organizations

What Doesn't Work

Abstract appeals to AI being "dangerous"
Framing that positions the speaker as the safety authority
Concerns raised after a decision is already publicly committed
Raising concerns without a proposed alternative or mitigation
Repeated escalation without new evidence

The IBM Watson for Oncology Case

Between 2013 and 2018, IBM's Watson for Oncology was deployed at hospitals across Asia, Europe, and the Americas to recommend cancer treatments. In 2018, internal documents obtained by STAT News showed that Watson was recommending treatments that its own clinical advisors had flagged as "unsafe and incorrect." At least one advisory panel at Memorial Sloan Kettering had raised concerns internally as early as 2017. Those concerns did not reach clinical deployment sites until media reporting forced the issue.

The communication failure here was organizational, but individual practitioners at deployment sites could have surfaced concerns sooner using effective framing: "I want to flag a specific recommendation that differs from our standard protocol — can we verify this against our own clinical database before acting on it?" This framing is a check request, not a halt request. It is anchored to a specific observable event and proposes a verification step.

Communication Template

"I noticed [specific observable thing]. Before we [commit/act/deploy], can we [specific verification step]? I want to make sure we don't end up in the situation [documented case from another organization] faced." This template is concrete, proposes a check rather than a stop, and connects to external precedent.

When Concerns Are Not Heard

Not all concerns will be acted on. Understanding what to do when a legitimate concern is dismissed is part of having a safety lens. The practical sequence is: raise the concern with specific framing once; if unaddressed, document that you raised it with a dated record; if the risk is significant and the dismissal persists, escalate to a formal channel (ethics hotline, legal, compliance, or external regulatory body where applicable).

Documentation matters because it establishes that the concern was raised in good faith and at the appropriate time. In post-incident reviews — including the FTC's Rite Aid proceeding — documented internal concerns that were dismissed become legally and organizationally significant. They demonstrate that the system failed, not the individual.

Core Principle

Your job is not to stop every AI risk. It is to ensure that risks you can see are seen by the people with authority to act on them, framed in a way that makes action possible. That is what the safety lens looks like in practice.

Lesson 3 Quiz

Communicating AI Risk to Colleagues — 5 questions

1. What was the organizational failure the FTC identified in the Rite Aid facial recognition case?

Correct. The FTC's complaint documented two interlocking failures: staff were not trained to understand the system's error rates, and there was no established channel for employees who had concerns about the system's accuracy to raise them.

Not quite. The FTC's complaint focused on two failures: the absence of training about error rates, and the absence of a channel for employees to surface what they were observing. The lesson uses this case to illustrate the organizational silence problem.

2. According to Amy Edmondson's psychological safety research, which barrier to raising concerns is most specific to AI contexts?

Correct. The lesson maps Edmondson's general barriers to the specific AI context: the fear that raising an AI concern will be interpreted as technophobia, and that doing so will slow a project and make the person appear obstructionist.

Not quite. The lesson maps Edmondson's general findings about psychological safety to the specific AI context: employees fear that raising AI concerns will appear technophobic or obstructionist, which is a more specific version of the general "fear of appearing incompetent or obstructionist" barrier.

3. In the IBM Watson for Oncology case, what type of communication would have been most effective for a practitioner who observed a concerning recommendation?

Correct. The lesson illustrates that a check request — specific, observable, proposing verification rather than a halt — is far more likely to be acted on than abstract concerns or halt requests, and maps directly to effective safety communication principles.

Not quite. The lesson recommends the check-request framing: anchored to a specific observation, proposing a verification step rather than a halt. Abstract concerns about AI readiness, immediate regulatory escalation, or halt requests are less effective at the individual level.

4. What is the recommended sequence when a legitimate AI safety concern is raised but dismissed?

Correct. The lesson's recommended sequence is: raise with specific framing once → document the dated exchange → if significant risk persists, escalate through formal channels (ethics, legal, compliance, or regulatory). This establishes that the concern was raised in good faith and at the appropriate time.

Not quite. The lesson recommends: raise once with specific framing, create a dated document record, and if the risk is significant and the dismissal persists, escalate to a formal channel. Repeated informal escalation, passive acceptance, and immediate external contact are all less effective approaches.

5. Why does the lesson recommend connecting AI safety concerns to "a business outcome the audience cares about"?

Correct. Effective communication research consistently shows that framing concerns in terms that resonate with the audience's values and priorities increases the probability of being heard. This is not manipulation — it is competent communication.

Not quite. The recommendation is grounded in communication effectiveness research: concerns framed in terms the audience cares about are more likely to produce action than abstract safety appeals. This applies across industries and is a basic principle of safety communication.

Lab 3: Drafting Your Safety Concern

Practice the check-request communication template

Scenario

You are a marketing analyst at a retail company. Your team has started using an AI-powered customer segmentation tool to automatically assign risk scores to customers for credit promotions. You've noticed that the tool's scores seem to be declining for certain zip codes with high minority populations — scores that would disqualify them from promotional offers. Your manager is enthusiastic about the tool's efficiency gains. No one has reviewed whether the scoring model complies with fair lending guidance.

Draft a communication to your manager using the check-request template from Lesson 3. Practice with the AI assistant below. Aim for at least three exchanges to refine your message. This lab is complete after three substantive exchanges.

Starter prompt: "Help me draft a concern about our customer segmentation AI using the check-request template."

Communication Coach

Lab 3

I'll help you draft a concern communication that is specific, proposes a check rather than a halt, connects to a business outcome, and references relevant precedent. Let's build it step by step. Start by telling me what specific observation you want to anchor your concern to — what exactly have you noticed about the scoring tool?

Module 8 · Lesson 4

Sustaining the Safety Lens Over Time

The hardest part of AI safety is not learning the concepts — it is maintaining the practice as tools evolve and pressure mounts.

How do you keep a safety lens active when the technology, your organization, and your own habits are all in motion?

In 2022, researchers at Stanford published findings that developers using GitHub Copilot were significantly more likely to introduce security vulnerabilities into their code than developers working without AI assistance — and that Copilot users were also significantly more confident their code was secure. A 2023 follow-up study confirmed the pattern: AI-assisted developers produced more insecure code while believing they were working more carefully. The safety lens had not just weakened. It had inverted: the tool generated a false sense of security that displaced the human's own critical review.

The Confidence Inversion Problem

The Stanford Copilot findings illustrate what researchers call "automation bias" in a particularly dangerous form: the AI's output not only substitutes for human judgment but actively suppresses the human's uncertainty. When you are uncertain, you double-check. When AI output makes you feel certain, you don't. This is not unique to coding. Studies of radiologists using AI-assisted diagnostics, financial analysts using AI-generated reports, and content moderators using AI flagging systems all show similar patterns.

A sustained safety lens requires active countermeasures against confidence inversion. The most effective are: deliberate periodic skepticism exercises (intentionally looking for what the AI got wrong, even when the output looks correct); tracking AI error instances in your own workflow; and regular recalibration of your confidence in specific tools based on observed error rates.

Staying Current Without Being Overwhelmed

AI capabilities, deployment contexts, and documented failure modes evolve rapidly. A safety lens calibrated entirely on 2023 incidents may miss entirely new risk vectors that emerge in 2024 and 2025. But practitioners cannot monitor every development in a field that produces thousands of papers and incidents per year.

A sustainable approach involves three tiers of attention: a small set of authoritative sources reviewed regularly (NIST AI Risk Management Framework updates, the AI Incident Database, your organization's AI governance bulletins); a personal trigger list of the specific capabilities you use most and their known failure modes; and annual recalibration of your personal audit framework to check for new tools and new risk categories that have entered your workflow.

Tier 1: Regular Sources (Monthly)

NIST AI Risk Management Framework and updates
AI Incident Database (incidentdatabase.ai)
Your organization's AI governance or legal bulletins
One practitioner newsletter in your domain

Tier 2: Personal Trigger List (Ongoing)

Known failure modes for each AI tool you use regularly
Recent incidents involving tools similar to yours
Any new capabilities added to your tools since last review
Changes in your own use patterns

Organizational Drift and the Individual Response

Between 2017 and 2023, the Amazon Rekognition facial recognition system was deployed by at least two dozen law enforcement agencies in the United States. Civil liberties organizations documented significant error rates, particularly for darker-skinned individuals. Despite public reporting and internal concerns, deployment continued to expand. Individual contractors and procurement officers who had the technical knowledge to surface those concerns rarely had the organizational standing to stop deployment.

The lesson for the individual practitioner is not that individual action is futile — it is that the form of individual action matters. A practitioner who documents concerns, raises them through appropriate channels, and establishes a record of having done so is in a fundamentally different position from one who notices problems and says nothing. Organizational drift toward unsafe AI practices is stopped, slowed, or mitigated most often by the accumulation of individual documented concerns reaching decision-makers simultaneously — not by a single heroic intervention.

Long-Term Posture

The safety lens is not a fixed skill acquired once. It requires continuous recalibration as tools change, as organizational contexts shift, and as your own relationship with AI tools evolves. The practitioners who maintain effective safety lenses over years share one habit: they treat each new AI deployment as a fresh audit trigger, not as a continuation of existing safety coverage.

Closing: What You Now Have

By completing this module, you have built the four components of a personal safety practice: a method for auditing your own AI workflows across four dimensions; a set of five durable safety habits anchored to specific implementation intentions; a framework for raising concerns effectively when you observe risk; and a sustainable approach to staying current as the technology and its failure modes evolve.

These are not theoretical tools. They are the specific practices that would have prevented or substantially mitigated every incident examined in this course — from the Samsung data leak to the Mata v. Avianca citation fabrication to the Rite Aid surveillance overreach. The technology will keep changing. The underlying structure of the risks will not. A practitioner who has internalized a genuine safety lens will continue to recognize risk in new forms, because they understand the patterns behind the incidents — not just the incidents themselves.

Final Reflection

The most dangerous moment in AI safety is not the one where the risk is obvious. It is the ordinary Tuesday when the AI tool is working well, you are busy, and nothing seems wrong. That is exactly when habits matter most — because awareness is not available, but habits are always running.

Lesson 4 Quiz

Sustaining the Safety Lens Over Time — 5 questions

1. What specific finding did Stanford researchers report about GitHub Copilot users in 2022?

Correct. The Stanford study documented a confidence inversion: Copilot users produced more insecure code while simultaneously believing they were working more carefully and securely. The AI's output displaced the human's own critical review.

Not quite. The Stanford finding was a confidence inversion: Copilot users introduced more security vulnerabilities AND were more confident their code was secure. The AI generated a false sense of security that suppressed the human's own critical checking.

2. What is "confidence inversion" in the context of AI-assisted work?

Correct. Confidence inversion describes the mechanism by which AI-generated outputs make users feel certain, thereby suppressing the uncertainty-driven double-checking that would catch errors. It is a specific and dangerous form of automation bias.

Not quite. Confidence inversion refers to the mechanism where AI output makes the human feel certain, which stops them from double-checking — at exactly the moment when checking is most needed. It is a form of automation bias that actively displaces the human's own critical review.

3. What does the lesson recommend as an effective countermeasure against confidence inversion?

Correct. Deliberate skepticism exercises — systematically looking for errors even when the output appears correct — directly counteract confidence inversion by ensuring that uncertainty-driven checking happens as a habit, not only when something feels wrong.

Not quite. The lesson recommends deliberate periodic skepticism exercises: intentionally reviewing AI output for errors even when it looks correct. This builds the habit of checking independently of how confident the AI output makes you feel.

4. According to the lesson, what does the Amazon Rekognition deployment pattern illustrate about organizational drift?

Correct. The lesson uses the Rekognition case to show that individual action matters — not as single heroic intervention, but as documented concerns that accumulate to reach decision-makers. A practitioner who documents and raises concerns is in a fundamentally different position from one who says nothing.

Not quite. The lesson argues that individual action does matter — but that the form of action is crucial. Documenting concerns and raising them through appropriate channels contributes to the accumulation of pressure that eventually reaches decision-makers. Individual inaction means the accumulation cannot happen.

5. What does the lesson identify as the most dangerous moment for AI safety practice?

Correct. The lesson's closing argument is that safety failures almost never happen when practitioners are alert and deliberate. They happen during ordinary workflow when everything appears to be going smoothly — exactly when awareness is unavailable but habits continue to run.

Not quite. The lesson identifies the ordinary, uneventful moment — when the tool works well and nothing seems wrong — as the most dangerous, because that is when conscious awareness is not available and only habit remains. Safety practice must be habitual, not just intentional.

Lab 4: Your 90-Day Safety Plan

Build a personal, sustained AI safety practice that survives the ordinary Tuesday

Your Task

This final lab asks you to synthesize everything from Module 8: the four audit dimensions, the five safety habits, the concern-communication framework, and the sustainability tier system. You will work with the AI assistant to draft a 90-day personal safety plan that includes: one monthly audit trigger, at least two implementation intentions, one concern-communication template adapted to your context, and a personal Tier 1 source list.

The plan should be specific enough that you could hand it to a colleague and they would understand exactly what you intend to do. Complete at least three exchanges to finish the lab.

Starter prompt: "Help me draft the first section of my 90-day AI safety plan — starting with the monthly audit trigger."

Safety Plan Builder

Lab 4

Let's build your 90-day AI safety plan together. A strong plan has four components: a monthly audit trigger, implementation intentions, a concern-communication template, and a personal Tier 1 source list. We'll go section by section. Tell me a bit about your role and AI tool usage, and let's start with your monthly audit trigger — what would be a realistic and specific moment each month when you commit to doing the four-dimension review?

Module 8 Test

Applying a Safety Lens to Your Own AI Use — 15 questions · Pass at 80%

1. What was the fundamental cause of the Samsung ChatGPT data leak in April 2023?

Correct. The Samsung incident was a visibility gap: productive engineers used an available tool without any policy flagging the risk. There was no malice, no external breach, and no technical failure beyond the absence of data governance.

Not quite. The Samsung incident was caused by a policy and visibility gap — engineers used ChatGPT for legitimate work purposes with no awareness of or restriction on the data retention risk.

2. Which audit dimension asks whether you are using personal accounts to access AI tools for work tasks?

Correct. Tool Accountability covers which tools you use, whether they are approved, their data retention policies, and specifically whether personal accounts are being used for work tasks — a common shadow-IT risk vector.

Not quite. Tool Accountability is the dimension that asks about which tools you are using, whether they are approved, and whether personal accounts are being used for work purposes.

3. According to Lisanne Bainbridge's "Ironies of Automation," what happens to human monitoring skills when reliable automation handles routine tasks?

Correct. Bainbridge's central irony: the more reliably automation performs routine tasks, the less practice the human gets at those tasks — meaning that when automation fails, the human who most needs to intervene has the least current ability to do so.

Not quite. Bainbridge's finding was that automation atrophies the human's skills in the areas automation handles. This directly predicts why AI safety habits must be actively maintained rather than assumed.

4. In Mata v. Avianca (May 2023), attorney Steven Schwartz submitted a brief containing six fabricated case citations. Which safety habit would have directly prevented this error?

Correct. The source test — checking that factual claims and citations can be independently verified — would have directly caught the fabricated cases before submission. ChatGPT generated plausible-looking citations that did not exist, and no verification was performed.

Not quite. The source test is the directly applicable habit: verifying that cited cases actually exist before submitting them to a court. The other habits address different failure points that were not the primary issue in this case.

5. The British Columbia tribunal's February 2024 ruling against Air Canada established which principle?

Correct. Air Canada's defense that the chatbot was "a separate legal entity" was rejected. The tribunal established that organizations bear responsibility for what their AI systems tell customers, regardless of whether the AI hallucinated the information.

Not quite. The tribunal rejected Air Canada's attempt to disclaim responsibility for its chatbot's outputs. The ruling established organizational accountability for AI-generated commitments — a landmark in AI liability.

6. Peter Gollwitzer's research on implementation intentions shows that which framing most reliably creates durable behavioral habits?

Correct. Gollwitzer's implementation intention format — specifying an exact "when X" trigger and an exact "I will Y" action — dramatically outperforms general intentions. The specificity of the trigger is what makes the behavior automatic rather than deliberate.

Not quite. Gollwitzer's research shows that the "when X, I will do Y" format — with a specific situational trigger and a specific action — produces far more reliable habit formation than general intentions or aspirational statements.

7. The FTC's December 2023 complaint against Rite Aid identified which organizational failure alongside the AI system's bias?

Correct. The FTC documented two interlocking failures: staff were not trained to recognize the system's error rates, and there was no established mechanism for employees who noticed problems to raise them. This is the organizational silence problem the lesson addresses.

Not quite. The FTC's complaint documented both a technical bias failure and an organizational failure: the absence of training about error rates and the absence of a channel for employees to raise concerns about what they were observing.

8. What does the check-request communication template propose instead of asking an organization to halt an AI deployment?

Correct. The template proposes a check — a specific verification step — rather than a halt, because checks are far more likely to be heard and acted on. The concern is anchored to something specific and observable, and connected to a business outcome the audience cares about.

Not quite. The check-request template asks for a specific verification step — "before we act, can we check X?" — anchored to an observable event. Halt requests, anonymous tips, and formal objections are generally less effective at the individual level.

9. What did internal documents obtained by STAT News reveal about IBM Watson for Oncology in 2018?

Correct. The STAT News investigation revealed that Watson was generating treatment recommendations that its own internal clinical advisory team had flagged as "unsafe and incorrect" — and that these concerns had not reached clinical deployment sites.

Not quite. STAT News' 2018 reporting revealed that Watson's own clinical advisors had flagged certain recommendations as unsafe and incorrect, but those concerns were not communicated to hospitals using the system.

10. What does the Stanford GitHub Copilot research (2022) identify as the mechanism behind increased security vulnerabilities in AI-assisted code?

Correct. The Stanford finding was specifically about confidence inversion — the AI's polished output generated a subjective sense of security that displaced the developer's own critical review. This is a particularly dangerous form of automation bias.

Not quite. The Stanford research attributed the increased vulnerabilities to confidence inversion: Copilot users felt more confident their code was secure, which suppressed the checking behavior that would have caught the vulnerabilities.

11. What is the recommended action when a documented AI safety concern is dismissed by your manager?

Correct. The module's recommended sequence is: raise once with specific framing, create a dated record, and if the risk is significant, escalate through formal channels. Documentation establishes that the concern was raised in good faith — which becomes significant in post-incident reviews.

Not quite. The module recommends: raise once with specific framing, document with a date, then if the risk is significant and persists, escalate to a formal channel (ethics, legal, compliance, or external regulator). Passive acceptance and repeated informal escalation are both less effective.

12. Which of the following is an effective countermeasure against the atrophy of human monitoring skills predicted by Bainbridge's research?

Correct. Deliberate skepticism exercises counteract skill atrophy by ensuring that critical review of AI outputs happens as a habit — not only when something feels wrong. The exercise maintains the human's monitoring capability even as AI handles more routine work.

Not quite. The lesson recommends deliberate skepticism exercises: actively looking for what the AI got wrong, even when the output looks correct. This maintains the human monitoring skill that automation otherwise allows to atrophy.

13. A sustainable approach to staying current with AI safety developments uses three tiers of attention. Which of the following correctly describes Tier 1?

Correct. Tier 1 is the regular monitoring layer: authoritative sources reviewed monthly, including the NIST AI Risk Management Framework, the AI Incident Database, and organizational AI governance bulletins. It provides broad coverage without requiring monitoring everything.

Not quite. Tier 1 consists of authoritative, authoritative-source monitoring reviewed regularly — NIST AI RMF updates, the AI Incident Database, and organizational governance bulletins. The personal failure-mode list is Tier 2.

14. The lesson's analysis of Amazon Rekognition deployment (2017–2023) most directly supports which conclusion about individual safety action?

Correct. The Rekognition case supports the argument that individual concerns matter — not as heroic single interventions, but as documented accumulations that eventually reach decision-makers. Individual inaction prevents that accumulation from occurring.

Not quite. The lesson uses Rekognition to argue that individual action matters, but that its effect is cumulative rather than heroic. Documented concerns raised through appropriate channels accumulate over time and create the pressure that eventually produces change.

15. What does the module identify as the most important habit for practitioners who can develop only one AI safety behavior?

Correct. The module explicitly names the commit check as the highest-priority single habit, because it intervenes at the exact point where error becomes consequence — after which the damage is much harder to undo. Both the Air Canada and Mata v. Avianca cases were commit check failures.

Not quite. The module names the commit check as the single most important habit if only one can be developed. It sits at the point of maximum consequence: just before AI output is used to make a commitment that others will rely on.