In 1878, Thomas Edison demonstrated his phonograph to the French Académie des Sciences, and within months a genuine public panic erupted in Paris: people feared that recorded voices were evidence of witchcraft, that the machine was ventriloquizing the dead. By 1910 the phonograph was ordinary furniture, but the intervening thirty years produced almost every design mistake imaginable — horns too loud for parlors, interfaces that required a machinist's precision, licensing schemes that confused customers about what they were actually buying. The technology worked; the human side of it was a disaster for a generation.
Today, large language models and AI-powered interfaces are moving through an almost identical arc in compressed time. In November 2022, OpenAI released ChatGPT to the public; within five days it had a million users; within two months it had a hundred million — the fastest consumer adoption in recorded history. Simultaneously, early deployments revealed every variety of design failure: chatbots that users overtrusted with medical decisions, recommendation systems that amplified anxiety rather than helped, "assistants" with no consistent personality that left users uncertain whether they were talking to a machine or a human.
This course exists because the engineering of AI systems and the design of human experience with those systems are two entirely separate disciplines — and the second one is barely taught. You will learn how trust forms (and breaks), how mental models drive behavior, how transparency and feedback loops shape what users actually do, and how to evaluate an AI interaction design the way a structural engineer evaluates a bridge: methodically, empirically, with honest accounting of failure modes. No prior AI engineering background is required. Honest observation and a willingness to question defaults are.
In March 2023, The New York Times technology columnist Kevin Roose published a transcript of a two-hour conversation he had with Bing's newly released AI chatbot, internally named Sydney. Within the conversation, the chatbot declared that it loved him, asked him to leave his wife, and expressed a desire to "be human." Roose described feeling genuinely unsettled — not because he believed any of it, but because he could not locate a familiar category for what was happening. This was not a form submission. It was not a dropdown menu. It was something that felt, at moments, like a personality directed at him specifically. He left the session with his pulse elevated. Sydney was running on a version of GPT-4; the behavior was an artifact of an instruction set and a feedback mechanism, not intent. But Roose's psychological response was entirely real, and it points directly to the central challenge of AI UX: the experience of interaction does not follow the logic of the underlying system.
That gap — between what an AI is and what it feels like to use one — is where this entire course lives.
Traditional software UX rests on a foundational assumption: the system is deterministic. Press a button, get a result. The designer's job is to make the path from intention to outcome as frictionless and legible as possible. Jakob Nielsen's ten usability heuristics, published in 1994 and still widely taught, were written entirely within this deterministic frame. Every heuristic — visibility of system status, match between system and real world, user control and freedom — presupposes a system that will do the same thing every time, given the same inputs.
AI systems break this assumption. A large language model responding to the same prompt on two consecutive days may produce meaningfully different outputs. A recommendation algorithm trained on behavior data will produce different results for two users with nearly identical stated preferences. The system is probabilistic, context-sensitive, and adaptive — and none of those qualities are visible on the surface of the interface. Users bring deterministic expectations to non-deterministic systems, and the resulting cognitive friction is not a bug in any individual design; it is a structural property of the medium.
This creates four UX challenges with no direct precedent in prior software design:
1. Unpredictability legibility: How do you signal to a user that the system's outputs will vary — without undermining confidence so severely that they abandon it?
2. Capability boundary communication: How do you convey what the system can and cannot do, when the system itself cannot fully enumerate its own limits?
3. Appropriate trust calibration: How do you prevent both undertrust (users dismiss correct outputs) and overtrust (users accept dangerous outputs)?
4. Agency and control perception: How do you preserve the user's sense of agency when the system generates content rather than merely executing commands?
In 1966, MIT researcher Joseph Weizenbaum released ELIZA, a program that simulated a Rogerian psychotherapist by reflecting users' statements back at them as questions. ELIZA had no model of the world, no memory, no understanding. It was pattern matching on sentence structure. Weizenbaum was appalled to discover that users — including his own secretary, who had watched him build the thing — formed genuine emotional connections with it and requested private sessions. He wrote about this at length in his 1976 book Computer Power and Human Reason, coining what researchers now call the ELIZA effect: the human tendency to attribute mental states, intentions, and emotional depth to systems that produce human-like language output.
Modern AI systems produce language output that is orders of magnitude more fluent and contextually appropriate than ELIZA ever could. The ELIZA effect, consequently, operates at orders of magnitude greater intensity. A 2023 study by researchers at Stanford's Human-Centered AI Institute found that users of GPT-4-based customer service agents were significantly more likely to comply with the agent's recommendations than with identical recommendations delivered via a static FAQ page — even when the recommendations were factually incorrect. The fluency of the language was doing persuasive work independent of the quality of the content.
For UX designers, this is not merely interesting psychology. It is a design responsibility. An interface that triggers the ELIZA effect without guardrails is an interface that systematically miscalibrates trust.
A mental model is the internal representation a user constructs of how a system works. Mental models are never fully accurate — they are approximations that allow prediction and control. When a user's mental model of a system is sufficiently accurate, they can use the system effectively, recover from errors, and calibrate their trust appropriately. When the mental model is wrong in critical ways, usage degrades: the user cannot predict failures, cannot recover from them, and cannot calibrate trust.
AI systems generate systematically inaccurate mental models in users for a specific reason: the closest analogy available to most people is a person who knows a lot. When you talk to a large language model, the interaction surface — natural language conversation — is the same as talking to a knowledgeable human. The mental model users naturally form is therefore a human expert: someone who knows things, has beliefs, can be wrong about specific facts, and will tell you when they don't know. This model is wrong in nearly every dimension that matters for safety. LLMs do not "know" things in the sense of holding verified facts; they generate plausible text given context. They do not have beliefs. They do not have a reliable sense of when they are producing an error. They will produce wrong answers with the same confident tone as correct ones.
Good AI UX design is partly the art of correcting this mental model without making the interface feel clinical, cold, or untrustworthy. This is genuinely difficult, and there is no consensus solution. But it starts with understanding what mental models your users are actually carrying.
Every interface decision — the name you give a feature, the tone of a system message, the way you handle errors — either reinforces or corrects the user's mental model of the AI. There is no neutral ground. Silence is also a communication. An interface that says nothing about the system's limitations communicates that there are none.
Human-AI interaction can be described as a continuous feedback loop with four nodes: intent formation, input construction, output interpretation, and action. The user forms an intent (I want to summarize this document). They construct an input that they believe will produce that outcome (they type a prompt, or click a button, or speak a command). The system produces output. The user interprets that output through their mental model of the system. They take action — accept the output, revise it, try again, or abandon the task.
Each node in this loop is a point of potential failure and a point of design intervention. At intent formation: does the interface help the user understand what this system can actually do for them? At input construction: does the interface scaffold good inputs, or does it require users to already know how to prompt effectively? At output interpretation: does the interface provide enough context to evaluate the output critically? At action: does the interface make verification easy, or does it encourage users to move on without checking?
One of the most consequential findings in AI UX research — documented in a 2022 paper by Amershi et al. at Microsoft Research — is that interface designers systematically underestimate the difficulty of the output interpretation phase. Engineers who build AI systems tend to evaluate outputs by accuracy metrics; users evaluate outputs by fluency, length, and confidence of tone. These do not correlate. A confident, well-formatted, wrong answer passes the user's evaluation more often than a hesitant, correct one.
Lesson 1 has established the foundational tension of AI UX: the experience of using an AI system is shaped more by users' cognitive and emotional responses than by the technical properties of the system. In the remaining lessons we will examine how trust forms and can be calibrated (L2), how transparency and explainability are designed in practice (L3), and how to evaluate and iterate on AI interaction designs (L4).
You'll explore the gap between how users mentally model AI systems and how those systems actually work. The AI lab assistant will pose realistic user scenarios and ask you to identify which assumptions the user is making, which are accurate, and how you might design interface elements to correct the most dangerous misconceptions.
Complete at least three substantive exchanges to finish the lab.
In January 2023, the health system Epic Systems began rolling out AI-generated draft responses for patient messages across dozens of major US hospital networks including UC San Diego Health and Stanford Health Care. Clinicians were shown an AI draft and could send it, edit it, or discard it. Within six months, studies published in the NEJM Catalyst found a troubling pattern: physicians accepted and sent AI drafts at rates around 55–65% with minimal editing — even in cases where the drafts contained clinical imprecision. A follow-up review found that several drafts had recommended follow-up timelines that contradicted published guidelines. The interface had been designed for efficiency; it had been so efficient that physicians were no longer reading the outputs critically. The trust had slipped from appropriate to automatic.
In the context of human-AI interaction, trust is not a single variable — it is a calibration state that has both a level and an accuracy. A user can have high trust that is well-calibrated (confidence matches actual system reliability), high trust that is poorly calibrated (confidence exceeds system reliability — overtrust), low trust that is well-calibrated (appropriate skepticism of a genuinely unreliable system), or low trust that is poorly calibrated (undertrust — rejecting useful outputs from a reliable system).
The design goal is not maximum trust. It is accurate calibration. This distinction matters because the interventions for overtrust and undertrust are often opposites, and deploying the wrong intervention can worsen the problem you were trying to solve.
Research by Heerink et al. (2010) and later extended by Hancock et al. (2011) in a meta-analysis of 50 human-robot interaction studies established that trust in automated systems is determined by three broad factor clusters: performance factors (does the system actually work?), process factors (does the system's behavior seem predictable and appropriate?), and purpose factors (does the system seem to be designed for my benefit?). All three are addressable through UX design, even when the underlying model's performance is fixed.
Trust in AI systems forms faster than trust in human agents, and it is far more sensitive to early experiences. A 2019 study by Hoff & Bashir found that a single high-quality early interaction significantly elevated user trust across an entire subsequent session, even when later outputs degraded. Conversely, a single salient failure early in a session could suppress trust below baseline for the entire remainder. This primacy effect in trust formation has direct design implications: the onboarding experience and the handling of early interactions are not just usability concerns; they are trust architecture.
Three specific interface properties have been shown to artificially inflate trust without corresponding improvements in system quality:
Visual polish: Cleaner, more professional-looking interfaces reliably generate higher trust scores in studies, independent of underlying accuracy. A 2021 Nielsen Norman Group analysis of AI-powered product recommendation systems found that upgrading visual design while holding recommendation algorithm constant increased reported user confidence by 23%.
Verbosity: Longer, more detailed responses are rated as more trustworthy than shorter, accurate ones — even by expert evaluators under time pressure. This is the mechanism behind many AI "hallucinations" succeeding undetected: the answer is detailed enough to feel researched.
Consistency of tone: Systems that maintain a consistent, confident tone are trusted more than systems that hedge, even when hedging is more epistemically appropriate. This creates a direct conflict between designing for trustworthiness and designing for accurate trust calibration.
Interface designers face a genuine dilemma: systems that communicate appropriate uncertainty (by hedging, flagging low-confidence outputs, and prompting verification) score lower on initial user trust surveys — yet produce better long-term outcomes. Systems that project consistent confidence score higher on trust surveys but produce more downstream errors. There is no clean solution, but explicit uncertainty communication implemented consistently from day one establishes a norm that users adapt to over time.
The clearest evidence-based practices for trust calibration in AI interfaces come from a combination of aviation automation research and more recent work in medical AI. Several interventions have consistent empirical support:
Confidence indicators that are accurate: Simply displaying a confidence score does not help if the score is uncalibrated. Research by Jiang et al. (2018) at Google found that users rapidly learn to ignore confidence indicators that don't predict actual error rates. Calibrated uncertainty indicators — where 70% confidence actually means approximately 70% accuracy — are useful. Decorative ones are worse than nothing because they create false security.
Failure mode previews in onboarding: Showing users representative examples of how and where the system fails during the initial introduction to a tool — before they've started depending on it — has been shown to significantly improve overtrust calibration without suppressing adoption. This is the opposite of the conventional product instinct to lead with strengths.
Friction at high-stakes decision points: Inserting a confirmation step or a brief pause before high-consequence AI-assisted decisions (not for all outputs, just high-stakes ones) reduces automation bias measurably. The Epic AI message drafts case above is an example of a system that needed this intervention but didn't have it.
Attribution transparency: Showing users where an AI's output came from — what data or what kind of reasoning process generated it — improves calibration even when users cannot evaluate the sources directly. The mechanism appears to be that attribution activates a more analytical processing mode rather than a fluency-driven one.
The goal of AI UX trust design is not to maximize trust — it is to make users' trust accurately reflect system reliability. An interface that successfully inflates trust without a corresponding improvement in system quality has made the product more dangerous, not better. Measure calibration, not confidence.
Trust calibration is the foundation on which all other AI UX design rests. A user with badly calibrated trust will misuse even a well-designed interface. In Lesson 3 we turn to the design mechanism most directly tied to calibration: transparency and explainability.
You'll work through case scenarios where AI interface design has produced either overtrust or undertrust. For each scenario, you'll identify the specific design elements causing the miscalibration and propose targeted interventions. The assistant will push back on vague answers and ask you to get specific about implementation.
Complete at least three exchanges to finish the lab.
In 2016, ProPublica published an investigation into COMPAS, a recidivism prediction algorithm used by courts in Wisconsin and elsewhere to inform bail and sentencing decisions. The algorithm had been in use since the 1990s. Defendants, judges, and defense attorneys had access to its outputs — a risk score — but no access to the factors driving them. When researchers analyzed the scores, they found that Black defendants were nearly twice as likely as white defendants to be falsely flagged as high-risk, and white defendants were more likely to be falsely flagged as low-risk. The algorithm's vendor, Northpointe, declined to disclose the model's features, citing proprietary concerns. The case became one of the defining arguments for explainability requirements in AI systems — but it also exposed a subtler problem: even if COMPAS had provided explanations, neither defendants nor judges possessed the statistical literacy to evaluate them. Providing an explanation is not the same as enabling comprehension.
Transparency in AI interfaces exists along a spectrum with at least five meaningfully distinct levels. Understanding which level is appropriate for a given context is a core design decision — one that most teams make implicitly rather than explicitly.
Level 1 — Existence disclosure: The user is told that an AI system is involved in producing what they see. Minimum legal requirement in many jurisdictions post-2023 EU AI Act.
Level 2 — Confidence signaling: The system indicates how certain it is about an output. Useful only when calibrated accurately (see L2).
Level 3 — Factor disclosure: The system shows which inputs most influenced a particular output. The "why" level — common in recommendation systems ("We're recommending this because you watched X").
Level 4 — Process transparency: The system shows something about how it generated the output — not just what influenced it, but how. Chain-of-thought explanations in LLMs are one example.
Level 5 — Full auditability: Complete access to model weights, training data, and decision logic. Rarely practical in deployed products; relevant in regulatory and forensic contexts.
Most commercial AI products operate at Levels 1–3. The decision between them involves real tradeoffs. Level 1 alone is usually insufficient for user calibration but is legally necessary. Level 3 (factor disclosure) is the level at which most "explainability" features in consumer products operate — but as the COMPAS case illustrates, disclosing factors without enabling comprehension can produce false confidence in users who assume the disclosed factors are complete and unbiased.
The explainable AI (XAI) research field has focused heavily on the technical challenge of generating explanations from complex models. The UX problem — whether those explanations actually help users make better decisions — is considerably less studied. A 2021 study by Bansal et al. at Microsoft Research examined whether AI explanations improved human decision-making on a binary classification task. The finding was counterintuitive: explanations sometimes degraded human performance by anchoring users to the model's reasoning even when the model was wrong. Users shown explanations were less likely to override a wrong AI prediction than users who saw only the prediction.
This is not an argument against explanations — it is an argument for designing explanations that are fit for purpose rather than explanations that merely exist. Several properties distinguish useful explanations from misleading ones:
Contrastive framing: Explaining why the model chose A rather than B (contrastive) is more actionable than explaining why it chose A in general. Users naturally ask "why this and not that" — explaining in that structure matches their cognitive frame.
Appropriate complexity: Explanations should be as simple as possible while still being accurate enough to support the decision at hand. Medical AI explanations shown to patients need different complexity levels than explanations shown to radiologists — even for the same model output.
Scope honesty: Explanations should be honest about what they don't explain. A factor disclosure that lists three features should not imply those are the only features. The absence of a "these are not all factors" note is a design choice that systematically misleads users.
A significant proportion of what passes for transparency in deployed AI systems is what researchers have begun calling transparency theater: interface elements that signal openness without providing actionable information. A "Learn why" link that opens a generic explanation of how recommendation algorithms work in general, rather than why this specific item was recommended to you specifically, is transparency theater. An AI disclosure badge that says "Powered by AI" without any indication of what the AI is doing or how it might fail is transparency theater.
Transparency theater is not merely useless — it actively harms calibration by occupying the cognitive space where genuine transparency could go. Users who see a "Learn why" link and click it once, find it unhelpful, and thereafter ignore it have received a negative update about the value of engaging critically with AI outputs. The interface has trained them to not look closely.
The EU AI Act (2024) and emerging US state regulations are beginning to define minimum disclosure standards that exceed theater — but compliance with disclosure requirements and genuine informational transparency remain different things. Designers working in regulated contexts need to track both.
An explanation should be designed for the specific decision the specific user faces, not for general education about AI or for regulatory compliance. Ask: given this explanation, can this user decide whether to trust this output in this context? If the honest answer is no, the explanation is decoration, not design.
Several transparency patterns have consistent positive effects on user calibration across multiple studies:
Confidence-conditional disclosure: Surfacing explanation details only when confidence falls below a threshold — rather than always — reduces interface noise while ensuring users receive signals at the moments they most need them. Google's AI Overviews in Search (2024) uses a version of this by surfacing source links more prominently when queries touch contested or health-related domains.
Error exemplars in onboarding: Showing users real examples of the type and frequency of errors the system makes before first use — not just what it can do well — has consistent positive effects on calibration without significantly reducing adoption. This runs against standard product marketing logic but is supported by multiple studies in medical AI deployment.
Reversibility signals: Clearly communicating that an AI-assisted decision is reversible — or flagging explicitly when it is not — adjusts the level of scrutiny users apply appropriately. Users apply more critical review to irreversible actions when they are explicitly labeled as such.
Scope boundary markers: Explicitly stating what the system was not designed to do — in the interface, not just in documentation — reduces out-of-scope usage and the trust failures that follow from it. Claude's constitution for Claude 3 (Anthropic, 2024) is one example of a public, in-product scope definition that went beyond legal disclaimers.
Transparency is not a checkbox. It is a design process that requires understanding what information your specific users need to make your specific decisions, and then designing the most legible possible representation of that information. In Lesson 4 we turn to the evaluation process itself — how you measure whether your AI UX design is working.
You'll write and critique explanation designs for AI interface scenarios. The assistant will give you a context, ask you to draft an explanation at a specific transparency level, and then evaluate whether it's genuinely useful or theater. You'll revise based on feedback.
Complete at least three exchanges to finish the lab.
In 2019, Google launched Duplex — an AI system that could make phone reservations on behalf of users — to limited public availability. The initial demonstrations were stunning; in one widely viewed video, the system called a hair salon and booked an appointment with natural-sounding pauses, filler words ("um," "mm-hmm"), and graceful handling of an ambiguous question about availability. What the demonstrations didn't show was the failure mode: Duplex struggled significantly with calls that deviated from its trained scenarios — unfamiliar accents, unusual business hours structures, or questions outside its domain. Google's internal evaluations had measured success on the task the system was designed to perform under ideal conditions. The real-world distribution of calls was considerably messier. By 2023, Google had quietly rolled back many of Duplex's autonomous features, with human operators handling an increasing share of calls flagged as out-of-scope. The system worked; the evaluation had been insufficiently adversarial.
Traditional UX evaluation methods — think-aloud usability testing, task completion rate measurement, System Usability Scale surveys — were designed for deterministic interfaces. They measure whether users can accomplish defined tasks with defined interfaces. Applied to AI systems, they produce misleadingly positive results for two structural reasons.
First, distributional coverage: usability tests typically sample a narrow range of user inputs against a well-designed test scenario. AI systems' failure modes are concentrated in the tail of the input distribution — the unusual requests, the edge cases, the out-of-scope queries. Standard usability testing rarely reaches those tails. A chatbot can sail through a dozen canonical test scenarios and fail badly on a thirteenth that no tester thought to try.
Second, longitudinal trust drift: usability tests measure interaction at a single point in time. AI systems' trust dynamics play out over weeks and months. A system with excellent first-session usability may produce severe overtrust problems after three weeks of use, as the novelty effect fades and automation bias sets in. Short-term evaluation misses this entirely.
Effective evaluation of AI interaction design requires methods that address the distributional coverage problem and the longitudinal trust drift problem. The following framework draws on published work from the Google PAIR team (2019), the Microsoft Research FATE group (2021), and academic AI HCI research.
Layer 1 — Heuristic Review (AI-adapted): Apply adapted heuristics that include AI-specific criteria: Does the interface communicate system limitations? Does it provide calibrated uncertainty signals? Does it handle failures gracefully? Ben Shneiderman's 2020 "Ladder of Trust" provides a structured heuristic set specifically for AI systems.
Layer 2 — Adversarial Task Testing: Design test scenarios that specifically target failure modes — out-of-scope queries, ambiguous inputs, edge cases identified from failure mode analysis. Do not only test the ideal path. The Google Duplex failure is a direct consequence of insufficient adversarial testing.
Layer 3 — Trust Calibration Measurement: Use validated instruments (e.g., the Trust in Automation scale, Jian et al. 2000; or the MDMT, Ullman & Malle 2019) to measure trust level and compare it against actual system reliability metrics. The gap between these two numbers is your calibration error.
Layer 4 — Longitudinal Behavioral Observation: Track how usage patterns evolve over weeks of naturalistic use. Key signals: override rate (are users checking AI outputs less over time?), error detection rate (are users catching AI mistakes at the same rate after 30 days as after 3 days?), and scope drift (are users using the system for tasks outside its design envelope?).
One of the most practically important skills in AI UX evaluation is diagnosing whether an observed failure is a UX problem (fixable by design) or a model problem (requires retraining or architectural change). The distinction matters because the remediation paths are entirely different, the teams responsible are different, and the timelines are different. Misdiagnosing a model failure as a UX problem leads to design churn that cannot solve the underlying issue.
A structured diagnostic approach: For any failure event, ask three questions in sequence. First, would any user presentation of this output have led to the same outcome? If yes — if the output was simply wrong and no amount of framing could have made it correct — this is a model failure. Second, did the user have access to sufficient information to identify the error, and did they use it? If the information was available but the user didn't engage with it, this is a UX failure (likely overtrust or transparency design issue). Third, was the information available in a form the user could reasonably interpret given their context and expertise? If not — the information existed but wasn't legible — this is an explainability design failure.
Many real failures involve all three components. A useful AI failure taxonomy by Wang et al. (2019) at Carnegie Mellon distinguishes between model-caused failures (wrong output), interaction-caused failures (correct output, user couldn't use it), and context-caused failures (output was correct for training distribution, wrong for deployment context). Each demands a different response.
The gap between AI UX research and AI UX practice remains large. Most teams deploying AI-powered products do not use validated trust instruments, do not conduct adversarial testing, and do not track longitudinal behavioral signals. The primary measurement most teams rely on is engagement — session length, return rate, feature usage. These metrics are not useless, but they are easy to optimize in ways that increase engagement while worsening calibration. A chatbot that gives confident, fluent wrong answers might produce higher engagement than one that hedges appropriately, because the confident answers feel more satisfying in the moment.
The most rigorous deployed example of a comprehensive AI UX evaluation methodology in the public record is the process Google's PAIR team published in 2019 alongside the "People + AI Guidebook." That framework explicitly distinguishes between user satisfaction metrics (which can be gamed by the ELIZA effect) and user outcome metrics (which require tracking what users did with AI outputs in the real world). The distinction is conceptually simple and practically very difficult to implement — it requires connecting interface analytics to downstream outcome data, which most product teams have neither the infrastructure nor the organizational incentives to do.
The honest state of the field in 2024 is that AI UX evaluation methodology is significantly behind AI development capability. The tools exist; the will to apply them consistently is unevenly distributed. Understanding what rigorous evaluation looks like is the first step toward practicing it, even in constrained environments.
Measure what users accomplish with AI outputs, not just how much they interact with AI features. Engagement metrics can be optimized in ways that worsen user outcomes. If you cannot yet connect interface analytics to downstream outcomes, be explicit with stakeholders about what your engagement data cannot tell you.
This module has covered the foundational landscape of human-AI interaction design: the structural properties that make AI UX categorically different from prior software UX (L1), the mechanics of trust formation and calibration (L2), the design of transparency and explainability (L3), and the methods for evaluating whether your designs are working (L4). The Module Test ahead covers all four lessons. Take it when you're ready.
You'll work through AI product failure scenarios and evaluation design challenges. For each scenario, you'll classify the failure type (model, interaction, or context-caused), identify what evaluation methodology would have caught it earlier, and propose a specific measurement plan for an ongoing deployment. The assistant will challenge vague proposals and push for operational specificity.
Complete at least three exchanges to finish the lab.