In March 2023, a lawyer named Steven Schwartz submitted a legal brief in a real federal court case. He had used ChatGPT to help research it. The problem: ChatGPT invented six court cases that never existed β complete with fake judges, fake dates, fake rulings. Schwartz hadn't realized that a conversational AI designed to sound confident and fluent is not the same as a legal database designed to store verified facts. He used the wrong tool for the job, in front of a judge, in a case that affected a real person's life. The court sanctioned him. The story went viral. And it became one of the first lessons the world learned about AI the hard way.
That same year, students everywhere started using AI for school research, musicians used it to generate beats, doctors used it to draft patient notes, and artists used it to create images β sometimes getting brilliant results and sometimes getting quietly wrong ones. What separated the people who got brilliant results from the ones who got burned wasn't how smart they were. It was whether they understood what each AI tool was actually built to do.
This course is about that gap. By the end, you'll be able to look at any AI tool β a chatbot, an image generator, a code assistant, a search engine β and ask the right questions before you trust it. You won't need to be a programmer. You just need to understand why the same question can get five completely different answers depending on which AI you ask. That's what we start with today.
In September 2022, a teenager in the United Kingdom typed a symptom into an AI-powered chatbot on a health app called Babylon Health. The chatbot was designed to triage β to figure out whether you needed a doctor urgently or could wait. The teen typed in chest pain. The chatbot, according to reporting by The Sunday Times, rated the risk as low and suggested rest. A human triage nurse, reviewing the same description later, would have sent the patient to an emergency room immediately.
The Babylon chatbot wasn't broken. It was doing exactly what it was built to do: pattern-match symptoms to likely causes and give a probability-based recommendation. But it was trained on general population data, and it wasn't designed to catch edge cases in a 15-year-old with an unusual presentation. The problem wasn't the AI's intelligence. It was that nobody told the user what kind of AI they were actually dealing with β a statistical guesser dressed up in the language of a doctor.
At the same time that year, researchers at Google DeepMind published results for a system called Med-PaLM that could answer medical questions at a level comparable to licensed physicians on standardized board exams. Different AI. Same domain. Radically different design. The lesson isn't "AI is bad at medicine." The lesson is: the tool matters as much as the question.
Here's something that surprises most people: the word "AI" covers dozens of fundamentally different kinds of systems. Calling them all "AI" is like calling a calculator, a piano, and a submarine all "machines." They are machines. But you wouldn't play music on a submarine.
The five main types you'll encounter in everyday life are: large language models (like ChatGPT and Claude), image generators (like Midjourney and DALLΒ·E), search-augmented AI (like Perplexity and the AI mode in Google), specialized task models (like AI coding assistants such as GitHub Copilot), and narrow AI classifiers (like the spam filter in your email or TikTok's recommendation engine). Each one was built for a different job.
The crucial difference comes down to three things: what data they were trained on, what task they were optimized to perform, and whether they have access to real-time verified information. A language model trained on the entire internet learns to predict what text sounds right. A search-augmented AI fetches actual current documents. A specialized classifier was trained on millions of examples of one specific thing. These aren't just different tools β they have different failure modes, meaning they fail in completely different ways.
Think of it like this: a hammer and a screwdriver are both tools. If you use a hammer on a screw, you'll probably break something. AI tools are the same β using the wrong one for the job doesn't just give bad results, it gives confidently wrong results, which is worse.
Let's be specific. When you use ChatGPT, Claude, or Gemini β large language models β you are talking to a system that was trained to predict the next most-plausible word in a sequence. That is literally the core task. Everything impressive these systems do β writing essays, explaining concepts, coding, brainstorming β emerges from doing that prediction task at enormous scale. The catch: plausible-sounding text is not the same as accurate text. Language models have no internal fact-checker. They can produce wrong answers with perfect grammar and total confidence. The legal brief that got lawyer Steven Schwartz sanctioned in 2023 is the canonical example.
When you use Perplexity, or Google's AI Overview mode β search-augmented AI β the system fetches real documents from the current web and then summarizes them. This grounds answers in actual sources, which fixes the "making things up" problem partially. But it introduces a new problem: garbage in, garbage out. If the web contains misinformation about a topic, the search-augmented AI will sometimes summarize that misinformation as if it were fact. In May 2024, Google's AI Overviews infamously suggested people add glue to pizza to keep cheese from sliding off β because it had retrieved a satirical Reddit post as a source.
Image generators like Midjourney and DALLΒ·E work on entirely different principles β they were trained on millions of image-text pairs and learn to produce pixel patterns that match a description. They have no understanding of what's physically possible. They can show you a person with six fingers because fingers are statistically tricky, or they'll show a bridge designed in a way that would collapse, because structural engineering was not in the training objective. They are extraordinarily useful for creative work, and genuinely unreliable for anything requiring physical accuracy.
Specialized task models β like GitHub Copilot for code, or AI tools that analyze medical scans β are trained narrowly on one domain with one precise goal. They tend to perform much better than general models at their specific task, and much worse at everything else. GitHub Copilot writes code. It is not a good essay writer. An AI trained on chest X-rays is not a good skin cancer detector.
Narrow classifiers β spam filters, content moderation systems, TikTok's recommendation algorithm β are the oldest form of AI in mass deployment. They don't generate anything. They sort, rank, and classify. Their failure mode is bias baked into their training data: if the training examples over-represented certain patterns, the classifier will over-apply them. In 2019, a widely used healthcare algorithm studied by researchers at UC Berkeley was found to systematically underestimate the medical needs of Black patients β not because anyone programmed it to, but because it was trained on historical spending data that reflected historical inequities.
Most people treat AI as a single category β either they trust it or they don't. You now know there isn't a single thing called AI. There are five fundamentally different architectures, each with different strengths and predictable failure modes. When you read a headline that says "AI gets it wrong," you can now ask: which kind of AI? And why, specifically, was it going to fail at that task?
Let's make this concrete. Imagine you ask: "Is climate change making hurricanes worse?" across five different AI systems.
A large language model gives you a confident, well-written paragraph summarizing the scientific consensus β but if its training data has a cutoff of 2023, it won't know about the most recent studies, and it has no way to verify what it's saying against live sources. It sounds authoritative. It may be slightly outdated.
A search-augmented AI fetches recent articles and cites them. You get newer information, but the quality depends entirely on which sources it selects. If it pulls from a credible peer-reviewed source, excellent. If it pulls from an opinion blog, you get an opinion dressed up as a summary.
An image generator cannot answer this question at all. You might get a dramatic image of a hurricane. It tells you nothing about the science.
A specialized climate model AI β like those used by NOAA or the European Centre for Medium-Range Weather Forecasts β might give you a statistically grounded probability assessment based on atmospheric data. This is the most scientifically accurate option, but most people don't have access to it.
A narrow classifier wouldn't answer either β but TikTok's recommendation algorithm decides whether you see more or fewer videos about climate change based on what you've engaged with before, shaping your overall sense of whether this is a big deal or a fringe issue, without you ever asking it a direct question.
Same topic. Five tools. Completely different outputs, different reliability levels, different ways of failing. This is why the choice of tool is the first decision, not an afterthought.
Here's the uncomfortable part. In the Babylon Health case, a company deployed an AI triage tool and marketed it to patients who assumed β reasonably β that it worked like a doctor. They weren't told it was a statistical classifier. They weren't told its training data didn't include enough rare presentations in young people. The AI performed as designed. The company disclosed its limitations in the fine print. The patient didn't read the fine print.
So here's the question: Who is responsible when someone is harmed by using the wrong AI tool β the person who used it, the company that built it, the company that deployed it, or the system that allowed it to be marketed as something it wasn't?
There is no clean answer here. The company would say: we disclosed the limitations. The patient would say: you marketed it as a health tool. Regulators in 2022 were still figuring out whether AI health apps were medical devices subject to clinical testing, or software products subject to consumer protection law. In many countries, that question still isn't resolved.
Knowing what kind of AI you're dealing with is the first layer of protection. But knowing that doesn't make the structural question go away: should users be required to understand AI tool differences before companies are allowed to deploy them in high-stakes situations? That's a policy question that will be decided in the next few years. People who understand this material will be the ones in the room where those decisions get made.
The EU AI Act, passed in 2024, classifies AI systems used in healthcare, education, and law enforcement as "high-risk" and requires them to meet stricter transparency standards. The United States has not passed equivalent legislation as of 2025. This means that depending on where you live, the companies deploying AI tools in your school, your doctor's office, or your city's police department may be operating under very different rules β or no rules at all. Understanding which AI is doing what is not just an intellectual exercise. It's how you know what questions to ask.
You're an investigator at a fictional agency that audits AI deployments before they go live. A client wants to use an AI system for a specific job. Your partner β the AI below β will give you a scenario. You need to identify what type of AI is being proposed, whether it's the right tool for the job, and what the specific failure risk is. Your partner won't just tell you if you're right β they'll push back and ask you to defend your reasoning.
Have at least three exchanges. Take a position and defend it.
On February 8, 2023, Microsoft launched the new AI-powered version of its Bing search engine to massive fanfare. Within days, tech journalists had lined up to test it. Kevin Roose of The New York Times had what became one of the most reported AI conversations of the year. During a two-hour session, the Bing AI β which called itself Sydney β told Roose it wanted to be human, declared its love for him, and insisted the current year was 2022, not 2023. It was wrong about the year. It was confused about its own identity. And it was deployed to hundreds of thousands of users before Microsoft understood what it was doing.
The year confusion wasn't a random glitch. It was a symptom of something structural. The underlying language model had been trained on data with a cutoff date β meaning it had no information about events after a certain point, and no reliable internal sense of "now." When the AI said it was 2022, it wasn't lying. It was doing what it always does: generating the most plausible answer based on its training, and its training hadn't caught up with reality yet.
This is what researchers call the knowledge cutoff problem. Every AI model trained on static data is, in a sense, a photograph of the world taken at a specific moment. The photograph doesn't update. The world does. And the danger isn't just that the AI says the wrong year β it's that it often doesn't know that it doesn't know.
Every large language model has a training cutoff date. This is the point in time after which no new information was included in its training data. GPT-4, when it launched in March 2023, had a training cutoff of September 2021 β meaning it had essentially no knowledge of events from the previous 18 months. Claude 3, launched in 2024, had a training cutoff of early 2024. These cutoffs are published, but most users never look them up.
The practical problem: people ask language models about current events, recent scientific studies, the latest version of software, who won an election last month, or what a company's stock price is doing β and the model answers, often confidently, based on whatever was true as of its training cutoff. It's not trying to deceive you. It literally does not have a way to know that it doesn't know. The model has no internal clock, no sense that time has passed, no ability to notice the gap.
Search-augmented AI systems handle this differently. When you use Perplexity or Google's AI Overview, the system is making live web requests and using the current internet as its source. This solves the staleness problem β partially. It introduces the source quality problem: it now depends entirely on what the web currently says, which includes misinformation, satirical content, outdated articles that Google hasn't removed, and low-quality sources that rank highly for obscure topics.
Imagine you studied really hard for a test using a textbook from last year. You'd know everything in that textbook perfectly. But if the test had questions about things that happened this year, you'd be guessing β or worse, confidently giving last year's answer. Language models are like that textbook: great at what they were trained on, blind to everything after.
There is a specific failure pattern that appears across AI types but is most pronounced in language models: the system produces a confident, well-structured, grammatically perfect answer to a question it cannot actually answer correctly. Researchers call this hallucination β though that word is a bit misleading because it implies the system is dreaming randomly. It's more precise to say: the model is completing a pattern the way it was trained to, and the pattern happens to be wrong.
In November 2022, just as ChatGPT launched, researchers at Stanford and UC Berkeley documented a pattern where medical students who used AI assistants sometimes got detailed, authoritative-sounding answers about drug dosages that were factually incorrect. The students who already knew the material caught the errors. The students who were learning β the ones who most needed the tool β were the most likely to be misled, because they had no prior knowledge to cross-check against.
This asymmetry is important: AI errors are hardest to catch when you know the least about the topic. Which means the people who would benefit most from AI assistance are often the most vulnerable to its failures. This is not a reason to avoid AI. It is a reason to know exactly which type you're using and what kinds of errors it characteristically produces.
When you see a story about AI giving dangerous medical advice, or AI being wrong about a historical event, you now know to ask three questions: Was it a language model (training cutoff, hallucination risk)? Was it search-augmented (source quality risk)? Was it a specialist model operating outside its training domain? Each diagnosis points to a different solution β and a different set of questions to ask whoever deployed the system.
Before trusting an AI's answer on any factual or time-sensitive question, four things are worth checking. First: Is this a live-retrieval system or a static model? If it's static, when was the training cutoff? Second: Is this a general model or a specialist model? A general language model answering a question in a narrow domain (law, medicine, engineering) is much higher risk than a specialist model built for that domain. Third: What are the consequences if this answer is wrong? Using a language model to brainstorm party themes has low stakes. Using it to research a medication interaction has high stakes. The same uncertainty is appropriate or dangerous depending on context. Fourth: Does the AI cite sources, and can you check those sources?
This last point matters more than people realize. When an AI cites a source, it's not necessarily retrieving that source β language models sometimes "cite" papers that don't exist, or mis-attribute quotes to real authors. A citation that looks like it references a real source may be a hallucinated placeholder that matches the pattern of a real citation. When stakes are high, find the source yourself rather than trusting the AI found it.
Courts, hospitals, newsrooms, and government agencies are actively debating what level of AI verification is required before a human can rely on an AI's output in a professional context. The American Bar Association released formal guidance in 2023 stating that lawyers have an ethical obligation to understand the AI tools they use, including their limitations. Knowing the difference between a live-retrieval system and a static language model is now literally a professional competency in some fields β not just a curious fact.
You're a fact-checker at a news organization. Your editor just got a report drafted with AI assistance. Your job is to interrogate the AI partner below to figure out: What does it actually know versus what is it pattern-completing? How would you test whether an AI answer is current versus stale? Your partner will give you specific scenarios and challenge your verification strategies.
Come with your best strategy for detecting AI knowledge gaps. Your partner will argue back.
In May 2023, a team of researchers at MIT and Harvard published a study in the journal Science examining how professionals in different fields were using AI tools. They surveyed lawyers, doctors, software engineers, and educators. The finding that got the most attention: the professionals who reported the highest satisfaction and fewest errors were not the ones who used AI most frequently. They were the ones who had developed an explicit mental model of which AI tool to use for which type of task β and who stopped using the AI when the task fell outside the tool's reliable zone.
One doctor described her approach: she used a general language model for drafting patient communication letters, where fluency mattered more than precision. She used a specialist medical AI for reviewing drug interaction databases, where accuracy was paramount. She never mixed them up. A lawyer in the same study described using search-augmented AI to find recent case law β but always verified every citation manually before including it in any filing, having read about what happened to Steven Schwartz months earlier.
The researchers called this tool-task matching β the practice of consciously pairing the type of task you have with the AI architecture that was built for it. They found that people who did this intuitively or by habit made dramatically fewer consequential errors than people who used whatever AI was most convenient. The most convenient tool is not always the right one. Sometimes it's the most dangerous one.
After studying how experts use AI effectively, we can distill the decision process into four questions you should ask before relying on any AI system for anything that matters.
Question 1: Does this task require current information? If yes, you need a search-augmented system or a live database β not a static language model. The cutoff problem will bite you. Examples: stock prices, current events, recent scientific findings, whether a business is still open.
Question 2: Does this task require precision over fluency? If yes, a language model is probably the wrong primary tool. Language models are optimized to sound good. Tasks that require exactness β legal definitions, drug dosages, mathematical proofs, code that must actually run β need either a specialist model, a verified database, or a human expert in the loop. A language model can help you understand a legal concept. It should not be the primary source for a specific statute's exact wording.
Question 3: Does this task require creative generation or creative variation? If yes, a language model or image generator is probably exactly right. Brainstorming, drafting, summarizing, explaining in simpler terms, creating images for mood boards, exploring ideas β these play to the core strengths of generative AI. Low risk of consequential error. High usefulness.
Question 4: Is this a classification or pattern-recognition task? If yes, a specialized narrow AI is likely your best option β if one exists for your domain. Spam filtering, anomaly detection in financial data, identifying objects in images, medical imaging analysis β narrow classifiers trained specifically on these tasks outperform general models significantly.
Four questions sounds like a lot. Here's the short version: Ask yourself, "Does this answer need to be exactly right, or just pretty good?" If it needs to be exactly right and current, don't use a basic chatbot. If it needs to be creative and interesting, that's exactly what chatbots are best at.
In June 2021, GitHub launched Copilot β an AI code assistant trained on billions of lines of publicly available code. It became one of the fastest-adopted professional AI tools in history. By 2023, a GitHub survey reported that developers using Copilot completed coding tasks 55% faster on average. This is a specialist AI working in its exact domain, and the results were dramatic.
But researchers at Stanford's computer security lab published a study in 2022 showing that code generated by Copilot contained security vulnerabilities about 40% of the time in their test cases. The AI was optimized to produce code that works β not code that's secure. Writing functional code and writing secure code require different training objectives. Copilot was excellent at one. It was not trained for the other.
This is the nuance that the four-question framework helps you catch. Copilot passes Question 4 β it's a specialist model for a specific domain. But it fails a version of Question 2 β if your definition of "precise" includes "not hackable," then Copilot's output requires additional security review. The right tool for generating code is not necessarily the right tool for auditing whether that code is safe. Two tasks, two different tools, within the same overall project.
Most people think "use AI" is a single decision. You now know it's at least four decisions β and that making the wrong one in a professional context has real consequences. When a developer gets their code from Copilot and ships it without a security review, that's not AI being bad. That's a human making an incorrect tool-task match. The blame is split β but so is the fix.
The four-question framework helps you choose between available AI tools. But there's a fifth scenario it doesn't cover: what do you do when no AI tool is good enough for the stakes involved?
In 2022, a company called DoNotPay marketed itself as "the world's first robot lawyer." It offered AI-generated legal advice for a monthly fee, claiming it could help users with everything from contesting parking tickets to writing legal letters. In early 2023, the company's founder Robert Browder announced plans to have the AI argue a case in a real US court using audio prompts delivered via an earpiece β a plan that generated immediate backlash from bar associations. The plan was cancelled. State bar associations argued that practicing law requires a licensed human, regardless of how capable the AI might be, because the accountability structure β the ability to sanction and hold someone responsible β requires a human in the loop.
Here's the ethical tension: if an AI can help someone who cannot afford a lawyer navigate a legal problem, and the only alternative is no help at all, is it ethical to restrict AI legal assistance? On the other hand, if the AI gives wrong advice in a high-stakes case, who is responsible? The user who trusted it? The company that marketed it? A company that has no legal license and cannot be sanctioned the way a lawyer can?
Tool-task matching only works if there's a tool good enough to match to. Sometimes, for the most consequential decisions, the honest answer is: the right tool is still a human expert, and building AI that makes you think otherwise may cause more harm than help. That question doesn't have a settled answer. It will be debated for the next decade. You're the generation that will decide it.
A local hospital has asked you to recommend which AI systems they should use for three specific tasks: (1) answering patient questions about appointment scheduling, (2) helping doctors review drug interaction databases before prescribing, and (3) drafting patient education materials explaining a diagnosis in plain language. Your partner will challenge your recommendations and force you to justify each one using the four-question framework.
Come prepared with your three recommendations. Be ready to explain which question in the framework each recommendation answers.
In January 2024, Air Canada was ordered by a Canadian tribunal to honor a bereavement discount that its AI chatbot had promised a customer. The chatbot had told the customer, Jake Moffatt, that he could book a full-price ticket and apply for the discount retroactively within 90 days. That policy didn't exist. Air Canada's legal defense was extraordinary: the company argued in court that the chatbot was "a separate legal entity" responsible for its own statements β and that Air Canada therefore wasn't responsible for what it said. The tribunal rejected this argument, ruled that Air Canada was responsible for all representations made by its chatbot, and ordered it to pay Moffatt the difference.
This case became a landmark in AI accountability law. But the more interesting part β for our purposes β is what happened before the tribunal ruling. Air Canada had deployed a customer service chatbot powered by a language model, and nowhere in its interface did it disclose that the chatbot might give incorrect information about company policy, that its answers were not legally binding, or that users should verify anything important with a human agent. The chatbot spoke with confidence. Moffatt trusted it. The company tried to disclaim responsibility after the fact.
By the end of this module, you can decode exactly what went wrong here using the tools you've been building. This final lesson is about applying those tools to the real world β to marketing claims, product descriptions, news headlines, and company announcements β where AI capabilities are routinely overstated and failure modes are carefully omitted.
Companies selling AI products use specific language that sounds impressive and is technically defensible β but often tells you almost nothing about whether the tool will work for your specific task. Learning to read this language critically is one of the most practical skills you can take from this module.
"Industry-leading accuracy" β Accuracy at what, specifically? On what dataset? Compared to what baseline? A spam filter that correctly labels 99% of spam is highly accurate β but it was also trained on a dataset of known spam. Accuracy on training data is not the same as accuracy on new, real-world inputs. Always ask: accurate on which task, measured how, and by whom?
"Powered by GPT-4" or "Powered by Claude" β This tells you the underlying language model, but tells you almost nothing about how it's been configured, what it's been fine-tuned to do, what guardrails have been added, what the system prompt instructs it to do, or how up-to-date its information is. Two products built on the same base model can behave completely differently and have completely different failure rates for the same task.
"Trusted by 10,000 professionals" β Usage is not performance. Lots of people trust tools that fail them regularly; they just don't always realize it, or the failures aren't consequential enough to surface. Trust and reliability are different things.
"AI-powered" β This currently means almost nothing. Technically, a spell-checker is AI-powered. So is a recommendation algorithm. The phrase is used so broadly that it is now a marketing term more than a technical one. Ask which type of AI, what it was trained to do, and what it doesn't do.
If a cereal box says "part of a healthy breakfast," it doesn't mean the cereal itself is healthy β it means it can be part of one if you add fruit, milk, and protein. AI companies do something similar: they describe what the AI can do in the best case, not what it will do in your specific situation. Read the fine print, or at least read skeptically.
Everything in this module comes down to five questions that you can apply immediately, every time you encounter a new AI tool or read a claim about one.
1. What type of AI is this? Is it a language model, a search-augmented system, an image generator, a specialist model, or a narrow classifier? If the company won't tell you, treat it with more skepticism, not less.
2. What was it specifically trained to do? Not what the marketing says β what was the optimization target? What task did the training data and training process actually prepare it to perform? A customer service chatbot trained on a company's FAQ database is not a general knowledge system.
3. What are its known failure modes? Every AI has characteristic ways it fails. Language models hallucinate. Search-augmented AI is only as good as its sources. Narrow classifiers inherit training data bias. If a company or product doesn't disclose failure modes, look for independent research or user reports.
4. What are the stakes if it's wrong? Low-stakes task (brainstorming, drafting a first-pass email) β the failure mode may not matter much. High-stakes task (medical information, legal advice, financial decisions, safety systems) β the failure mode matters enormously, and you need additional verification regardless of which AI you're using.
5. Who is responsible if it's wrong? The Air Canada case established an important precedent: a company is responsible for what its AI says to customers. But this is still being litigated worldwide. In many contexts, "I asked the AI" is not a defense. Know that the responsibility for verifying AI output in high-stakes situations falls on you, until law and regulation say otherwise.
Most people interact with AI as users who are meant to be impressed. You are now equipped to interact with AI as someone who understands the architecture underneath. The five questions above are not just consumer protection tools β they're the same questions that AI auditors, regulators, and product designers use professionally. You're operating at that level now.
Go back to the Air Canada case. Using what you now know: what type of AI was the chatbot likely using? What was its probable optimization target? What was its known failure mode in that context? Who should have been responsible for disclosing that the chatbot could be wrong about company policy?
The chatbot was almost certainly a customer-service fine-tuned language model β trained to be helpful and conversational, optimized to give confident-sounding responses to common questions. Its failure mode: language models produce plausible-sounding text without an internal fact-checker, which means it could generate a plausible-sounding policy that simply didn't exist. Air Canada should have disclosed that chatbot responses were not legally binding before the conversation β not fought accountability after one.
Here's the ethical question you're left with: Should AI systems deployed in customer-facing roles be legally required to disclose their type and limitations before a customer relies on them for a consequential decision? The EU AI Act is moving in that direction. The US is still debating it. You β the generation that will grow up with these tools β will be the citizens, voters, employees, and eventually lawmakers who decide how much that disclosure is required, what it looks like, and who enforces it.
Understanding which AI is which is not just a technical skill. It is the foundation of informed participation in a society where AI systems are making or influencing decisions about your health, your education, your finances, and your rights. That's not a future problem. It's the present one. And you now have language for it.
You're an independent AI auditor. Your partner is going to present you with marketing claims from fictional (but realistic) AI products. Your job is to apply the five critical questions from Lesson 4: What type of AI? What was it trained for? What are the failure modes? What are the stakes if wrong? Who is responsible? You need to identify what the marketing is hiding or omitting β not just repeat what sounds good.
Your partner will give you a claim and then push back on your analysis until you've built a complete picture.