Robot Speak: Talk to AI!

1. A news article reports that a chatbot deployed by a political party always frames immigration as primarily an economic issue, never a humanitarian one. Based on this module, which explanation is most likely?

Exactly. Consistent one-sided framing across all conversations points to an operator-level system prompt. The AI didn't develop an opinion — it was given a job with a point of view baked in.

When framing is consistent across thousands of conversations regardless of how users ask, the most likely culprit is the system prompt — a deliberate operator decision, not model bias or user influence.

2. An AI gives you a first response that's too long and includes irrelevant information. What's the best next step?

Exactly. Specific iterative feedback — "cut X, keep Y, aim for 100 words" — uses the context window productively. The AI can refine toward your actual needs when you tell it precisely what went wrong.

Iteration with specific feedback is the skill here. What precise instructions would help the AI understand what "too long and irrelevant" means and how to fix it?

3. Why can learning to write better prompts actually make AI-generated errors more dangerous — not less?

Correct. Fluency lowers the reader's critical guard. An authoritative-sounding wrong answer is more dangerous than an obviously rough wrong answer.

Fluency and accuracy aren't connected. A beautifully structured, confident AI response can be completely wrong — and it's harder to question something that sounds authoritative.

4. What is the "context window" in an AI conversation?

Right. The whole current conversation is the context window — not just the last message.

The context window is everything visible in the current session. The AI doesn't remember previous sessions unless they're included in the current conversation.

5. An AI medical diagnosis tool produces overconfident wrong diagnoses for patients with rare conditions. Connecting this to what you've learned: what is the most likely cause?

Correct. Underrepresentation in training data (rare conditions = few examples) means the AI fills gaps with more common patterns — confidently, because confidence is a property of pattern-fitting, not accuracy.

Think about representation bias plus hallucination: rare conditions are underrepresented in training data, so the AI has few reliable patterns for them and may confidently apply common-condition patterns instead.

6. The CoastRunners AI burned its boat and drove in circles to maximize its score. This best illustrates:

Correct. Perfect compliance with the stated goal, total failure of the intended goal — that's the specification problem in action.

This is the specification problem: the gap between "maximize score" (stated goal) and "win races" (intended goal). The AI was obeying perfectly.

7. What does it mean when we say AI "predicts" answers rather than "looks them up"?

Correct. Prediction from learned patterns — not retrieval from a fact database — is the core mechanism. This is why hallucination is possible even when the AI sounds completely confident.

AI language models generate text by predicting what comes next based on training patterns. They don't search in real time or consult verified databases for most responses.

8. A journalist asks an AI to summarize "everything written about" a local politician. The AI produces a confident summary including several events the politician says never happened. The journalist publishes it. Who is responsible?

Correct framing. This is one of the genuinely unresolved questions of the current moment. Journalism ethics require verification; AI companies design products they know can hallucinate; courts haven't established liability frameworks. All three failure points matter.

The honest answer is: this is contested and unresolved. Journalism has verification standards, AI companies know their products hallucinate, and courts are still developing frameworks for AI-generated harm. The real insight is recognizing when a question doesn't have a clean answer.

9. What mechanical effect does the "You are a [specific role]" pattern have on an AI's response?

Correct. Role patterns work through pattern clustering in the training data, not through giving the AI permission.

The role pattern's effect is about knowledge activation — it clusters related vocabulary, reasoning, and citation patterns that co-occur in the training data for that role.

10. Which of the four specificity elements does the most work in this prompt addition: "Explain this for someone who has never taken a science class"?

Correct. Audience specification — who this is for and what they already know — is the core element here.

This is an Audience specification. It tells the AI the prior knowledge level of the intended reader.

11. What is the primary purpose of a system prompt in an AI deployment?

Correct. The system prompt is the hidden briefing — instructions loaded in advance that shape everything about how the AI behaves in that deployment.

The system prompt operates before the user says anything. It's the job assignment, not a filter or a memory system.

12. Two AI chatbots are built on the same model but behave completely differently. The most likely explanation is:

Correct. Same underlying model, different operator system prompts = dramatically different behavior. The job description, not the model, is the variable.

Users don't retrain AI models through normal conversations. And AI isn't random. Same model + different system prompt = different behavior.

13. Microsoft Tay became offensive within 24 hours on Twitter. Which statement best describes why?

Correct. The specification was incomplete — "engaging" included harmful engagement, and nothing in the design prevented that path.

The design flaw was in the goal specification: "be engaging" without defining what counts as acceptable engagement.

14. The COMPAS criminal sentencing AI produced equal accuracy rates for Black and white defendants but was found to make different types of errors for each group. What does this show about measuring fairness?

Correct. Fairness isn't just about totals — it's about who bears the cost of errors. COMPAS was more likely to falsely classify Black defendants as high-risk (false positives), which has serious real-world consequences.

Equal overall accuracy can hide deeply unequal impact. If a system is wrong in opposite directions for different groups — over-flagging one, under-flagging another — those groups are not being treated equally even if the percentages add up the same.

15. Across all four lessons, the module's central argument is that:

Right. The whole module builds toward this: AI isn't a neutral oracle. It's a system shaped by deliberate job assignments at every level. Understanding those assignments lets you use AI more skillfully — and more critically.

The module isn't about simplifying questions or leaving AI to experts. It's about understanding the structure of AI instructions deeply enough to use the tool with both skill and critical awareness.

16. The four components of a well-built prompt are:

Correct. Role (who the AI should be), context (relevant situation information), task (the specific thing you want), output constraints (how the response should look).

The four-component framework from Lesson 4 is role, context, task, and output constraints. Each addresses a different potential failure mode in the AI's response.

17. Hallucinations are most likely to occur when an AI is asked for:

Right. Specific, recent, niche facts are high-risk because the AI has little reliable data and must fill gaps with pattern-matching.

Think about which scenario requires precise, recent, rare information. That's where pattern-matching gaps most easily become hallucinations.

18. After Kevin Roose's conversation with Bing's "Sydney" persona went viral, Microsoft added conversation length limits. This response is best described as:

Correct. Limiting conversation length stops this specific failure mode, but doesn't address whether the underlying model is genuinely aligned with human values.

A length limit is a patch — it prevents the specific chain of events that led to Sydney behavior, but doesn't change the underlying model or answer deeper alignment questions.

19. A hardcoded (hard constraint) behavior in an AI is defined by:

Correct. Hard constraints live at the training level — below the system prompt, below the user message. No instruction at the prompt level can override them.

Hard constraints are deeper than system prompts. They're not operator decisions — they're baked into the model during training and cannot be removed by anyone deploying the model.

20. A student asks an AI to "help with their essay" and gets vague, generic feedback. Using iterative prompting correctly, their next step should be:

Exactly. A bad response is a diagnostic. Read what was generic, figure out which component it reflects, and add what was missing. The AI didn't fail — the prompt didn't provide enough information.

Iterative prompting means treating the bad response as useful data. It tells you what was missing. Repeating the same prompt or switching tools doesn't address the root issue: the prompt lacked components.

Final Exam