Prompt Engineering: Get More Out

1. The "Telephone Operator Trap" describes using AI in output mode when you need:

Correct. Output mode produces artifacts; reasoning mode helps you think. Conflating them — asking for a deliverable when you need analysis — consistently produces less useful results for complex tasks.

The trap is using AI as a text generator when you need it as a thinking partner. Asking "write my conclusion" when you should ask "here's my argument — what would make this conclusion strong and why?" The second engages reasoning; the first just generates average text.

2. NotebookLM reduces hallucination primarily by:

Correct. Document-grounding is NotebookLM's core design — it answers from what you gave it, removing the general training data that enables hallucination.

NotebookLM's anti-hallucination advantage is grounding: it only uses the documents you provide, not its general training data.

3. According to the module, when should you update an existing prompt entry versus creating a new one (branching)?

The right criterion: same use case, better version = update. Different enough use case that both versions would be useful in the same week = branch. You're either tightening a tool or building a second tool with shared DNA.

Update when you're improving the prompt for its original use case. Branch when the modified version serves a meaningfully different context — the signal is whether you'd use both versions in the same week for different things.

4. If AI output is confidently wrong in the same direction on every run, what type of problem is that?

Correct. Consistent directional error means the model is drawing on wrong information or reasoning from a misleading frame. The fix is in the information or framing, not in reasoning technique.

Consistent same-direction error is a different failure mode from inconsistency. It points to a knowledge gap or misleading prompt structure — adding examples or CoT won't fix a systematically wrong premise.

5. The minimum viable fields for a prompt library entry are:

The four minimum fields — and the last two (what worked, what to watch out for) are often the most valuable because they capture reasoning, not just the prompt text.

The four minimum fields are: the prompt itself, what it's for, what worked, and what to watch out for. Everything else is optional optimization on top of this foundation.

6. What is the primary reason AI produces generic writing output when given a generic prompt?

Correct. Specific inputs produce specific outputs. The model fills the gap in your brief with the most statistically average content for the category.

The issue isn't design, tier, or editing requirements — it's the absence of specific raw material in the prompt. Generic prompt = category-average output.

7. Creating fifteen folder categories before you have fifteen prompts is an example of:

Over-categorization — designing structure for a collection that doesn't exist yet. Empty buckets are organizational noise that make the system feel heavier than it is and predict eventual abandonment.

This is over-categorization. Building elaborate structure before the content exists to fill it is one of two main failure modes. The cure is to derive buckets from actual use patterns, not from what seems comprehensive.

8. The three-layer personal AI stack described in Lesson 4 consists of:

Correct. Daily driver, research layer, specialist — three layers based on task type, not tool tier or cost.

The three layers are organized by task function: general tasks, current-fact tasks, and one specific recurring task. Not by cost or rotation.

9. An AI output consistently sounds like it was written for any company in the industry — not the specific company you're targeting. The most likely prompt-level cause is:

Generic output = thin context. When the AI doesn't have specifics — company, role, what makes this situation different from similar situations — it defaults to the broadest applicable version, which reads as generic and interchangeable.

The diagnostic framework maps "too generic" specifically to a thin context field. Adding specifics about the target audience, what they care about, and what distinguishes this situation is the fix — not adjusting tone or format.

10. A recruiter is using AI to rank candidates and notices the rankings are inconsistent — the same resume sometimes gets a high rank, sometimes low. What's the most likely diagnosis?

Correct. Inconsistency across runs for similar inputs is the signature of an ambiguity problem — the model is making different guesses about what "good" means each time.

Inconsistency on similar inputs = ambiguity problem. The model doesn't have a stable definition of "correct" to work from. Few-shot examples of actual approved rankings would provide that definition explicitly.

11. You're writing a research paper and prompt chain step three produced a solid literature review. Step four asks you to build the argument. Which context method is most appropriate if you're continuing in the same session?

Correct. In the same session, the full literature review is already in the context window. Your step four prompt can directly reference it — "using the literature review above, construct an argument that..." — without any copying or summarizing needed.

Opening a new session cuts you off from all prior context. Stay in session when possible — the context window holds everything you've exchanged, and later prompts can exploit that accumulated specificity directly.

12. Claude's "Constitutional AI" training approach produces a model that is more likely than ChatGPT to:

Correct. Constitutional AI trains self-evaluation, which manifests as qualifications, pushback, and reasoning-heavy responses.

Constitutional AI is a self-evaluation framework — it makes Claude assess its outputs, leading to more qualifications and reasoning, not faster or shorter responses.

13. You're preparing for a job interview at a product company. You want honest evaluation of your answers to common interview questions — not just validation. Which tool should you use?

Right. When you specifically want honest critique over validation, Claude's Constitutional AI orientation is the structural advantage. It's trained to evaluate honestly, not to make you feel good.

The key requirement is honest evaluation. Claude is specifically designed to evaluate rather than validate. ChatGPT's RLHF training makes it more likely to affirm than critique.

14. You get a draft cover letter from AI and the opening is too safe. What's the most effective next step?

Right. Targeted surgical feedback produces better revisions than vague improvement requests or starting over. The iteration loop is where quality is built.

Vague requests for improvement produce vague changes. Identify what's specifically wrong and direct a specific fix — that's the iteration skill.

15. What are the three components of a strong role assignment?

Correct. Identity gives the model a vantage point. Disposition gives it a communication style. Stakes give it a goal. All three together produce significantly more useful output than any one alone.

Job title and years of experience are part of identity, but they're not the complete framework. Disposition (how this person actually communicates) and Stakes (what they're trying to achieve) are the layers most people miss.

16. Why does AI output often look "done" even when it isn't?

The surface finish creates a "finished" signal. That's the formatting illusion — the look of done-ness independent of the quality of the content.

It's not about expertise or length — it's about the visual and structural signals we use to categorize things as "rough draft" vs. "finished." AI always delivers those "finished" signals.

17. Gemini's primary advantage over ChatGPT and Claude for research tasks is:

Correct. Gemini's differentiator for research is real-time web access and Google integration — not parameter size or reasoning depth.

Gemini's research advantage comes from real-time web access, not raw capability. It can retrieve current information; the others (without browsing) cannot.

18. The "feedback loop that actually works" for prompt iteration has four steps. Which step does the module say most people skip?

The compare step. Without holding both outputs side by side, you can't confirm the change helped — which means you're iterating blind and can't build the pattern recognition that makes future iterations faster.

The skipped step is comparison. Use → notice → add constraint → use again → compare. Without comparing outputs, you don't know if the change was actually an improvement, and you miss the learning that builds prompt engineering intuition over time.

19. What is "explicit carry-forward" in cross-session chaining?

Exactly. New sessions have no access to prior exchanges. Explicit carry-forward means you physically include the relevant prior output — pasted, quoted, labeled — so the model is working with the actual content rather than inferring what you mean by "our previous analysis."

AI sessions don't share memory across conversations. Explicit carry-forward is exactly that — explicit. You paste or quote the actual prior output into your new message. No platform feature or session ID solves the cross-session context gap for you.

20. The correct fix for Format Mismatch is:

Structure instructions — specific about format, structure, and length. "Write in exactly 150 words as a single paragraph" is more effective than "keep it short." Specificity is the fix.

Format Mismatch is a structural problem — the model produced the wrong form. The fix must address structure directly, not information content or tone.

Final Exam