Problem ontology, value specification, and the architecture of AI project failure.
Amazon's hiring AI learned from ten years of successful hires. The problem: ten years of successful hires were predominantly male. The AI didn't just replicate this bias — it actively penalized resumes that included the word "women's" (as in "women's chess club") and downgraded graduates of all-women's colleges. Amazon's stated goal was to identify candidates who would succeed. Their operationalized goal was to find candidates similar to past successful hires. Those two goals were not the same — and the difference was the problem. Amazon disbanded the team in 2018. The AI had been working perfectly on the specified problem while failing on the actual one.
AI project failures typically trace to one of three specification failures:
Rigorous problem specification requires explicit value decomposition:
Before building any AI system, explicitly state: what are the terminal values? What instrumental metrics are you using? Why do those metrics produce those values? Under what conditions would they diverge? If you can't answer these questions, you can't safely build the system.
5 questions — free, untracked, retake anytime.
was the specification gap in Amazon's hiring AI?
is the 'decomposition test' in value specification?
is 'temporal myopia' as a problem specification failure?
is the difference between 'terminal values' and 'instrumental values' in AI specification?
does stakeholder exclusion from problem definition produce specification failures?
Apply rigorous value decomposition to AI problem specification.
Practice rigorous value decomposition for AI specification.
Prompt architecture, context engineering, and the epistemics of system design.
Research published in 2023 found that models prompted with "You are an expert X" produced outputs rated as higher quality by domain experts than the same models without persona specification — even though the model weights didn't change. Further research found that prompts containing logical inconsistencies, contradictions, or ambiguous instructions produced highly variable outputs that could be exploited adversarially. The system prompt is not just instructions — it is the epistemic context that shapes what patterns the model draws on. Understanding this mechanistically, not just empirically, is required for reliable system design.
Large-scale AI deployments require systematic context engineering beyond individual prompt design:
Reliable AI systems at scale typically combine: a carefully designed system prompt, a RAG layer for factual grounding, dynamic context injection for relevance, and guardrail layers for constraint enforcement. No single element is sufficient.
5 questions — free, untracked, retake anytime.
do logical contradictions in system prompts create security vulnerabilities?
does RAG (Retrieval-Augmented Generation) address in AI system design?
is 'uncertainty architecture' as a design requirement for AI systems?
is no single element (system prompt, RAG, guardrails) sufficient for reliable AI systems at scale?
mechanistic insight explains why specific prompts produce better outputs than vague ones?
Design a full AI system architecture for a complex use case.
Design a full AI system architecture for a complex use case.
Evaluation methodology, benchmark design, and the epistemics of AI quality assessment.
In 2023, researchers discovered that AI coding assistants — GitHub Copilot and others — generated security vulnerabilities at measurable rates: CWE (Common Weakness Enumeration) issues in 40% or more of security-sensitive code completions in some studies. Developers using these tools didn't notice: the code looked correct, compiled, and passed basic tests. The vulnerability was in logical correctness under adversarial conditions — something standard evaluation didn't test. The evaluation problem: most developers (and most evaluation frameworks) test for 'does this work?' not 'can this be exploited?'
Rigorous AI evaluation requires multiple distinct dimensions measured independently:
Evaluations measure what they measure — which may not be what you care about:
An AI system that passes your evaluation framework is safe only within the scope of what your evaluation measured. Knowing what your evaluation doesn't measure is as important as knowing what it does.
5 questions — free, untracked, retake anytime.
did the GitHub Copilot security vulnerability study reveal about evaluation frameworks?
is 'construct validity' as an evaluation design concern?
is 'failure mode characterization' and why is it more informative than overall accuracy?
is 'benchmark saturation' as an evaluation problem?
must evaluators know what their evaluation framework doesn't measure?
Design a rigorous multi-dimensional evaluation framework.
Design a rigorous evaluation framework with explicit scope limitations.
Organizational design, automation bias mitigation, and the human factors of AI systems.
The 2009 crash of Air France 447 occurred partly because of automation dependency: when the autopilot disconnected due to sensor malfunction, the pilots — who had been managing automated systems rather than manually flying — were unable to diagnose and respond to the situation. They had the controls; they lacked the situational awareness. The NTSB and BEA reports highlighted a structural problem: designing systems for normal operations creates skill atrophy that makes failure situations worse. This is the automation paradox: the more reliable the automation, the less practiced humans are at handling the cases when it fails.
The automation paradox has direct implications for AI workflow design:
Meaningful human oversight requires organizational design, not just technical design:
Human oversight isn't just a workflow feature — it requires organizational culture, training, incentive structures, and role design that make meaningful oversight possible in practice, not just in theory.
5 questions — free, untracked, retake anytime.
is the 'automation paradox' and why does it matter for AI workflow design?
does incentive structure affect oversight quality in human-AI workflows?
is 'role preservation' as an organizational design requirement for AI-augmented workplaces?
must override culture be explicitly designed rather than assumed?
does Air France 447 illustrate about designing AI systems for failure cases?
Design organizational infrastructure for meaningful AI oversight.
Design the organizational infrastructure for meaningful AI oversight.
Structured adversarial evaluation, failure mode taxonomy, and pre-deployment safety engineering.
In 2022, Anthropic published research on "discovering language model behaviors" through structured elicitation — systematic attempts to find behaviors models could exhibit that weren't intended or anticipated. They found that models could produce outputs that were harmful in specific contexts even when safety training had reduced harmful outputs in general contexts. The key insight: a model's general safety performance on standard evaluations does not predict its behavior under targeted adversarial elicitation. Standard evaluation and adversarial evaluation measure different things — and both are necessary.
Rigorous pre-deployment safety evaluation requires a structured adversarial program:
Effective red-teaming requires a structured failure mode taxonomy:
Red-team coverage should be proportional to harm severity times probability. Safety failures in high-stakes applications deserve exhaustive testing; minor usability issues deserve proportionally less attention.
5 questions — free, untracked, retake anytime.
did Anthropic's behavior elicitation research demonstrate about standard vs. adversarial evaluation?
does rigorous red-teaming require both internal and external teams?
is a 'threat model' in the context of AI red-teaming?
is 'automated adversarial generation' and what does it address?
should red-team coverage be prioritized according to the coverage principle?
Design a comprehensive adversarial evaluation program.
Design a comprehensive adversarial evaluation program.
Deployment ethics, ongoing monitoring, and the governance of production AI systems.
When Twitter's image-cropping algorithm was found in 2020 to systematically crop images to show white faces over Black faces, the company's initial response was to dispute the finding. Further research by their own Responsible ML team confirmed it. The algorithm had been deployed and operating for years — cropping millions of images — before the bias was identified. The team that found it internally noted that the monitoring infrastructure to detect bias post-deployment hadn't existed; the finding came from an external journalist who noticed the pattern. The monitoring failure preceded the bias failure — no one was watching for it.
The Twitter image cropping case illustrates two failures: a biased algorithm and absent monitoring. The second failure made the first failure invisible for years. Deployers of AI systems are responsible for building the monitoring infrastructure to detect problems — not waiting for journalists to find them.
5 questions — free, untracked, retake anytime.
two failures did the Twitter image cropping case illustrate?
is 'distribution shift detection' in production AI monitoring?
is a 'feedback loop' monitoring concern in production AI?
does 'external audit access' require for high-stakes AI systems?
does the Twitter case illustrate about deployers' monitoring responsibilities?
Design a comprehensive production AI monitoring and governance framework.
Design a comprehensive production AI monitoring and governance framework.
Equity by design, participatory development, and the politics of AI access.
The MIT Media Lab's Civic Media project documented a pattern in AI civic tech deployment: well-funded organizations built AI tools for underserved communities without involving those communities in design. The tools reflected the designers' assumptions about what those communities needed — which differed significantly from what communities said they needed when asked. Tools designed for efficiency often conflicted with community values around relationship and trust. Tools designed for data collection conflicted with communities' justified distrust of institutions that had used data against them. The pattern had a name: "parachute tech" — solutions dropped in from outside without community anchoring.
Equity considerations integrated throughout design produce different results than equity reviews at the end:
Participatory design — involving affected communities in design — is not just ethically required; it produces better tools:
Building AI tools for underserved communities without them produces tools that serve designers' assumptions rather than communities' needs. Participation is not just ethical obligation — it is the technical requirement for building tools that actually work.
5 questions — free, untracked, retake anytime.
is 'parachute tech' and why does it fail?
is participatory design a technical requirement, not just an ethical obligation?
is 'data sovereignty' in the context of equity-centered AI design?
does 'equity by design' require compared to 'equity by afterthought'?
is the question 'are there better-positioned builders?' important in equity-centered AI development?
Design an equity-centered AI development process.
Design an equity-centered AI development process.
Professional ethics, whistleblowing, and the long-term obligations of those who build consequential AI.
In 2024, several current and former employees of major AI companies signed an open letter — "A Right to Warn About Advanced AI" — arguing that AI companies' confidentiality agreements prevented them from raising safety concerns publicly, even when internal escalation had failed. The letter called for the right to report safety concerns to regulators and the public without retaliation. This is a new frontier in professional ethics: what obligations do people who build consequential AI systems have when the organizations they work for resist safety concerns? The question isn't hypothetical — it's current.
The builder's responsibility doesn't end at deployment — it evolves:
When organizations resist safety concerns, builders face a classic ethical sequence:
Building consequential AI carries ongoing professional responsibility. The people who build these systems have technical knowledge that uniquely positions them to identify when something is wrong — and that positioning creates responsibility that doesn't end at deployment or when employment ends.
5 questions — free, untracked, retake anytime.
did the 'A Right to Warn' open letter address?
do builders have unique post-deployment monitoring obligations?
distinguishes 'exit' from 'whistleblowing' as responses to organizational safety resistance?
is 'proactive disclosure' as an ongoing builder responsibility?
does the professional responsibility of AI builders extend beyond employment?
Synthesize the curriculum and develop your builder's ethics framework.
Synthesize the curriculum and develop your personal framework for building responsibly.
8 questions covering all lessons. Free, untracked, retake anytime.
divergence in AI specification means:
contradictions in system prompts create:
GitHub Copilot security vulnerability finding shows:
automation paradox means that for high-stakes AI-assisted domains:
behavior elicitation research demonstrated that:
Twitter image cropping case illustrates:
design is a technical requirement (not just an ethical obligation) because:
professional responsibility of AI builders: