When Lion Air Flight 610 crashed into the Java Sea on October 29, 2018, killing all 189 people aboard, investigators eventually traced a central factor to the MCAS software — a flight-control algorithm Boeing engineers had designed to handle certain aerodynamic conditions. The system received data from a single angle-of-attack sensor. That sensor gave a faulty reading. The algorithm, following its instructions with perfect consistency, pushed the nose down repeatedly. Pilots had fewer than ten minutes and incomplete information about which automated system was fighting them. A second crash, Ethiopian Airlines Flight 302 on March 10, 2019, followed an almost identical pattern, killing 157 more people.
The MCAS logic was not irrational. It was doing exactly what it was designed to do. But the judgment about how much authority to give an automated system in a life-or-death situation — and whether pilots needed to know it existed — was a human judgment. Committees of engineers, managers, and regulators made choices that encoded certain priorities and omitted others. The algorithm had no capacity to recognise when context had changed so fundamentally that its own authority should be questioned.
Moral reasoning is not rule-following. Rules are inputs to moral reasoning — they inform it, constrain it, provide starting points. But genuinely ethical decisions require the reasoner to weigh competing values, account for unique context, tolerate ambiguity, and accept responsibility for outcomes. These are not functions; they are capacities.
Philosophers distinguish between rule-based ethics (Kantian deontology: follow the universal rule), outcome-based ethics (utilitarianism: maximise aggregate welfare), and virtue ethics (act as a person of good character would act). Human moral agents can draw on all three frameworks simultaneously and shift emphasis depending on circumstances. An experienced judge, doctor, or soldier does this constantly, often without conscious deliberation.
Current AI systems, including large language models, are trained on human moral language and can generate sophisticated-sounding moral arguments. But they do not hold stakes in outcomes. They are not responsible parties. They cannot be harmed, shamed, or held accountable in any meaningful social sense. This absence of stake is not a technical limitation to be engineered away — it is a structural feature of what these systems are.
A 2021 MIT study on autonomous vehicle ethics (Awad et al., "The Moral Machine Experiment") surveyed 40 million decisions from 233 countries and found that moral preferences varied systematically by culture, age, and social role. There is no single correct encoding. Any system that claims to have solved this problem has actually smuggled in somebody's values without admitting it.
In medicine, clinical guidelines exist precisely because individual judgment is fallible — but the guidelines do not replace judgment. The landmark 2016 case at Addenbrooke's Hospital in Cambridge, UK, documented how an AI diagnostic system achieved higher accuracy than junior doctors on certain pathology reads while simultaneously missing critical social context (a patient's expressed refusal of treatment) that altered the entire care pathway. The system performed its task correctly within its defined scope. It had no mechanism to recognise that the scope was wrong.
In criminal justice, the COMPAS algorithm used across multiple US jurisdictions to predict recidivism risk was analysed by ProPublica in 2016. The algorithm's mathematical outputs were internally consistent. What the algorithm could not do was exercise the kind of contextual judgment a parole officer with twenty years of experience might apply: recognising when a person's circumstances had changed in ways the historical training data could not capture, or when the data itself was a product of discriminatory policing.
In warfare, the U.S. Department of Defense's AI ethical principles (adopted in 2020) explicitly state that lethal force decisions must remain under "appropriate levels of human judgment." This is not sentimentality — it reflects the legal and moral reality that accountability for killing requires a human agent who can be held responsible.
The practical implication is not that AI should never be involved in consequential decisions. It already is, and often beneficially. The implication is that the human who configures, oversees, and interprets AI outputs in high-stakes contexts holds a skill that cannot be automated: the ability to recognise when the machine's output is technically correct but contextually wrong, and to bear responsibility for acting on that recognition.
Workers who develop this capacity — who can articulate why a particular AI recommendation is inappropriate for a specific context, and who are willing to own that judgment — will be genuinely difficult to replace. Workers who simply relay AI outputs without applying contextual scrutiny are, in a real sense, performing the same function as a conduit, and conduits are easy to remove.
There is a difference between a decision that is optimal by measurable criteria and a decision that is right given the full human context. AI can pursue the first. Only humans can be responsible for the second.
You will present the AI assistant with scenarios drawn from real-world documented cases where automated systems produced technically correct outputs that were contextually inappropriate. Your task is to articulate why the context changes the ethical calculus — and to probe the AI's reasoning for gaps.
Complete at least three exchanges. Push back on answers that feel too tidy. Real moral reasoning is rarely clean.
In 2019, the Cleveland Clinic began deploying an AI triage system to help route patients through its emergency department. The system performed well on objective metrics: wait time reduction, appropriate severity classification, resource allocation. By 2022, published analysis showed that patient satisfaction scores — a separate tracked metric — had become more divergent from clinical efficiency scores than at any prior point in the hospital's measurement history.
Interviews with patients who gave low satisfaction scores while receiving objectively fast, high-quality clinical care returned a consistent theme: they felt no one had listened to them. The triage process had become faster and more accurate, but the moment of human acknowledgment — the nurse who said "that sounds frightening, let's get you seen" — had been compressed or removed from the workflow. The clinical outcome improved. The human experience deteriorated. These are not the same thing, and the difference matters to the people receiving care.
Empathy in professional contexts is not a soft skill appended to competence — it is frequently the mechanism through which competence operates. A therapist who correctly identifies a cognitive distortion but communicates it without empathy may worsen the patient's condition. A lawyer who understands a client's legal position but not their emotional state may recommend a technically optimal settlement the client will reject. A manager who diagnoses a team conflict accurately but delivers the assessment without relational sensitivity may entrench the conflict rather than resolve it.
The economic value of empathy has been increasingly documented. A 2020 Harvard Business Review analysis of 889 companies across 45 industries found a statistically significant positive correlation between empathy scores (measured by Businessolver's Empathy Monitor survey instrument) and financial performance, employee retention, and customer loyalty. This is not a coincidence — empathy generates trust, and trust reduces the friction costs that permeate every transaction.
AI systems can simulate empathic language with increasing sophistication. What they cannot do is experience the other person's reality in any sense that creates genuine mutual understanding. The simulation can be useful — but it is different in kind from the real thing, and people — particularly in high-stakes moments — increasingly notice the difference.
A 2023 Stanford study (Liebrenz et al.) examined patient responses to mental health support from AI chatbots versus human therapists. Patients rated AI interactions as "helpful" and "informative" but systematically described them as lacking something they couldn't always name. Follow-up interviews identified the missing element as felt mutuality — the sense that the other party was genuinely affected by what they heard, not merely processing it.
Trust is not the same as reliability. A vending machine is reliable. Trusted relationships involve vulnerability, reciprocity, and accumulated shared history. Clients who trust a professional do so because they believe that professional genuinely cares about their outcome — not merely that the professional will perform their contracted tasks accurately.
This distinction has direct career implications. Research by Maister, Green, and Galford in The Trusted Advisor identified four components of professional trust: credibility, reliability, intimacy, and self-orientation (specifically, low self-orientation — the degree to which the advisor puts the client's interest ahead of their own). AI systems score extremely well on credibility and reliability in many domains. They score zero on intimacy, and the question of self-orientation is philosophically incoherent for a system with no interests. The trust equation that clients apply to human advisors simply does not map to AI in its current form.
Professionals who actively cultivate the relational dimensions of their work — who invest in understanding clients as people, who are present and genuinely responsive in difficult moments — are accumulating a form of capital that compounds over time and is structurally difficult for automation to displace.
In fields including social work, palliative care, conflict mediation, executive coaching, and crisis counselling, the human relationship is not the delivery mechanism for a separate service — it is the service. Attempts to automate these fields have consistently demonstrated that efficiency gains in administrative components do not compensate for losses in relational quality.
A 2021 systematic review published in JAMA Network Open examining AI-assisted mental health interventions found that AI tools were effective at delivering psychoeducation (information about conditions and treatments) and structured exercises (CBT worksheets, mood tracking), but consistently ineffective as primary providers for complex relational trauma, grief, and personality disorders — conditions where the therapeutic relationship itself is the treatment mechanism.
The professional who is known — not just competent — is the one whose position is most durable. Relationships are accumulated evidence that you are safe to trust. AI can assist your work; it cannot accumulate your reputation with the specific people who know you.
Relational capital is often invisible until it's gone. In this lab, you'll work with the AI assistant to map the specific trust relationships and empathic capacities in your own work that AI cannot replicate. You'll also examine where simulated empathy from AI is already close enough to be useful — and where it isn't.
Be specific about your actual role or a role you're familiar with. Vague answers produce vague insights.
In February 2023, Getty Images filed a lawsuit in the U.S. District Court of Delaware against Stability AI, alleging that Stability AI had scraped and trained its Stable Diffusion model on more than 12 million images from Getty's collection without license or compensation. The case brought into sharp focus a technical reality about how generative image AI works: it does not create from nothing. It identifies and recombines statistical patterns learned from enormous quantities of existing human-created work.
The legal question of copyright was unresolved at the time of writing. But the creative question was more immediately interesting: what distinguished the output of Stable Diffusion from the inputs it was trained on? Art directors and designers who worked with generative AI tools through 2022 and 2023 consistently reported the same observation — the outputs were frequently impressive, often beautiful, and almost never surprising in the way that genuinely original human creative work surprises. They were, in the words of one senior creative director at Pentagram interviewed by Fast Company in 2023, "the average of everything they've consumed, rendered with extraordinary skill."
There is a model of creativity — sometimes called the combinatorial model — that holds that all creative acts are novel recombinations of existing elements. Under this model, AI creativity and human creativity differ only in degree, not in kind. This view has gained traction in popular writing about AI, particularly in the claim that "AI is just doing what humans do, only faster."
The philosopher Margaret Boden's taxonomy of creativity offers a more careful analysis. Boden distinguishes between combinational creativity (novel combinations of familiar ideas), exploratory creativity (pushing the boundaries of an established conceptual space), and transformational creativity (fundamentally restructuring the conceptual space itself). Current AI systems operate primarily in the first category and occasionally in the second. Transformational creativity — the kind that generates new paradigms rather than new exemplars — remains observationally rare in AI output and is not structurally expected from systems that optimise for prediction of existing patterns.
The documented examples of genuinely transformational creative work — Einstein's reconception of time and space, Coltrane's development of sheets of sound in jazz improvisation, Picasso's cubist deconstruction of visual perspective — share a common feature: they broke with pattern rather than extending it. Systems trained to predict and reproduce patterns are not structured to break with them.
A 2023 study by Noy and Zhang at MIT (published in Science) found that knowledge workers who used ChatGPT for creative writing tasks produced outputs that were rated as significantly higher quality by evaluators — but also as more homogeneous. The average quality rose; the variance fell. This is the signature of a tool that lifts the floor while compressing the ceiling: it eliminates the worst outputs and many of the best simultaneously.
Human creativity in professional contexts carries intentional stakes that AI creativity does not. When a novelist makes a structural choice about their narrative, they are committing something — a vision, a risk, an argument about what matters. When an architect chooses a material or a spatial configuration, they are expressing a position about how people should inhabit space. When an advertising creative reframes a client's product around an unexpected cultural insight, they are making a bet with their professional reputation on the line.
These stakes are not incidental to the creative work — they are part of what makes the work creative rather than generative. The creative director who pitches a campaign that could fail publicly, and who chose it anyway because they believed in it, is performing a different act than a system that generates a thousand campaign concepts and presents the statistically most likely to be approved.
Clients increasingly understand this distinction. A 2023 survey by the in-house agency community InSource found that 71% of corporate marketing directors said they valued having a creative lead who "had a genuine point of view and was willing to defend it" more than having faster or cheaper content production. The creative act they were purchasing was one that included human judgment and risk.
The most productive frame for creative professionals is not "AI vs. human creativity" but "what does human creative judgment add to AI generative capacity?" The answer is: direction, curation, intentional constraint, and conceptual transformation.
When the designer Stefan Sagmeister talks about what makes his studio's work distinctive, he describes the role of radical constraint — choosing to do things in ways that are deliberately harder, that break established pattern, that produce discomfort as a creative feature rather than a defect. This is exactly the kind of intentional departure from statistical pattern that AI tools, left to their own optimisation, will not produce. The human creative practitioner who can direct AI tools toward genuinely original ends — rather than letting AI tools direct toward statistically safe outputs — is exercising a skill that the AI cannot supply from within itself.
The creative worker at risk from AI is the one whose value was always in execution speed rather than conceptual originality. The one whose position strengthens is the one who brings a genuine point of view — a willingness to break pattern intentionally, to make bets, and to be accountable for creative choices that could fail.
In this lab you will collaborate with the AI assistant on a real creative challenge from your own field. Your goal is to actively direct the AI toward genuinely surprising outputs — using intentional constraint, deliberate pattern-breaking, and your own conceptual judgment — rather than accepting statistically safe suggestions.
After at least three exchanges, reflect with the AI on which contributions came from you versus the tool, and what that tells you about where your irreplaceable value sits.
At 3:27 PM on January 15, 2009, US Airways Flight 1549 struck a flock of Canada geese 2,818 feet above Manhattan and lost thrust in both engines. Captain Chesley Sullenberger had approximately 208 seconds before impact. The aircraft's automated systems were functioning correctly throughout — they provided accurate data about the aircraft's state. What they could not provide was a decision about which of several imperfect options to choose under conditions of extreme time pressure, incomplete information, and unprecedented circumstances that no simulation had modelled exactly.
Sullenberger's decision to land on the Hudson River rather than attempt a return to LaGuardia or divert to Teterboro was later analysed in detail by the NTSB. Simulations run afterward showed that a return to LaGuardia would likely have succeeded — but only if initiated within the first 35 seconds, before Sullenberger had completed his assessment. He decided correctly with the information available at the time of decision. All 155 people aboard survived. The NTSB report noted that his decision integrated accumulated judgment from 40 years of flying experience in a way that could not be decomposed into retrievable, explicit rules.
AI systems, including the most sophisticated planning and decision-support tools, perform best in environments with well-defined state spaces, measurable outcomes, and sufficient historical data to train reliable models. These conditions are met in chess, Go, protein folding prediction, and many aspects of financial trading. They are met incompletely or not at all in the conditions that define leadership: novel situations, conflicting values, incomplete information, and outcomes that cannot be fully specified in advance.
The economist Frank Knight drew a distinction in 1921 that remains analytically important: the difference between risk (quantifiable probability distributions over known outcomes) and uncertainty (situations where the outcome space itself is unknown or where probabilities cannot be meaningfully assigned). AI tools are exceptional at managing risk. Genuine uncertainty — Knight uncertainty — is the domain where human judgment remains structurally necessary.
The CEO making a strategic pivot into an unexplored market, the general deciding whether a ceasefire negotiation is genuine or tactical, the physician choosing between two treatments where the patient's case is sufficiently unusual that no robust clinical data applies directly — these are all situations of Knightian uncertainty. They require judgment that is not reducible to optimisation over a known probability space.
A 2022 McKinsey Global Survey found that 57% of C-suite executives reported that AI-generated analysis had improved their ability to identify risks — but only 12% reported delegating final strategic decisions to AI recommendations. The pattern is consistent: AI improves the information environment for human decisions; it does not displace the decision itself at the highest levels of consequence.
Leadership involves not only making decisions but absorbing the consequences of them in a way that maintains the legitimacy of the institution and the trust of the people affected. When a decision goes wrong, someone must stand accountable — not merely as a procedural requirement, but as a social and psychological necessity for the people who were harmed or disappointed. This is not a function that can be outsourced to a system that has no capacity for shame, repair, or genuine responsibility.
When Knight Capital Group's automated trading algorithm malfunctioned on August 1, 2012, and generated $440 million in losses in 45 minutes, destroying the firm, the accountability was assigned to humans: the engineers who had not properly managed the deployment, the managers who had not implemented adequate safeguards, the executives who had approved the system's architecture. The algorithm itself bore no accountability. This asymmetry — where humans are accountable for automated systems but automated systems cannot be accountable for themselves — means that the leadership function of absorbing and responding to failure is permanently human.
Effective leaders in AI-augmented environments are developing a specific new capability: knowing when to override a well-performing system based on contextual information the system cannot access. The FAA's report on airline automation dependency (2013) documented a pattern in which highly automated cockpits were producing pilots who were excellent at monitoring automated systems but degraded in their ability to exercise manual judgment when automation failed. The solution was not to remove automation — it was to structure deliberate practice of exactly the judgment the automation could not supply.
The practical work of building durable leadership capability in an AI-augmented environment involves three specific investments. First: cultivate comfort with Knightian uncertainty. This means seeking out decision experiences where the outcome space is genuinely unclear, rather than only making decisions within well-defined analytical frameworks. Second: practice explicit accountability narratives — the capacity to explain, after the fact, why a decision was made with what information and what values were weighed. Leaders who can narrate their decision-making process are demonstrating a kind of transparency that AI systems cannot authentically provide. Third: develop the skill of calibrated AI override — knowing when the AI's recommendation, while analytically defensible, misses something contextually critical, and being willing to diverge from it on record.
AI makes the information environment for leadership decisions richer. It does not eliminate the need for someone to decide, to commit, and to be accountable for being wrong. That someone is always a human — and the quality of that human's judgment determines whether the richer information environment actually produces better outcomes or only better-documented bad ones.
You'll present the AI assistant with a real or plausible decision scenario from your professional domain. The AI will give you a recommendation. Your job is to identify what contextual information the AI couldn't access, articulate why the recommendation might be wrong despite being analytically reasonable, and practice the accountability narrative — explaining your override decision as a leader would.
After at least three exchanges, ask the AI to challenge your override reasoning. Can you defend it? This is calibrated override under pressure.