In early 2023, a mid-sized law firm called Levidow, Levidow & Oberman in New York had a problem that would end up in every newspaper in the country. One of their lawyers, Steven Schwartz, was working on a personal injury lawsuit — a case about a man who said he was injured on an Avianca Airlines flight. The research was piling up, and Schwartz used ChatGPT to help him find relevant court cases to cite in his legal brief.
ChatGPT gave him a list of cases. Varghese v. China Southern Airlines. Shaboon v. EgyptAir. They looked real. They had docket numbers. They had judge names and quoted passages of legal opinions. Schwartz submitted the brief to a federal court in May 2023.
The problem? Every single case was fabricated. ChatGPT had invented them — completely. The judges cited didn't exist. The rulings quoted never happened. When the opposing lawyer tried to find the cases, they found nothing. The court was furious. Schwartz was fined $5,000 and publicly sanctioned. The story ran in The New York Times, The Guardian, and on every legal blog in the country.
But here's the thing almost nobody asked: what was the input that caused this? Schwartz had typed something like "find cases relevant to passenger injury on international flights." That was a perfectly reasonable thing to ask a human researcher. But when you feed that input to an AI that is designed to produce fluent, confident-sounding text — not to check legal databases — the output is almost guaranteed to sound real and be wrong.
Let's start with the most basic idea in this entire module, because everything else depends on it: every AI workflow is just a pipeline. Something goes in. Something comes out. The thing that goes in is called the input. The thing that comes out is called the output.
If that sounds obvious, good. It is obvious — until you start looking at real workflows and realizing how many people never think carefully about what they're putting in. Steven Schwartz's mistake wasn't that he used AI. It was that he fed his AI a request designed for a human expert — and then trusted the output without understanding what the AI was actually doing with that input.
An AI doesn't "know" what you mean. It processes what you give it. The input is the only way the AI has of understanding your goal. Change the input, and you change the output entirely — even if the AI model is identical.
When most people think about giving an AI an input, they picture typing something. But modern AI workflows accept many different kinds of inputs. Understanding which kind you're dealing with changes how you design everything downstream.
Text inputs are the most common — a message, a prompt, a document. Structured data inputs are things like spreadsheet rows, form submissions, or database records where the information is already organized into categories. File inputs include PDFs, images, audio recordings, or videos. Event inputs are signals — things like "an email just arrived" or "someone filled out this form" — that trigger the workflow to start at all.
Here's why this matters practically: a no-code tool like Make.com (formerly Integromat) treats each of these differently. If you're trying to build a workflow that reads customer support emails and sorts them by urgency, your input is an event (new email arrives) plus text (the email body). The AI needs both. If you only wire up the event trigger but forget to pass the email text, the AI gets a signal that says "something happened" — but no information about what. The output will be garbage, or nothing at all.
Think of it like this: imagine you're a chef and someone orders food — but they only ring the bell to say "order up" without leaving a ticket. You know a customer wants something, but you have no idea what. That's exactly what happens when an AI gets an event trigger with no data attached. The bell rang, but there's no ticket.
For older readers thinking about institutional design: this is why enterprise AI systems spend enormous engineering effort on what they call data ingestion pipelines — the plumbing that gets the right inputs to the right AI at the right time. The AI model itself is often the smallest engineering problem. Getting clean, complete, correctly-formatted inputs to it is where most of the real work lives.
There's a phrase that's been in computing since the 1960s: garbage in, garbage out. It means: if you feed a system bad input, you get bad output, and the system is not to blame. But in AI, this phrase has a twist that people often miss.
In traditional software, "garbage in, garbage out" is obvious — if you put the wrong number into a calculator, it gives you the wrong answer. But with AI — especially large language models like the one Steven Schwartz used — the garbage output doesn't look like garbage. It looks brilliant. The AI produces fluent, confident, well-formatted text that reads as if an expert wrote it. The cases it invented had proper legal citation format. They sounded authoritative. That's the danger: the output quality is decoupled from whether the input was appropriate for the task.
This is the core principle you're going to carry through this whole module: the quality of the output is not a reliable signal of the quality of the input. You cannot look at a polished AI output and work backward to assume the input was well-designed. You have to think about the input first, always, before you look at what came out.
When you read a news story about an AI "going wrong" — making a bad medical diagnosis, flagging the wrong person, generating fake news — the first question most people ask is "what's wrong with the AI?" You now know to ask a different question first: what was the input? That question puts you ahead of most adults who cover AI in the press.
Steven Schwartz was fined and sanctioned. But here's a question the court didn't fully resolve, and that legal scholars are still arguing about: whose responsibility is the input?
Schwartz claimed he didn't know ChatGPT could fabricate cases. OpenAI, who makes ChatGPT, had published warnings that the tool could "hallucinate" — make things up. But those warnings were buried in terms-of-service documents that almost no one reads. Law schools weren't teaching students about AI limitations. The software was designed to look confident regardless of whether it was correct.
So: was Schwartz negligent for not knowing more? Was OpenAI responsible for making a tool that sounds authoritative even when it's wrong? Was the law firm responsible for not having policies? Was the court system responsible for not updating its rules about AI-assisted research faster?
There's no clean answer here. Different people — lawyers, ethicists, engineers — reach different conclusions. But what's certain is that the input design question is inseparable from the responsibility question. Who decides what goes into an AI system, and what standards they're held to, is one of the defining legal and ethical problems of the next decade. You now have enough context to have a real opinion about it.
In 2021, a team of engineers at Spotify published a technical paper describing how their music recommendation system, called BaRT (Bandits for Recommendations as Treatments), actually worked. Most users thought Spotify just "knew" what they liked — that there was one big brain deciding their weekly playlist. The reality was stranger and more interesting.
BaRT was a chain. One AI model processed raw listening data — what songs you played, how long you listened, when you skipped. Its output: a numerical representation of your taste profile, called an embedding. That embedding became the input to a second model, which compared your profile to thousands of other users' profiles and identified clusters of people who listened similarly. That cluster assignment became the input to a third model, which evaluated candidate songs not yet in your history. And so on — through multiple layers, each output feeding the next input, until the system produced a ranked list of 30 songs for your Discover Weekly playlist every Monday morning.
In 2022, Spotify engineers noticed something odd: certain genres were being systematically under-recommended to users who had shown interest in them. They traced the problem backwards through the chain. It wasn't the final ranking model. It wasn't even the clustering model. The problem was in the very first step — the way raw listening data was being normalized (adjusted to a standard scale) before being turned into embeddings. A small mathematical bias in that first output was being amplified at every subsequent step until, by the end of the chain, it had grown large enough to meaningfully change what music millions of people heard.
What Spotify built is called a pipeline — a series of connected steps where the output of each step becomes the input to the next one. In no-code AI tools, you build pipelines all the time, often without thinking of them that way. When you set up a Make.com scenario that: (1) monitors a Gmail inbox, (2) sends the email text to an AI to classify its sentiment, (3) takes that sentiment score and uses it to route the email to different folders — you've built a three-step pipeline.
The key insight from Spotify's experience: errors don't stay where they start. They propagate. A small problem in Step 1 becomes a medium problem in Step 2 becomes a large problem in Step 3. This is called error amplification, and it's the reason engineers who build multi-step AI workflows spend so much time checking intermediate outputs — the outputs in the middle of the chain that most users never see.
In a no-code tool like Zapier or Make, intermediate outputs are the data that passes between your "modules" or "steps." You can usually inspect them in a test run. Most beginners skip this inspection. After this lesson, you won't.
Here's something that surprises almost everyone building their first multi-step workflow: two outputs can contain the same information and still break your pipeline — because of how that information is formatted.
Imagine Step 1 of your workflow produces an AI output that says: "The sentiment of this email is: Positive (confidence: 87%)". Now Step 2 needs to route the email based on the sentiment. But Step 2 is expecting a simple label — just the word "Positive" or "Negative." Instead it gets a full sentence. If you haven't set up Step 2 to parse (extract the right part from) that sentence, it doesn't know what to do. The workflow breaks — not because the AI was wrong, but because the format of the output didn't match what the next step needed as input.
This is why professional workflow builders obsess over what are called output schemas — a specification of exactly what format an output will take. In no-code tools, you often handle this by instructing the AI to respond in JSON format (a structured data format), or by adding a "text parser" step between two AI steps to reformat the output before it becomes the next input.
Think of it like passing a note in class. If you write "Meet me at the gym" and fold it into a paper airplane, but your friend is expecting a flat note they can read quickly — the information is right, but the format is wrong. Your message doesn't get delivered the way you intended. Reformatting the output is like unfolding the airplane before passing it on.
At an institutional level — in companies building real production AI systems — this formatting problem is so common it has a standard solution: contract testing. Each step in a pipeline defines a "contract" specifying what format its output will take, and automated tests check that every output matches the contract. For no-code builders, the equivalent is: always run a test with real data and inspect the output before connecting the next step.
Pipelines aren't always linear. Sometimes one output splits and becomes the input to multiple different steps at once. This is called a fan-out. In Make.com, this looks like a single module with multiple routes connected to its output. In Zapier, it looks like a single trigger feeding multiple parallel action paths.
Fan-outs are powerful but carry a specific risk: if the shared output is wrong, every branch it feeds is wrong simultaneously. In Spotify's case, that first normalization step's biased output fed into every subsequent step — it was effectively a fan-out point. One error at the start propagated through the whole system at once.
When you design a workflow, identifying your fan-out points — the moments where one output splits into many — tells you exactly where you need the most rigorous output checking. A problem at a fan-out point is always worse than a problem at a regular step.
When AI tools produce unexpected results, most people's instinct is to look at the final output and ask "why did the AI say that?" You now know to trace the pipeline backward. The answer is almost always somewhere in an intermediate output — usually in the formatting or in a fan-out point earlier in the chain. That's how Spotify's engineers found their bug, and it's the same skill you'll use every time you debug a no-code workflow.
Spotify's genre bias problem raises an uncomfortable question about accountability in multi-step AI systems. The engineers who wrote the normalization code — Step 1 — didn't intend to bias recommendations. The engineers who built the clustering model — Step 2 — had no idea their input was slightly off. By the time the bias showed up in what users actually heard, it had passed through so many hands that no single person felt responsible for it.
This is a real and growing problem in AI ethics: responsibility diffusion. In a long pipeline, each step can have a different team, a different vendor, or even a different company behind it. When the final output is harmful or unfair, who is accountable? The team that wrote Step 1? The team that approved the final output? The company that deployed the system to millions of users?
There's no consensus answer. Some legal scholars argue for strict liability on the final deployer — whoever puts the system in front of users owns all the outputs. Others argue for transparency requirements — every step in a pipeline should be documented so that the source of an error can be traced. What's certain is that understanding how outputs-become-inputs is not just a technical skill. It's the skill that makes accountability conversations possible at all.
At 9:30 AM on August 1, 2012, the New York Stock Exchange opened for trading. Within the first 45 minutes, Knight Capital Group — one of the largest market-making firms in the United States, handling about 10% of all US equity trading volume — began executing orders at a pace that made no sense. Their automated trading system was buying and selling the same stocks in rapid cycles, moving markets, accumulating enormous positions no one had authorized.
By 10:15 AM, Knight Capital had lost $440 million dollars. The company was essentially destroyed by lunchtime. What happened? Engineers had deployed new software the night before, and during the deployment, an old piece of code — called the "Power Peg" algorithm — was accidentally left active on eight of their servers. When the market opened, the opening bell was the trigger. The new code was supposed to receive that trigger. But the old Power Peg code received it too, simultaneously, from eight different servers — and it had been sitting dormant for years, waiting for exactly that signal.
The trigger — market open — was identical whether the right system received it or the wrong one. The algorithm didn't know it wasn't supposed to be listening. It just knew: the trigger fired, execute. And so it did, 4 million times in 45 minutes.
In any AI workflow — from a billion-dollar trading system to a simple Zapier automation you build in an afternoon — nothing happens until a trigger fires. A trigger is the event that starts the workflow. It's the first input, in a sense, except it's not data about a task — it's a signal that says "it's time to run."
In no-code tools, triggers come in several forms. Schedule triggers fire at a set time — "run every day at 8 AM." Event triggers fire when something specific happens — "a new row was added to this Google Sheet" or "a new email arrived in this inbox." Webhook triggers fire when an external service sends a signal — "a customer just completed a purchase on this website." Manual triggers fire when you click a button.
The trigger controls when and how often your AI runs. A workflow that generates a daily summary report should be triggered by a schedule. A workflow that processes customer support tickets should be triggered by events. Using the wrong trigger type can mean your AI runs at the wrong time, runs too often, runs with incomplete data, or — as Knight Capital demonstrated — runs when it was never supposed to run at all.
Knight Capital's disaster was actually a trigger scope problem. The trigger — market open — was broadcast to every system that was listening. The problem was that an unintended system was listening. In no-code workflows, the equivalent situation arises constantly: you set up a trigger on a shared resource and forget that multiple workflows are monitoring the same thing.
Imagine you build a workflow in Make.com that monitors a shared company Google Sheet for new rows. Every time a new row is added, your AI processes it and sends an email. Then a colleague sets up their own automation on the same sheet. Now every time a new row is added, both workflows trigger. Your AI sends two emails. The colleague's automation runs when it shouldn't. Nobody planned for this.
The professional fix is trigger scoping: making sure a trigger is as specific as possible. Instead of "any new row," use "a new row where the Status column says 'Ready to Process.'" Instead of "any new email," use "a new email with a specific subject line prefix or a specific sender." The more precisely you define your trigger, the less likely it is to fire at the wrong time — or be hijacked by something else listening on the same channel.
Picture a classroom where the teacher says "everyone who is done, raise your hand." Three students who aren't done raise their hands by accident because they thought the teacher was talking about a different question. If the teacher had said "everyone who is done with Question 5, raise your hand" — a more scoped trigger — only the right students would respond. Specificity in triggers works exactly the same way.
There's another dimension to triggers that beginners almost always overlook: how often the trigger fires. This matters for two reasons — cost and quality.
On cost: most no-code AI tools charge per operation or per AI call. If your trigger fires 500 times a day because you set it to "check for new emails every minute," but most of those triggers find nothing new, you're burning through your plan's operation limit on empty runs. Choosing a trigger that fires only when there's actually something to process — like a webhook that fires only when a new email arrives, rather than a schedule that polls for new emails every minute — is dramatically more efficient.
On quality: if your trigger fires faster than your AI can process the previous batch, you create a backlog — a queue of inputs waiting to be processed. For most hobby workflows, this doesn't matter. But if you're building something real — a customer service bot, a content moderation system, a data pipeline — backlog means latency, and latency means people are waiting. Understanding trigger frequency is how you size your workflow for the real load it'll face.
Knight Capital's $440 million loss happened in 45 minutes. That's roughly $163,000 per second. And it happened because no one had a clear inventory of which systems were listening to which triggers. Today, large financial institutions are required by the SEC to maintain what's called a "systems and technology inventory" — a documented list of every automated system and what triggers it responds to. Understanding triggers isn't just a no-code skill. It's the foundation of regulatory compliance for any institution that uses automated decision-making.
Here's the ethical question this lesson leaves open: the Power Peg algorithm at Knight Capital was designed years earlier to serve a legitimate purpose. It wasn't malicious. The engineers who deployed the new code didn't know the old code was still listening. When a system causes harm through a trigger no one intended to fire — harm that happened through a series of reasonable-looking individual decisions — who bears responsibility for the outcome?
Knight Capital's shareholders, who had nothing to do with the deployment decision, lost most of the value of their investment in 45 minutes. The engineers who made the deployment mistake didn't go to prison. The firm eventually survived through an emergency bailout from investors. But the question of who should pay when automation runs where it shouldn't — and how to prevent it — remains one of the most contested questions in AI governance today.
On September 23, 1999, NASA's Mars Climate Orbiter — a spacecraft the size of a small car, built over four years and launched 9 months earlier — arrived at Mars and began its orbital insertion burn. It disappeared. Communications went silent. The spacecraft was never heard from again.
The investigation that followed found one of the most embarrassing errors in space exploration history. Lockheed Martin, the contractor that built the spacecraft's navigation software, had programmed it to output thruster force in imperial units — pound-force. NASA's navigation team at the Jet Propulsion Laboratory had written their software to receive thruster force data in metric units — newtons. For nine months, the spacecraft had been receiving navigation corrections that were wrong by a factor of 4.45 — because one pound-force equals 4.45 newtons.
Every thruster correction over nine months was off by that factor. By the time it reached Mars, the Orbiter was on the wrong trajectory. It entered the Martian atmosphere at the wrong angle and was destroyed. The output of one system — in pound-force — was treated as the input to another system — expecting newtons. Nobody caught it because both systems were technically "working correctly." The connection between them was broken.
In no-code AI workflows, you spend surprisingly little time thinking about the individual AI tools. The hard part — where most workflows fail — is the connection between tools. When Tool A produces an output and Tool B needs it as input, three things have to match: the format of the data, the units or scale of any values, and the field names used to label that data.
NASA's problem was units. In a Make.com workflow, your equivalent problems might look like this: your AI produces a date in format "MM/DD/YYYY" but the Google Sheets step expects "YYYY-MM-DD." Your sentiment analysis AI returns a score between -1 and 1, but your routing step expects a label like "Positive" or "Negative." Your email parser pulls the sender's name as "First Last" but your CRM needs two separate fields: "First" and "Last."
In every case, the individual tools are working fine. The data is correct. The connection is broken. This is why understanding how to map fields — explicitly defining which output field connects to which input field — is the most practical skill in no-code workflow building.
When you connect a tool in Make.com, it's using that tool's API under the hood. You don't have to write code to use an API — no-code tools handle that part. But you do need to understand the concept of API documentation, because it tells you exactly what inputs a tool accepts and exactly what fields will appear in its output.
Think of API documentation as a menu. It lists every "dish" the service offers (every action you can call), what "ingredients" it requires (required input fields), what optional additions you can specify (optional input fields), and exactly what will come back on your plate (output fields and their formats). The OpenAI API, for example, tells you: send a "messages" array with "role" and "content" fields; receive back a "choices" array containing a "message" object with a "content" field. Every time you use OpenAI through Make.com, that's what's happening — you just don't see the JSON directly.
The practical skill here is: before you connect two tools, look at what the first tool outputs and what the second tool needs as input. If they don't match, add a data transformation step in between. Don't assume two tools "just work together" because they're both popular. Assume they don't match until you've verified they do.
Imagine you're passing a message between two friends who speak different languages. The first friend writes in Spanish. The second friend only reads French. Even if the message is perfectly written, it won't work without a translator in the middle. A data transformation step is that translator. You need it whenever two tools "speak" different data formats.
NASA's engineers ran checks on the individual systems — the navigation software at Lockheed Martin worked correctly in pounds. The navigation software at JPL worked correctly in newtons. But nobody ran an end-to-end test — a test that sent real data from one system all the way through to the other and checked that the result made physical sense.
In Make.com and Zapier, both tools provide a "test" or "run once" feature for every workflow you build. The strong advice in this lesson is simple: always test with real, representative data — not the fake test data the tools sometimes pre-fill for you. Fake test data often has unrealistically clean formats. Real data has messy names, unexpected characters, blank fields, and values in unexpected ranges. Testing with real data reveals connection problems before they reach users or customers.
A specific habit to build: after running a test, expand every intermediate output in the test log and ask yourself: "Does this output look exactly like what the next step is expecting?" If the answer is "I'm not sure," find out before you turn the workflow on. The five minutes you spend checking intermediate outputs can save you the equivalent of a $327 million spacecraft.
When someone tells you they "built an AI workflow," the first question you can now ask is: "How did you handle the connections between tools?" Most people who build no-code workflows think about the AI step in the middle — the fun, interesting part. They give much less thought to how data flows into it and out of it. You now know that the connections are where the real engineering judgment lives. That knowledge puts you in the top tier of no-code builders before you've built your first workflow.
Here's the ethical dimension to sit with: NASA's unit mismatch was nobody's deliberate fault. Lockheed Martin followed a specification. JPL followed a specification. The specifications weren't coordinated with each other. $327 million of public money was lost. Three hundred and twenty-seven million dollars that could have funded schools, medical research, or climate science.
In AI systems today, there are countless integration points between tools built by different companies, running on different data standards, maintained by different teams. When a healthcare AI misses a diagnosis because a data field from one hospital's electronic records system was interpreted incorrectly by another system's AI, who is responsible? The hospital that generated the data? The AI company that built the model? The no-code developer who connected them? The regulatory body that approved the integration?
The deeper your understanding of how inputs and outputs connect across tool boundaries, the more clearly you can see where these accountability gaps live — and the better positioned you are to argue for specific solutions, rather than just pointing at "AI" as an abstract problem.
A startup is about to launch an AI-powered hiring tool. The tool reads job applications and outputs a score from 1–10 that determines whether each applicant gets a human review. You've been asked to audit the input design before launch day.
Your AI partner below is playing the role of the workflow's lead engineer. They're confident the system is ready. Your job is to challenge them — ask hard questions about what's going into the AI, whether the input is appropriate for this task, and what could go wrong.
You've inherited a three-step no-code workflow: (1) an AI reads customer emails and extracts the topic and urgency, (2) a routing step sends high-urgency emails to the priority queue, (3) an AI drafts a response. The final responses are all wrong — they're responding to the wrong topic, as if the AI didn't understand the email at all.
Your AI partner is playing the role of the previous developer who built the workflow. They're going to show you what each step's output looks like. Your job is to figure out at which step the data breaks down — and why.
You're designing a no-code AI workflow for a small online store. The workflow should: (1) detect when a customer leaves a review on the store's Google Business profile, (2) classify the review as positive or negative, (3) if negative, automatically alert the store owner and draft a response. You need to design the trigger strategy before building anything.
Your AI partner is a skeptical fellow architect who will push back on your trigger choices. They want to know: what type of trigger, how often it fires, how specifically scoped it is, and what happens if the trigger fires at the wrong moment.
A hospital is connecting its patient intake forms (which output data in one format) to an AI triage tool (which expects data in a different format) to an electronic health records system (which has its own schema). You've been brought in as an independent auditor to review the connection design before the system goes live with real patients.
Your AI partner is the integration developer who built the connections. They believe everything is fine. Your job is to probe the specific data fields, units, and formats at each connection point — and identify at least two potential failure modes before the system goes live.