How training data encodes values and historical bias — and why data collection is an ethical choice, not a technical one
In 2016, researchers at MIT trained an image recognition system on a publicly available dataset of labeled images. The system learned to classify objects accurately. But it also learned something else: the dataset had been labeled overwhelmingly by people in wealthy countries, using cultural assumptions common to those places. When deployed in different cultural contexts, the system made systematic errors — misidentifying objects and scenes that were normal and important in other places. The bias was not in the algorithm. It was in the data collection decisions made years before training began.
The dataset had been collected without thinking about ethics. The ethics had arrived anyway — encoded into every prediction the system made.
AI systems learn from training data. That training data is not neutral. It is a collection of decisions: what to measure, who to include, what examples to label, how to label them. Each of these decisions reflects values. Who decided what examples the data should include? Whose perspectives are represented in the labels? Whose are missing?
Historical bias in corpora: Most large-scale training datasets reflect historical data. If you train on historical hiring records, the dataset includes the biases of past hiring decisions. If you train on historical criminal justice data, the data encodes the biases of past enforcement and conviction patterns. The model learns not what is fair or accurate, but what happened — and what happened was often biased.
Labeling bias: How are data points labeled — classified, categorized, scored? The people doing the labeling bring their own biases. If you ask 100 people to label images as "professional" or "unprofessional," they will disagree. Your label dataset will encode disagreement as truth. The model will learn the labeling distribution, not an objective fact.
Collection scope and consent: Whose data is in the dataset? If a dataset comes from a specific platform or geographic region, it is not representative of "humanity" — it is representative of that subset. This is often invisible. A dataset of social media posts is a dataset of people on social media, not people in general. A dataset of hiring records is a dataset of hiring decisions made under specific conditions, not of the broader population.
Large-scale training datasets often require data collection at massive scope — scraping billions of images from the web, purchasing datasets from brokers, aggregating public records. Scale creates anonymity, which can reduce explicit consent and create distance from the individuals whose data is included. At scale, ethical questions about whose data you have, how it was collected, and what consent was given become statistically easy to ignore — but ethically impossible to ignore.
Before training begins, ethical AI development requires understanding what the training data actually represents. What population does it come from? What time period? What measurement biases exist in how the data was collected? What perspectives are included, and what are absent? This understanding is foundational to ethical AI. Without it, you cannot know what biases you are baking into your system.
Training data is a snapshot of the past — how things were measured, categorized, and decided at a particular moment. AI systems trained on that data will perpetuate those historical decisions unless explicitly designed not to. The ethical question is not whether your model will learn from the data — it will. The question is whether you are willing to examine what values that data encodes, and whether you accept those values in your deployed system.
Investigate how values are encoded in a training dataset before AI learning begins
You are examining a real training dataset that will be used to build an AI system. You cannot change the data — it will be used as-is. Your job is to identify the ethical issues embedded in the data collection and labeling decisions.
Choose a real dataset (ImageNet, COMPAS crime data, hiring records, medical records, or another). Identify: (1) What population does this data represent, and who is excluded? (2) What historical biases are likely encoded in this data? (3) What labeling choices were made, and what values do they reflect? (4) How would a model trained on this data perpetuate or amplify the issues you identified?
How training approaches, fine-tuning, and architectural decisions embed values into AI systems
Researchers at a major AI lab spent months training a large language model on diverse internet text. Then came fine-tuning — the process of training the model to be more helpful, more harmless, and more honest. This fine-tuning involved human annotators selecting examples of good and bad model behavior. But how good and bad were defined reflected the annotators' values. A response that seemed respectful to one person seemed dismissive to another. A response that seemed honest to one person seemed recklessly blunt to another. By the time fine-tuning was done, the model had learned very specific values — the values of the people who did the annotation work.
The model had been trained to be helpful, harmful, and honest. It had also been trained to embody the specific ethical choices of a few hundred people.
How a model is trained determines what it learns. Different training approaches make different ethical trade-offs. Reinforcement Learning from Human Feedback (RLHF): This approach trains a model to maximize human ratings of its outputs. But human raters bring their own biases and values. The model learns to optimize for what those specific raters find good. If raters come from a narrow demographic or cultural group, the model will learn their preferences as if they were universal values.
Adversarial training: Some systems are trained to resist attacks or harmful uses. But what counts as harmful? A system trained by one group may refuse requests that another group would consider legitimate. The training choices encode disagreements about what the system should do.
Few-shot learning: Systems trained on a small number of examples of desired behavior will learn the patterns in those examples — including their biases. If your few examples all show men as leaders and women as support staff, the model will learn that pattern.
Fine-tuning and labeling AI systems is often done by contract workers — known as "ghost workers" — in low-wage countries, doing cognitive work for below-market rates. These workers are doing value alignment: deciding what good AI behavior looks like. But they are rarely told their values are being encoded into the system. They are often not compensated based on the long-term impact of the systems they shape. The ethical issue is not just what values get encoded — it is that the people encoding them often do not know that is what they are doing.
Even technical architecture reflects values. A system designed to maximize engagement will learn to prioritize attention-grabbing over accuracy. A system designed to maximize fairness will make different trade-offs than a system designed to maximize accuracy. A system designed to be transparent (explaining its decisions) will work differently than a black-box system. These are not technical neutralities — they are ethical choices about what the system should optimize for.
The technical decisions about how to train, fine-tune, and architect an AI system are decisions about what values that system will embody. These are ethical choices. Organizations that treat design as purely technical — letting engineers choose training approaches without thinking about what values those approaches encode — have made an ethical choice: to accept whatever values emerge from the technical choices, without deliberate examination.
Evaluate the values embedded in technical decisions about how an AI system was built
You are analyzing the design decisions made in building a real AI system. Your task is to understand what values those decisions embed. Choose a real system (ChatGPT, DALL-E, a recommendation algorithm, a decision-support tool, or another).
For that system: (1) Identify one training approach or architectural choice that was made (e.g., RLHF fine-tuning, optimization for engagement, content filtering, accuracy maximization). (2) What values does that choice reflect? (3) What trade-offs does it make (is there something it is good at, and something it is worse at)? (4) Who benefited from that choice, and who might have been disadvantaged?
How organizational structures and safety processes try to address ethical concerns — and what their limitations are
A major AI company created a safety team tasked with identifying risks and harms in systems before deployment. The team found serious issues: performance disparities across demographic groups, concerning patterns in system outputs, risks of misuse. The team recommended stronger testing, changes to training data, and delayed deployment. But deployment was scheduled. Revenue forecasts had been made. The stakeholders who wanted deployment had more authority than the safety team. The system deployed anyway, with mitigations the safety team had not recommended instead of the changes they had proposed. The harms the team had predicted occurred.
The safety process had worked. The organizational power dynamics had overridden it.
Red teaming — adversarially testing an AI system to find harms and risks — is a standard safety practice. Red teams look for: biased outputs, toxic outputs, security vulnerabilities, ways the system could be misused. When red teams work well, they catch serious issues before deployment. But red teaming has limitations. Scope limitations: A red team can only test what it thinks to test. Novel harms that no one predicted will not be caught. Measurement limitations: Some harms are hard to test for. Subtle discrimination, long-term effects, emergent behaviors when systems interact with each other — these are difficult to identify in controlled testing. Power imbalance: Red teams typically report to leadership. When the red team finds issues that conflict with deployment timelines or revenue goals, their recommendations can be overridden.
Some organizations create ethics boards or committees to oversee AI development. These boards can provide expertise and oversight. But ethics boards also have structural limitations: they often lack enforcement power, they may not have access to sufficient technical information to make informed recommendations, and their decisions can be overridden by business units. An ethics board that can be ignored when its recommendations cost money is not governance — it is risk theater.
Rather than deploying a system to all users at once, staged deployment releases the system to a small group first, monitors for problems, and expands gradually. Staged deployment allows organizations to catch harms before they affect many people. But staged deployment is only effective if: (1) The system is actually monitored during staging (many are not), (2) Problems are acted upon (sometimes staged problems are ignored to stay on deployment schedule), (3) The staged population is representative (deploying to a narrow group first may miss issues that only manifest with different populations).
The strongest safety processes have structural features: Independence: Safety teams report to executives independent of product teams, not through the product chain of command. Authority: Safety recommendations can delay or block deployment, not just be noted. Resource protection: Safety work is funded and staffed regardless of revenue impact. Transparency: Safety findings are reported to independent oversight (boards, regulators) not just internal leadership.
An organization can have extensive safety processes — red teams, ethics boards, staged deployment, monitoring — and still deploy harmful systems. Safety processes work only if they have real authority to prevent deployment. Without authority, safety is a process that makes deployment look thoughtful while the core decision (when to deploy) remains driven by commercial considerations. Ethical AI requires that safety processes can actually delay or block deployment when serious harms are found.
Assess whether an organization's safety structures actually prevent harm or create appearances of safety
An AI development organization has implemented safety processes. Your job is to evaluate whether those processes are adequate — do they have the structure to actually prevent harmful deployments? Choose a real organization and research its safety practices (companies publish information about red teams, ethics boards, deployment procedures).
For that organization: (1) Describe the safety processes they have implemented. (2) Analyze their structure: do safety teams have independent reporting? Do they have authority to block deployment? Are they staffed and funded regardless of product timelines? (3) Identify gaps: what risks are not being tested for? What populations might not be represented in staged testing? (4) Assess whether the processes create actual governance or governance theater.
How non-technical stakeholders design AI systems — through requirements, commissioning, customization, and the questions they ask
A government agency needed an AI system for benefit eligibility. The procurement team specified the requirements: "Determine eligibility with 95% accuracy." Technical teams built a system that met that spec. But the specification had not addressed fairness. The system was 95% accurate overall but achieved that through systematically biased decisions — higher accuracy for some groups, lower for others. By the time the system was in use, it was making decisions that affected millions of people. The procurement decision had designed the system — through what it asked for and what it did not ask for.
Non-technical stakeholders had designed an AI system without understanding they were doing so.
Most AI system design happens at the requirement and commissioning level, not in the engineering phase. When you specify what an AI system should do, you are designing it. When you ask for accuracy without asking about fairness, you are designing a system that may be unfair. When you choose to automate a decision fully instead of keeping humans in the loop, you are designing who will be accountable. When you decide to deploy to a certain population first, you are designing whose risks are addressed first.
Customization and prompting: For systems you license (like large language models), customization choices are design choices. How you fine-tune the system, what instructions you give it, what constraints you set — these shape what the system will do and what values it will express to your users.
Commissioning choices: When you commission an AI system — whether from an external vendor or an internal team — the requirements and success metrics you specify design the system. Specifying accuracy designs for one thing. Specifying fairness designs for something different.
If you commission an AI system and do not ask about fairness, you are not being neutral. You are designing for whatever fairness emerges — which is often the fairness of the training data, which often reflects historical biases. Not asking about ethics is an ethical choice: accepting the ethics that the system will have without deliberate design.
Ethical accountability for AI systems does not only belong to engineers and data scientists. Everyone involved in specifying, commissioning, deploying, or using AI systems is participating in design. This means everyone involved has some responsibility for the values that emerge. A procurement officer who specifies accuracy without fairness is responsible for unfairness. A manager who deploys a system without checking for bias is responsible for that bias. A user who does not question the system's outputs is responsible for whatever the system is doing.
This responsibility is not individual blame — organizations share responsibility for creating conditions where ethical design is possible. But it is accountability: for the questions asked, the questions not asked, and the values that emerge as a result.
You do not need to be an engineer to design an AI system. You design through the requirements you specify, the metrics you measure, the questions you ask, and the questions you do not ask. You design through who you commission the system from, what constraints you require, and what you accept without requirement. You design through where you deploy first and how you monitor. Recognizing this is the foundation of ethical AI: understanding that design is distributed across everyone involved, and that everyone involved has some responsibility for what gets designed.
Write the commissioning brief that embeds ethical design into an AI system before it is built
You are commissioning an AI system for your organization. This is your opportunity to design ethics into the system from the start. You will write the commissioning brief — the requirements that will shape what system is built. Your brief should specify what the system should do AND the values it should embody.
For a specific use case: (1) Write your functional requirements (what the system should do). (2) Write your fairness requirements (which populations should be treated equally, and how you will measure it). (3) Write your transparency/explainability requirements (where humans need to understand why decisions were made). (4) Write your human oversight requirements (where humans stay in control). (5) Write your validation requirements (what testing will verify the system meets these specs).