The World Press Photo Foundation announced its 2023 Digital Storytelling winner: a series by Ukrainian photographer Maxim Dondyuk documenting the war in Ukraine. But the surrounding conversation was dominated by a different image β an AI-generated submission by Boris Eldagsen titled "PSEUDOMNESIA: The Electrician," which the jury awarded first prize in the Creative category. Eldagsen then refused the prize, publicly declaring he had submitted the AI image as a test to see whether the contest was "ready for AI." The resulting debate divided the photographic world and forced every major competition to urgently draft AI disclosure policies.
It was not a scandal of malice. It was a scandal of category collapse β the question of what a photograph is had been left unanswered for too long, and AI forced the reckoning.
Photography's authority has always rested on a single claim: the image was produced by light reflected from a real scene, captured by a camera at a specific moment. This indexical relationship β the idea that a photograph is a physical trace of reality, like a footprint in mud β is why photographs carry legal weight in courtrooms, why they move us to donate to disaster relief, and why photojournalism has shaped geopolitical outcomes from the execution of Nguyα» n VΔn LΓ©m in 1968 to Abu Ghraib in 2004.
Generative AI breaks the indexical bond. A model like Midjourney or Stable Diffusion produces images by learning statistical patterns across hundreds of millions of photographs and then synthesising new pixel arrangements that match a text prompt. No camera. No light. No moment. The resulting image may be indistinguishable from a documentary photograph, but it is not one. It is a hallucinated average of photographic style.
The danger is not simply that AI images exist. Composite and manipulated images have existed since the 1860s. The danger is scale and accessibility: any person with a free account can now produce photorealistic images of events that never occurred, people who never posed, and atrocities that were never committed β in seconds, at zero marginal cost.
March 2023 β The Pentagon explosion hoax. An AI-generated image depicting an explosion near the Pentagon was shared on Twitter (now X) by accounts including one verified as Bloomberg. The image briefly triggered a dip in U.S. stock markets before the Arlington County Fire Department confirmed no explosion had occurred. The image was detectable as AI-generated on close inspection β the architectural details were inconsistent β but it spread before verification could catch up.
February 2024 β Taylor Swift NCII. Non-consensual intimate AI-generated images of Taylor Swift were shared millions of times on X before the platform took action. The incident accelerated legislation in the United States; multiple states passed laws specifically criminalising AI-generated non-consensual intimate imagery (NCII) within months. Microsoft, whose Designer tool was used to generate some images, subsequently tightened its content filters.
September 2023 β Balenciaga "refugee couture." A viral set of AI-generated images depicted refugees wearing high-fashion Balenciaga clothing in devastated environments. The images were clearly labelled as AI art by their creator, but once stripped of context on social media, they were shared by thousands of accounts as commentary on real luxury brands, raising questions about dignity, representation, and the aestheticisation of suffering without consent from those being depicted.
KEY CONCEPT β INDEXICALITY
Semiotician Charles Sanders Peirce distinguished between icons (images that resemble), symbols (arbitrary signs), and indices (signs causally connected to what they represent). A thermometer is an index of temperature; smoke is an index of fire; a photograph is an index of a real scene. AI-generated images are icons β they resemble photographs β but they are not indices. Ethical photography practice in the AI era requires clearly communicating this distinction to audiences.
The Content Authenticity Initiative (CAI), co-founded by Adobe, the BBC, and The New York Times in 2019, developed the Coalition for Content Provenance and Authenticity (C2PA) standard. C2PA embeds cryptographically signed metadata into image files, creating a tamper-evident record of the image's origin, edits, and authorship β a "nutrition label" for media.
In 2023, Leica became the first camera manufacturer to embed C2PA credentials in hardware, launching the M11-P. Canon and Nikon announced similar programmes. Adobe Firefly, a generative AI tool, automatically attaches Content Credentials to AI-generated images. The standard is technically sound but faces an adoption problem: credentials are stripped when images are uploaded to most social platforms, and there is no legal requirement to apply them.
AI detection software β tools like Hive Moderation, AI or Not, and Illuminarty β attempts to identify generated images by statistical signatures. Accuracy rates above 90% are reported in controlled tests, but adversarial techniques (lightly post-processing AI images) can defeat most detectors. Detection is a useful layer but not a reliable solution on its own.
PHOTOGRAPHER'S ETHICAL ANCHOR
The National Press Photographers Association (NPPA) updated its Code of Ethics in 2023 to state that members must "clearly label" AI-generated or AI-altered images and must never use AI to fabricate news events. The World Press Photo contest now requires entrants to disclose any use of generative AI, and submissions found to contain undisclosed AI generation are disqualified. As a photographer working with AI tools, voluntary adoption of these standards before they are legally mandated is a mark of professional integrity.
In this lab you will interrogate the ethical stakes of AI-generated photorealistic images. Use the AI assistant to think through real scenarios: What disclosure obligations do photographers have? How do audiences calibrate trust? When does a synthetic image cross from art into deception?
Engage with at least three substantive exchanges to complete this lab.
In January 2023, Getty Images filed suit against Stability AI in the United States District Court for the District of Delaware, alleging that Stability AI had scraped and used more than 12 million photographs from Getty's collection to train Stable Diffusion β without licence, without compensation, and without removing Getty's watermarks (which sometimes appeared, distorted, in generated outputs). The case is ongoing as of 2024 and represents the largest copyright dispute in AI image generation to date.
The suit does not stand alone. In September 2023, a class-action lawsuit brought by artists including Sarah Andersen, Kelly McKernan, and Karla Ortiz against Stability AI, Midjourney, and DeviantArt alleged that the companies had trained on billions of images scraped from the web without consent. The central question was not whether copying occurred β the companies conceded scraping β but whether training on images constitutes copyright infringement under U.S. law.
The dominant training datasets for image generation models β LAION-5B (5.85 billion image-text pairs), LAION-Aesthetics, and Common Crawl β were assembled by scraping publicly accessible URLs from the internet. Images posted to Flickr, ArtStation, DeviantArt, personal portfolio sites, and stock agencies were included at scale. The LAION team estimated in 2022 that roughly 47% of LAION-5B's images were hosted on just five platforms: Flickr, Wikimedia, WordPress, Imgur, and Facebook.
The legal status of this scraping is contested across jurisdictions. In the United States, the fair use doctrine β tested in cases like Authors Guild v. Google (2015), which permitted Google to index and display snippets of books β may protect transformative uses of copyrighted material for training purposes. However, AI training is distinct from indexing: the model does not store images but learns to reproduce their statistical features, and outputs can be stylistically indistinguishable from specific artists' work. Whether this constitutes infringement remains an open legal question.
In the European Union, the AI Act (formally adopted in 2024) requires providers of general-purpose AI models to publish "sufficiently detailed summaries" of the data used for training, specifically to enable copyright holders to assert their rights. This represents the first significant legislative mandate for AI training data transparency.
Several mechanisms now exist β or are emerging β for photographers and artists to opt out of AI training data:
Spawning's "Have I Been Trained?" (haveibeentrained.com) allows artists to search LAION-5B for their images and submit opt-out requests. As of 2023, over 80 million images had been opted out. However, opt-out is retroactive β the images have already been used in training runs β and new models built on different datasets are not affected.
Robots.txt extensions (the "ai-crawlers" token proposed by the Spawning API and adopted by some web crawlers) allow website owners to signal that their content should not be scraped for AI training. Major AI companies including Google DeepMind, Common Crawl, and OpenAI have committed to honouring these signals, but compliance is not legally enforceable in most jurisdictions.
Adobe Stock's consent-based model is the clearest positive alternative: Adobe compensates contributors whose images are used to train Adobe Firefly, with bonus payments based on image usage in training. This opt-in, compensated model represents the ethical standard that advocates argue should be industry-wide.
FACIAL RECOGNITION AND STREET PHOTOGRAPHY
Beyond training data, AI facial recognition tools have created a parallel consent crisis in street photography. Clearview AI scraped over 30 billion facial images from the public internet β including social media β to build a facial recognition database sold to law enforcement. In 2022, Clearview was fined β¬20 million by France's CNIL and ordered to delete French citizens' data. The case established that publicly posted photographs retain privacy protections; publication does not equal consent to facial recognition indexing.
U.S. copyright law does not protect artistic style β only specific expression. This means that prompting an AI to generate images "in the style of Annie Leibovitz" or "in the style of Steve McCurry" is not, on current legal interpretation, copyright infringement, because no specific copyrighted image is being reproduced. However, this same principle means that photographers cannot copyright their personal visual style, leaving them legally exposed even as their work is commercially exploited.
The right of publicity β which protects individuals' names, likenesses, and personas from commercial exploitation β offers a parallel avenue. In 2023, a Tennessee law (the ELVIS Act) extended right-of-publicity protections specifically to AI-generated vocal imitations, and similar proposals for visual likeness are under discussion in multiple states. For photographers whose recognisable subjects are reproduced in AI outputs, right-of-publicity claims may offer stronger protection than copyright.
PRACTICE STANDARD
When using AI tools in your photographic practice, proactively investigate what data your chosen tool was trained on. Prefer tools with transparent, consent-based training programmes (such as Adobe Firefly). If you generate images incorporating recognisable people's likenesses, obtain explicit consent. Document your workflow decisions β they are increasingly required by galleries, agencies, and competitions.
This lab focuses on the consent and copyright dimensions of AI image generation. Explore questions about training data collection, photographers' rights, opt-out mechanisms, and what a fair compensation framework might look like.
Engage with at least three substantive exchanges to complete the lab.
In March 2023, Bloomberg journalists ran a systematic test of five major AI image generators β Stable Diffusion, DALL-E 2, Midjourney, Adobe Firefly, and DreamStudio β prompting each to generate images of people in high-status professions (CEO, lawyer, doctor, judge) and low-status or criminal contexts (fast-food worker, criminal, social-services recipient). The results were striking: all five systems overrepresented white men in high-status roles and darker-skinned individuals in low-status or criminal contexts, in proportions that significantly exceeded even the biases already present in the U.S. workforce.
The study did not indicate deliberate design choices. It reflected a structural reality: the training data β scraped predominantly from English-language Western internet sources β encoded the photographic and representational norms of those sources, which themselves reflect decades of systemic inequity in how people are photographed, published, and distributed.
Bias in AI image generation operates at multiple levels. At the data level, training datasets over-represent certain demographics, geographies, and aesthetics. LAION-5B, for instance, draws heavily from English-language platforms; content from Africa, South Asia, and Latin America is proportionally underrepresented relative to global population. A model trained on this data will generate images that skew toward Western visual norms by default.
At the labelling level, image-text pairs in training data reflect the assumptions of their labellers. If images of women in leadership roles are less commonly captioned "CEO" than equivalent images of men β because fewer such images existed in training corpora, or because labellers used different language β the model learns a skewed association.
At the RLHF level (Reinforcement Learning from Human Feedback), human raters shape which outputs are rewarded. If raters share demographic or aesthetic preferences β as they are statistically likely to, given that many rater pools are recruited from online platforms with their own demographic skews β those preferences become encoded in the model.
The result is a feedback amplification loop: biased photography norms produce biased training data, which produces biased models, which produce biased outputs used in advertising, editorial, and cultural production β reinforcing the original norms at scale.
Google's Gemini image generation tool launched in February 2024 and was almost immediately suspended for its image generation features following a wave of controversy. Users discovered that prompts for historical images β "Nazi German soldiers," "the Founding Fathers of the United States," "a medieval English knight" β produced historically inaccurate images showing racially diverse groups in contexts where such diversity was anachronistic or distorting. Google's vice president of product for Gemini acknowledged the tool had "missed the mark" and suspended image generation of people entirely while recalibrating.
The episode illustrated the difficulty of correcting for bias without overcorrecting: Google had implemented diversity-promoting interventions in response to the documented whiteness-by-default problem in AI image generation, but without adequate safeguards for historical contexts where diversity injection distorted rather than improved accuracy. Critics across the political spectrum noted that the episode revealed how deeply political the choices embedded in AI training and tuning actually are.
THE STOCHASTIC PARROT PROBLEM
Linguists Emily Bender, Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell coined the term "stochastic parrot" (2021) to describe how large language and multimodal models "parrot" patterns from training data without understanding. Applied to image generation: when a model produces a "doctor," it reproduces the statistical average of how doctors appear in its training data β not the reality of who doctors are globally. This is not intelligence; it is very large-scale pattern completion. The ethical weight falls on those who deploy these outputs as representations of reality.
Underrepresentation in training data does not just produce biased outputs β it can produce technical failure for certain populations. In 2015, Google Photos' image classifier notoriously labelled photographs of Black people as "gorillas" β a failure traced to underrepresentation of darker-skinned subjects in training data. Google's solution, reported by Wired in 2023, was to block all searches for "gorilla," "chimp," "chimpanzee," and "monkey" in Google Photos β a workaround that persisted for at least eight years rather than fixing the underlying training data problem.
In facial recognition, a 2018 MIT Media Lab study by Joy Buolamwini and Timnit Gebru ("Gender Shades") found error rates up to 34.7% for darker-skinned women in commercial facial analysis systems, compared to 0.8% for lighter-skinned men. The study directly prompted IBM, Microsoft, and Amazon to audit and retrain their facial recognition products. Amazon halted sales of its Rekognition tool to law enforcement in 2020 after further studies showed error rates that risked wrongful identification of Black individuals in criminal investigations.
ETHICAL PRACTICE FOR PHOTOGRAPHERS
When using AI generation or enhancement tools, actively interrogate the demographic assumptions of the output. Does the tool's default "person" represent the global diversity of human appearances? Are you using AI-generated imagery in commercial or editorial contexts where representational accuracy matters? Consider whether AI-generated images in your work risk perpetuating the erasure or misrepresentation of already-underrepresented communities β and whether photographic fieldwork with real subjects would be more ethical and more accurate.
This lab explores how AI image generation systems encode and amplify demographic bias β and what ethical obligations photographers have when using these tools. Discuss real cases, interrogate the mechanisms of bias, and consider practical mitigation strategies.
Engage with at least three substantive exchanges to complete the lab.
When the European Parliament voted to adopt the EU AI Act in June 2023 β the world's first comprehensive AI regulatory framework β Article 50 included a specific requirement that AI systems generating synthetic media must ensure outputs are "marked in a machine-readable format and detectable as artificially generated or manipulated." The regulation further required that providers disclose when text, images, audio, or video had been AI-generated, with specific provisions for deepfakes used for satire or artistic purposes needing to carry labels indicating their synthetic nature.
For photographers and visual media professionals, the AI Act established a clear trajectory: disclosure is becoming a legal obligation, not a professional courtesy. The question is no longer whether to label AI-generated imagery, but how β and what the enforcement consequences of non-compliance will be when the Act's provisions fully take effect in 2025 and 2026.
As of 2024, the disclosure landscape for AI-generated imagery is a patchwork of voluntary standards, platform policies, and emerging legislation:
EU AI Act (2024): Mandates machine-readable watermarking and human-readable disclosure for AI-generated content produced by general-purpose AI systems. High-risk applications (including certain uses of facial recognition) face stricter transparency requirements. Penalties for non-compliance can reach β¬15 million or 3% of global annual turnover for generative AI providers.
U.S. NO FAKES Act (proposed, 2023): Would create a federal right for individuals to control digital replicas of their voice or likeness, including AI-generated images. As of 2024, the bill had bipartisan support but had not passed.
China's Deep Synthesis Regulations (effective January 2023): Require service providers to obtain user consent before using their likeness or voice, to label AI-generated content conspicuously, and to maintain logs of AI-generated content for 15 days. China is currently the most aggressive jurisdiction in mandating AI content labelling.
Platform policies: Meta announced in February 2024 that it would label AI-generated images on Facebook, Instagram, and Threads using industry-standard signals (including C2PA metadata and Google's SynthID watermarking). YouTube requires creators to disclose AI-generated content in videos, particularly those depicting realistic people or events. TikTok implemented similar requirements in 2023.
Google's SynthID, developed by Google DeepMind and launched in 2023, embeds an imperceptible digital watermark directly into the pixel values of AI-generated images. The watermark is designed to survive common post-processing operations β cropping, resizing, JPEG compression β and can be detected by Google's verification tool without access to the original model. Google has opened SynthID to third-party developers via its Vertex AI platform.
Adobe Content Credentials (built on C2PA) take a different approach: rather than altering pixel values, they attach a signed metadata sidecar to the image file that records generation history. This approach is fully transparent and human-readable but depends on platforms preserving metadata β which most social platforms currently do not.
Invisible watermarking vs. metadata represent two philosophies: invisible watermarks are harder to strip but can be defeated by adversarial processing; metadata is transparent and auditable but easily stripped. Most technologists argue both approaches are needed in combination.
THE 2024 U.S. ELECTION AND AI IMAGERY
The 2024 U.S. presidential election cycle produced the first documented large-scale use of AI-generated political imagery in campaign advertising. In January 2024, a robocall using an AI-generated voice imitating President Biden instructed New Hampshire voters not to vote in the Democratic primary β a case that prompted the FCC to ban AI-generated voices in robocalls. On the image side, AI-generated photos of political candidates in fabricated scenarios circulated on social media throughout the campaign cycle, prompting the FEC to consider requiring disclosure of AI content in political advertising.
Major photojournalism institutions updated their AI policies substantively in 2023β2024. The Associated Press policy, revised in 2023, prohibits using AI to generate photorealistic images for editorial use but permits AI tools for image organisation, search, and non-editorial tasks. Reuters maintains a similar prohibition. The New York Times states that photojournalists may not alter or generate images of news events using AI.
For documentary and fine-art photographers, the standards are less prescriptive but the professional community is increasingly aligned: images submitted to competitions or published with documentary intent must disclose AI involvement. The Photography Society of America updated its exhibition rules in 2023, creating separate "Creative AI" divisions for AI-generated work, distinct from traditional and digitally manipulated photography categories.
The deeper shift is conceptual: photography is ceasing to be treated as a single medium and is fracturing into disclosure-dependent categories β documentary, editorial, commercial, creative AI β each with its own ethical and legal framework. Photographers who engage with AI tools need to be fluent in which category their work occupies and what obligations attach to it.
YOUR DISCLOSURE CHECKLIST
Before publishing any image with AI involvement: (1) Determine the publication context β documentary, editorial, commercial, or creative/art. (2) Check the platform's current AI labelling requirements. (3) If publishing in the EU, assess whether EU AI Act disclosure obligations apply. (4) Apply C2PA Content Credentials using Adobe's free tools or camera-native solutions. (5) Include a human-readable disclosure in caption or alt text. (6) Retain documentation of your AI workflow β generation prompts, tools used, edit history β for a minimum of 12 months.
This lab focuses on practical compliance and professional ethics around AI image disclosure. Explore how the EU AI Act, platform policies, and photojournalism standards apply to real-world publishing decisions β and how to build a disclosure workflow into your practice.
Engage with at least three substantive exchanges to complete the lab.