In early 2023, Heinz ran a global campaign called "A.I. Ketchup." The brand prompted several AI image generators with the phrase "ketchup" — no brand name, just the word. Every generator, across every style, defaulted to a bottle that looked unmistakably like Heinz. The campaign ran across 18 markets and earned over 1.3 billion impressions organically. The creative director cited this as proof that brand equity had become encoded into the training data of generative models themselves.
Between 2022 and 2025 more than a dozen serious generative image platforms launched or reached maturity. Each makes slightly different tradeoffs: image quality versus speed, artistic range versus brand controllability, ease of use versus depth of customization. For designers, the instinct to pick one tool and master it is understandable — but it leaves capability on the table.
This lesson maps the major platforms, explains the architectural differences that produce different results, and gives you a professional framework for deciding which tool belongs in which part of a real project pipeline.
All four use diffusion models — a process that starts from random noise and iteratively refines it toward an image matching a text description. But the training data, model architecture, and inference controls differ significantly.
Midjourney uses a proprietary model trained on a curated dataset that skews toward fine art, photography, and design. Its aesthetic quality comes partly from heavy curation of what it learned from. It gives users relatively few direct parameters but rewards experienced prompt engineers who understand its aesthetic vocabulary.
DALL·E 3 uses a GPT-4-based system to rewrite your prompt before sending it to the image model — meaning it aggressively interprets your intent. This makes it very good at literal fidelity but can override stylistic choices you make deliberately.
Stable Diffusion (specifically SDXL and SD 3.x) exposes its weights publicly, enabling the entire ecosystem of ControlNet (pose/depth control), LoRA (lightweight fine-tuning), and img2img workflows that make professional production pipelines possible.
Adobe Firefly was trained exclusively on Adobe Stock images, openly licensed content, and public domain material — giving it a unique commercial safety guarantee that competitors cannot currently match at scale.
Closed-weight models (Midjourney, DALL·E) are faster to use but impossible to customise at the model level. Open-weight models (Stable Diffusion) require more infrastructure but allow brand-specific fine-tuning that closed models cannot replicate.
No single tool wins across all criteria. A mature design pipeline in 2025 typically uses Firefly for client-deliverable assets requiring legal clarity, Midjourney for concept ideation, DALL·E for precise comps, and Stable Diffusion for bespoke brand fine-tuning. The skill is knowing when to switch.
You'll receive three real-world design scenarios. For each, work through which generative image platform (Midjourney, DALL·E 3, Stable Diffusion, or Adobe Firefly) is the best choice — and why. Your coach will push you to justify decisions based on capability, legal requirements, and workflow fit.
When Coca-Cola used DALL·E 3 and Stable Diffusion for their 2024 "Masterpiece" campaign extensions, the creative team published internal notes describing their prompt development process. Initial prompts produced images that were aesthetically impressive but unusable — wrong aspect ratios, brand colour drift, inconsistent bottle silhouettes. The team spent three weeks developing a prompt library of 47 tested templates before achieving production-quality consistency across 200+ assets. The insight: prompt engineering at production scale is a repeatable craft, not creative inspiration.
Most designers approach AI image prompts the way they'd write a creative brief to a human illustrator: descriptive, evocative, relatively loose. This produces aesthetically interesting results but rarely production-ready ones. Generative models need a different kind of input — one that balances subject specification, style parameters, technical constraints, and negative space (what you don't want).
The anatomy of a professional prompt has five layers. Mastering them is the difference between a designer who occasionally gets lucky with AI and one who uses it reliably for client deliverables.
Midjourney responds strongly to emotional and aesthetic vocabulary: "ethereal," "desolate," "brutalist opulence." DALL·E 3 responds better to structural description: "a wide-angle photograph of a concrete building at dusk, dramatic shadows, no people." Knowing which register each model responds to saves hours of iteration.
Professional prompt engineering is iterative, not single-shot. The production workflow used by studios like BUCK and Superside typically follows three stages: Discovery (broad, high-chaos prompts to explore the solution space), Refinement (narrowing style and subject with lower chaos/temperature), and Production Lock (a fixed prompt template + seed that can generate consistent variants).
Seed locking — specifying a fixed random seed in Midjourney or Stable Diffusion — is particularly powerful for campaign work because it allows stylistically consistent images of different subjects using the same visual language.
Treat your best prompts as design assets — version-controlled, documented, and shared across your team. The Coca-Cola team's 47-template prompt library was as much a brand asset as their style guide. Build yours deliberately.
You'll practice the Five-Layer Prompt Framework by building, critiquing, and refining prompts for real design scenarios. Your coach will analyse your prompts, identify which layers are weak, and help you revise toward production quality.
When director Wes Anderson style became a dominant AI aesthetic trend in 2023, the studio Curious Refuge documented how they built full AI short films. Their key finding: text-to-image alone produced beautiful single frames but no temporal or compositional consistency. Their production pipeline used ControlNet with OpenPose to lock character body positions across frames, img2img at 0.4–0.6 strength to restyle photographic references into the target aesthetic, and inpainting to swap elements within otherwise-locked compositions. The combination turned a single-shot tool into a production-capable system.
Text-to-image is the entry point to generative tools, but it has a fundamental limitation for design work: stochastic composition. Every generation produces a random compositional arrangement within the prompt's constraints. For editorial work, this is acceptable. For campaign assets, packaging, or multi-image series requiring visual consistency, it is a blocking problem.
The three techniques in this lesson — img2img, inpainting, and ControlNet — solve this by giving designers control over composition, content, and structure that text prompts alone cannot provide.
Img2img takes an existing image as input alongside a text prompt. The model uses the reference image's structure as a starting point, then rewrites it toward the prompt. The denoising strength parameter (0.0–1.0) controls how much the model departs from the reference: 0.0 produces an identical copy, 1.0 produces a purely text-driven result. The productive range for most design applications is 0.35–0.65.
Practical applications include: restyling a mood board photograph into an illustrated look; converting a rough sketch into a polished render; adapting a competitor's aesthetic into a new visual language; creating dark/light mode variants of the same scene.
0.2–0.35: Preserve structure almost entirely, subtle style change only. 0.4–0.55: Major style shift while keeping composition and subject. 0.6–0.75: Significant departure — shape and rough composition survive. 0.8+: Only faint traces of the original remain.
Inpainting allows a designer to mask a specific region of an image and regenerate only that region while the rest remains pixel-identical. This is one of the most practically powerful tools in the generative pipeline: it solves the "almost perfect" problem that plagues text-to-image outputs.
Common design applications: removing AI-generated hands and regenerating them (the persistent hand failure mode); swapping background environments while keeping a hero subject; replacing product labels or colours; adding a missing design element to an otherwise-locked composition; and cleaning up compositional artefacts without restarting generation.
Inpainting is available natively in Stable Diffusion (via AUTOMATIC1111, ComfyUI), in Adobe Firefly's Generative Fill, and in Photoshop's AI features from version 25.0 onwards. Midjourney added a limited inpainting tool ("Vary Region") in late 2023.
ControlNet conditions a Stable Diffusion generation on a structural input rather than (or in addition to) a text prompt. The structural input can be a pose skeleton (OpenPose), a depth map, an edge map (Canny), a surface normal map, or a segmentation mask. This allows designers to specify the exact spatial layout of an image before generation begins.
The most important ControlNet modes for designers: OpenPose for human figure positioning (critical for fashion, lifestyle, and character work); Canny Edge for preserving product shapes and architectural layouts; Depth for controlling spatial depth while allowing surface texture and style to vary; MLSD for architectural line preservation in interior and exterior visualisation.
The professional pipeline is almost never a single text-to-image generation. It's: reference → img2img (style transfer) → inpainting (fix problems) → ControlNet (lock structure for variants). Master this sequence and you control AI outputs rather than accept them.
You'll work through designing a complete generative production pipeline for a scenario requiring visual consistency across multiple assets. Your coach will help you decide which advanced techniques to use at each stage and why.
In February 2023, the US Copyright Office issued formal guidance ruling that AI-generated images produced without sufficient human creative authorship are not eligible for copyright protection. The immediate case involved Kris Kashtanova's graphic novel "Zarya of the Dawn," created with Midjourney. The Office granted copyright to the text and overall arrangement but cancelled protection for all individual AI-generated images within it. The ruling established the human authorship threshold that now governs every commercial AI image use in the United States.
The US Copyright Office's 2023 Zarya ruling established a principle that has since been echoed in guidance from copyright offices in the EU, UK, and Australia: copyright protection requires human creative expression. Entering a text prompt and accepting a generated result does not meet this threshold. The copyright question hinges on how much human creative decision-making shaped the final output.
Factors that may establish stronger authorship claims: significant post-production editing in Photoshop or Illustrator; extensive use of img2img with a designer's own photography as reference; detailed ControlNet conditioning that reflects specific creative decisions; or substantial selection and arrangement of multiple AI elements into a designed composition.
This area of law is actively evolving and varies by jurisdiction. US guidance as of 2025 suggests that works where AI generation is merely a component of a larger human-authored design work are more defensible than works where the AI output is used without modification.
Currently: pure AI-generated images (prompt → accept → deliver) are unlikely to be copyright-protected in the US. Your own post-production additions may be copyrightable separately. The underlying AI output itself is in a contested legal space. Always consult your legal team before claiming copyright in AI-assisted commercial work.
Parallel to copyright ownership questions, major litigation is underway about whether training AI models on copyrighted images constitutes infringement. In 2023, three stock photography plaintiffs (including Getty Images) filed suit against Stability AI. Getty's UK case, filed in January 2023, alleged that Stability AI scraped and used 12 million Getty images without licence. As of 2025 these cases remain unresolved but have produced two immediate practical effects.
First, Adobe Firefly's "trained on licensed data" proposition became a significant commercial differentiator — prompting enterprises including Publicis Groupe and WPP to mandate Firefly for client deliverables requiring legal clarity. Second, Midjourney and Stability AI both added opt-out mechanisms for creators whose work appeared in training data, and several jurisdictions are legislating mandatory opt-out rights.
Each platform grants different usage rights through its Terms of Service, independent of copyright law:
| Platform | Commercial Use | Ownership Grant | Key Restriction |
|---|---|---|---|
| Midjourney (paid) | Yes (paid plans) | Non-exclusive licence; generated images not copyright-protected | Enterprise plan required for companies >$1M revenue |
| DALL·E 3 (OpenAI) | Yes | OpenAI assigns all rights to user; no OpenAI claim on outputs | Cannot use to deceive or misrepresent |
| Stable Diffusion | Yes (CreativeML OpenRAIL-M licence) | User owns outputs; model itself is licensed not sold | Cannot use for specific prohibited applications |
| Adobe Firefly | Yes | User retains full rights; Adobe IP indemnification applies | Must have active Creative Cloud subscription |
Beyond legal compliance, professional designers face ethical questions that law has not yet addressed: disclosure (should clients know AI was used?), representation (who should be depicted in AI imagery, and with what care?), displacement (what obligations exist toward illustrators and photographers whose work may have trained the models?), and environmental cost (large-scale AI image generation carries significant compute energy costs).
Several major design industry bodies — including the AIGA and the Design Council UK — published AI ethics frameworks in 2023–2024. These consistently recommend: proactive client disclosure of AI tool use; intentional representation review of AI outputs before publication; and support for opt-out mechanisms for living artists.
For client commercial deliverables requiring legal certainty: use Adobe Firefly and document it. For internal ideation or work where copyright ownership is not critical: any platform is workable under its ToS. For large enterprise accounts: understand the revenue thresholds in Midjourney's Enterprise plan. And for all AI work: build disclosure into your client relationship from the start — the law will catch up, and transparency protects you.
You'll analyse specific scenarios involving copyright, platform terms, and ethical decisions in AI image use. Your coach will present situations that require you to apply what you've learned about the Zarya ruling, platform ToS differences, and industry ethics frameworks.