When Ghostwriter977 posted "Heart on My Sleeve" to TikTok and YouTube in April 2023, the track sounded indistinguishable from a collaboration between Drake and The Weeknd. Neither artist had touched it. Within 72 hours the song accumulated millions of streams, triggered DMCA takedowns by Universal Music Group, and sparked a congressional inquiry into AI-generated music. The track was built using voice-cloning AI layered over a produced beat — a workflow that, by mid-2024, any creator could replicate with consumer tools.
The incident crystallised a question the music industry had been quietly avoiding: when AI can reproduce any artist's voice and style with high fidelity, what counts as original work?
Text-to-music platforms such as Suno (launched publicly December 2023) and Udio (April 2024) accept a natural-language prompt and return a complete audio production — vocals, instrumentation, mixing — in under a minute. The underlying models are trained on vast corpora of recorded music and learn to associate stylistic descriptors ("upbeat lo-fi hip hop, melancholic lyrics about city lights") with corresponding audio features.
Suno's architecture conditions a diffusion-based audio model on both text embeddings and a learned musical "grammar" that handles structure (verse, chorus, bridge). Udio uses a similar approach but allows stems — isolated vocal or instrumental tracks — giving producers more flexibility for downstream editing in a DAW.
By June 2024, the Recording Industry Association of America (RIAA) filed lawsuits against both Suno and Udio, alleging that their training datasets contained copyrighted recordings without licence. The cases remained active through 2025 and are expected to set precedent for the entire generative audio space.
In June 2024 the RIAA, representing major labels including Universal, Sony, and Warner, sued Suno for $150,000 per infringed work. The suits allege the models were trained on the labels' catalogues without permission or payment — the same legal theory applied to image generators in the 2023 Getty vs. Stability AI case.
What these tools do well: rapid prototyping of musical ideas, generating background music for video or podcast content, exploring genre combinations that would require a large session band to test conventionally, and producing royalty-free commercial music for small creators who cannot afford licensing fees.
Persistent limitations: outputs frequently exhibit what engineers call "hallucinated lyrics" — phonetically plausible syllables that mean nothing. Long-form coherence remains weak; a four-minute track generated in a single pass often loses its harmonic or rhythmic identity mid-way. Live-instrument nuance — the expressive imperfections that make a jazz piano recording feel human — is still largely absent.
Producers at studios including Hypnosis Music (London) have reported using Suno and Udio as demo sketch tools — generating a reference track to communicate a brief to session musicians, rather than as a final deliverable. This "AI as translator" workflow has become common in advertising music houses.
For commercial work in 2024–2025, AI-generated music is most defensible legally when the human creator writes all lyrics, edits the arrangement in a DAW, re-records key elements with live instruments, and uses the AI output strictly as a structural scaffold. This preserves both copyright protection and artistic authorship.
You're helping a small documentary filmmaker find background music for a 3-minute emotional scene set in a rain-soaked city at night. You have access to a text-to-music tool but no budget for licensing. In this lab, you'll work with an AI assistant to craft and refine prompts, understand what makes a music generation prompt effective, and think through the legal and creative trade-offs involved.
When the Screen Actors Guild–American Federation of Television and Radio Artists (SAG-AFTRA) went on strike in July 2023, one of its central demands was a prohibition on studios using AI voice clones of actors without consent and compensation. The union had documented cases where production companies had cloned voice performances from existing recordings to generate new lines — avoiding re-hire costs entirely. The five-month strike ended in November 2023 with a deal that included AI provisions, marking the first major collective bargaining agreement to address generative voice AI.
The strike forced the entire audio-production industry to confront a question that podcast producers, audiobook narrators, and radio talent had been asking quietly: if a studio can clone your voice once, do they ever need to hire you again?
ElevenLabs, founded in 2022, emerged as the dominant consumer voice-cloning platform. By early 2024 it offered "Instant Voice Cloning" — upload 30 seconds of audio, receive a model that replicates timbre, pacing, and emotional inflection. The platform was implicated in a January 2024 incident where a robocall using a cloned voice of President Biden urged New Hampshire Democratic primary voters not to vote. ElevenLabs terminated the responsible account within hours and tightened identity verification.
Descript took a different approach, marketing "Overdub" — a voice clone feature embedded in a podcast editing environment. Overdub requires the user to record themselves reading a consent script before any clone is activated, creating at least a procedural consent layer. The feature was designed explicitly for podcasters who want to fix spoken errors by typing corrections rather than re-recording.
Adobe Podcast (now part of Adobe Premiere) added "Enhance Speech" in 2023 — not cloning, but AI-powered audio clean-up that removes background noise, room reverb, and microphone imperfections from any recording. This tool democratised broadcast-quality audio for creators with consumer microphones and home studios.
A political operative used ElevenLabs to clone President Biden's voice and distribute a robocall to New Hampshire voters discouraging primary participation. The Federal Communications Commission subsequently proposed new rules requiring disclosure of AI-generated audio in political advertising. The case became a landmark in AI misuse documentation.
For independent podcast producers, AI has introduced four genuinely transformative capabilities: automated transcription (tools like Whisper from OpenAI achieve near-human accuracy on clear speech), noise removal (Adobe Enhance Speech, Auphonic), filler-word deletion (Descript's "remove filler words" feature scans transcripts and cuts "um," "uh," and "like" without manual editing), and AI-generated chapter summaries and show notes (Riverside.fm's AI summary feature was adopted by over 10,000 shows by late 2023).
Larger productions use AI differently. Spotify announced in September 2023 that it would use AI voice translation to dub popular podcast episodes into Spanish, French, and German — preserving the original host's vocal character across languages. The pilot used ElevenLabs technology and was tested with podcasts by Lex Fridman, Dax Shepard, and Steven Bartlett.
Ethical voice-cloning practice requires: (1) explicit recorded consent from the voice owner before any clone is created, (2) clear disclosure to audiences when a cloned voice is used, (3) a defined scope — the clone may not be used for purposes the owner did not agree to, and (4) compensation terms negotiated in advance. SAG-AFTRA's 2023 AI agreement codified versions of all four requirements.
You host an independently produced true-crime podcast. Your lead narrator — who recorded 40 episodes with you — has moved abroad and is unavailable for re-recording. You're considering using an ElevenLabs voice clone to record corrections and new episodes. In this lab, explore the ethical and legal dimensions of that decision with the AI assistant, including consent requirements, disclosure obligations, and what SAG-AFTRA's framework says about independent creators.
The Writers Guild of America's May–September 2023 strike produced landmark AI provisions that, while aimed at screenwriting, set a template the film music industry watched closely. Simultaneously, Hollywood composers' union Local 47 began negotiating AI provisions of its own, after reports emerged that at least two major streaming productions had used AI-generated score elements — synthesised from existing soundtracks — without composer knowledge or credit. The productions involved adaptive score tools that auto-generated ambient underscore for scenes the human composer had not been contracted to score.
The incidents illustrated a particular risk in AI scoring: the technology is most likely to displace work at the margins — the incidental, atmospheric cues that make up the bulk of a score's runtime but earn composers the majority of their backend royalties.
AIVA (Artificial Intelligence Virtual Artist), founded in Luxembourg in 2016, was among the first AI composers to receive performing rights society registration. By 2023 it had been used on over 300 commercial game soundtracks and was the preferred AI scoring tool at several mid-tier game studios seeking rapid iteration of ambient music without session musician costs. AIVA allows users to set key, time signature, tempo, and instrumentation, then generates multi-part MIDI arrangements that can be exported to any DAW.
Mubert operates differently — instead of composing new pieces, it uses a library of AI-generated stems tagged by mood and energy to assemble real-time adaptive music. Game developers integrate Mubert's API directly into their game engines, allowing the soundtrack to respond to player state without pre-composed trigger cues. Several indie games released in 2023–2024 used Mubert for their entire ambient layers.
Meta's AudioCraft (released open-source August 2023) included MusicGen — a model capable of generating short musical passages from text and reference audio. Meta's decision to open-source the model meant that within weeks, developers had fine-tuned versions for specific genres: film noir orchestral, chiptune, and cinematic trailer music, among others.
Traditional game music is composed in discrete loops triggered by game state — a system composers have been paid to create since the 1980s. Adaptive AI scoring replaces the trigger-based loop architecture with real-time generation that responds continuously. This eliminates not just the composition fee but the ongoing royalties composers earn each time a loop is used in a shipped game.
Professional film composers interviewed by The Hollywood Reporter in mid-2023 consistently identified thematic development — the craft of writing a musical idea and then transforming it across a film's emotional arc — as the domain AI tools cannot yet replicate. A leitmotif that starts as a tender piano phrase in Act One and resurfaces as a full orchestral swell during the climax requires narrative understanding that present AI systems lack.
Sound designers have found a different balance. AI tools like Soundraw and Boomy handle generative background texture efficiently, freeing human sound designers to focus on the distinctive, character-defining sounds that make a film or game's audio world feel original — creature vocalisations, unique environmental reverbs, weapon sounds that reflect character personality.
The practical reality for working composers in 2024 is a two-tier market: AI handles ambient and functional music at scale; human composers are contracted for thematic content, emotional peak moments, and anything requiring narrative coherence across more than a few minutes.
Music generated by AIVA on paid plans is assigned to the user as the copyright owner — a model AIVA achieved by ensuring the user, not the AI, holds the composition credit. This is legally distinct from outputs generated by Suno or Udio, where copyright ownership remains contested pending the RIAA litigation outcomes.
You're the solo developer of an indie horror-adventure game. Your total music budget is $500. You need: an atmospheric ambient layer that responds to player tension, a distinctive main theme, and five short event stings (discovery, danger, safety, puzzle solved, death). In this lab, work with the AI assistant to design a realistic audio strategy that combines AI tools with your budget constraints — and think through the copyright implications of each tool choice.
At a 2023 industry summit documented by Music Week, Radiohead's Thom Yorke drew an explicit comparison between generative AI music tools and the Napster era: "The argument is exactly the same — it's someone else taking your life's work, packaging it, and offering it for free or almost free." The comparison was apt in structure if not in detail. Napster distributed existing recordings without payment; AI tools consume existing recordings as training data — also without payment — and produce new outputs that compete directly with the recordings they were trained on.
The distinction matters legally. Napster's liability was clear: distribution of copyrighted files. AI training-data liability is contested: the industry must establish whether training constitutes infringement, whether outputs are "substantially similar" to training data, and whether fair use applies at the scale of billions of audio samples. Those questions were actively before federal courts as of mid-2025.
The U.S. Copyright Office issued a February 2023 guidance stating that works generated entirely by AI — without sufficient human authorship — are not eligible for copyright protection. This creates a foundational tension: if AI-generated music cannot be copyrighted, creators who use it lose the ability to exclusively control or license their output.
However, the Copyright Office also acknowledged that human-AI collaborative works may be protectable to the extent of human authorship. A musician who writes all lyrics, directs the AI to generate an instrumental scaffold, then edits, arranges, and masters the result has contributed sufficient human authorship to likely secure copyright — though the exact threshold remains legally untested.
The 2023 AI Act in the European Union introduced transparency requirements that directly affect audio AI: high-risk AI systems — including those capable of producing content that could be mistaken for human-created — must disclose AI involvement. This will affect how AI-generated music is labelled on streaming platforms for EU audiences.
Traditional music royalties flow through two channels: the master recording (owned by labels or artists) and the composition (owned by songwriters and publishers). When AI generates both the composition and the "recording" simultaneously, neither channel has a clear rights-holder. Streaming platforms in 2024 began requiring AI-generated track disclosures, but royalty routing for such tracks remained unresolved — meaning some AI-generated tracks were earning streaming revenue with no established mechanism for distributing it.
Universal Music Group's "artist-centric" model, announced October 2023, proposed that streaming platforms prioritise royalty payments for tracks with demonstrated human engagement metrics — a de facto disadvantage for AI-generated catalogue. Several major platforms, including Deezer, adopted variants of this model in 2024.
SoundCloud's AI music policy, updated February 2024, was among the first to explicitly distinguish between human-created, AI-assisted, and AI-generated tracks — and to route royalties differently for each category. AI-generated tracks without a discernible human author receive no performance royalties under the updated policy.
Licensing AI training data has emerged as a new revenue stream for rights-holders. In May 2024, Universal signed what was reported as a licensing agreement with an unnamed AI company to provide access to its catalogue for training purposes — a model some analysts predicted would become standard, creating a "music training data market" analogous to stock photography.
In 2024–2025, the safest approach for commercial audio work: (1) use AI for structural scaffolding, not final output; (2) document every human creative decision you make on top of AI output; (3) disclose AI involvement to any platform, client, or publisher that asks — most now have explicit policies; (4) avoid prompting AI to reproduce the style of a specific named artist, as this is the most legally exposed territory; and (5) check the specific copyright terms of whichever tool you use — AIVA, Mubert, and Soundraw handle ownership differently from Suno and Udio.
You run a small YouTube channel (95,000 subscribers) that publishes weekly video essays. A brand has approached you for a sponsored video and wants original background music that "sounds like Hans Zimmer's Interstellar score" — specifically because they love the emotional tone. You're considering using Suno to generate it. In this lab, work through the legal, ethical, and practical dimensions: what's the risk of that specific request? What alternatives exist? How should you disclose AI involvement to the brand and to YouTube?