L1
Β·
Quiz
Β·
Lab
L2
Β·
Quiz
Β·
Lab
L3
Β·
Quiz
Β·
Lab
L4
Β·
Quiz
Β·
Lab
Module Test
Module 6 Β· Lesson 1

How AI Listens: Real-Time Audio Analysis in Live Shows

Before AI can shape sound, it must understand it β€” analysing pitch, dynamics, and space as the performance unfolds.
What does it actually mean for an AI system to "hear" a live performance, and how does that capability change what sound engineers can do?

At the 2019 Eurovision Song Contest in Tel Aviv, the production team used automated audio analysis across 26 competing broadcasts simultaneously β€” monitoring loudness levels, frequency balance, and dynamic range in real time across every feed. The system flagged deviations from EBU R128 loudness standards instantly, allowing engineers to intervene before viewers at home noticed anything wrong. This was not a futuristic experiment; it was a logistical necessity at broadcast scale.

The Anatomy of Real-Time Audio Analysis

Real-time audio analysis refers to the computational process of examining an incoming audio signal and extracting meaningful data about it with a latency low enough to influence the same performance being measured. In live production contexts, "real time" typically means a processing delay under 10 milliseconds β€” small enough that no audience member would perceive it as lag.

The foundational measurements that AI-assisted systems track include loudness (LUFS/RMS), dynamic range, spectral balance (which frequency bands are dominant), transient detection (sharp attacks like snare hits or consonants), and harmonic content (what notes and chords are present). Each of these has been measurable by digital tools for decades; what AI adds is the ability to interpret these measurements contextually and act on them automatically.

Modern systems such as iZotope's DDLY dynamic delay and Eventide's H9 with machine learning-based algorithms can identify the character of an incoming signal β€” is this a dry vocal, a wet guitar, a kick drum transient? β€” and adjust processing parameters accordingly without a human turning a knob. The AI is not guessing; it is pattern-matching against training data drawn from thousands of previously labelled audio examples.

Real Case β€” Radiohead's 2016 Tour

FOH engineer Chris Kent used a combination of spectral analysis plug-ins running in Avid VENUE to monitor the interaction between Thom Yorke's vocal and the band's dense layered synth textures in real time. The system provided visual and audio alerts when specific frequency ranges became overloaded, allowing Kent to make EQ decisions before feedback or muddiness reached the audience. He described it as "having a second pair of ears that never gets fatigued."

Spectral Analysis and Feedback Detection

Feedback β€” that piercing squeal when a microphone picks up its own amplified signal β€” is one of the most disruptive events in live audio. Traditional solutions relied on engineers manually sweeping frequencies with a graphic EQ and listening. AI-assisted feedback suppression systems, such as those built into Shure's Axient Digital wireless system and dbx DriveRack processors, now detect the early signatures of feedback (a rapid, narrow-band loudness spike with a self-reinforcing harmonic pattern) and apply a narrow notch filter in milliseconds β€” before the feedback reaches audible threshold.

The distinction from older "automatic feedback suppressors" is subtle but important: older systems simply detected high-loudness spikes and cut them. AI-trained systems distinguish between a loud transient (a rim shot, a shout) and true feedback onset by analysing the rate of change and spectral narrowness of the signal. Fewer false cuts mean less tonal damage to the sound.

A documented example: at the 2018 NFL Super Bowl halftime show with Justin Timberlake, the RF coordination team used Shure Wireless Workbench with automated spectrum scanning to monitor over 70 simultaneous wireless channels across the arena, detecting interference and rerouting frequencies automatically during the live broadcast.

Key Terms
LUFSLoudness Units relative to Full Scale β€” the standard measurement for perceived loudness in broadcast and live audio, replacing older peak-based methods.
Spectral AnalysisBreaking an audio signal into its component frequencies to see which parts of the spectrum are active and at what levels.
Notch FilterA very narrow EQ cut targeting a specific frequency β€” used to remove feedback without significantly altering overall sound quality.
Transient DetectionIdentifying the sharp attack portion of a sound β€” the initial burst of energy before a note sustains β€” which is critical for distinguishing instruments and avoiding false-positive feedback suppression.
Why This Matters for Performers

Understanding how AI listens helps performers and directors make better decisions in rehearsal: placement of microphones, choice of monitoring configurations, and even staging positions all interact with what AI systems can reliably detect. A performer who walks upstage and turns away from a front-fill speaker changes the spectral signature of their microphone β€” and a well-configured AI analysis system can flag that change before the engineer misses it manually.

Lesson 1 Quiz

How AI Listens β€” check your understanding
1. In a live audio context, what does "real time" processing typically mean in terms of latency?
Correct. Sub-10ms latency means no audience member perceives the processing delay as lag in the performance.
Not quite. Real-time audio processing in live contexts aims for under 10ms β€” anything above that becomes perceptible and disruptive.
2. What distinguishes AI-assisted feedback suppression from older automatic feedback suppressors?
Correct. By distinguishing feedback onset from loud transients like rim shots, AI systems cause fewer false cuts and less tonal damage.
Incorrect. The key difference is that AI analyses the character of the loudness spike β€” not just its level β€” to avoid cutting real musical content.
3. Which wireless system was documented as managing over 70 simultaneous wireless channels at the 2018 Super Bowl halftime show?
Correct. Shure's Wireless Workbench with automated spectrum scanning managed the complex RF environment of that production.
The lesson specifically documented Shure Axient Digital with Wireless Workbench at that Super Bowl production.
4. What is LUFS, and why did it replace older peak-based loudness standards?
Correct. LUFS accounts for how humans actually perceive loudness over time, rather than just measuring instantaneous peak values.
LUFS is a loudness measurement standard β€” it measures perceived loudness over time, which is more useful than simple peak levels for broadcast and live work.

Lab 1 β€” Real-Time Audio Analysis

Discuss AI listening systems with your AI assistant

Lab Brief

In this lab you'll explore how AI real-time audio analysis applies to a specific live performance context. Discuss the technical concepts with your AI assistant, ask about real tools, and think through how these systems would function in a show you know or are working on.

Starter prompt: "I'm a sound designer for a mid-sized touring musical production. Walk me through how AI real-time spectral analysis could help me manage feedback risks during the show."
AI Lab Assistant Performing Arts & AI Β· M6 L1
Hello! I'm your lab assistant for Module 6. Let's explore how AI audio analysis systems work in live performance. Tell me about a production context you're curious about, or use the starter prompt above to begin.
Module 6 Β· Lesson 2

Adaptive Mixing: AI That Responds to the Performance

From automated gain riding to machine-learning mix assistants β€” how AI is beginning to share the console with human engineers.
When an AI system adjusts a vocal fader in response to a performer's dynamics, is it augmenting human craft or replacing it?

At the 2022 Glastonbury Festival, the production team behind Billie Eilish's headline set used iZotope's RX 10 and auxiliary AI-driven gain-riding tools integrated into an Avid S6L console. According to the festival's technical production reports, the system was configured to maintain vocal intelligibility across the 100,000-capacity Pyramid Stage field, automatically compensating for Eilish's characteristic use of extreme dynamic range β€” from near-whispered verses to full-voice choruses β€” in real time without constant manual fader moves from the FOH engineer.

What Adaptive Mixing Actually Does

Adaptive mixing refers to audio systems that automatically adjust mix parameters β€” fader levels, EQ, compression ratios, reverb sends β€” in response to incoming audio content. The "adaptation" can be rule-based (if level exceeds threshold X, apply gain reduction Y) or model-based (a trained neural network predicts the appropriate setting based on learned examples of good mixes).

The most commercially mature form of adaptive mixing is automatic gain control (AGC), which has existed in broadcast for decades. But AI brings two significant upgrades: the ability to distinguish between different sound sources sharing a microphone (a singer speaking vs. singing, for instance) and the ability to consider multiple channels simultaneously and balance them as a system rather than individually.

Yamaha's ProVisionaire platform and DiGiCo's SD-Rack both include scene-recall systems that can be triggered by timecode or MIDI, but newer hybrid AI systems go further: they do not just recall saved states but interpolate intelligently between them based on what they hear. If a performer sings longer than expected before the chorus, the system waits β€” it doesn't blindly execute a scene change at bar 32 regardless of what's happening musically.

Real Case β€” Hamilton on Broadway

The original Broadway production of Hamilton (2015–present) uses a sophisticated monitor mixing system in which each cast member's in-ear monitor mix is partially automated. The system, documented in Sound on Sound's 2016 feature on the show, uses timecode-driven automation to pre-position faders based on scene, but the A1 and A2 engineers retain the ability to override in real time. The automation handles predictable elements (the orchestra balance, click distribution) while humans handle the unpredictable (a performer moving unexpectedly, ad-libs, audience interaction).

Machine Learning Mix Assistants

Beyond gain riding, AI mix assistants trained on large libraries of professionally mixed recordings are now commercially available. iZotope's Neutron 4 and Sonible's smart:comp 2 both use machine learning to analyse an incoming signal, compare it to target profiles, and suggest or apply EQ and compression settings automatically. While these tools are primarily used in studio recording, they are increasingly being tested in live contexts via laptop-based processing racks.

The conceptual shift is important: these systems do not apply a fixed formula. They learn what a "balanced mix" sounds like from thousands of examples, and then try to push incoming audio toward that learned ideal. A well-trained model will recognise that a muddy low-mid buildup on a piano in a small theatre needs different treatment than the same buildup in an arena.

Critics within the audio engineering community β€” including veteran FOH engineers like Monty Carlo (touring engineer, various major artists) β€” have noted that AI mix assistants tend to produce competent-but-generic results. They excel at avoiding obvious mistakes but can suppress the idiosyncratic choices that define a great live mix. This tension between safety and artistry is central to the field's current debate about AI's role.

Key Terms
Gain RidingManually or automatically adjusting a channel's fader level over time to maintain consistent perceived loudness despite variations in performance dynamics.
AGC (Automatic Gain Control)A circuit or algorithm that reduces gain when signal levels rise too high and increases it when they fall too low β€” the rule-based predecessor to AI-driven adaptive mixing.
Scene RecallThe ability of a digital mixing console to instantly restore a saved collection of settings β€” fader positions, EQ, routing β€” on cue.
TimecodeA synchronisation signal (often SMPTE) that assigns a precise timestamp to each frame of a show, allowing automated systems to trigger events at exact moments.
The Human Judgment Layer

Every documented deployment of AI adaptive mixing in professional live production retains a human engineer with override authority. The AI handles volume, the human handles meaning. When Adele pauses unexpectedly before a chorus, the AI might misread the silence as an opportunity to normalise levels; the human engineer knows she is building drama and holds the mix. Knowing when not to act is a judgment that AI systems have not yet reliably demonstrated.

Lesson 2 Quiz

Adaptive Mixing β€” check your understanding
1. What is the key difference between rule-based adaptive mixing and model-based adaptive mixing?
Correct. Rule-based systems apply fixed thresholds; model-based systems learn from large libraries of professionally mixed audio and apply contextual judgment.
Not quite. Rule-based = fixed logic; model-based = learned from examples. The distinction matters for understanding what AI mixing actually does.
2. According to the lesson, what was the role of AI gain-riding tools in Billie Eilish's 2022 Glastonbury headline set?
Correct. Eilish's wide dynamic range β€” from whisper to full voice β€” made automated gain riding practically necessary for a 100,000-capacity outdoor venue.
The system was used to handle Eilish's extreme dynamic variation automatically, maintaining intelligibility at a 100,000-capacity site.
3. What criticism do some veteran engineers level at AI mix assistants like iZotope Neutron 4?
Correct. AI assistants tend to avoid mistakes well but may sand away the distinctive choices that make a legendary live mix.
The criticism documented in the lesson is about artistic mediocrity β€” AI mixes are safe but can lack the idiosyncratic decisions that make great live sound.
4. In Hamilton's Broadway production, what does the timecode-driven automation primarily handle versus what human engineers manage?
Correct. This division of labour β€” automation for predictable, humans for unpredictable β€” is the dominant model in professional live sound automation.
Hamilton's system divides labour: automation handles the reliable/predictable elements while engineers retain authority over anything unexpected.

Lab 2 β€” Adaptive Mixing Strategies

Design an AI-assisted mixing workflow with your AI assistant

Lab Brief

In this lab, you'll work through the practical design of an AI-assisted mixing setup for a specific show type. Consider what the AI should automate, what should stay under human control, and how to configure the handoff between them.

Starter prompt: "I'm the A1 for a touring pop concert in 3,000-seat theatres. Help me design an AI-assisted mixing workflow that handles the predictable elements automatically while keeping artistic control where I need it."
AI Lab Assistant Performing Arts & AI Β· M6 L2
Ready to help you design an adaptive mixing workflow. Tell me about your show β€” or start with the prompt above β€” and we'll work through what AI can and can't reliably handle at a 3,000-seat touring level.
Module 6 Β· Lesson 3

AI-Generated Music and Sound Design for Live Theatre

From generative underscores to adaptive soundscapes β€” how AI composition tools are entering the live theatre sound designer's toolkit.
When a theatre's soundscape responds generatively to what is happening on stage, where does sound design end and live composition begin?

In 2023, the UK-based theatre company ComplicitΓ© collaborated with sound designer Gareth Fry on productions where generative audio tools were used to create evolving ambient textures that responded to dramatic rhythm rather than fixed timecode. Fry, speaking at the 2023 Theatre Sound Design symposium in London, described using Max/MSP with machine learning extensions to generate real-time sound environments that "listened" to cue calls and adapted their texture accordingly β€” producing different results each night while remaining structurally coherent.

Generative Audio in Live Contexts

Generative audio refers to sound that is created algorithmically in real time rather than played back from a fixed file. In live theatre, this approach addresses a longstanding challenge: recordings are static, but performances are not. A scene that runs longer one night because an audience is particularly reactive, or shorter because a performer makes a different choice, creates problems for a sound designer who has crafted a precise 47-second soundscape to accompany it.

Early generative theatre audio used randomised loops and probabilistic triggers in software like QLab or Max/MSP. AI-assisted generative audio goes further: it uses trained models to produce new audio content that fits stylistic constraints set by the designer β€” a specific harmonic palette, a textural density, a rhythmic feel β€” without cycling through the same patterns.

Google's Magenta project has produced open-source tools, including MusicVAE, that allow sound designers to interpolate between two musical ideas (for instance, a tense string texture and a quiet ambient wash) and generate smooth transitions of variable length. Theatre sound designers can use these to create underscore that can extend or compress gracefully as a scene demands.

Real Case β€” Sleep No More (Punchdrunk)

Punchdrunk's long-running immersive production Sleep No More (running in New York since 2011 and Shanghai since 2016) uses a non-linear audio environment designed by Stephen Dobbie. While not fully AI-generated, the production's audio system uses zone-based adaptive playback that responds to audience density and movement data tracked by RFID cards given to each audience member. Louder zones become quieter as more people enter, maintaining an intimate atmosphere. This adaptive response to live data represents the operational logic that full AI generative systems now build on.

AI Sound Design Tools for Theatre

Several tools have moved from experimental to practical use in professional theatre sound design in the past three years:

ElevenLabs and Resemble AI are being used to generate ambient voices, crowd sounds, and environmental audio layers that can be customised without requiring expensive field recording sessions. A sound designer who needs rain sounds with a specific spectral quality or distant crowd murmur at a specific density can generate these directly rather than searching a library.

Stable Audio (from Stability AI, released 2023) allows designers to generate short audio clips from text descriptions β€” "medieval tavern ambience, mid-frequency heavy, no modern sounds" β€” with high enough quality for theatrical use as background layers. The tool does not replace custom design but accelerates the iteration process significantly.

Adobe Project Music Generative AI, demonstrated at Adobe MAX 2023, showed tools for generating looping underscore tracks that could vary in intensity over time based on simple parameter controls β€” directly applicable to theatrical underscore needs.

Critically, all of these tools in current professional deployment function as resources for human designers, not as autonomous composers. The designer still makes all structural and dramatic decisions; the AI generates raw material faster than conventional methods.

Key Terms
Generative AudioSound produced algorithmically in real time rather than played back from a fixed recording β€” can vary each time it is triggered.
UnderscoreMusic or ambient sound played beneath dialogue or action in theatre to support emotional atmosphere without drawing attention to itself.
MusicVAEA variational autoencoder model from Google Magenta that can interpolate between two musical sequences, generating smooth transitions of variable length.
RFID TrackingRadio Frequency Identification β€” used in immersive theatre to track audience location data, enabling spatial audio systems to respond to where people actually are.
The Dramaturgy of Generative Sound

The most significant challenge in AI-generated theatre audio is not technical but dramaturgical: a generative system that produces beautiful, contextually appropriate sound every night is not necessarily producing sound that serves the specific dramatic arc of each performance. A human sound designer makes choices β€” silence here, texture there β€” that are part of a carefully considered interpretation of the text. AI generative systems optimise for coherence; human designers optimise for meaning. Those are different objectives.

Lesson 3 Quiz

AI-Generated Music and Sound Design β€” check your understanding
1. What core theatrical problem does generative audio address that fixed recordings cannot?
Correct. When a scene runs longer because of audience response, a fixed 47-second soundscape becomes either too short or too long β€” generative audio can adapt.
The core problem is duration mismatch: performances are variable, recordings are fixed. Generative audio can extend or compress to match.
2. What does Google Magenta's MusicVAE specifically enable for theatre sound designers?
Correct. MusicVAE lets designers define two musical endpoints β€” say, tension and calm β€” and generate transitions of whatever duration the scene requires.
MusicVAE interpolates between two musical states, generating transitions of variable length β€” directly useful for underscore that needs to flex with performance timing.
3. In Sleep No More (Punchdrunk), how does the audio system respond to audience behaviour?
Correct. This inverse relationship preserves intimacy β€” the system responds to live occupancy data rather than following a fixed playback schedule.
Sleep No More's system actually gets quieter as audiences enter zones β€” maintaining intimacy through inverse response to tracked occupancy data.
4. According to the lesson, how do current professional theatre sound designers primarily use AI generation tools?
Correct. AI accelerates material generation; the designer's interpretation, structure, and dramatic judgment remain human responsibilities.
In professional practice, AI tools generate material faster β€” but the designer makes all the choices about what to use, how, and why. The AI is not the designer.

Lab 3 β€” Generative Sound Design

Develop a generative audio concept for a theatre production

Lab Brief

In this lab, you'll think through the design of a generative audio environment for a specific theatrical context. Consider the dramatic needs, the tools available, and how you would define the parameters within which AI-generated material operates.

Starter prompt: "I'm designing sound for a 75-minute one-person play about memory and grief. The running time varies by 5–8 minutes each night because the performer responds to the audience. How would I use generative audio tools to design a soundscape that works reliably despite this variation?"
AI Lab Assistant Performing Arts & AI Β· M6 L3
Let's design a generative soundscape for theatre. The challenge of variable timing is one of the most interesting problems in live sound design right now. Share your context or use the starter prompt, and we'll work through practical approaches together.
Module 6 Β· Lesson 4

Ethics, Authorship, and the Future of AI Sound

Who owns a generative soundscape? What happens when AI replaces a musician's job? How do live performance communities navigate these questions now?
When an AI system generates the music that moves an audience to tears, who deserves credit β€” and what responsibilities come with that power?

In 2023, the American Federation of Musicians (AFM) made AI-generated music a central issue in Broadway contract negotiations. The union's contract with the Broadway League, concluded in March 2024, included specific language prohibiting the use of AI to replace live musicians in the orchestra pit without consent and compensation provisions. The AFM documented concerns that AI-generated backing tracks β€” already common in touring productions β€” were being used to reduce orchestra sizes, eliminating union jobs. This was not speculation; several touring productions had already reduced their live musician counts by 30–50% using pre-recorded and AI-assisted tracks.

The Labour Question

The most immediate and documented ethical challenge of AI in live music is economic displacement. When a touring production uses an AI-generated backing track instead of hiring a live rhythm section, specific musicians lose specific jobs. This is not a hypothetical future scenario β€” it is the present reality in mid-level touring production in the US and UK.

The scale of this displacement is tracked by the AFM and the UK's Musicians' Union (MU). The MU's 2023 survey of working musicians found that 38% of respondents had lost at least one booking in the prior year to a production that used recorded or AI-assisted backing tracks instead of live players. The touring circuit for covers bands and pit orchestras has been under pressure from pre-recorded backing tracks since the 1990s; AI-generated music accelerates the same trend.

Defenders of AI-assisted production argue that smaller productions using AI backing tracks are presenting shows that would otherwise not be economically viable at all β€” creating employment for performers, crew, and venue staff that would not exist if the show required a full live orchestra. This is a genuine tension, not a simple villain story: the AFM and MU both acknowledge that requiring full live orchestras in every production is not economically realistic and would eliminate more jobs total than it saves.

Real Case β€” West End AI Music Row 2023

In October 2023, a West End producer proposed replacing the live five-piece band in a mid-scale revival with an AI-generated adaptive backing track system. The Musicians' Union publicly opposed the plan and the production ultimately retained three live musicians while using AI-assisted elements for the rhythm section. The compromise was notable: it was not all-or-nothing, but a negotiated hybrid reflecting the genuine economic pressures on both sides.

Authorship and Copyright

When a sound designer uses Stable Audio to generate an ambient texture and incorporates it into a live show, who owns that texture? Current copyright law in the US and UK does not protect AI-generated output as such β€” copyright requires human authorship. The designer who prompted and selected the output has a reasonable claim as the creative decision-maker, but the legal framework for this remains unsettled.

For live performance specifically, the authorship question intersects with collective authorship norms. Theatre sound design has always involved the designer, the director, the composer (if separate), and the performers all contributing to the final sound experience. AI-generated elements enter this collective authorship space without a clear role β€” they are not performers, not composers in the legal sense, and not employees of anyone in the production.

The Society of Motion Picture and Television Engineers (SMPTE) and Audio Engineering Society (AES) both have active working groups examining attribution standards for AI-assisted audio work. Their emerging consensus favours a disclosure model: productions should document and disclose where AI generation tools contributed to the final product, without necessarily changing ownership structures.

Consent and Voice Cloning

A specific and acutely contested area is AI voice cloning for live performance. Tools like ElevenLabs allow a convincing synthetic reproduction of any person's voice from a short sample. In 2023, several estates of deceased musicians β€” most prominently involving recordings associated with the late Amy Winehouse β€” faced public scrutiny when AI tools were used to generate new "performances" in their voices without clear licensing frameworks.

For live theatre, the issue arises when productions use AI-cloned voices for characters, narration, or even to extend the performance of a living actor across multiple productions simultaneously. The UK's Equity union and the US's SAG-AFTRA both negotiated AI voice consent provisions in 2023–2024 contracts, requiring explicit performer consent before any voice can be used to train or populate an AI voice model used in commercial production.

Key Terms
AFMAmerican Federation of Musicians β€” the US union representing professional musicians, including Broadway pit orchestras and touring production players.
Voice CloningUsing AI to synthesise a convincing reproduction of a specific person's voice, capable of speaking or singing new content not originally recorded.
Disclosure ModelAn attribution approach in which productions publicly document where AI tools contributed to the work, without necessarily changing legal ownership structures.
Collective AuthorshipThe norm in theatre that multiple creative contributors (designer, director, composer, performers) share credit for the final artistic product, without a single author.
The Practitioner's Responsibility

Every sound designer, music director, and audio engineer working in live performance today faces these questions practically, not theoretically. The tools are available now. Choosing to use them, choosing not to, and choosing how to disclose and compensate β€” these are decisions that current practitioners are making in active productions. Understanding the ethical landscape is not optional for a working professional in this field.

Lesson 4 Quiz

Ethics, Authorship, and the Future β€” check your understanding
1. What specific provision did the AFM secure in its 2024 Broadway League contract regarding AI?
Correct. The contract requires consent and compensation before AI can replace live musicians β€” not an outright ban, but a structured negotiation process.
The AFM secured consent and compensation requirements β€” not an outright ban, which would have been unenforceable given pre-recorded backing track practice.
2. What did the MU's 2023 survey find about working musicians and AI-related job losses?
Correct. The MU survey documented real economic impact: over a third of working musicians lost bookings to backing track and AI-assisted production in a single year.
The MU survey found 38% had lost bookings β€” real documented economic impact, not just fear of future displacement.
3. Under current US and UK copyright law, who owns AI-generated audio output?
Correct. Copyright law requires human authorship β€” AI-generated content as such is currently unprotectable, leaving a legal grey area for designers who incorporate it.
Current law does not protect AI-generated output β€” copyright requires human authorship. The designer who selected and used the output has claims, but the output itself is not automatically protected.
4. What approach do the AES and SMPTE working groups favour for AI attribution in audio production?
Correct. The emerging consensus is transparency through disclosure β€” audiences and collaborators should know where AI contributed, even if the legal framework for ownership remains unsettled.
AES/SMPTE working groups favour a disclosure model β€” transparency about where AI contributed, without mandating specific ownership or prohibition rules.

Lab 4 β€” Ethics and AI Sound Policy

Work through a real ethical dilemma in AI-assisted live production

Lab Brief

In this lab, you'll engage with the ethical dimensions of AI sound in a specific production scenario. There are no easy right answers β€” the goal is to reason carefully through competing legitimate interests.

Starter prompt: "I'm the music director for a touring production of a musical. The producer wants to replace the five-piece live band with an AI-generated adaptive backing track to cut costs. I have concerns. Help me think through the ethical, legal, and practical dimensions of this decision."
AI Lab Assistant Performing Arts & AI Β· M6 L4
This is one of the most contested areas in performing arts right now, and there are real competing interests at stake. Let's work through this carefully. Tell me about the specific production context, or use the starter prompt to begin examining the ethical landscape.

Module 6 Test

AI for Music and Sound in Live Performance β€” 15 questions Β· 80% to pass
1. What is the typical latency threshold for audio processing to be considered "real time" in a live performance context?
Correct. Sub-10ms processing is not perceived as lag by audiences or performers.
The threshold is under 10ms β€” anything above becomes perceptible in a live performance context.
2. Which measurement standard does AI audio analysis use to measure perceived loudness in broadcast and live production?
Correct. LUFS reflects perceived loudness over time, not just instantaneous peak levels.
LUFS β€” Loudness Units relative to Full Scale β€” is the current broadcast and live standard for measuring perceived loudness.
3. At the 2019 Eurovision Song Contest in Tel Aviv, AI audio analysis was used to monitor how many simultaneous broadcast feeds?
Correct. 26 competing broadcasts were monitored simultaneously for EBU R128 loudness compliance.
26 simultaneous broadcast feeds were monitored β€” all 26 competing countries' audio streams.
4. How does AI-assisted feedback suppression distinguish true feedback from a loud musical transient like a rim shot?
Correct. Feedback has a characteristic rapid onset and narrow spectral signature β€” AI analyses both to avoid false cuts.
AI analyses rate of change and spectral narrowness β€” feedback rises fast in a very narrow frequency band, unlike broadband transients.
5. In Hamilton's Broadway production, what does the timecode-driven automation primarily manage?
Correct. Automation handles predictable elements; human A1/A2 engineers handle unpredictable performance variables.
Automation manages predictable elements (orchestra balance, click distribution) while engineers retain override authority for the unpredictable.
6. What is the key advantage AI mix assistants add over traditional AGC (Automatic Gain Control)?
Correct. AI can contextualise multiple channels and distinguish source characters β€” AGC simply responds to level thresholds.
AI adds contextual intelligence β€” distinguishing source types and balancing channels as a system, not just responding to individual level thresholds.
7. What criticism do experienced FOH engineers make about AI mix assistants like iZotope Neutron 4?
Correct. Competent-but-generic is the dominant criticism β€” AI avoids mistakes but may not make the distinctive choices that make a mix memorable.
The criticism is artistic, not technical: AI mix assistants optimise for averages and can eliminate the idiosyncratic choices that define a great mix.
8. What core theatrical problem does generative audio solve compared to fixed recordings?
Correct. A scene that runs longer than expected doesn't leave the sound designer with a loop or an awkward silence β€” generative audio adapts.
Generative audio's core advantage is timing flexibility β€” it can match variable scene duration rather than forcing performers to match a fixed recording.
9. What does Google Magenta's MusicVAE enable for theatre sound designers?
Correct. MusicVAE lets designers define two musical endpoints and generate transitions of whatever length the scene requires.
MusicVAE interpolates between musical states β€” a tension texture and a calm texture, for instance β€” generating transitions of variable duration on demand.
10. In Punchdrunk's Sleep No More, how does the audio system respond to audience density in a zone?
Correct. The inverse response β€” quieter as more people enter β€” preserves the intimate atmosphere the production design requires.
Sleep No More's system gets quieter with more occupants β€” an inverse response that maintains the intimate, unsettling atmosphere of the production.
11. What percentage of MU members surveyed in 2023 had lost at least one booking to recorded or AI-assisted backing tracks?
Correct. 38% β€” a significant documented impact on working musicians' livelihoods in a single year.
The MU survey found 38% had lost bookings β€” real documented impact on working musicians in 2023.
12. Under current US and UK copyright law, can AI-generated audio output be copyright-protected?
Correct. The legal requirement for human authorship means pure AI-generated output is currently unprotectable by copyright in both jurisdictions.
Current law requires human authorship for copyright β€” AI output as such is unprotectable. This creates a genuine legal grey area for designers using these tools.
13. What approach do AES and SMPTE working groups favour for crediting AI contributions in audio production?
Correct. Transparent disclosure is the emerging consensus β€” audiences and collaborators deserve to know where AI contributed.
AES/SMPTE favour a disclosure model β€” transparency about AI contributions without prescribing specific ownership or prohibition rules.
14. What outcome did the 2023 West End AI music dispute reach between a producer and the Musicians' Union?
Correct. The hybrid compromise β€” three live musicians plus AI elements β€” reflected the genuine economic pressures on both sides.
The outcome was a negotiated hybrid: three live musicians were retained while AI-assisted elements covered other parts. Neither side got everything it wanted.
15. What is the central distinction between how AI systems and human sound designers optimise in live performance, according to the module's analysis?
Correct. This is the module's core claim: coherence and meaning are different objectives, and current AI systems excel at one while humans remain uniquely capable of the other.
The module's core distinction: AI optimises for coherence (things fitting together); humans optimise for meaning (choices serving the dramatic arc). These require different kinds of intelligence.