At the 2019 Eurovision Song Contest in Tel Aviv, the production team used automated audio analysis across 26 competing broadcasts simultaneously β monitoring loudness levels, frequency balance, and dynamic range in real time across every feed. The system flagged deviations from EBU R128 loudness standards instantly, allowing engineers to intervene before viewers at home noticed anything wrong. This was not a futuristic experiment; it was a logistical necessity at broadcast scale.
Real-time audio analysis refers to the computational process of examining an incoming audio signal and extracting meaningful data about it with a latency low enough to influence the same performance being measured. In live production contexts, "real time" typically means a processing delay under 10 milliseconds β small enough that no audience member would perceive it as lag.
The foundational measurements that AI-assisted systems track include loudness (LUFS/RMS), dynamic range, spectral balance (which frequency bands are dominant), transient detection (sharp attacks like snare hits or consonants), and harmonic content (what notes and chords are present). Each of these has been measurable by digital tools for decades; what AI adds is the ability to interpret these measurements contextually and act on them automatically.
Modern systems such as iZotope's DDLY dynamic delay and Eventide's H9 with machine learning-based algorithms can identify the character of an incoming signal β is this a dry vocal, a wet guitar, a kick drum transient? β and adjust processing parameters accordingly without a human turning a knob. The AI is not guessing; it is pattern-matching against training data drawn from thousands of previously labelled audio examples.
FOH engineer Chris Kent used a combination of spectral analysis plug-ins running in Avid VENUE to monitor the interaction between Thom Yorke's vocal and the band's dense layered synth textures in real time. The system provided visual and audio alerts when specific frequency ranges became overloaded, allowing Kent to make EQ decisions before feedback or muddiness reached the audience. He described it as "having a second pair of ears that never gets fatigued."
Feedback β that piercing squeal when a microphone picks up its own amplified signal β is one of the most disruptive events in live audio. Traditional solutions relied on engineers manually sweeping frequencies with a graphic EQ and listening. AI-assisted feedback suppression systems, such as those built into Shure's Axient Digital wireless system and dbx DriveRack processors, now detect the early signatures of feedback (a rapid, narrow-band loudness spike with a self-reinforcing harmonic pattern) and apply a narrow notch filter in milliseconds β before the feedback reaches audible threshold.
The distinction from older "automatic feedback suppressors" is subtle but important: older systems simply detected high-loudness spikes and cut them. AI-trained systems distinguish between a loud transient (a rim shot, a shout) and true feedback onset by analysing the rate of change and spectral narrowness of the signal. Fewer false cuts mean less tonal damage to the sound.
A documented example: at the 2018 NFL Super Bowl halftime show with Justin Timberlake, the RF coordination team used Shure Wireless Workbench with automated spectrum scanning to monitor over 70 simultaneous wireless channels across the arena, detecting interference and rerouting frequencies automatically during the live broadcast.
Understanding how AI listens helps performers and directors make better decisions in rehearsal: placement of microphones, choice of monitoring configurations, and even staging positions all interact with what AI systems can reliably detect. A performer who walks upstage and turns away from a front-fill speaker changes the spectral signature of their microphone β and a well-configured AI analysis system can flag that change before the engineer misses it manually.
In this lab you'll explore how AI real-time audio analysis applies to a specific live performance context. Discuss the technical concepts with your AI assistant, ask about real tools, and think through how these systems would function in a show you know or are working on.
At the 2022 Glastonbury Festival, the production team behind Billie Eilish's headline set used iZotope's RX 10 and auxiliary AI-driven gain-riding tools integrated into an Avid S6L console. According to the festival's technical production reports, the system was configured to maintain vocal intelligibility across the 100,000-capacity Pyramid Stage field, automatically compensating for Eilish's characteristic use of extreme dynamic range β from near-whispered verses to full-voice choruses β in real time without constant manual fader moves from the FOH engineer.
Adaptive mixing refers to audio systems that automatically adjust mix parameters β fader levels, EQ, compression ratios, reverb sends β in response to incoming audio content. The "adaptation" can be rule-based (if level exceeds threshold X, apply gain reduction Y) or model-based (a trained neural network predicts the appropriate setting based on learned examples of good mixes).
The most commercially mature form of adaptive mixing is automatic gain control (AGC), which has existed in broadcast for decades. But AI brings two significant upgrades: the ability to distinguish between different sound sources sharing a microphone (a singer speaking vs. singing, for instance) and the ability to consider multiple channels simultaneously and balance them as a system rather than individually.
Yamaha's ProVisionaire platform and DiGiCo's SD-Rack both include scene-recall systems that can be triggered by timecode or MIDI, but newer hybrid AI systems go further: they do not just recall saved states but interpolate intelligently between them based on what they hear. If a performer sings longer than expected before the chorus, the system waits β it doesn't blindly execute a scene change at bar 32 regardless of what's happening musically.
The original Broadway production of Hamilton (2015βpresent) uses a sophisticated monitor mixing system in which each cast member's in-ear monitor mix is partially automated. The system, documented in Sound on Sound's 2016 feature on the show, uses timecode-driven automation to pre-position faders based on scene, but the A1 and A2 engineers retain the ability to override in real time. The automation handles predictable elements (the orchestra balance, click distribution) while humans handle the unpredictable (a performer moving unexpectedly, ad-libs, audience interaction).
Beyond gain riding, AI mix assistants trained on large libraries of professionally mixed recordings are now commercially available. iZotope's Neutron 4 and Sonible's smart:comp 2 both use machine learning to analyse an incoming signal, compare it to target profiles, and suggest or apply EQ and compression settings automatically. While these tools are primarily used in studio recording, they are increasingly being tested in live contexts via laptop-based processing racks.
The conceptual shift is important: these systems do not apply a fixed formula. They learn what a "balanced mix" sounds like from thousands of examples, and then try to push incoming audio toward that learned ideal. A well-trained model will recognise that a muddy low-mid buildup on a piano in a small theatre needs different treatment than the same buildup in an arena.
Critics within the audio engineering community β including veteran FOH engineers like Monty Carlo (touring engineer, various major artists) β have noted that AI mix assistants tend to produce competent-but-generic results. They excel at avoiding obvious mistakes but can suppress the idiosyncratic choices that define a great live mix. This tension between safety and artistry is central to the field's current debate about AI's role.
Every documented deployment of AI adaptive mixing in professional live production retains a human engineer with override authority. The AI handles volume, the human handles meaning. When Adele pauses unexpectedly before a chorus, the AI might misread the silence as an opportunity to normalise levels; the human engineer knows she is building drama and holds the mix. Knowing when not to act is a judgment that AI systems have not yet reliably demonstrated.
In this lab, you'll work through the practical design of an AI-assisted mixing setup for a specific show type. Consider what the AI should automate, what should stay under human control, and how to configure the handoff between them.
In 2023, the UK-based theatre company ComplicitΓ© collaborated with sound designer Gareth Fry on productions where generative audio tools were used to create evolving ambient textures that responded to dramatic rhythm rather than fixed timecode. Fry, speaking at the 2023 Theatre Sound Design symposium in London, described using Max/MSP with machine learning extensions to generate real-time sound environments that "listened" to cue calls and adapted their texture accordingly β producing different results each night while remaining structurally coherent.
Generative audio refers to sound that is created algorithmically in real time rather than played back from a fixed file. In live theatre, this approach addresses a longstanding challenge: recordings are static, but performances are not. A scene that runs longer one night because an audience is particularly reactive, or shorter because a performer makes a different choice, creates problems for a sound designer who has crafted a precise 47-second soundscape to accompany it.
Early generative theatre audio used randomised loops and probabilistic triggers in software like QLab or Max/MSP. AI-assisted generative audio goes further: it uses trained models to produce new audio content that fits stylistic constraints set by the designer β a specific harmonic palette, a textural density, a rhythmic feel β without cycling through the same patterns.
Google's Magenta project has produced open-source tools, including MusicVAE, that allow sound designers to interpolate between two musical ideas (for instance, a tense string texture and a quiet ambient wash) and generate smooth transitions of variable length. Theatre sound designers can use these to create underscore that can extend or compress gracefully as a scene demands.
Punchdrunk's long-running immersive production Sleep No More (running in New York since 2011 and Shanghai since 2016) uses a non-linear audio environment designed by Stephen Dobbie. While not fully AI-generated, the production's audio system uses zone-based adaptive playback that responds to audience density and movement data tracked by RFID cards given to each audience member. Louder zones become quieter as more people enter, maintaining an intimate atmosphere. This adaptive response to live data represents the operational logic that full AI generative systems now build on.
Several tools have moved from experimental to practical use in professional theatre sound design in the past three years:
ElevenLabs and Resemble AI are being used to generate ambient voices, crowd sounds, and environmental audio layers that can be customised without requiring expensive field recording sessions. A sound designer who needs rain sounds with a specific spectral quality or distant crowd murmur at a specific density can generate these directly rather than searching a library.
Stable Audio (from Stability AI, released 2023) allows designers to generate short audio clips from text descriptions β "medieval tavern ambience, mid-frequency heavy, no modern sounds" β with high enough quality for theatrical use as background layers. The tool does not replace custom design but accelerates the iteration process significantly.
Adobe Project Music Generative AI, demonstrated at Adobe MAX 2023, showed tools for generating looping underscore tracks that could vary in intensity over time based on simple parameter controls β directly applicable to theatrical underscore needs.
Critically, all of these tools in current professional deployment function as resources for human designers, not as autonomous composers. The designer still makes all structural and dramatic decisions; the AI generates raw material faster than conventional methods.
The most significant challenge in AI-generated theatre audio is not technical but dramaturgical: a generative system that produces beautiful, contextually appropriate sound every night is not necessarily producing sound that serves the specific dramatic arc of each performance. A human sound designer makes choices β silence here, texture there β that are part of a carefully considered interpretation of the text. AI generative systems optimise for coherence; human designers optimise for meaning. Those are different objectives.
In this lab, you'll think through the design of a generative audio environment for a specific theatrical context. Consider the dramatic needs, the tools available, and how you would define the parameters within which AI-generated material operates.
In 2023, the American Federation of Musicians (AFM) made AI-generated music a central issue in Broadway contract negotiations. The union's contract with the Broadway League, concluded in March 2024, included specific language prohibiting the use of AI to replace live musicians in the orchestra pit without consent and compensation provisions. The AFM documented concerns that AI-generated backing tracks β already common in touring productions β were being used to reduce orchestra sizes, eliminating union jobs. This was not speculation; several touring productions had already reduced their live musician counts by 30β50% using pre-recorded and AI-assisted tracks.
The most immediate and documented ethical challenge of AI in live music is economic displacement. When a touring production uses an AI-generated backing track instead of hiring a live rhythm section, specific musicians lose specific jobs. This is not a hypothetical future scenario β it is the present reality in mid-level touring production in the US and UK.
The scale of this displacement is tracked by the AFM and the UK's Musicians' Union (MU). The MU's 2023 survey of working musicians found that 38% of respondents had lost at least one booking in the prior year to a production that used recorded or AI-assisted backing tracks instead of live players. The touring circuit for covers bands and pit orchestras has been under pressure from pre-recorded backing tracks since the 1990s; AI-generated music accelerates the same trend.
Defenders of AI-assisted production argue that smaller productions using AI backing tracks are presenting shows that would otherwise not be economically viable at all β creating employment for performers, crew, and venue staff that would not exist if the show required a full live orchestra. This is a genuine tension, not a simple villain story: the AFM and MU both acknowledge that requiring full live orchestras in every production is not economically realistic and would eliminate more jobs total than it saves.
In October 2023, a West End producer proposed replacing the live five-piece band in a mid-scale revival with an AI-generated adaptive backing track system. The Musicians' Union publicly opposed the plan and the production ultimately retained three live musicians while using AI-assisted elements for the rhythm section. The compromise was notable: it was not all-or-nothing, but a negotiated hybrid reflecting the genuine economic pressures on both sides.
When a sound designer uses Stable Audio to generate an ambient texture and incorporates it into a live show, who owns that texture? Current copyright law in the US and UK does not protect AI-generated output as such β copyright requires human authorship. The designer who prompted and selected the output has a reasonable claim as the creative decision-maker, but the legal framework for this remains unsettled.
For live performance specifically, the authorship question intersects with collective authorship norms. Theatre sound design has always involved the designer, the director, the composer (if separate), and the performers all contributing to the final sound experience. AI-generated elements enter this collective authorship space without a clear role β they are not performers, not composers in the legal sense, and not employees of anyone in the production.
The Society of Motion Picture and Television Engineers (SMPTE) and Audio Engineering Society (AES) both have active working groups examining attribution standards for AI-assisted audio work. Their emerging consensus favours a disclosure model: productions should document and disclose where AI generation tools contributed to the final product, without necessarily changing ownership structures.
A specific and acutely contested area is AI voice cloning for live performance. Tools like ElevenLabs allow a convincing synthetic reproduction of any person's voice from a short sample. In 2023, several estates of deceased musicians β most prominently involving recordings associated with the late Amy Winehouse β faced public scrutiny when AI tools were used to generate new "performances" in their voices without clear licensing frameworks.
For live theatre, the issue arises when productions use AI-cloned voices for characters, narration, or even to extend the performance of a living actor across multiple productions simultaneously. The UK's Equity union and the US's SAG-AFTRA both negotiated AI voice consent provisions in 2023β2024 contracts, requiring explicit performer consent before any voice can be used to train or populate an AI voice model used in commercial production.
Every sound designer, music director, and audio engineer working in live performance today faces these questions practically, not theoretically. The tools are available now. Choosing to use them, choosing not to, and choosing how to disclose and compensate β these are decisions that current practitioners are making in active productions. Understanding the ethical landscape is not optional for a working professional in this field.
In this lab, you'll engage with the ethical dimensions of AI sound in a specific production scenario. There are no easy right answers β the goal is to reason carefully through competing legitimate interests.