AI and the future of music performance
AI is changing the way music is created, but at what cost?
This week, Google DeepMind and YouTube announced that they have partnered to create a new AI music generation model. Named Lyria, the model has two main functions: ‘Dream Track’, which creates a short AI-generated clip with the sound and style of particular artists; and the deceptively named ‘AI Music Tools’, which allows users to ‘create new music or instrumental sections from scratch, transform audio from one music style or instrument to another, and create instrumental and vocal accompaniments.’
In some ways this is nothing to write home about. New technologies have driven musical innovation for centuries, from the development of a cast iron frame for the piano, to the invention of the synthesiser and recording technologies. None have yet spelled the end of music-making as either pastime or profession, but have instead shifted the way we create, perform, and listen to music in new and exciting directions. So I’m not somebody who tends towards the reactionary when it comes to new music tech. And yet I am concerned about Lyria and what it represents.
It’s the latest in a growing list of AI systems of increasing levels of sophistication that have been created over the last few years to generate music and sound. JukeDeck, for example, launched publicly in 2015 and allowed users to generate royalty-free music, and Google’s 2017 NSynth combines instruments to generate new sounds (e.g. trombone and clarinet). The quality of the audio generated by Lyria is a step up from any of these. Sure, it’s far from perfect. On good speakers, it’s easy to hear that the voices created by Dream Track aren’t the original thing, and the generated clips are only a few seconds long. But this is only the first release. Given how quickly Large Language Models (LLMs) like ChatGPT have been taught to generate text it would be naive to believe that Lyria won’t be developed as quickly, and that music will somehow be immune from being impacted by AI.
The speed at which AI is being developed means that the implications of new systems are much wider-ranging — and less foreseeable — than we’re used to dealing with. It’s an issue that governments and companies are having to grapple with; the US announced new rules around AI safety last month, and the UK government held an AI safety summit at the beginning of November at which Elon Musk described AI as an existential threat. Existential or otherwise, systems like Lyria present a serious challenge for the way that music careers function now, and they appear especially damaging where classical music is concerned.
‘AI Music Tools’ may sound innocuous, but with a bit more development these tools mean you won’t need an orchestra any more to record a film soundtrack. And why record a new Christmas album with an expensive human choir when you can generate one in seconds? Or hire a musician to play at a function when you can generate hours of instrumental improvisation for free? Lyria is being put out into a world where it’s already a financial challenge to be a performing musician. As of 2019, 55% of musicians in the UK earn less than £20,000 per year, 79% less than £30,000. Being able to convincingly generate an orchestra might be great in the short-term for composers, but it certainly isn’t for orchestral musicians. It will only exacerbate the idea that music isn’t really a job and that musicians should work for free. How will being a performing musician be sustainable under these conditions?
I hope I’m worrying about nothing. Maybe being able to generate sound so easily will make live music more valuable, and performing musicians will be in demand like never before. In a few years’ time composing might have become so cheap that more time can be allotted to music on school syllabuses, allowing children to enjoy music-making from a young age. One can only hope, but realistically this feels pretty utopian. The infrastructure for long-term music-making is slowly being eroded already, with disastrous results. If we take the UK as an example, music education has been deprioritised for years, with the result that the number of students sitting A-level music falling by 46% since 2010. Over the summer the Dartington Trust announced that its 2024 Music Summer Festival & School, which has been a cornerstone of UK music education for over seventy years, is being suspended. This week Oxford Brookes university announced the closure of their music department. This year has been a litany of funding debacles, with institutions like English National Opera and the Britten Sinfonia having their Arts Council funding slashed, and the BBC Singers only narrowly saved from closure.
Prioritising artificially generated music in this environment is like letting loose a wrecking ball on an already unsteady building. I want to believe Google DeepMind’s upbeat blurb, that Lyria will ‘open a new playground for creativity’ and ‘deliver a positive contribution to the future of music’. I really do. But the musical ecosystem is fragile, and Lyria feels like a wolf made out to sound like a sheep. In some specific ways it’s been created far more responsibly than LLMs like ChatGPT. All Lyria-generated material will be watermarked, and the developers seem to have taken care to avoid wild copyright infringements in the data sets (OpenAI, by contrast, are facing lawsuits for copyright infringement, including from the Authors’ Guild).
It’s great that developers are now working with copyright holders, but there are many more stakeholders in this ecosystem — such as performing musicians — who are going to be badly affected, and who aren’t yet being widely consulted. Responsibility isn’t just about copyright law. ‘Dream Track’ focuses entirely on music as product, not as process. Besides the fact that sophisticated music generation will be catastrophic for composition as a commercially viable career, it misses much of what has made music a meaningful activity for humans for millennia. Music isn’t just about the sound produced. It’s about the human interactions that go into making it, the thrill of finally mastering a new technique and achieving something you never thought you’d be able to, the hours of patience and consideration that make a performance really just a small part of what music-making is.
Part of the reason music is so meaningful to me now, and is central to my life and career, is not just because I listened to music when I was younger but because I learned to play the piano. Playing in chamber groups forced me to overcome social unease and engage with all kinds of personalities I might not otherwise have connected with, leading to some of the most memorable and enjoyable experiences of my early adulthood. Piano became the main way in which I communicated, with others and, perhaps just as importantly, with myself. Being a teenager is a confusing business — I often felt like I just didn’t have the words or conceptual frameworks to articulate my uncertain responses to the new situations and emotions that I was experiencing for the first time. When words seemed to have abandoned me, though, music stepped in. Practicing a piece was a way that I could sit with a thought or emotion for hours, sometimes weeks, and work through it within a framework provided by a composition. Where is any of that in an AI-created clip?
It’s unsurprising, I suppose, that AI systems are being developed that focus on music at the points of creation and consumption. The engineers are human. They are artistic consumers. These systems are being built from a place of admiration — the very reason that music, art and literature have been focused on by AI developers is that artistic creation is truly fascinating. Who wouldn’t want to be able to create the Mona Lisa if they could? Or write a play that will still be meaningful to people four hundred years after it was written? There’s a deep irony to the fact that these engineers might make it impossible for humans to continue in jobs that involve creating the art that inspired them in the first place — and that the same companies developing the tech that could make musicians redundant are materially benefitting from doing so. The average annual compensation of a senior research scientist at Google is estimated at approximately $692,500, and $900,000 at OpenAI. Never mind a year, that’s what some musicians can expect to earn in a lifetime.
Wherever Lyria goes next, music-making will be changing fast over the coming years, in ways that will be difficult to predict. I hope that AI is developed in such a way that it genuinely does turn out to be a useful creative tool, and allows music to continue to be a meaningful activity for humans and a viable career. Maybe the ways in which music is meaningful to future humans will be different to the ways in which it has been meaningful to me and my generation, who still remember CDs (just) and for whom live music involves human beings singing or playing instruments. Perhaps that might even be ok, and in thirty years time music will be meaningful in ways I can’t imagine with the capabilities we have right now. But embracing new musical norms may also come with profound loss. It’s disingenuous to pretend otherwise, and to present music-focused AI as a force for exclusively positive change. Because of cuts to music education, learning an instrument is in danger of becoming accessible mostly to the wealthy. If systems like Lyria accelerate this to the point that learning an instrument is a truly niche activity and completely pointless as far as a potential career is concerned, I struggle to think of ‘a positive contribution to the future of music’ that would be worth this loss.
AI is human created, and we can — at the moment — choose how to use it. There’s no inevitability to the way that this all plays out. The artistic community does have a voice, and can have a say in how these technologies develop. Lawsuits and legislation can make a difference. Now it’s clear that copyright is an issue, Lyria’s engineers have at least secured permission for their data set rather than using recordings without permission. I’m an eternal optimist, so I’m going to hold on to the hope that musicians will continue to be proactively consulted on AI development so it can be created in ways that really do work for them rather than against them. After all, AI was promised as a way to free humans up to pursue the activities they love, not to automate away the things that give us meaning.
Very interesting read. I completely agree with the devaluation of the arts in the UK and other places. As for AI, I struggle even with the term itself. I think it's mostly science fiction. My understanding of LLM is they 'learn' by doing pattern extractions on big data, which can then be assembled generatively. But basically, there's no creativity, just a sophisticated plagiarism of such a large selection of data that it feels original.
Stability Audio uses 800,000 songs for its training, and it sounds terrible. I played around with it for about 5 minutes before getting bored with its really bad piano 'compositions'. I haven't found meaningful audio from Lyria, or interacted with it. I'd be surprised if it sounds any better. I don't have the quote to hand, but a major VC fund stated the other day that unless fair use is applied to LLMs then the industry basically bust. I see no reason why fair use should be applied, so if that makes the industry go bust, so be it. AI is not an economic life vest for Silicon Valley.
I'm sure LLMs have their uses. I doubt they'll change any fundamentals of music composition. Perhaps their most destructive quality would be for young musicians to take them seriously, and not bother to put the hard work into studying. Overall, I think for the most part it's a really stupid technology and a huge waste of computer power, that unless it can find some justification to exist should go the way of Crypto and the Metaverse.