Google’s AI music generator is like ChatGPT for audio

It can write 5-minute songs based on short text prompts.
Sign up for the Freethink Weekly newsletter!
A collection of our favorite stories straight to your inbox

Google has unveiled an advanced AI music generator that can turn a snippet of text into a song — but legal concerns might prevent the tech giant from ever sharing it with the public.

The AI revolution: ChatGPT, DALL-E 2, and other advanced AIs capable of generating impressive text or images in response to user prompts exploded in popularity in 2022, but they weren’t the first generative AIs, nor the only examples of what the neural networks can do.

Several companies have also trained AIs to generate music in response to text, audio, or image prompts — OpenAI, the research firm behind ChatGPT and DALL-E 2, even released an AI music generator called “Jukebox” back in 2020.

These systems haven’t been as enthusiastically embraced as their text- and image-generating counterparts, though, mainly because their outputs aren’t as impressive — most are low-fidelity, simplistic, and lacking in traditional song structures, such as repeating choruses.

What’s new? Music-making AIs are getting better, though, and perhaps the most impressive example of the technology is MusicLM, an AI music generator unveiled by Google in January 2023.

The system can generate clips up to 5 minutes long based on text descriptions, and while the music isn’t going to win any Grammys, the audio does sound more like something a human might record than the clips generated by other AIs.

How it works: Google trained MusicLM on more than 280,000 hours of music sourced from MuLan, a model trained to link music to descriptions written in natural language.

They then created MusicCaps, a publicly accessible dataset of more than 5,500 music clips to use to evaluate the AI music generator. Expert musicians wrote captions for each of these clips, as well as lists of aspects to describe them, such as their genre or mood.

During the evaluation stage, Google pitted MusicLM against two other text-to-music AIs — Mubert and Riffusion — using several quantitative metrics for assessing a clip’s audio quality and adherence to a text description. 

They also presented human evaluators with MusicCaps’ descriptions and two audio clips — these might be two clips produced by AIs or one AI-generated clip and the music upon which the MusicCaps description was based. The evaluators then chose which of the clips they thought best matched the description. 

According to a paper Google shared on the preprint server arXiv, MusicLM outperformed the other AIs across the board. 

“We strongly emphasize the need for more future work in tackling these risks associated to music generation.”

Agostinelli et al.

Looking ahead: Google’s AI music generator may be able to produce audio that sounds closer to human-written music, but it still can’t replicate traditional song structures, and the vocals it creates are particularly poor quality, with unintelligible lyrics.

Google says future work on the system could focus on those issues, improving the overall quality of the audio, and addressing the problem that’s preventing it from releasing the MusicLM to the public: about 1% of its output can be approximately matched to audio in its training data.

“We acknowledge the risk of potential misappropriation of creative content associated to the use case … We strongly emphasize the need for more future work in tackling these risks associated to music generation,” the researchers wrote.

We’d love to hear from you! If you have a comment about this article or if you have a tip for a future Freethink story, please email us at [email protected].

Related
LLMs are a dead end to AGI, says François Chollet
AI researcher François Chollet thought we needed a better way to measure progress on the path to AGI — so he made one.
Meet Thresh, the world’s first professional gamer
Was Elon Musk any good at Quake? “He’s a legit gamer,” but…
You’re thinking of the metaverse all wrong, says Matthew Ball
Rumors of the metaverse’s demise have been greatly exaggerated.
Perplexity, Google, and the battle for AI search supremacy
AIs that generate answers to user queries could transform search, but only if someone can get the tech and the business model right.
How AI is rewriting Silicon Valley’s relationship with the Pentagon
Silicon Valley is warming to the Department of Defense as it works to get new AI systems developed and deployed en masse.
Up Next
scientific papers
Subscribe to Freethink for more great stories