Nvidia unveiled a new artificial intelligence model designed to generate music and audio, with features that allow voice modifications and the creation of novel sounds.
Named Fugatto, short for Foundational Generative Audio Transformer Opus 1, the technology targets producers in music, film, and video game industries.
While Nvidia, the largest supplier of chips and software for AI systems, showcased the model’s capabilities, the company does not plan to publicly release it immediately. Fugatto joins the growing roster of generative AI tools, including those developed by startups like Runway and major players like Meta Platforms, which generate audio or video from text prompts.
Nvidia’s version stands out with its ability to modify existing audio. For instance, it can transform piano music into vocals or change a recorded voice’s accent and emotional tone. “Generative AI will bring new possibilities to music, video games, and ordinary creators,” said Bryan Catanzaro, Nvidia’s vice president of applied deep learning research.
One of Fugatto’s features allows it to create unique sounds, like making a trumpet mimic a barking dog. Catanzaro compared its potential to the transformative effect of synthesizers on music over the past 50 years.
The announcement comes amid growing tension between tech companies and the entertainment industry. While companies like OpenAI negotiate with Hollywood studios about AI use, disputes over voice imitation, like Scarlett Johansson’s recent claims against OpenAI, have highlighted the challenges of integrating such technology responsibly.
Fugatto was trained on open-source data, but Nvidia remains cautious about its release, citing potential misuse, including generating inappropriate content or infringing copyrights. “Generative technology carries risks,” Catanzaro said, adding that the company is debating how to manage these concerns.
Other AI leaders like OpenAI and Meta have also withheld public releases of their generative audio and video models, reflecting the industry’s broader uncertainty about controlling misuse and safeguarding against copyright violations.