Nvidia reveals AI model to modify voices and sounds

The announcement comes amid growing tension between tech companies and the entertainment industry

Nvidia unveiled a new artificial intelligence model designed to generate music and audio, with features that allow voice modifications and the creation of novel sounds.

Named Fugatto, short for Foundational Generative Audio Transformer Opus 1, the technology targets producers in music, film, and video game industries.

While Nvidia, the largest supplier of chips and software for AI systems, showcased the model’s capabilities, the company does not plan to publicly release it immediately. Fugatto joins the growing roster of generative AI tools, including those developed by startups like Runway and major players like Meta Platforms, which generate audio or video from text prompts.

Nvidia’s version stands out with its ability to modify existing audio. For instance, it can transform piano music into vocals or change a recorded voice’s accent and emotional tone. “Generative AI will bring new possibilities to music, video games, and ordinary creators,” said Bryan Catanzaro, Nvidia’s vice president of applied deep learning research.

One of Fugatto’s features allows it to create unique sounds, like making a trumpet mimic a barking dog. Catanzaro compared its potential to the transformative effect of synthesizers on music over the past 50 years.

The announcement comes amid growing tension between tech companies and the entertainment industry. While companies like OpenAI negotiate with Hollywood studios about AI use, disputes over voice imitation, like Scarlett Johansson’s recent claims against OpenAI, have highlighted the challenges of integrating such technology responsibly.

Fugatto was trained on open-source data, but Nvidia remains cautious about its release, citing potential misuse, including generating inappropriate content or infringing copyrights. “Generative technology carries risks,” Catanzaro said, adding that the company is debating how to manage these concerns.

Other AI leaders like OpenAI and Meta have also withheld public releases of their generative audio and video models, reflecting the industry’s broader uncertainty about controlling misuse and safeguarding against copyright violations.

Pakistan, US move closer to trade breakthrough after high-level meetings in…

Pakistan, Sudan explore joint agricultural ventures in high-level bilateral meeting

PM Shehbaz launches electric vehicle initiative to curb fuel imports and…

Power tariff likely to decrease by 65paisa/unit under FCA of June

Federal govt finally appoints Shakeel Ahmed as OGRA’s member finance

Legal battle casts shadow on cement recovery

Punjab continues to struggle against plastic

Power sector circular debt has reached its Rubicon

Dolmen REIT shines through

Kohat Cement to enter the real estate development market

Unlocking Pakistan’s digital potential: why a smarter approach to 5G is…

Pakistan’s tech sector: From outsourcing hub to global innovation partner

Painfully Deja Vu

Growth on paper, stagnation on ground

U.S. lawmaker opposes Nvidia’s AI chip sales to China

Nvidia sees strong AI chip demand as China welcomes foreign investment

WhatsApp faces possible exit from Russian market

U.S. Senator seeks clarity on Microsoft’s Chinese staff in defence contracts

Nvidia reveals AI model to modify voices and sounds

LEAVE A REPLY Cancel reply

Must Read

PM Shehbaz launches electric vehicle initiative to curb fuel imports and...

Power tariff likely to decrease by 65paisa/unit under FCA of June

Federal govt finally appoints Shakeel Ahmed as OGRA’s member finance

Dar chairs high-level session to shape strategic investment plan for allied nations

Nvidia reveals AI model to modify voices and sounds

LEAVE A REPLY Cancel reply

RELATED ARTICLES

Must Read