Nvidia reveals AI model to modify voices and sounds

The announcement comes amid growing tension between tech companies and the entertainment industry

Nvidia unveiled a new artificial intelligence model designed to generate music and audio, with features that allow voice modifications and the creation of novel sounds.

Named Fugatto, short for Foundational Generative Audio Transformer Opus 1, the technology targets producers in music, film, and video game industries.

While Nvidia, the largest supplier of chips and software for AI systems, showcased the model’s capabilities, the company does not plan to publicly release it immediately. Fugatto joins the growing roster of generative AI tools, including those developed by startups like Runway and major players like Meta Platforms, which generate audio or video from text prompts.

Nvidia’s version stands out with its ability to modify existing audio. For instance, it can transform piano music into vocals or change a recorded voice’s accent and emotional tone. “Generative AI will bring new possibilities to music, video games, and ordinary creators,” said Bryan Catanzaro, Nvidia’s vice president of applied deep learning research.

One of Fugatto’s features allows it to create unique sounds, like making a trumpet mimic a barking dog. Catanzaro compared its potential to the transformative effect of synthesizers on music over the past 50 years.

The announcement comes amid growing tension between tech companies and the entertainment industry. While companies like OpenAI negotiate with Hollywood studios about AI use, disputes over voice imitation, like Scarlett Johansson’s recent claims against OpenAI, have highlighted the challenges of integrating such technology responsibly.

Fugatto was trained on open-source data, but Nvidia remains cautious about its release, citing potential misuse, including generating inappropriate content or infringing copyrights. “Generative technology carries risks,” Catanzaro said, adding that the company is debating how to manage these concerns.

Other AI leaders like OpenAI and Meta have also withheld public releases of their generative audio and video models, reflecting the industry’s broader uncertainty about controlling misuse and safeguarding against copyright violations.

Pakistan forms committee to boost auto exports following US tariff reduction

SECP issues final amendments to public offering regime to enhance IPO…

PSX ends Wednesday in the red; KSE-100 drops 476 points amid…

Meezan Bank reports Rs47.14 billion half-year profit, down 9.6% YoY

Habib Metropolitan Bank reports stable half-year profit with strong non-mark-up income

The slow decay of Pakistani Agriculture

Chenab’s fortunes have not quite changed. Why is its stock price…

Copper exports from Saindak mine crossed $800 million in 2024

At Hum TV, YouTube is starting to become big business

Pakistani meat to finally be available at Carrefour in Dubai

Crypto exchanges need to earn Pakistan’s trust with on‑chain protection

Efficiency in manufacturing has to be achieved through cost management

Unlocking Pakistan’s digital potential: why a smarter approach to 5G is…

Pakistan’s tech sector: From outsourcing hub to global innovation partner

Perplexity AI makes $34.5 billion bid for Google’s Chrome browser

IT and IT-enabled services to retain 4% withholding tax rate in…

Warnings issued over WhatsApp scams targeting Pakistani users with fake job…

Abu Dhabi’s MGX plans to raise up to $25 billion to…

Nvidia reveals AI model to modify voices and sounds

LEAVE A REPLY Cancel reply

Must Read

Highnoon Laboratories Limited and Beximco Pharmaceuticals Limited join forces to transform...

PSX ends Wednesday in the red; KSE-100 drops 476 points amid profit-taking

Meezan Bank reports Rs47.14 billion half-year profit, down 9.6% YoY

Habib Metropolitan Bank reports stable half-year profit with strong non-mark-up income

Nvidia reveals AI model to modify voices and sounds

LEAVE A REPLY Cancel reply

RELATED ARTICLES

Must Read