Initially, OpenAI introduced a tool enabling individuals to create digital images by simply describing their desired visualisations. Subsequently, they developed similar technology capable of generating full-motion videos reminiscent of Hollywood productions. Now, they have unveiled technology capable of replicating someone's voice. The renowned artificial intelligence startup announced on Friday that a select group of businesses was testing a new OpenAI system called Voice Engine, designed to replicate a person's voice from a 15-second recording.


By uploading a recording of oneself along with a paragraph of text, the system can articulate the text using a synthetic voice that closely resembles the user's own. Notably, the text does not necessarily need to be in the user's native language. For instance, an English speaker can have their voice replicated in Spanish, French, Chinese, or various other languages.


What Danger Does Voice Engine Pose


OpenAI is cautious about widely distributing this technology due to its ongoing efforts to comprehend its potential risks. Similar to image and video generators, a voice generator could potentially contribute to the spread of misinformation on social media platforms. Moreover, it could enable criminals to impersonate individuals online or during phone conversations.


The company expressed particular concern about the possibility of this technology being utilised to bypass voice authenticators that safeguard access to online banking accounts and other personal applications.


Jeff Harris, Product Manager at OpenAI, in an interview, said, “This is a sensitive thing, and it is important to get it right,” as reported by the New York Times.


The company is exploring methods of adding watermarks to synthetic voices or implementing controls to prevent individuals from utilising the technology with the voices of politicians or other notable figures.


OpenAI is among several companies that have developed advanced AI technology capable of swiftly generating synthetic voices. These companies include tech giants like Google and startups like ElevenLabs based in New York. 


Since last year, OpenAI has utilised its technology to enable a version of ChatGPT that can speak. Additionally, it has long provided businesses with various voices for similar applications, all created from recordings provided by voice actors.


However, the company has yet to release a public tool that allows individuals and businesses to replicate voices from short clips as Voice Engine does. According to Harris, the ability to replicate any voice in this manner is what poses a risk. He emphasised that the technology could be particularly hazardous in an election year.