OpenAI says it has conducted a small test of its latest voice cloning product, Voice Engine, with a number of select partners. The results show promising applications for the technology, but safety concerns could prevent publication.

OpenAI says Voice Engine can clone a human’s voice based on a single 15-second recording of their voice. The tool can then “generate natural-sounding speech that may be very much like the unique speaker.”

Once cloned, Voice Engine can convert text input into audible speech using “emotional and realistic voices.” The power of the tool enables exciting applications, but additionally raises serious security issues.

Promising use cases

OpenAI began testing the voice engine late last yr to see how a small group of select participants could use the technology.

Some examples of how Voice Engine test partners used the product include:

  • Adaptive teaching – Age of Learning used Voice Engine to offer reading assistance to children, create voice-over content for learning materials, and supply personalized verbal responses to interact with students.
  • Translate content – HeyGen used Voice Engine for video translation in order that product marketing and sales demos could reach a bigger market. The translated audio retains the person’s native accent. So when a native French speaker’s tone is translated into English, you continue to hear their French accent.
  • Providing more comprehensive social services – Dimagi trains medical examiners in distant environments. It used Voice Engine to coach medical examiners in underserved languages ​​and supply interactive feedback.
  • Supporting non-verbal people – Livox enables non-verbal people to speak via alternative communication devices. Voice Engine allows these people to decide on a voice that best represents them, quite than something that sounds more robotic.
  • Help patients restore their voice – Lifespan piloted a program offering Voice Engine to individuals with speech impairments because of cancer or neurological diseases.

Voice Engine just isn’t the primary AI voice cloning tool, however the examples in it are Blog post by OpenAI indicate that it represents the state-of-the-art and should even be higher than ElevenLabs.

Here is only one example of the natural tone and emotional qualities that could be created.

Security concerns

OpenAI said it was impressed by the use cases developed by test participants, but further security measures would have to be put in place before the corporate decides “whether and how one can deploy this technology at scale.”

According to OpenAI, the technology that may accurately reproduce an individual’s voice poses “serious risks which might be particularly outstanding in an election yr.” Fake Biden robocalls and Senate candidate Kari Lake’s fake video are living proof.

In addition to the clear restrictions in the overall usage guidelines, participants within the trial were required to have “explicit and informed consent from the unique speaker” and weren’t allowed to develop a product that allowed people to create their very own voices.

OpenAI says it has implemented additional security measures, including an audio watermark. It didn’t explain exactly how it really works, but said it could perform “proactive monitoring” of Voice Engine usage.

Some other major players within the AI ​​industry are also concerned about this sort of technology coming into circulation.

What’s next?

Will the remaining of us give you the option to mess around with Voice Engine? It’s unlikely, and perhaps that is thing. The potential for malicious use is gigantic.

OpenAI is already recommending that institutions akin to banks phase out voice authentication as a security measure.

Voice Engine has an embedded audio watermark, but OpenAI says more work is required to detect when audiovisual content is AI-generated.

Even if OpenAI decides to not release Voice Engine, others will. The days when you could possibly trust your eyes and ears are history.

This article was originally published at