A 6-minute video demo of ElevenLabs:
ElevenLabs, a generative AI Voice service, allows you to clone any voice and create AI-generated audio with a few clicks. I took it for a test. Here is how it works.
How does it work
You can choose from many available artificial voices or upload a voice sample and create your custom voice. For example, an audio recording of you reading an NYTimes article, but just 30 seconds of it. Or any audio file of any person you have.
Instant Voice Cloning in ElevenLabs
After uploading the voice sample, you can use your cloned voice to generate an audio file with your cloned voice by providing a text.
I have used the service multiple times, mainly to showcase its capabilities in workshops on AI.
Now, I wanted to find out what my voice would sound like if I created an AI-cloned version of it.
I recorded my voice on the site, reading a description of a talk I am giving at AI Con Las Vegas in June.
You can upload an audio file you recorded locally or directly register on the site.
Generating Voice Audio in ElevenLabs
Either way, you can instantly use the voice to generate audio files, having you say anything.
I wanted to find out how much better or worse the purely AI-generated version sounds.
For an exact comparison, I use the same paragraph of text and generate it as English voice using ElevenLabs.
Here are the results:
Compare Voices
Natural Voice uploaded to ElevenLabs
I am reading a few sentences in English.
The generated audio AI
Based on my 30-second upload, with the same text input.
ElevenLabs turns me into a native speaker. But does it still sound like me? No, because my accent is obliterated. If you want to sound like a native English speaker, ElevenLabs works perfectly for that.
But a key feature is for sure the option to speak in your voice in 27 languages. Let me say the same text in German.
Generated AI voice German
Here, I sound like someone from northern Germany. I could use that voice to appeal to people from there, Austrians would not relate to that voice.
Tuning your AI voice
ElevenLabs has features to play around with the voice dynamic, speed, and how much it should stick to the original voice.
Voice settings in ElevenLabs:
In the “Creator” plan of ElevenLabs “Professional Voice Cloning” is listed as a feature. From the help page:
Professional Voice Cloning (PVC), unlike Instant Voice Cloning (IVC) which lets you clone voices with very short samples nearly instantaneously, allows you to train a hyper-realistic model of a voice. This is achieved by training a dedicated model on a large set of voice data to produce a model that’s indistinguishable from the original voice.
The difference is that you can’t use this voice instantly. ElevenLabs says it can take up to 4 weeks before your voice is ready to use when using this feature. I might update this post with audio samples generated with this feature. I recommend signing up for the newsletter to get my new content.
Harmful use of the service
The dangers of such a service are demonstrated in this fake campaign example.
With the click of a checkbox, you confirm that you are the rightful owner of the voice. You can upload a voice sample of Joe Biden, Arnold Schwarzenegger, or anyone you would like to. So far, ElevenLabs has not detected the famous voices I tried to upload. I think that’s a feature it might need to avoid misuse. And that for sure will happen.
Use cases
Overall, ElevenLabs is an excellent AI app that enables creators of all kinds to do all sorts of things with audio:
- Creation of podcast ads
- Turn your written content into audio
- Translate your content in 28 languages
- Create audiobooks
- Dub your videos
and much more.
Conclusion
I am surprised at how well the service works and how it can be used to scale audio content production.
Working a lot with audio? Go check out ElevenLabs, it might be the right tool to speed up some of your workflows.