I have been playing with voice cloning to try and get a sense of where the technology is and how it can be used in instructional design. There are a number of moral and legal issues with voice cloning, of course, like how do you know the voice you are hearing is who you think it is and what they are saying is what they are actually saying. But there are some very useful applications of this technology in education, such as cross-lingual voice cloning for dubbing videos in other languages. As Perez et al. puts it: “The rapid progress of modern AI tools for automatic speech recognition and machine translation is leading to a progressive cost reduction to produce publishable subtitles for educational videos in multiple languages.” It has also been used by an instructor with ALS who had lost her voice through the illness but got her voice back virtually through voice cloning. It can also be used to create more natural voices in education materials that use text to speech (TTS).
But not all of these tools are created equal. In my experiments I used Orson Welles voice because it is so distinctive (and I am a fan). A voice that distinctive should be low hanging fruit for AI. To be fair, I am only trying free or freemium tools so your experiments will certainly vary. I work a lot with adjunct instructors and students so cost is a factor.
The first one I tried was VoCloner. I was not impressed. It sounds more like Tennessee Williams bumped a little towards the North Atlantic:
The next one, PlayHT, gave much better results:
PlayHT seems to have picked up the cadence of his voice pretty well. PlayHT only lets you clone one voice and if you want to use other voices, they want you to pay. But you can delete a voice and create a new one any time you want. I look forward to more chicanery with this later – maybe for #DS106?
If you have tools that you are using for voice cloning that you think our faculty might find useful, feel free to post a comment below or send me a note. Thanks!