We’ve been fascinated by the advances AI is enabling in language translation – partly because we do implement this on shows with a global audience and partly because, well, it’s fun to mess with new tech. It’s part of the job!
It’s always a bit torturous to watch foreign language films, no matter how great they are – either the dubbing is totally off-putting and detracts from your enjoyment of the story, or subtitles mean your eyes are trying to keep up with the words and pictures at the same time, and failing miserably – especially in those cases where they’re out of sync. Well, thanks to recent advancements in AI, you will soon be able to watch any foreign content in your native tongue and it’ll feel like that precious dialogue is actually coming out of the actors’ mouths.
In 2022, the movie “Fall” made headlines when the team at Flawless used AI tools to remove more than 30 F-bombs from the film and replace them with PG-13 language. They created a 3D deepfake model of each actor’s face so they could reanimate their mouths to match the new words. The results are seamless. In January of 2023, Flawless revealed they are using the same AI technology to redub “Fall” into other languages. In this translation process, not only is the video re-lip synced but the AI also maintains the voice tone of the original speaker in that new language. The results are again, seamless.
Now in 2024 we have access to these powerful tools that only major studios have had and it is opening many new avenues for content creators. But can we get the same level of quality as a high-level movie production? To find out, I uploaded a 60 second clip of Barack Obama speaking in English to HeyGen Labs AI translator (one of the many tools available to the public). I selected my target language: Japanese. A few minutes later I downloaded my redubbed video file and the results speak for themselves…
There it is, Obama speaking Japanese in his own tone of voice like he had spoken it his whole life. And this was generated using only the original video file. I supplied no extra images of Obama’s face to create the 3D face replacement and I needed no additional audio recordings for the AI to recreate his voice. Just upload the video and the AI does the work.
For content creators, this tech is still in its infancy with rough edges and limitations. But as AI dubbing improves how can we expect these tools to change the way we make content? If a video can be shared in every language, what does that mean for international distribution?
Be interested to hear others’ thoughts on this, what differences do you think this particular tech advance will make to the landscape for video in the future? Anyone got a live example where they’ve implemented?