When podcast networks began utilizing AI to translate exhibits into different languages final yr, media consumers have been skeptical. Voices ranged from awkwardly robotic to wildly inaccurate.
The tech has come a great distance since then — which is why iHeartMedia selected to take its time. Greater than a yr after iHeartMedia execs advised Digiday they deliberate to debut a handful of translated podcasts to develop their worldwide viewers and promoting enterprise, these exhibits are lastly seeing the sunshine of day.
The audio media large has rolled out out AI-translated variations of 10 of its well-liked podcast exhibits, in six totally different languages – and within the authentic hosts’ personal voices, due to voice-cloning expertise.
“We actually wished to ensure we didn’t launch till the expertise was good and we felt actually good about placing these exhibits out on this planet and doing it for a spread of exhibits,” mentioned Will Pearson, president of iHeartMedia’s podcast arm. “The expertise has simply improved a lot within the final 9 to 12 months.”
And naturally, media consumers should be satisfied too.
The AI translations confronted two principal challenges: making certain accuracy and preserving every present’s distinctive tone, Pearson mentioned. The exhibits usually are not re-recorded by native audio system. iHeartMedia labored with AI audio and video firm Speechlab to clone the voices of its podcast hosts and use them to learn the present’s transcript in a number of languages. This permits listeners in worldwide markets to get pleasure from these exhibits of their native languages – with the unique podcast hosts’ voices.
iHeartMedia examined a number of AI voice corporations earlier than partnering with Speechlab, and secured buy-in from podcast hosts to launch the translated exhibits. That was the simple half, in line with Pearson.
The tougher half was getting the expertise and course of as much as par. Speechlab cloned these hosts’ voices from snippets of their exhibits, and refined the audio to translate it into totally different languages whereas preserving the unique tone and persona. These audio clips have been shared with the podcast groups at iHeartMedia, who went forwards and backwards between the podcast exhibits’ groups and with the expertise and human groups of native audio system to assessment the AI-generated audio snippets.
The podcast community would share suggestions with the Speechlab’s staff. For instance, one host who has a slight accent had some phrases mistranslated. iHeartMedia additionally examined out the translations with teams of listeners who have been native audio system.
As soon as this course of was ironed out, it took “a matter of days” to translate a wider swath of podcast episodes, in line with Pearson. About 15 episodes per present will roll out over the course of the following a number of weeks.
“The expertise has gotten to the purpose the place it’s actually coming again very robust, even on the primary move,” he mentioned. Podcast hosts vetted the AI-translated episodes, he famous.
“Should you listened to one among these [shows] 12 to 18 months in the past, it may need been correct however probably not within the sense of what a translator would have accomplished. To make one thing really feel conversational, it shouldn’t be one-to-one. There’s a nuance of language… you need the conversational move to really feel proper – extra for the chat exhibits than the narrative type exhibits. When Jay Shetty delivers his content material, he has his personal supply,” Pearson mentioned. “We need to ensure it captures that essence as a lot as attainable.”
A couple of quarter of iHeartMedia’s month-to-month podcast downloads come from outdoors of the U.S., in line with Pearson. “That’s not an insignificant quantity of listenership,” Pearson mentioned. “We do promote internationally. That is the early levels of getting new exhibits out on this planet and attending to monetize from there.” iHeartMedia had about 555.6 million international podcast streams and downloads in Could 2025, in line with Podtrac knowledge.
The AI audio translation expertise isn’t changing human translators at iHeartMedia. Pearson mentioned it could have been “cost-prohibitive” to translate these podcast exhibits with translation editors. Pearson declined to share how a lot it price to get these AI-translated variations of the podcast exhibits up and operating. iHeartMedia labored with about two dozen inside and exterior folks to assist with the tech, edits and translations, he famous.
Kristen Coseo, director of podcast and digital audio technique at Ocean Media, is skeptical of the present state of the AI expertise, however lauded the iHeartMedia initiative.
“The power to ship content material in language whereas preserving the host’s voice may preserve model consistency and improve listener connection, making it a sensible technique,” she mentioned. “Whereas AI voice cloning has considerably improved, there’s nonetheless a threat of sounding robotic or lacking emotional nuance. If the standard is excessive, this could possibly be a game-changer for reaching new markets.”
Coseo mentioned she’d think about shopping for adverts within the translated podcast episodes if there was knowledge exhibiting robust listener engagement and retention in worldwide markets. “The chance to focus on numerous audiences with tailor-made content material is compelling, nevertheless it hinges on the translations being seamless and culturally resonant,” Coseo mentioned. “Authenticity is crucial in podcasting.”
One other company media purchaser who requested to talk anonymously as they hadn’t heard the AI translated episodes — mentioned whereas the concept and idea are good, they would want to think about the authenticity of the present’s emotion, context and rhythm earlier than they purchased adverts in AI-translated podcasts. “Simply because the content material may be translated through AI doesn’t imply it needs to be,” they mentioned. “Everyone knows that word-for-word issues don’t translate precisely. So, I’d be cautious and need to herald somebody that speaks the language to ensure it sounds the best way it’s presupposed to, each from a content material and promoting perspective.”
Media corporations are discovering extra methods to make use of AI audio expertise. Audio corporations like Spotify and PodcastOne are experimenting with AI-translated podcasts as effectively. Information publishers are additionally more and more utilizing AI to create audio merchandise. Time and Enterprise Insider launched AI-generated audio information briefings this month.
iHeartMedia has disclaimers in its episodes and present descriptions that the podcasts are translated utilizing AI expertise. The episodes can be revealed weekly. iHeartMedia will analyze which podcasts do effectively in numerous markets, and evolve the technique from there, Pearson mentioned.
“I don’t understand how this stuff will carry out. However we do really feel fairly bullish on the expansion of [this]. That is cost-effective sufficient that that is well worth the funding to check this out,” he mentioned.
The translated exhibits embody “On Function with Jay Shetty”, “Revisionist Historical past with Malcolm Gladwell”, “Stuff You Missed in Historical past Class”, Stuff They Don’t Need You to Know”, “Earlier than Breakfast with Laura Vanderkam”, “Easy methods to Cash”, “Stuff to Blow Your Thoughts”, “Betrayal”, “The Girlfriends” and “Homicide 101”. The exhibits will first come out in Spanish, then French, Arabic, Portuguese, Hindi and Mandarin, with plans to develop to much more exhibits and languages sooner or later.