Brands having unique, consistent, high-quality synthetic voices across their products and communications might sound like science fiction. But John Campbell, managing director of voice experience agency Rabbit & Pork (part of the Tipi Group), tells us that they’re here already – and we’re only at the start of a synthetic audio boom. For The Drum’s Audio Deep Dive, he tells us how brands can jump in.
What do Top Gun: Maverick, The Andy Warhol Diaries and The Book of Boba Fett have in common?
They all feature synthetic voices of actors – Val Kilmer’s in Top Gun, Andy Warhol’s and a younger-sounding Mark Hamill in The Book of Boba Fett.
And artificial intelligence (AI)-powered synthetic voices are not reserved for the television and film industry, or viral videos of ‘fake Tom Cruise.’ Many brands are now creating their own synthetic voices to represent their brand. Why? Just look at the smart speaker in your living room.
The synthetic advantage
With a synthetic voice, you can programmatically generate spoken audio on the fly. Rather than pre-recording each line of text in a recording studio, you can simply request the audio to be generated in a fraction of a second.
To start, you need to provide the system with ‘training data’ – usually a voice actor reading a selection of lines to the system.
There are still certain scenarios where a voiceover artist would be the best option. For example, we recently created an Alexa Skill for Sky Bet where Sky Sports presenter Jeff Stelling provided the voiceover.
However, if we wanted to have a match preview containing more dynamic text, written on a weekly basis, a synthetic voice would be a better option. Rather than generating a generic ‘Manchester United will play Liverpool’ from a cache of recordings, or expensive weekly recording sessions, a synthetic voice can give us: ‘Manchester United face Liverpool this bank holiday at Old Trafford. Manager Erik ten Hag...’
The coming synthetic audio boom
Synthetic voice technology has been around for years. In the last five years, the cost and time to create a synthetic voice have drastically reduced, and access to technology has become much easier. Synthetic voices can even be created at home via a website, uploading audio clips for training.
There’s also been a boom in places where brands can deploy synthetic voice assets: call centers, point-of-sale machines, radio ads, YouTube clips, narration on website copy and conversational experiences.
The key for brands is making sure that the voice is consistent, unique and in keeping with the brand’s visual aesthetic.
There are three tested strategies for doing this well:
1. Use an existing voice
If your brand has an existing voiceover artist or celebrity who is known as the brand’s voice already, this is often the easiest option. For example, KFC with Colonel Sanders has created a synthetic voice of the famous founder, which has already been the voice for years.
2. Create a voice
Use a voice actor to create a new, unique voice for your brand. The BBC recently created a voice for its website to read out articles. It was ‘mixed’ in such a way that you can’t pin the voice to an exact accent or location in the UK.
3. Borrow a voice
Take a celebrity or well-known person to lend their voice. This can be seen as a risky strategy if that celebrity was to go on to be involved in controversy. But with that voice comes the popularity and pre-existing perceptions.
With each of these options, there are legal and ethical questions that need to be answered. What is the remit of where the voice can be used? Are there things that a celebrity wouldn’t want their voice to say? These questions may steer brands away from using a celebrity.
How can voice search and synthetic voice be combined?
Smart speakers are a massive growth area for synthetic speakers. Many of the top brands have created Alexa Skills voice apps, where the brand controls the conversational experience. Here many brands must use built-in Alexa voices to add a voiceover to their app. But if a brand has its own synthetic voice, the experience becomes much more customized to that brand identity.
Beyond voice apps, voice search is a key consideration on smart speakers. It’s not farfetched to see a future where Alexa and Google could use different brands’ synthetic voices to answer questions, rather than the default Alexa voice. For example, Alexa could respond to ‘how many calories are in a can of Coke?’ with ‘according to Coca-Cola, there are 123...’ in a voice unique to that brand.
As the use of smart speakers grows outside the home, with brands beginning to use voice within marketing operations, the use of synthetic voices is only going to accelerate.
For more insight into the worlds of voice and sonic branding, check out our Audio Deep Dive hub.