Choosing the right AI voice is key to maximizing audio engagement.

You need a lifelike voice that listeners enjoy listening to and find easy to understand. It should also reflect the nature of your content and brand.

We've brought together 7 tips for choosing an AI voice that truly resonates with your target audience, so you can create automated audio that keeps listeners listening:

  1. Choose the right language and locale
  2. Think about representation
  3. Create demos
  4. Listen to audio in your niche
  5. Support voice actors
  6. Get audience feedback
  7. Consider a custom voice

Keep reading or listening to learn more.

1. Choose the right language and locale

The most important thing is that the AI voice speaks your language. But if your language is spoken in multiple regions, you might also need to consider locale.

Our voice library covers 70 languages across more than 130 locales. For example, you can have your English-language content read in an Australian, British, Canadian, Hong Kong, Indian, Irish, Kenyan, New Zealand, Nigerian, Filipino, Singaporean, South African, Tanzanian, US, or Welsh accent.

We also use different natural language processing algorithms for each locale. This allows us to ensure that voices pronounce words and other elements like a native would. For example, en-US voices read '10/01' as "October first", whereas en-GB voices say "the tenth of January".

2. Think about representation

Biases concerning accent, age, and gender can affect people's perceptions of voices. Listeners make judgements about traits like intelligence and trustworthiness based on accent alone.¹

These often reflect stereotypes that exist in the general public. For example, one study found that female voice overs are considered more soothing, whereas male voice overs are considered more forceful.²

Rather than appealing to these biases, you can make your audio more authentic, engaging, and inclusive by focusing on representation. Millennials and Gen Zs in the UK agree that, as a culture, we’re more open to hearing from diverse voices than ever before.³

  • Represent the author: If you are converting authored content into audio, using the author's voice adds a personal touch. With our voice cloning service, you can create an AI voice that sounds just like you or a writer on your team. The next-best option is to choose an AI voice that best matches the accent, age, and gender of the author.
  • Represent the audience: People can better identify with voices that sound like their own, and tend to find them more trustworthy.⁴ This is particularly important when you're covering in-group topics — for example, women's lifestyle or regional news.

Our AI voice library includes adult male and female voices from 130+ locales. We also offer a number of children's AI voices, which are ideal for making child-friendly audio.

3. Create demos

Listening to voice demos can help you make a subjective call on how natural the AI voice sounds and whether it will suit your content.

When you're choosing AI voices in BeyondWords, you can press the play button alongside to hear a short preview. The voice will typically say "Hello, you are listening to a preview of this voice" (or a translation). You can also listen to these previews via our voice demo tool.

Creating an extended voice demo in the BeyondWords Text-to-Speech Editor

After shortlisting your favorite voices, you might want to create extended demos in our Text-to-Speech Editor. You can enter any text and set a different voice for each paragraph, then review the audio output. This makes it easy to experiment and compare different options.

4. Listen to audio in your niche

If you don't know what kind of voice will suit your content, it's helpful to listen to other audio in your niche. Whether it's delivered by a human or AI, this will give you a good idea of what voice types work.

News broadcasting, for example, is associated with a distinct speaking style. This helps reporters to be clear, neutral, and trustworthy. As a result, similar voices are being used for AI-narrated news articles. Amazon Polly has even developed specific 'Newscaster' voices⁵ (Matthew (US English), Joanna (US English), Lupe (US Spanish), and Amy (British English)), which are available through our library.

5. Support voice actors

Most AI voice providers don't pay commission to the voice actors who contribute to their voice models.

We believe that voice actors should be fairly compensated for the long-term value their voice clones provide. That's why the performers behind our exclusive voices earn royalties based on how much their voice is used, as well as an upfront recording fee.

This means that you can use some of the best AI voices available while supporting real voice actors. Our paid users can already get beta access to 'Joe' (male, adult, British English). More exclusive voices, including 'Jodi' (female, adult, US English) and 'Adam' (male, adult, US English), are coming soon.

You can also support voice actors by creating custom voices.

6. Get audience feedback

A/B testing to compare engagement with AI voices

Through testing and analytics, you can empirically measure which voices are best at keeping listeners engaged. A/B testing, or split testing, gives the most reliable results:

  1. Create two versions of your audio: one with Voice A, and one with Voice B
  2. Serve version A to half your audience, and version B to the other half
  3. Compare completion rates to see which version engaged listeners best

You can monitor listener retention and other key metrics through BeyondWords Analytics. We also offer Google Analytics and Google Tag Manager integration.

Alternatively, you could try asking your audience for subjective feedback. Create samples of your favorite voices and conduct a poll or survey asking which listeners like best.

7. Consider a custom voice

Voice actor Joe Coen in the recording studio

If you can't find the AI voice you're looking for, or want to invest in a unique sound, you can create a custom AI voice that perfectly represents your content and brand. This gives you all the advantages of text-to-speech publishing, combined with the engagement power of a particular speaker's voice.

As we have already discussed, if you are creating audio versions of authored content, you may wish to clone the voice of the author(s). Having writing read aloud in the writer's own voice adds a personal touch, which can help to engage listeners and lend your audio more authenticity.

Alternatively, you can follow in the footsteps of publishers like News24 by casting a voice actor. This can result in a higher-quality voice model and help you to achieve strong and consistent audio branding.

When commissioning an AI voice, you will need to create a services agreement with the chosen speaker, which sets out all the terms and conditions concerning the voice clone. We offer a free template, called the Voice Cloning Contract, that covers the commercial and contractual areas likely to be relevant.

Find your favorite AI voice today

The voice provider or platform you choose will determine which AI voices you can use — and how you can use them.

At BeyondWords, every user gets access to an AI voice library featuring 500+ voices from Amazon Polly, Yandex, Microsoft Azure, and Google Cloud. And, thanks to automatic SSML tagging, these AI voices sound better when they're used through BeyondWords.

Paying users also get access to exclusive voices created in collaboration with voice actors, as well as our voice cloning service. Our text-to-speech platform equips you with the audio production, distribution, monetization, and analytics tools needed to make the most of any voice, too.

Sign up free today or book a demo with our team.


Sources

  1. (2007) "How Accents Affect Perception of Intelligence, Physical Attractiveness, and Trustworthiness of Middle-Easter-, Latin-American-, British- and Standard-American-English-Accented Speakers," Intuition: The BYU Undergraduate Journal of Psychology: Vol. 3 : Iss. 1 , Article 3. [PDF]
  2. Consumers Hear Differences in Male, Female Voices, Marketing Charts, published March 2010
  3. Culture Next: 2021, Spotify, published September 2021
  4. Bestelmeyer, P. E., Belin, P., & Ladd, D. R. (2015). A Neural Marker for Social Bias Toward In-group Accents. Cerebral cortex (New York, N.Y. : 1991), 25(10), 3953–3961. https://doi.org/10.1093/cercor/bhu282 [Available here]
  5. Medill Study Finds Preference for Female Voices and Local Accents, Northwestern, published March 2020
  6. NTTS Newscaster Speaking Style, AWS, accessed May 2022