Enhancing AI-generated audio articles with pronunciation rules

Maria Lazareva 30.Apr.2024

AI voices have advanced rapidly over the last few years and are now almost indistinguishable from human voices. However, nuances in text can still pose challenges, particularly in news reporting, where publishers need automation tools to match the speed of the news cycle, and reader expectations for audio quality are high.

At BeyondWords, all content is preprocessed before speech synthesis to achieve as accurate pronunciations as possible. This text preprocessing enables our AI voices to recognise number ranges, decimal points, shortened words and more.

For example, compare the two pronunciations below with and without BeyondWords text preprocessing:

Audio generated without BeyondWords

0:00

/0:05

Audio generated with BeyondWords

0:00

/0:05

In some cases, however, publishers require further customization, which is possible with pronunciation rules.

Pronunciation rules enable smooth narration of other non-standard words, such as acronyms, data-based financial analysis, scores reporting in sports, technical terms and jargon in specialized fields. They’re also helpful for new or unusual words that often enter the news cycle.

Below are examples where pronunciation rules with BeyondWords can be applied to nuanced text:

Stock symbols in financial reporting, like FTSE 100
Uncommon names of people or places, like American rapper A$AP Rocky or the Welsh town of Cwmystwyth
Political terminology, like Sen. Tim Scott, R-S.C in the United States
Investment and financial reporting with references to codes like 401(k)

Setting pronunciation rules allows publishers to achieve consistency, ensuring that words are spoken uniformly across different contexts. They also help to create scalable and precise AI-generated audio articles over time, eliminating the need for manual editing.

Using pronunciation rules, anyone in your organization can create and manage pronunciations without complicated SSML or relying on developer assistance.

You can easily create and manage pronunciations through the BeyondWords API or the dashboard. Rules can be applied across the entire organization, in a specific project, or individual articles. Using rules, you can choose to substitute one word for another, dictate the pronunciation of acronyms as a complete word or individual letters, or specify exact pronunciations using the International Phonetic Alphabet (IPA).

Customizing pronunciation

You can create custom pronunciations from the Settings section of your projects in the Rules tab, or using the API.

You can choose between the following pronunciation customizations: “Substitute”, “Say as a word”, “Say as letter sequence”, “Say in a specific language” or “Custom pronunciations”. Watch the demo below:

0:00

Customizing pronunciations with BeyondWords

Check out our Docs and Guides for an easy step-by-step guide on pronunciation customization. Our AI voice and audio publishing platform is purpose-built for content publishers.

To learn more about how BeyondWords can help you, schedule a demo with our team or create an account today.

Next: See phonemic transcription guidelines for Afrikaans, Danish, English [AU/GB/NZ/ZA], English [CA/US], German [AT/CH/DE], Mandarin [Pinyin], Norwegian and Swedish.

Note: (1) New rules will only be applied to new content that you generate. To apply rules to older content you will need to regenerate it. (2) You should not add rules for numbers on their own unless their scope is for a single audio only. Issues with numbers should be reported to us through the Voice Issue tab. (3) The Custom Pronunciation rule type is currently limited to voice clones, but we are working on making it available for our standard voices in the future.