Instagram has added a text-to-speech feature to Reels, allowing creators to convert video captions into audio.
This is seemingly in an effort to keep up with TikTok, which launched text-to-speech in December last year.
While popular (there are currently 1.5 billion views on the #texttospeech hashtag alone), TikTok’s AI voices have received a lot of negative attention. And Instagram could go down the same route.
For starters, it’s only offering two voices, both of which have an American accent: Voice 1 (a female voice) and Voice 2 (a male voice). It also experiences similar issues with inaccurate pronunciations and robotic-sounding speech.
These limitations are understandable in the context of these platforms. But audiences are increasingly exposed to advanced synthetic speech, so their expectations are high.
BeyondWords, for example, offers a library of over 720 voices across 64 languages, and the ability to create a custom voice. We also use natural language processing (NLP) and speech synthesis markup language (SSML) to ensure more accurate text-to-speech. And we’ve powered over one billion listens for over 120 global publishers.
So, I expect Instagram’s text-to-speech feature to get its fair share of criticism. But it’s an impressive feature that will no doubt assist creators and their audiences.
How to use the text-to-speech feature on Instagram
- Open Instagram and go to the Reels camera
- Create your video then select ‘Preview’
- Tap ‘Aa’ to add a text caption
- Tap the text bubble, then ‘...’, then ‘Text-to-Speech’
- Select a voice then tap ‘Done’
- Make any other edits then share your Reel