image-description
Return to Blog

Plivo introduces 27 Languages & 40+ additional Text-to-Speech voices with Amazon Polly Integration

, Written by

We are excited to announce that Plivo now supports Amazon Polly, adding more than 40 voices, 27 languages, and new APIs to give developers more control over synthesized speech output in applications that need text-to-speech voice. With Amazon Polly, Plivo developers now have control over the volume, pitch, rate, and pronunciation of the voices that interact with their users.

Why is it important?

Text-to-Speech has been an important tool in developer’s armory. It allows developers to create interactive voice applications by generating speech dynamically, rather than playing static, pre-recorded media files. Plivo platform supports this capability through Plivo <Speak> XML. Today, with normal text-to-speech, developers can only choose from a basic male or female voice in a subset of languages, without pauses, tonal modulations or other qualities that a natural speech possesses. The result is often a functional and mechanical sounding speech, in a limited set of languages, without any choice of voice or tones - that doesn’t provide a life-like experience to the customer.

Enter Speech Synthesis Markup Language (SSML). SSML has been designed by W3C to provide an XML-based markup language to assist in generating natural sounding synthesized speech. Amazon Polly, being the world leader in SSML Speech Synthesis was our natural choice for integration into Plivo platform. With dozens of lifelike voices across a variety of languages, you can now select the ideal voice and build speech-enabled applications that work in many different countries, adjust speech rate, pitch, loudness and even emphasis to provide a more contextual and localized voice experience to your customers.

For Text-to-Speech, listening is believing. Experience the difference between basic text-to-speech vs Amazon Polly advanced TTS here:

Basic Voice


SSML Enriched Voice with Amazon Polly


Integrating Advanced Text-to-Speech in your Application with Plivo

To synthesize SSML speech on Plivo, simply specify one of the many Amazon Polly voices in the ‘voice’ attribute of Plivo’s <Speak> XML. Note that Polly voices must be namespaced with ‘Polly.’.

For example:

<Response>
<Speak voice="Polly.Joey">
<say-as interpret-as="digits">1836</say-as> is your <say-as interpret-as="spell-out">OTP</say-as> for Plee-voh.
</Speak>
</Response>

Here is the complete list of Amazon Polly voice attributes supported on Plivo platform:

SSML Tags

The following SSML tags are supported for use in Plivo’s XML:

Action SSML Tag Description
Adding a Pause <break> Use this tag to include a pause in the speech.
Emphasizing words <emphasis> Use this tag to change the rate and voice of the speech.
Specifying Another language for Specific Words <lang> Use this tag to set the natural language of the text.
Adding a Pause between Paragraphs <p> Use this tag to represent a paragraph.
Using Phonetic Pronunciation <phoneme> Use this tag for a phonetic pronunciation of the text.
Controlling Volume, Speaking Rate and Pitch <prosody> Use this tag to modify the volume, pitch, and rate of the tagged text.
Adding a Pause between sentences <s> Use this tag to represent a sentence. This will add a strong break before and after the tag.
Controlling How special types of words are spoken <say-as> Use this tag to describe how to interpret the text.
Pronouncing Acronyms and Abbreviations <sub> Use this tag to pronounce the specified words or phrases as different words or phrases.
Improving Pronunciation by specifying parts of speech <w> Use this tag to customize the pronunciation of words by specifying the part of speech.

SSML Voices

Choose from the wide array of Amazon Polly SSML voices supported for use with Plivo XML:

Language Female Male
Australian English (en-AU) Nicole Russell
Brazilian Portuguese (pt-BR) Vitoria Ricardo
Canadian French (fr-CA) Chantal  
Danish (da-DK) Naja Mads
Dutch (nl-NL) Lotte Ruben
French (fr-FR) Lea
Celine
Mathieu
German (de-DE) Vicki
Marlene
Hans
Hindi (hi-IN) Aditi  
Italian (it-IT) Carla Giorgio
Japanese (ja-JP) Mizuki Takumi
Korean (ko-KR) Seoyeon  
Mandarin Chinese (cmn-CN) Zhiyu  
Norwegian (nb-NO) Liv  
Polish (pl-PL) Ewa
Maja
Jacek
Jan
Portuguese - Iberic (pt-PT) Ines Cristiano
Romanian (ro-RO) Carmen  
Russian (ru-RU) Tatyana Maxim
Spanish - Castilian (es-ES) Conchita Enrique
Swedish (sv-SE) Astrid  
Turkish (tr-TR) Filiz  
UK English (en-GB) Amy
Emma
Brian
US English (en-US) Joanna
Salli
Kendra
Kimberly
Matthew
Justin
Joey
US Spanish (es-US) Penelope Miguel
Welsh (cy-GB) Gwyneth  
Welsh English (en-GB-WLS)   Geraint

What’s Next:

  • Check out the detailed documentation to learn more about what’s possible SSML & Amazon Polly.

  • Meet us at Booth No: 815 at AWS re:Invent to talk about this announcement and much more

comments powered by Disqus