We are excited to announce that Plivo now supports Amazon Polly, adding more than 40 voices, 27 languages, and new APIs to give developers more control over synthesized speech output in applications that need text-to-speech voice. With Amazon Polly, Plivo developers now have control over the volume, pitch, rate, and pronunciation of the voices that interact with their users.
Why is it important?
Text-to-Speech has been an important tool in developer’s armory. It allows developers to create interactive voice applications by generating speech dynamically, rather than playing static, pre-recorded media files. Plivo platform supports this capability through Plivo <Speak> XML. Today, with normal text-to-speech, developers can only choose from a basic male or female voice in a subset of languages, without pauses, tonal modulations or other qualities that a natural speech possesses. The result is often a functional and mechanical sounding speech, in a limited set of languages, without any choice of voice or tones - that doesn’t provide a life-like experience to the customer.
Enter Speech Synthesis Markup Language (SSML). SSML has been designed by W3C to provide an XML-based markup language to assist in generating natural sounding synthesized speech. Amazon Polly, being the world leader in SSML Speech Synthesis was our natural choice for integration into Plivo platform. With dozens of lifelike voices across a variety of languages, you can now select the ideal voice and build speech-enabled applications that work in many different countries, adjust speech rate, pitch, loudness and even emphasis to provide a more contextual and localized voice experience to your customers.
For Text-to-Speech, listening is believing. Experience the difference between basic text-to-speech vs Amazon Polly advanced TTS here:
SSML Enriched Voice with Amazon Polly
Integrating Advanced Text-to-Speech in your Application with Plivo
To synthesize SSML speech on Plivo, simply specify one of the many Amazon Polly voices in the ‘voice’ attribute of Plivo’s <Speak> XML. Note that Polly voices must be namespaced with ‘Polly.’.
<Response> <Speak voice="Polly.Joey"> <say-as interpret-as="digits">1836</say-as> is your <say-as interpret-as="spell-out">OTP</say-as> for Plee-voh. </Speak> </Response>
Here is the complete list of Amazon Polly voice attributes supported on Plivo platform:
The following SSML tags are supported for use in Plivo’s XML:
|Adding a Pause||<break>||Use this tag to include a pause in the speech.|
|Emphasizing words||<emphasis>||Use this tag to change the rate and voice of the speech.|
|Specifying Another language for Specific Words||<lang>||Use this tag to set the natural language of the text.|
|Adding a Pause between Paragraphs||<p>||Use this tag to represent a paragraph.|
|Using Phonetic Pronunciation||<phoneme>||Use this tag for a phonetic pronunciation of the text.|
|Controlling Volume, Speaking Rate and Pitch||<prosody>||Use this tag to modify the volume, pitch, and rate of the tagged text.|
|Adding a Pause between sentences||<s>||Use this tag to represent a sentence. This will add a strong break before and after the tag.|
|Controlling How special types of words are spoken||<say-as>||Use this tag to describe how to interpret the text.|
|Pronouncing Acronyms and Abbreviations||<sub>||Use this tag to pronounce the specified words or phrases as different words or phrases.|
|Improving Pronunciation by specifying parts of speech||<w>||Use this tag to customize the pronunciation of words by specifying the part of speech.|
Choose from the wide array of Amazon Polly SSML voices supported for use with Plivo XML:
|Australian English (en-AU)||Nicole||Russell|
|Brazilian Portuguese (pt-BR)||Vitoria||Ricardo|
|Canadian French (fr-CA)||Chantal|
|Mandarin Chinese (cmn-CN)||Zhiyu|
|Portuguese - Iberic (pt-PT)||Ines||Cristiano|
|Spanish - Castilian (es-ES)||Conchita||Enrique|
|UK English (en-GB)||
|US English (en-US)||
|US Spanish (es-US)||Penelope||Miguel|
|Welsh English (en-GB-WLS)||Geraint|