Documentation Index Fetch the complete documentation index at: https://plivo.com/docs/llms.txt
Use this file to discover all available pages before exploring further.
This page covers the XML elements for audio output: converting text to speech, playing audio files, and sending DTMF tones.
Speak
The <Speak> element converts text to speech and plays it to the caller. Use it for dynamic messages that can’t be prerecorded.
Basic Usage
< Response >
< Speak > Hello! Welcome to our service. </ Speak >
</ Response >
Python
Node.js
Ruby
PHP
Java
.NET
Go
from plivo import plivoxml
response = plivoxml.ResponseElement()
response.add(plivoxml.SpeakElement( 'Hello! Welcome to our service.' ))
print (response.to_string())
Speak Attributes
Attribute Type Default Description voicestring WOMANVoice tone. Allowed: WOMAN, MAN languagestring en-USLanguage for speech. See supported languages below loopinteger 1Number of times to repeat. 0 = infinite
Change Voice and Language
< Response >
< Speak voice = "MAN" language = "en-GB" >
Good day! This message uses a British male voice.
</ Speak >
</ Response >
from plivo import plivoxml
response = plivoxml.ResponseElement()
response.add(plivoxml.SpeakElement(
'Good day! This message uses a British male voice.' ,
voice = 'MAN' ,
language = 'en-GB'
))
print (response.to_string())
Loop a Message
Play a message multiple times:
< Response >
< Speak loop = "3" > Please hold. Your call is important to us. </ Speak >
</ Response >
Set loop="0" to repeat indefinitely until the call ends:
< Response >
< Speak loop = "0" > Please wait while we connect you. </ Speak >
</ Response >
Supported Languages
Language Code Woman Man Danish da-DKYes No Dutch nl-NLYes Yes English (Australian) en-AUYes Yes English (British) en-GBYes Yes English (USA) en-USYes Yes French fr-FRYes Yes French (Canadian) fr-CAYes No German de-DEYes Yes Italian it-ITYes Yes Polish pl-PLYes Yes Portuguese pt-PTNo Yes Portuguese (Brazilian) pt-BRYes Yes Russian ru-RUYes No Spanish es-ESYes Yes Spanish (USA) es-USYes Yes Swedish sv-SEYes No
SSML Support
Speech Synthesis Markup Language (SSML) provides fine-grained control over pronunciation, pitch, rate, and pauses. Use Polly voices for SSML support.
< Response >
< Speak voice = "Polly.Joey" language = "en-US" >
< prosody rate = "medium" >
Hello and welcome to Plivo.
< break time = "500ms" />
The word < say-as interpret-as = "spell-out" > SSML </ say-as >
stands for Speech Synthesis Markup Language.
</ prosody >
</ Speak >
</ Response >
from plivo import plivoxml
response = plivoxml.ResponseElement()
speak = plivoxml.SpeakElement(
content = "The word" ,
voice = "Polly.Joey" ,
language = "en-US"
)
speak.add_say_as( "read" , interpret_as = "characters" )
speak.add_s( "may be interpreted as either the present simple form" )
speak.add_w( "read" , role = "amazon:VB" )
speak.add_s( "or the past participle form" )
speak.add_w( "read" , role = "amazon:VBD" )
response.add(speak)
print (response.to_string())
Tag Description Example <break>Add a pause <break time="500ms"/><say-as>Control pronunciation <say-as interpret-as="spell-out">ABC</say-as><prosody>Modify pitch, rate, volume <prosody rate="slow">Slowly</prosody><emphasis>Add emphasis <emphasis level="strong">Important</emphasis><p>Paragraph pause <p>First paragraph.</p><s>Sentence pause <s>First sentence.</s>
Speak Nesting
<Speak> can be nested inside:
<GetDigits> - Play message while collecting input
<GetInput> - Play message while collecting speech/digits
<PreAnswer> - Play message before answering
< Response >
< GetDigits action = "/handle-input/" numDigits = "1" >
< Speak > Press 1 for sales, press 2 for support. </ Speak >
</ GetDigits >
</ Response >
Play
The <Play> element plays an audio file to the caller. Use it for pre-recorded messages, music, or sound effects.
Basic Usage
< Response >
< Play > https://example.com/audio/welcome.mp3 </ Play >
</ Response >
Python
Node.js
Ruby
PHP
Java
.NET
Go
from plivo import plivoxml
response = plivoxml.ResponseElement()
response.add(plivoxml.PlayElement( 'https://example.com/audio/welcome.mp3' ))
print (response.to_string())
Play Attributes
Attribute Type Default Description loopinteger 1Number of times to play the audio. 0 = infinite loop
Loop Audio
Play hold music on repeat:
< Response >
< Play loop = "0" > https://example.com/audio/hold-music.mp3 </ Play >
</ Response >
from plivo import plivoxml
response = plivoxml.ResponseElement()
response.add(plivoxml.PlayElement(
'https://example.com/audio/hold-music.mp3' ,
loop = 0
))
print (response.to_string())
Format Extension Notes MP3 .mp3Recommended for smaller file sizes WAV .wavHighest quality, larger files
Requirements:
Audio must be served over HTTPS
Maximum file size: 10 MB
Recommended: 8kHz or 16kHz sample rate, mono
Combine with Speak
< Response >
< Play > https://example.com/audio/intro-jingle.mp3 </ Play >
< Speak > Welcome to Acme Corporation. How can we help you today? </ Speak >
</ Response >
Play During IVR
Nest <Play> inside <GetDigits> to play audio while collecting input:
< Response >
< GetDigits action = "/handle-input/" numDigits = "1" timeout = "10" >
< Play > https://example.com/audio/menu-options.mp3 </ Play >
</ GetDigits >
< Speak > We didn't receive any input. Goodbye. </ Speak >
</ Response >
Play Nesting
<Play> can be nested inside:
<GetDigits> - Play while collecting digits
<GetInput> - Play while collecting speech/digits
<PreAnswer> - Play before answering the call
Play Best Practices
Use HTTPS - Audio URLs must use HTTPS
Optimize file size - Compress audio for faster loading
Host reliably - Use a CDN for audio file hosting
Test audio quality - Ensure audio is clear at phone quality (8kHz)
Provide fallback - Use <Speak> as backup if audio fails to load
DTMF
The <DTMF> element sends DTMF (Dual-Tone Multi-Frequency) tones on the current call. Use it to navigate IVR systems, enter PINs, or interact with telephony systems.
Basic Usage
< Response >
< DTMF > 1234 </ DTMF >
</ Response >
Python
Node.js
Ruby
PHP
Java
.NET
Go
from plivo import plivoxml
response = plivoxml.ResponseElement()
response.add(plivoxml.DTMFElement( '1234' ))
print (response.to_string())
DTMF Attributes
Attribute Type Default Description asyncboolean trueSend asynchronously and continue to next element
Allowed Characters
Character Description 0-9Digit tones *Star key #Pound/hash key wWait 0.5 seconds WWait 1 second
With Pauses
Use w (0.5s) or W (1s) to add delays between tones:
< Response >
< DTMF > 1ww2ww3ww4 </ DTMF >
</ Response >
This sends 1, waits 1 second, sends 2, waits 1 second, etc.
Navigate External IVR
When dialing an external number with an IVR:
< Response >
< Dial >
< Number sendDigits = "wwww1234#" > +14155559999 </ Number >
</ Dial >
</ Response >
This is typically done using the sendDigits attribute on <Number> rather than the <DTMF> element.
Send During Call
Send tones during an active call:
< Response >
< Speak > Sending your confirmation code now. </ Speak >
< DTMF > 5678 </ DTMF >
< Speak > Code sent. </ Speak >
</ Response >
Synchronous vs Asynchronous
Async (default): DTMF sends while next element starts
< DTMF async = "true" > 123 </ DTMF >
< Speak > Processing... </ Speak >
Sync: Wait for DTMF to complete before continuing
< DTMF async = "false" > 123 </ DTMF >
< Speak > DTMF complete. </ Speak >
DTMF Use Cases
Scenario Example Enter PIN <DTMF>1234#</DTMF>Navigate IVR menu <DTMF>1</DTMF>Enter extension <DTMF>wwww5678</DTMF>Star code <DTMF>*67</DTMF>
Combined with Dial
When using with <Dial>, prefer sendDigits on the <Number> element:
< Response >
< Dial >
< Number sendDigits = "wwww123#" > +14155551234 </ Number >
</ Dial >
</ Response >