Text to Speech

Text to Speech is an easy method to add dynamic voice to your phone capabilities. Unlike pre-recorded audio, this feature can be used to deliver spoken versions of text such as live information (e.g., sports scores) and other dynamic messages (e.g., emergency updates). Examples of simple changes in the code can allow you to change the message, the voice (e.g., male or female), who will be receiving your message, and much more.

This tutorial will use Plivo’s Speak element to read out text as speech to the caller. More specifically, here is what will happen behind the scenes:

  1. A call is made to the Plivo phone number assigned to your text to speech application
  2. When the call is connected, the caller will hear the message read out. To demonstrate the capabilities of the Speak element, we’ll have Plivo read the message in 3 different languages (namely English, French, and Russian).
  3. After the message is read, the call is automatically hung up.

Prerequisites

  1. Sign up for a free Plivo trial account.
  2. Check out our Helper Libraries page and install the right helper based on the programming language you want to use.
  3. Buy a Plivo phone number. A phone number is required to receive calls. You can buy a Plivo phone number in over 20+ countries through the Buy Numbers tab on your Plivo account UI. Check the Voice API coverage page for all the supported countries.
  4. Use a web hosting service to host your web application. There are many inexpensive cloud hosting providers that you can use for just a few dollars a month. Follow the instructions of your hosting provider to host your web application.

Set up a Web Server

Let’s assume your web server is located at myvoiceapp.com. Below is a snippet to set up a route on your webserver. Lets call it, text-to-speech. Now when we send an HTTP request to myvoiceapp.com/text-to-speech this route will be invoked. You will now have to configure this URL in your Plivo application.

Note: For PHP, the route will be myvoiceapp.com/text-to-speech.php
  1. Copy the relevant code below into a text file and save it. Lets call it, text-to-speech.
    Note: Make sure to use the appropriate file extention for your code (e.g. text-to-speech.py for Python).
  2. Now that you have the code, you will need to expose your server to the public Internet. This way, Plivo will know where to find your app when a particular phone number is dialed. Moving forward, we will assume that your app is available at myvoiceapp.com.

Code

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
from flask import Flask, Response
import plivo, plivoxml

app = Flask(__name__)

@app.route('/text-to-speech/', methods=['GET','POST'])
def speak_xml():
    # Generate a Speak XML with the details of the text to play on the call.
    r = plivoxml.Response()

    # Add Speak XML Tag with English text
    body1 = u'Congratulations! You have successfully made a call with Pleevo.'
    params1 = {
        'language': "en-US", # Language used to read out the text.
        'voice': "MAN" # The tone to be used for reading out the text.
    }
    r.addSpeak(body1, **params1)

    # Add Speak XML Tag with French text
    body2 = u'Félicitations! Vous avez effectué avec succès un appel avec Pleevo.'
    params2 = {
        'language': "fr-FR", # Language used to read out the text.
        'voice': "WOMAN" # The tone to be used for reading out the text.
    }
    r.addSpeak(body2, **params2)

    # Add Speak XML Tag with Spanish text
    body3 = u'¡Enhorabuena! Usted ha realizado con éxito una llamada con Pleevo.'
    params3 = {
        'language': "es-US", # Language used to read out the text.
        'voice' : "MAN" # The tone to be used for reading out the text.
    }
    r.addSpeak(body3, **params3)

    print r.to_xml()
    return Response(str(r), mimetype='text/xml')

if __name__ == "__main__":
    app.run(host='0.0.0.0', debug=True)
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
require 'rubygems'
require 'sinatra'
require 'plivo'
include Plivo

get '/text-to-speech/' do
    # Generate a Speak XML with the details of the text to play on the call.
    r = Response.new()

    # Add Speak XML Tag with English text
    body1 = 'Congratulations! You have successfully made a call with Pleevo.'
    params1 = {
        'language'=> "en-US",
        'voice' => "MAN"
    }
    r.addSpeak(body1, params1)

    # Add Speak XML Tag with French Text
    body2 = 'Félicitations! Vous avez effectué avec succès un appel avec Pleevo.'
    params2 = {
        'language' => "fr-FR",
        'voice' => "WOMAN"
    }
    r.addSpeak(body2, params2)

    # Add Speak XML Tag with Spanish Text
    body3 = '¡Enhorabuena! Usted ha realizado con éxito una llamada con Pleevo.'
    params3 = {
        'language' => "es-US",
        'voice' => "MAN"
    }
    r.addSpeak(body3, params3)

    puts r.to_xml()
    content_type 'text/xml'
    return r.to_s()
end
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
var plivo = require('plivo');
var express = require('express');
var app = express();

app.set('port', (process.env.PORT || 5000));

app.all('/text-to-speech/', function(request, response) {
    // Generate a Speak XML with the details of the text to play on the call.
    var r = plivo.Response();

    // Add Speak XML Tag with English text
    var body1 = "Congratulations! You have successfully made a call with Pleevo.";
    var params1 = {
        'language': "en-US", // Language used to read out the text.
        'voice': "MAN" // The tone to be used for reading out the text.
    };
    r.addSpeak(body1, params1);

    // Add Speak XML Tag with French text
    var body2 = "Félicitations! Vous avez effectué avec succès un appel avec Pleevo.";
    var params2 = {
        'language': "fr-FR", // Language used to read out the text.
        'voice':"WOMAN" // The tone to be used for reading out the text.
    };
    r.addSpeak(body2, params2);

    // Add Speak XML Tag with Spanish text
    var body3 = "¡Enhorabuena! Usted ha realizado con éxito una llamada con Pleevo.";
    var params3 = {
        'language': "es-US", // Language used to read out the text.
        'voice' : "MAN" // The tone to be used for reading out the text.
    };
    r.addSpeak(body3, params3);
    console.log (r.toXML());

    response.set({'Content-Type': 'text/xml'});
    response.send(r.toXML());

});

app.listen(app.get('port'), function() {
    console.log('Node app is running on port', app.get('port'));
});
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
<?php
    require 'vendor/autoload.php';
    use Plivo\Response;
    # Generate a Speak XML with the details of the text to play on the call.
    $r = new Response();

    # Add Speak XML Tag with English text
    $body1 = 'Congratulations! You have successfully made a call with Pleevo.';
    $params1 = array(
        'language' => "en-US", # Language used to read out the text.
        'voice' => "MAN" # The tone to be used for reading out the text.
    );
    $r->addSpeak($body1,$params1);

    # Add Speak XML Tag with English text
    $body2 = 'Félicitations! Vous avez effectué avec succès un appel avec Pleevo.';
    $params2 = array(
        'language' => "fr-FR", # Language used to read out the text.
        'voice' => "WOMAN" # The tone to be used for reading out the text.
    );
    $r->addSpeak($body2,$params2);

    # Add Speak XML Tag with English text
    $body3 = '¡Enhorabuena! Usted ha realizado con éxito una llamada con Pleevo.';
    $params3 = array(
        'language' => "es-US", # Language used to read out the text.
        'voice' => "MAN" # The tone to be used for reading out the text.
    );
    $r->addSpeak($body3,$params3);

    Header('Content-type: text/xml');
    echo($r->toXML());
?>
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
package plivoexample;

import java.io.IOException;
import com.plivo.helper.exception.PlivoException;
import com.plivo.helper.xml.elements.PlivoResponse;
import com.plivo.helper.xml.elements.Speak;
import javax.servlet.ServletException;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import org.eclipse.jetty.server.Server;
import org.eclipse.jetty.servlet.ServletContextHandler;
import org.eclipse.jetty.servlet.ServletHolder;

public class textToSpeech extends HttpServlet {
    private static final long serialVersionUID = 1L;
    @Override
    protected void doGet(HttpServletRequest req, HttpServletResponse resp)
            throws ServletException, IOException {
        // Generate a Speak XML with the details of the text to play on the call.
        PlivoResponse response = new PlivoResponse();

        // Add Speak XML Tag with English text
        Speak spk1 = new Speak("Congratulations! You have successfully made a call with Pleevo.");
        spk1.setLanguage("en-US"); // Language used to read out the text.
        spk1.setVoice("MAN"); // The tone to be used for reading out the text.
        response.append(spk1);

        // Add Speak XML Tag with French Text
        Speak spk2 = new Speak("Félicitations! Vous avez effectué avec succès un appel avec Pleevo.");
        spk2.setLanguage("fr-FR"); // Language used to read out the text.
        spk1.setVoice("WOMAN"); // The tone to be used for reading out the text.
        response.append(spk2);

        // Add Speak XML Tag with Spanish Text
        Speak spk3 = new Speak("¡Enhorabuena! Usted ha realizado con éxito una llamada con Pleevo.");
        spk3.setLanguage("es-US"); // Language used to read out the text.
        spk1.setVoice("MAN"); // The tone to be used for reading out the text.
        response.append(spk3);

        try {
            System.out.println(response.toXML());
            resp.addHeader("Content-Type", "text/xml");
            resp.getWriter().print(response.toXML());
        } catch (PlivoException e) {
            e.printStackTrace();
        }
    }

    public static void main(String[] args) throws Exception {
        String port = System.getenv("PORT");
        if(port==null)
            port ="8000";
        Server server = new Server(Integer.valueOf(port));
        ServletContextHandler context = new ServletContextHandler(ServletContextHandler.SESSIONS);
        context.setContextPath("/");
        server.setHandler(context);
        context.addServlet(new ServletHolder(new textToSpeech()),"/text-to-speech/");
        server.start();
        server.join();
    }
}
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
using System;
using System.Collections.Generic;
using System.Diagnostics;
using RestSharp;
using Plivo.XML;
using Nancy;

namespace text_to_speech
{
    public class Program : NancyModule
    {
        public Program()
        {
            Get["/text-to-speech/"] = x =>
            {
                // Generate a Speak XML with the details of the text to play on the call.
                Plivo.XML.Response resp = new Plivo.XML.Response();

                // Add Speak XML Tag with English text
                resp.AddSpeak("Congratulations! You have successfully made a call with Pleevo.", new Dictionary<string, string>()
                {
                    {"language", "en-US"}, // Language used to read out the text.
                    {"voice", "MAN"} // The tone to be used for reading out the text.
                });

                // Add Speak XML Tag with French text
                resp.AddSpeak("Félicitations! Vous avez effectué avec succès un appel avec Pleevo.", new Dictionary<string, string>()
                {
                    {"language", "fr-FR"}, // Language used to read out the text.
                    {"voice", "WOMAN"} // The tone to be used for reading out the text.
                });

                // Add Speak XML Tag with Spanish text
                resp.AddSpeak("¡Enhorabuena! Usted ha realizado con éxito una llamada con Pleevo.", new Dictionary<string, string>()
                {
                    {"language", "es-US"}, // Language used to read out the text.
                    {"voice", "MAN"} // The tone to be used for reading out the text.
                });

                Debug.WriteLine(resp.ToString());

                var output = resp.ToString();
                var res = (Nancy.Response)output;
                res.ContentType = "text/xml";
                return res;
            };
        }
    }
}

Create an Application

  1. Create an Application by visiting the Application Page and click on New Application or by using Plivo’s Application API.
  2. Give your application a name. Lets call it Text to Speech. Enter your server URL (e.g., http://myvoiceapp.com/text-to-speech) in the Answer URL field and set the method as POST or GET. See our Application API docs to learn how to modify your application through our APIs.
  3. Click on Create to save your application.

Create Text to Speech Application

Assign a Plivo number to your app

  1. Navigate to the Numbers page and select the phone number you want to use for this app.
  2. Select Text to Speech (name of the app) from the Plivo App dropdown list.
  3. Click on Update to save.

Create Text to Speech Application

If you don’t have a Plivo phone number, go to the Buy Number page to purchase a Plivo phone number.

Create Text to Speech Application

Test it out

Now, make a call to the Plivo number that you associated with your application. Plivo will detect the call and fetch the answer URL, which resides at myvoiceapp.com/text-to-speech. Then, Plivo will execute the XML instruction from your URL and read the message out in 3 different languages. Below is the XML output.

Sample XML

<Response>
    <Speak language="en-GB" voice="MAN">
        This is a randomly generated text can be used in your layout
    </Speak>
    <Speak language="fr-FR">
        Ce texte généré aléatoirement peut-être utilisé dans vos maquettes
    </Speak>
    <Speak language="ru-RU" voice="MAN">
        Это случайно сгенерированный текст может быть использован в макете
    </Speak>
</Response>

Next Steps

Learn how to have Plivo connect a call to a second person.

  1. Make an Outbound Call
  2. Play a Text-to-speech Message
  3. Connect Call to a Second Person
  4. Greet Caller by Name
  5. Play MP3/WAV Audio to Caller
  6. Hangup Using API
  7. Receive Incoming Call
  8. Forward Incoming Call
  9. Record Using API
  10. Screen Incoming Call
  11. Reject incoming call
  12. Get Details of all Calls
  13. Get Details of a single Call
  14. Get Details of Live Calls
  15. Build an IVR Phone Menu
  16. Conference Call
  17. Call Forward
  18. SIP Endpoint (Direct-Dial)
  19. Inbound Trunk
  20. Outbound Trunk