Audio Streaming beta

Plivo’s audio streaming feature lets businesses stream raw audio from active calls in real time to their applications or third-party systems.

You can stream audio by using XML instructions or APIs.

XML

With the XML element, you can stream raw media from a live phone call over a WebSocket connection. To begin streaming, return XML like this during the call.

<Response>
	<Stream bidirectional = "true" keepCallAlive="true" >wss://yourstream.websocket.io/audiostream</Stream>
</Response>

See our XML reference documentation for complete details.

API

Alternatively, you can use Plivo APIs to initiate and manage audio streams.

1
2
3
4
curl -i --user AUTH_ID:AUTH_TOKEN \
    -H "Content-Type: application/json" \
    -d '{"service_url": "wss://yourstream.ngrok.io/audiostream"}' \
    https://api.plivo.com/v1/Account/{auth_id}/Call/{call_uuid}/Stream/

See our audio streaming API reference documentation for complete details.

Retrieving an audio stream

You can retrieve a specific stream using the stream’s UUID, as shown in this cURL command.

curl -i --user AUTH_ID:AUTH_TOKEN \
    https://api.plivo.com/v1/Account/{auth_id}/Call/{call_uuid}/Stream/{Stream_uuid}/

Sample response:

{
"api_id": "2053be7b-10e6-11ee-9cd1-0242ac110009",
"audio_track": "both",
"bidirectional": false,
"bill_duration": 27,
"billed_amount": "0.00300",
"call_uuid": "78737f83-4660-490d-98e1-025dfe4b5c8f",
"created_at": "2023-06-21 13:23:44.136962+00:00",
"end_time": "2023-06-21 18:53:43+05:30",
"plivo_auth_id": "MAY2RJNZKZNJMWOTXXX",
"resource_uri": "/v1/Account/MAY2RJNZKZNJMWOTXXX/Call/78737f83-4660-490d-98e1-025dfe4b5c8f/Stream/20170ada-f610-433b-8758-c02a2aab3662/",
"rounded_bill_duration": 60,
"service_url": "wss://mysocket.com/wss/v2/1/demo/",
"start_time": "2023-06-21 18:53:16+05:30",
"stream_id": "20170ada-f610-433b-8758-c02a2aab3662"
}

You can also retrieve all of a call’s streams. For syntax and details, see the audio streaming API reference.

Audio stream transmission files

Audio streams are transmitted over WebSockets as JSON files.

At the start of an audio stream, Plivo first sends an event to mark the successful connection.

Event on starting the audio stream

{
  "sequenceNumber": 0,
  "event": "start",
  "start": {
    "callId": "8c43a765-94fa-4ee9-b9a3-242703e41f63",
    "streamId": "b77e037d-4119-44b5-902d-25826b654539",
    "accountId": "155747",
    "tracks": [
      "inbound",
      "outbound"
    ],
    "mediaFormat": {
      "encoding": "audio/x-l16",
      "sampleRate": 8000
    }
  },
  "extra_headers": "{}"
}

Subsequent events include the actual audio being sent to the WebSocket in Base64-encoded format.

Event on receiving an inbound media event

{
  "sequenceNumber": 887,
  "streamId": "20170ada-f610-433b-8758-c02a2aab3662",
  "event": "media",
  "media": {
    "track": "inbound",
    "timestamp": "1687353805345",
    "chunk": 469,
    "payload": "CAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAAIAAgACAA="
  },
  "extra_headers": "{}"
}

A similar event is sent for outbound audio streams; for them, the track value is “outbound.”

Plivo also lets you send audio back from the client via the WebSocket.

Play audio event

You can utilize the playAudio event to transmit audio through a WebSocket. When the bi-directional attribute is set to true, Plivo can deliver the audio transmitted from your application to the party on the call.

Attributes

event

Indicates the event type. playAudio is the value required to transmit audio over the WebSocket.

 
media

An object containing media metadata and payload.

contentType

The audio codec format.
Allowed values: audio/x-l16, audio/x-mulaw

sampleRate

Sample rate of the audio transmitted.
Allowed values: 8000, 16000

payload

Base64-encoded string of raw audio

Sample Request

{
  "event": "playAudio",
  "media": {
    "contentType": "audio/x-l16",
    "sampleRate": 8000,
    "payload": "base64 encoded raw audio.."
  }
}

Checkpoint event

Send a checkpoint event via the WebSocket when the desired audio events are queued. Plivo responds with a “played” event upon receiving the checkpoint commandt, indicating that buffered audio events preceding the checkpoint have been successfully played back to the end user.

Request

{
  "event": "checkpoint",
  "streamId": "20170ada-f610-433b-8758-c02a2aab3662",
  "name": "customer greeting audio"
}

Response

{
  "event": "playedStream",
  "streamId": "20170ada-f610-433b-8758-c02a2aab3662",
  "name": "customer greeting audio"
}

Clear audio event

You can interrupt the playback and clear the buffered audio by sending the clearAudio event over the same WebSocket connection. Plivo will clear all buffered media events, enabling you to initiate new playAudio events tailored to the specific use case or scenario.

You can send the clear audio event using the format below.

Sample Request

{
  "event": "clearAudio",
  "streamId": "b77e037d-4119-44b5-902d-25826b654539"
}

Response

{
  "sequenceNumber": 0,
  "event": "clearedAudio",
  "streamId": "20170ada-f610-433b-8758-c02a2aab3662"
}

WebSocket connection failures

In the event of an unsuccessful connection, either on the initial connection attempt or if an established connection is dropped, Plivo will retry the specified WebSocket connection twice before disconnecting.

A sample audio streaming application

Here’s some sample Python code, using Flask, that illustrates managing audio streams.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
from flask import Flask
import plivo

app = Flask(__name__)

auth_id = '<auth_id>' 
auth_token = '<auth_token>' 
call_uuid = '506f0d9f-5961-4c3f-b595-732b36c24e29' #call uuid for which the stream needs to be initiated
service_url = 'wss://8309-49-36-97-128.ngrok.io' #WebSocket URL to which Plivo sends raw audio 
bidirectional = False #specifies whether the audio streamed is one way or bidirectional
audio_track = 'both' #specifies which track needs to be forked for the audio stream
stream_timeout = 86400 #maximum duration in seconds for which the audio needs to be streamed
status_callback_url = 'https://<yourdomain>.com/events/' #URL where Plivo will send audio streaming parameters 
status_callback_method = 'POST' #method used to invoke status callback URL
content_type = 'audio/x-l16;rate=16000' #preferred audio codec and sampling rate
extra_headers = {"Test": "test1"} #additional headers sent to WebSocket

client = plivo.RestClient(auth_id, auth_token)

@app.route('/start_stream')
def start_audio_stream():
    try:
        response = client.calls.start_stream(
            call_uuid=call_uuid,
            service_url=service_url,
            bidirectional=bidirectional,
            audio_track=audio_track,
            stream_timeout=stream_timeout,
            status_callback_url=status_callback_url,
            status_callback_method=status_callback_method,
            content_type=content_type,
            extra_headers=extra_headers
        )
        return str(response)
    except plivo.exceptions.PlivoRestError as e:
        return str(e)

@app.route('/stop_stream/<stream_uuid>')
def stop_audio_stream(stream_uuid):
    try:
        response = client.calls.delete_specific_stream(call_uuid, stream_uuid)
        return str(response)
    except plivo.exceptions.PlivoRestError as e:
        return str(e)

@app.route('/retrieve_stream/<stream_uuid>')
def retrieve_audio_stream(stream_uuid):
    try:
        response = client.calls.get_details_of_specific_stream(call_uuid, stream_uuid)
        return str(response)
    except plivo.exceptions.PlivoRestError as e:
        return str(e)

if __name__ == '__main__':
    app.run()

The Flask application defines three routes:

  • /start_stream initiates the audio stream by calling the start_audio_stream function.
  • /stop_stream/<stream_uuid> stops the audio stream by calling the stop_audio_stream function. The stream_uuid is passed as a - parameter in the URL.
  • /retrieve_stream/<stream_uuid> retrieves the audio stream by calling the retrieve_audio_stream function. The stream_uuid is passed as a parameter in the URL.

Buffering

In case of slow or unstable WebSocket connections caused by network problems, Plivo buffers audio packets for up to 40 seconds. Any time the buffer reaches a capacity of 30, 60, or 90%, Plivo sends a connection_degraded event to the designated status callback URL to notify your application about a potentially unstable WebSocket connection.

Pricing

Audio streaming is priced at $0.003 per minute per stream, over and above the expected charges for voice minutes associated with a call.