Dynamically Caching Audio Files for Superior Voice Experience

Nov 23, 2018
Dynamically Caching Audio Files for Superior Voice Experience

Fetching a resource over the network can be slow and expensive — which makes the ability to cache and reuse previously fetched audio resources a critical tool for optimizing for performance. Plivo has always allowed caching of audio files, but we see very different usage patterns. Some customers use a single audio file for months; others change an audio file every five minutes. A single audio caching philosophy can’t cover such diverse use cases. To address this, we’re introducing user-controlled audio file caching. You can choose to store your audio files on our servers for a duration of your choice.

Modifying the caching behavior of your audio files doesn’t require any changes in the Plivo application. Our implementation is based on standard client-side caching directives for RESTful communication over HTTP. All you need to do is ensure that each server response provides the correct HTTP header directives in the Cache-Control response header to instruct our servers when and for how long to cache the audio resource. You have three options: no-cache, no-store, and max-age.

Cache-Control Response

no-cache indicates that the returned response can’t be used to satisfy a subsequent request to the same URL without first checking with the server to see whether the response has changed. Responding with Cache-Control: no-cache ensures Plivo always checks whether the file has changed.

No Cache

no-store is simpler — it tells the server and all intermediate caches not to store any version of the returned response. Responding with Cache-Control: no-store ensures Plivo doesn’t maintain a local copy of the file in its cache, and always requests the resource from your web server.


Responding with Cache-Control: max-age=<new max-age value in seconds> tells Plivo to cache the returned audio file for a specified period of time. Plivo will consider its cached copy stale after the specified number of seconds (max-age value) have elapsed.

Validating cached responses with ETags

Suppose that max-age seconds have passed since the initial fetch and Plivo has initiated a new request for the same resource. First, Plivo checks the local cache and finds the previous response, but we can’t use it because the response has now expired. At this point Plivo could dispatch a new request and fetch the new full response, but that would be inefficient because if the resource hasn’t changed, there’s no reason to download the same information that’s already in the cache.

Validation tokens, specified in an ETag header, are designed to address this situation. The ETag value of a resource is typically an MD5 hash or some other fingerprint of the contents of the file.

Plivo sets the If-None-Match request header to the ETag value of the resource in its cache, allowing your web server to respond with a new version or a 304 Not Modified response to instruct Plivo to use its cached version. This enables more efficient resource update checks, as no data is transferred if the resource has not changed.

For optimal voice quality on calls, Plivo uses multiple media servers and accesses the one closest to the physical location of the user to cut latency. Every Plivo media server maintains its own local cache, which means that the first call landing on any new Plivo media server will always result in a cache miss.

Plivo media server

The State of Marketing in 2024

HubSpot's Annual Inbound Marketing Trends Report

Frequently asked questions

No items found.
footer bg

Subscribe to Our Newsletter

Get monthly product and feature updates, the latest industry news, and more!

Thank you icon
Thank you!
Thank you for subscribing
Oops! Something went wrong while submitting the form.