Documentation Index
Fetch the complete documentation index at: https://plivo.com/docs/llms.txt
Use this file to discover all available pages before exploring further.
Pipecat is an open-source framework for building conversational AI agents. It orchestrates speech-to-text (STT), language models (LLM), and text-to-speech (TTS) services into a unified pipeline.
Connect Plivo Audio Streaming to Pipecat to build AI voice agents that handle inbound and outbound phone calls.
How It Works
Phone Call ↔ Plivo ↔ WebSocket Stream ↔ Pipecat ↔ AI Services
├── STT (Deepgram)
├── LLM (OpenAI/Gemini)
└── TTS (Cartesia/ElevenLabs)
- Plivo handles phone call routing and streams real-time audio over WebSocket
- Pipecat receives audio and orchestrates the AI pipeline
- STT service converts speech to text
- LLM processes the text and generates a response
- TTS service converts the response back to speech
- Plivo plays the audio to the caller
Choose Your Stack
Pipecat supports multiple AI service combinations. Choose based on your requirements:
Standard Pipelines (STT → LLM → TTS)
| Guide | STT | LLM | TTS | Best For |
|---|
| OpenAI + Cartesia | Deepgram | OpenAI GPT-4o | Cartesia | Low latency, expressive voices |
| OpenAI + ElevenLabs | Deepgram | OpenAI GPT-4o | ElevenLabs | Natural voices, voice cloning |
| Gemini + Cartesia | Deepgram | Google Gemini | Cartesia | Cost-effective, fast responses |
| Gemini + ElevenLabs | Deepgram | Google Gemini | ElevenLabs | Balance of cost and voice quality |
Speech-to-Speech (Direct Audio Processing)
| Guide | Model | Best For |
|---|
| OpenAI Realtime | GPT-4o Realtime | Lowest latency, native multimodal |
| Gemini Live | Gemini Live | Multimodal with video support |
Speech-to-speech models process audio directly without intermediate text conversion, resulting in lower latency and more natural conversations.
Prerequisites
| Requirement | Description |
|---|
| Plivo Account | Sign up and get Auth ID and Auth Token |
| Phone Number | Purchase a voice-enabled number |
| Pipecat | Install via pip install pipecat-ai |
| AI Service Accounts | Credentials for your chosen STT, LLM, and TTS providers |
Quick Start
1. Clone the Examples
git clone https://github.com/pipecat-ai/pipecat-examples.git
cd pipecat-examples/plivo-chatbot
2. Choose Inbound or Outbound
- Inbound calls:
cd inbound - Receive calls on your Plivo number
- Outbound calls:
cd outbound - Initiate calls programmatically
Edit .env with your credentials (varies by provider stack).
4. Start the Server
uv sync && uv run server.py
5. Expose for Development (Inbound Only)
Configure your Plivo number’s Answer URL to https://your-ngrok-url.ngrok.io/
Inbound vs Outbound Calls
Inbound Calls
Your Plivo number receives calls and connects them to your Pipecat bot.
Setup:
- Configure Answer URL on your Plivo number
- Plivo sends call to your server
- Server returns XML with
<Stream> element
- WebSocket connection established with Pipecat
Outbound Calls
Your application initiates calls to phone numbers.
Setup:
- Start your Pipecat server
- Call the
/start endpoint with target phone number
- Plivo places the call and connects to your bot
curl -X POST http://localhost:7860/start \
-H "Content-Type: application/json" \
-d '{"phone_number": "+1234567890"}'
Pass custom data to your bot:
curl -X POST http://localhost:7860/start \
-H "Content-Type: application/json" \
-d '{
"phone_number": "+1234567890",
"user_name": "John",
"context": "appointment reminder"
}'
Access this data in your bot via runner_args.body.
Troubleshooting
| Issue | Solution |
|---|
| Call doesn’t connect | Verify ngrok URL matches Plivo Answer URL |
| No audio | Check WebSocket connection in Pipecat logs |
| Bot not responding | Verify AI service API keys in .env |
| Authentication errors | Check Plivo Auth ID and Token |
Debug logs:
- Server logs: Terminal running
server.py
- Bot logs:
bot_<room_name>.log files
- Plivo logs: Console > Logs > Calls