Skip to main content
Pipecat is an open-source framework for building conversational AI agents. It orchestrates speech-to-text (STT), language models (LLM), and text-to-speech (TTS) services into a unified pipeline. Connect Plivo Audio Streaming to Pipecat to build AI voice agents that handle inbound and outbound phone calls.

How It Works

Phone Call ↔ Plivo ↔ WebSocket Stream ↔ Pipecat ↔ AI Services
                                            ├── STT (Deepgram)
                                            ├── LLM (OpenAI/Gemini)
                                            └── TTS (Cartesia/ElevenLabs)
  1. Plivo handles phone call routing and streams real-time audio over WebSocket
  2. Pipecat receives audio and orchestrates the AI pipeline
  3. STT service converts speech to text
  4. LLM processes the text and generates a response
  5. TTS service converts the response back to speech
  6. Plivo plays the audio to the caller

Choose Your Stack

Pipecat supports multiple AI service combinations. Choose based on your requirements:

Standard Pipelines (STT → LLM → TTS)

GuideSTTLLMTTSBest For
OpenAI + CartesiaDeepgramOpenAI GPT-4oCartesiaLow latency, expressive voices
OpenAI + ElevenLabsDeepgramOpenAI GPT-4oElevenLabsNatural voices, voice cloning
Gemini + CartesiaDeepgramGoogle GeminiCartesiaCost-effective, fast responses
Gemini + ElevenLabsDeepgramGoogle GeminiElevenLabsBalance of cost and voice quality

Speech-to-Speech (Direct Audio Processing)

GuideModelBest For
OpenAI RealtimeGPT-4o RealtimeLowest latency, native multimodal
Gemini LiveGemini LiveMultimodal with video support
Speech-to-speech models process audio directly without intermediate text conversion, resulting in lower latency and more natural conversations.

Prerequisites

RequirementDescription
Plivo AccountSign up and get Auth ID and Auth Token
Phone NumberPurchase a voice-enabled number
PipecatInstall via pip install pipecat-ai
AI Service AccountsCredentials for your chosen STT, LLM, and TTS providers

Quick Start

1. Clone the Examples

git clone https://github.com/pipecat-ai/pipecat-examples.git
cd pipecat-examples/plivo-chatbot

2. Choose Inbound or Outbound

  • Inbound calls: cd inbound - Receive calls on your Plivo number
  • Outbound calls: cd outbound - Initiate calls programmatically

3. Configure Environment

cp env.example .env
Edit .env with your credentials (varies by provider stack).

4. Start the Server

uv sync && uv run server.py

5. Expose for Development (Inbound Only)

ngrok http 7860
Configure your Plivo number’s Answer URL to https://your-ngrok-url.ngrok.io/

Inbound vs Outbound Calls

Inbound Calls

Your Plivo number receives calls and connects them to your Pipecat bot. Setup:
  1. Configure Answer URL on your Plivo number
  2. Plivo sends call to your server
  3. Server returns XML with <Stream> element
  4. WebSocket connection established with Pipecat

Outbound Calls

Your application initiates calls to phone numbers. Setup:
  1. Start your Pipecat server
  2. Call the /start endpoint with target phone number
  3. Plivo places the call and connects to your bot
curl -X POST http://localhost:7860/start \
  -H "Content-Type: application/json" \
  -d '{"phone_number": "+1234567890"}'
Pass custom data to your bot:
curl -X POST http://localhost:7860/start \
  -H "Content-Type: application/json" \
  -d '{
    "phone_number": "+1234567890",
    "user_name": "John",
    "context": "appointment reminder"
  }'
Access this data in your bot via runner_args.body.

Troubleshooting

IssueSolution
Call doesn’t connectVerify ngrok URL matches Plivo Answer URL
No audioCheck WebSocket connection in Pipecat logs
Bot not respondingVerify AI service API keys in .env
Authentication errorsCheck Plivo Auth ID and Token
Debug logs:
  • Server logs: Terminal running server.py
  • Bot logs: bot_<room_name>.log files
  • Plivo logs: Console > Logs > Calls