Documentation Index
Fetch the complete documentation index at: https://plivo.com/docs/llms.txt
Use this file to discover all available pages before exploring further.
Build a voice agent using Deepgram for speech recognition, OpenAI GPT-4o for conversation, and ElevenLabs for natural-sounding text-to-speech synthesis.
Best for: Applications requiring natural voices, voice cloning, or multilingual support.
Prerequisites
| Service | What You Need |
|---|
| Plivo | Auth ID, Auth Token, Voice-enabled phone number |
| Deepgram | API key from console.deepgram.com |
| OpenAI | API key from platform.openai.com |
| ElevenLabs | API key from elevenlabs.io |
Installation
pip install "pipecat-ai[deepgram,openai,elevenlabs]"
Environment Variables
# Plivo credentials
PLIVO_AUTH_ID=your_auth_id
PLIVO_AUTH_TOKEN=your_auth_token
PLIVO_PHONE_NUMBER=+1234567890
# AI service credentials
DEEPGRAM_API_KEY=your_deepgram_key
OPENAI_API_KEY=sk-your_openai_key
ELEVENLABS_API_KEY=your_elevenlabs_key
Pipeline Configuration
from pipecat.services.deepgram import DeepgramSTTService
from pipecat.services.openai import OpenAILLMService
from pipecat.services.elevenlabs import ElevenLabsTTSService
# Speech-to-Text
stt = DeepgramSTTService(
api_key=os.getenv("DEEPGRAM_API_KEY"),
)
# Language Model
llm = OpenAILLMService(
api_key=os.getenv("OPENAI_API_KEY"),
model="gpt-4o",
)
# Text-to-Speech
tts = ElevenLabsTTSService(
api_key=os.getenv("ELEVENLABS_API_KEY"),
voice_id="your_voice_id", # Browse voices at elevenlabs.io/voice-library
)
Service Details
Deepgram STT
Real-time speech recognition with interim results and language detection.
| Option | Description |
|---|
DeepgramSTTService | Standard WebSocket transcription |
DeepgramFluxSTTService | Enhanced turn detection for conversations |
OpenAI LLM
Chat completion with GPT-4o supporting streaming responses and function calling.
| Model | Description |
|---|
gpt-4o | Most capable, multimodal |
gpt-4o-mini | Faster, cost-effective |
gpt-4-turbo | Previous generation |
ElevenLabs TTS
Natural voice synthesis with word-level timing and voice cloning support.
| Feature | Description |
|---|
| WebSocket streaming | Real-time audio with low latency |
| Word-level timing | Precise synchronization |
| Voice cloning | Create custom voices |
| Multilingual | 29+ languages supported |
Service options:
ElevenLabsTTSService - WebSocket-based, recommended for real-time
ElevenLabsHttpTTSService - HTTP-based, simpler setup
Quick Start
Inbound Calls
git clone https://github.com/pipecat-ai/pipecat-examples.git
cd pipecat-examples/plivo-chatbot/inbound
# Configure environment
cp env.example .env
# Edit .env with your credentials
# Start server
uv sync && uv run server.py
# Expose with ngrok (development)
ngrok http 7860
Configure your Plivo number’s Answer URL to your ngrok URL.
Outbound Calls
cd pipecat-examples/plivo-chatbot/outbound
cp env.example .env
uv sync && uv run server.py
# Initiate a call
curl -X POST http://localhost:7860/start \
-H "Content-Type: application/json" \
-d '{"phone_number": "+1234567890"}'