Prerequisites
| Service | What You Need |
|---|---|
| Plivo | Auth ID, Auth Token, Voice-enabled phone number |
| Deepgram | API key from console.deepgram.com |
| OpenAI | API key from platform.openai.com |
| Cartesia | API key from play.cartesia.ai |
Installation
Environment Variables
Pipeline Configuration
Service Details
Deepgram STT
Real-time speech recognition with interim results and language detection.| Option | Description |
|---|---|
DeepgramSTTService | Standard WebSocket transcription |
DeepgramFluxSTTService | Enhanced turn detection for conversations |
DeepgramFluxSTTService with ExternalUserTurnStrategies for better conversation flow.
OpenAI LLM
Chat completion with GPT-4o supporting streaming responses and function calling.| Model | Description |
|---|---|
gpt-4o | Most capable, multimodal |
gpt-4o-mini | Faster, cost-effective |
gpt-4-turbo | Previous generation |
Cartesia TTS
Real-time voice synthesis with word-level timing and interruption handling.| Feature | Method |
|---|---|
| Spell out text | SPELL("ABC") |
| Add emotion | EMOTION_TAG("SARCASM") |
| Insert pause | PAUSE_TAG(0.5) |
| Adjust speed | SPEED_TAG(1.2) |
| Adjust volume | VOLUME_TAG(0.8) |
Quick Start
Inbound Calls
Outbound Calls
Related
- Pipecat Overview - Architecture and setup
- Deepgram Docs - STT configuration
- OpenAI Docs - LLM configuration
- Cartesia Docs - TTS configuration