Deepgram + OpenAI + Rime

Build a voice agent using Deepgram for speech recognition, OpenAI GPT-4o for conversation, and Rime for text-to-speech synthesis. Best for: Applications requiring precise pronunciation control, custom phonemes, and word-level timing.

Prerequisites

Service	What You Need
Plivo	Auth ID, Auth Token, Voice-enabled phone number
Deepgram	API key from console.deepgram.com
OpenAI	API key from platform.openai.com
Rime	API key from rime.ai

Installation

pip install "pipecat-ai[deepgram,openai,rime]"

Environment Variables

# Plivo credentials
PLIVO_AUTH_ID=your_auth_id
PLIVO_AUTH_TOKEN=your_auth_token
PLIVO_PHONE_NUMBER=+1234567890

# AI service credentials
DEEPGRAM_API_KEY=your_deepgram_key
OPENAI_API_KEY=sk-your_openai_key
RIME_API_KEY=your_rime_key

Pipeline Configuration

from pipecat.services.deepgram import DeepgramSTTService
from pipecat.services.openai import OpenAILLMService
from pipecat.services.rime import RimeTTSService

# Speech-to-Text
stt = DeepgramSTTService(
    api_key=os.getenv("DEEPGRAM_API_KEY"),
)

# Language Model
llm = OpenAILLMService(
    api_key=os.getenv("OPENAI_API_KEY"),
    model="gpt-4o",
)

# Text-to-Speech
tts = RimeTTSService(
    api_key=os.getenv("RIME_API_KEY"),
    # voice="your_voice_id",
)

Service Details

Deepgram STT

Real-time speech recognition with interim results and language detection.

Option	Description
`DeepgramSTTService`	Standard WebSocket transcription
`DeepgramFluxSTTService`	Enhanced turn detection for conversations

OpenAI LLM

Chat completion with GPT-4o supporting streaming responses and function calling.

Model	Description
`gpt-4o`	Most capable, multimodal
`gpt-4o-mini`	Faster, cost-effective

Rime TTS

Real-time voice synthesis with word-level timing and precise pronunciation control.

Feature	Method
Spell out text	`SPELL("ABC")`
Insert pause	`PAUSE_TAG(0.5)`
Custom pronunciation	`PRONOUNCE(text, word, phoneme)`
Adjust speed inline	`INLINE_SPEED(text, 1.2)`

Service options:

RimeTTSService - WebSocket-based, real-time with word timestamps
RimeHttpTTSService - HTTP-based, simpler setup

Pronunciation Control

Rime excels at precise pronunciation control for names, technical terms, and branded content.

Custom Pronunciations

from pipecat.services.rime import PRONOUNCE

# Replace word with phoneme pronunciation
text = PRONOUNCE(
    "Welcome to Plivo",
    "Plivo",
    "plee-voh"
)

Spelling Out Text

from pipecat.services.rime import SPELL

# Spell out acronyms or codes
text = SPELL("API")  # Says "A P I"

Dynamic Speed Control

from pipecat.services.rime import INLINE_SPEED

# Speed up specific sections
text = INLINE_SPEED("Terms and conditions apply", 1.3)

Quick Start

Inbound Calls

git clone https://github.com/pipecat-ai/pipecat-examples.git
cd pipecat-examples/plivo-chatbot/inbound

# Configure environment
cp env.example .env
# Edit .env with your credentials

# Start server
uv sync && uv run server.py

# Expose with ngrok (development)
ngrok http 7860

Configure your Plivo number’s Answer URL to your ngrok URL.

Outbound Calls

cd pipecat-examples/plivo-chatbot/outbound

cp env.example .env
uv sync && uv run server.py

# Initiate a call
curl -X POST http://localhost:7860/start \
  -H "Content-Type: application/json" \
  -d '{"phone_number": "+1234567890"}'

When to Use Rime

Choose Rime when:

You need precise control over pronunciation
Your content includes technical terms, names, or branded words
You want word-level timing for synchronized experiences
You need inline speed adjustments

Choose Cartesia or ElevenLabs when:

You need emotion/expression controls
You want voice cloning capabilities
You need broader multilingual support

Pipecat Overview - Architecture and setup
Deepgram Docs - STT configuration
OpenAI Docs - LLM configuration
Rime Docs - TTS configuration

Concepts

Integration Guides

API Reference

XML Reference

Troubleshooting

Deepgram + OpenAI + Rime

Prerequisites

Installation

Environment Variables

Pipeline Configuration

Service Details

Deepgram STT

OpenAI LLM

Rime TTS

Pronunciation Control

Custom Pronunciations

Spelling Out Text

Dynamic Speed Control

Quick Start

Inbound Calls

Outbound Calls

When to Use Rime

Concepts

Integration Guides

API Reference

XML Reference

Troubleshooting

​Prerequisites

​Installation

​Environment Variables

​Pipeline Configuration

​Service Details

​Deepgram STT

​OpenAI LLM

​Rime TTS

​Pronunciation Control

​Custom Pronunciations

​Spelling Out Text

​Dynamic Speed Control

​Quick Start

​Inbound Calls

​Outbound Calls

​When to Use Rime

​Related

Prerequisites

Installation

Environment Variables

Pipeline Configuration

Service Details

Deepgram STT

OpenAI LLM

Rime TTS

Pronunciation Control

Custom Pronunciations

Spelling Out Text

Dynamic Speed Control

Quick Start

Inbound Calls

Outbound Calls

When to Use Rime

Related