LiveKit Agent with Custom TTS

https://github.com/nineninesix-ai/livekit-agent https://github.com/nineninesix-ai/voice-agent-web-embed

A real-time voice AI assistant built with LiveKit Agents framework, featuring speech-to-text, language processing, and text-to-speech capabilities.

Features

Speech-to-Text: Deepgram STT with Flux General EN model
Language Model: OpenAI GPT-4o-mini for natural conversation
Text-to-Speech: OpenAI-compatible server using KaniTTS (https://github.com/nineninesix-ai/kanitts-vllm)
Voice Activity Detection: Silero VAD for accurate speech detection
Turn Detection: Multilingual turn detection for natural conversations
Noise Cancellation: Background voice cancellation (BVC) for clear audio

Prerequisites

Python ≥ 3.10
uv package manager
LiveKit Cloud account (sign up for free)
API keys for:
- LiveKit (API key, secret, and URL)
- Deepgram (for STT)
- OpenAI (for LLM)

Installation

1. Install LiveKit CLI

macOS:

brew install livekit-cli

Linux:

curl -sSL https://get.livekit.io/cli | bash

Windows:

winget install LiveKit.LiveKitCLI

Then authenticate with LiveKit Cloud:

lk cloud auth

2. Clone and Setup

git clone git@github.com:nineninesix-ai/livekit-agent.git
cd livekit-agent

3. Install Dependencies

uv sync

This will install all required dependencies from pyproject.toml:

livekit-agents with Deepgram, OpenAI, Silero, and turn-detector plugins
livekit-plugins-noise-cancellation for audio processing
python-dotenv for environment variable management

4. Configure Environment Variables

Create a .env.local file in the project root with the following variables:

LIVEKIT_API_KEY=your_livekit_api_key
LIVEKIT_API_SECRET=your_livekit_api_secret
LIVEKIT_URL=wss://your-project.livekit.cloud
DEEPGRAM_API_KEY=your_deepgram_api_key
OPENAI_API_KEY=your_openai_api_key
KANI_BASE_URL=http://localhost:8000/v1

Quick setup with LiveKit CLI:

lk app env -w

This automatically generates LiveKit credentials in .env.local.

5. Download Model Files

Download required model files (VAD, turn detection, etc.):

uv run agent.py download-files

Usage

Development Mode

Run the agent in development mode (connects to LiveKit Cloud):

uv run agent.py dev

This starts the agent and connects it to your LiveKit server. You can interact with it via:

LiveKit Agents Playground
Your own web/mobile frontend
Telephony integration

Console Mode

Test the agent locally in your terminal without LiveKit connection:

uv run agent.py console

Production Mode

Run the agent in production:

uv run agent.py start

Configuration

Custom TTS Server

This project uses a custom TTS server instead of the default OpenAI TTS. The TTS server URL is configured via the KANI_BASE_URL environment variable, which defaults to http://localhost:8000/v1 if not set.

To use the local TTS server:

Set KANI_BASE_URL in your .env.local file (or leave unset to use the default)
Ensure your TTS server is running at the specified URL
The server should be OpenAI-compatible (accepts same API format)

Check our FastAPI implementation here: https://github.com/nineninesix-ai/kanitts-vllm

Audio Models Configuration

Current configuration:

STT: Deepgram Flux General EN with 0.4s eager end-of-turn threshold
LLM: GPT-4o-mini
VAD: Silero VAD
Turn Detection: Multilingual model
Noise Cancellation: BVC (recommended for telephony)

Telephony Applications

For telephony use cases, the agent uses BVC (Background Voice Cancellation) noise cancellation. For even better results with telephone audio, consider using BVCTelephony:

room_input_options=RoomInputOptions(
    noise_cancellation=noise_cancellation.BVCTelephony(),
)

Deployment

Deploy to LiveKit Cloud

To deploy your agent to LiveKit Cloud for production use:

lk agent create

This command will:

Automatically generate Dockerfile, .dockerignore, and livekit.toml configuration files
Register your agent with your LiveKit Cloud project
Deploy the containerized agent to the cloud

Prerequisites for deployment:

LiveKit CLI authenticated with your cloud account (lk cloud auth)
All environment variables configured in your LiveKit Cloud project settings

After deployment, your agent will be available through:

LiveKit Agents Playground
Your custom web/mobile applications. Check our web embed app example here: https://github.com/nineninesix-ai/voice-agent-web-embed
Telephony integrations

Note: For self-hosted production environments, refer to the LiveKit documentation for custom deployment configurations.

Resources

Troubleshooting

Agent won't start:

Verify all environment variables are set correctly in .env.local
Ensure model files are downloaded: uv run agent.py download-files
Check that your API keys are valid

No audio in/out:

Verify your microphone/speaker permissions
Check LiveKit room configuration
Ensure noise cancellation is properly configured

TTS not working:

Ensure your local TTS server is running on port 8000
Check server logs for errors
Verify the server implements OpenAI-compatible API

License

Apache 2.