Text-to-Speech Models

Generate natural-sounding speech from text with Assisters TTS, our advanced voice synthesis model.

Assisters TTS v1

Our state-of-the-art text-to-speech model with 300+ natural voices in 100+ languages.

Specification	Value
Model ID	`assisters-tts-v1`
Voices	300+
Languages	100+
Max Input	4,096 characters
Price	$0.01 / 1,000 characters
Latency	~100ms first audio

Capabilities

Natural Voices: Human-like speech with proper intonation
Multilingual: 100+ languages with native accents
Voice Variety: 300+ unique voices (male, female, various ages)
Streaming: Real-time audio streaming
Multiple Formats: MP3, WAV, OGG, FLAC output

Example Usage

from openai import OpenAI

client = OpenAI(
    base_url="https://api.assisters.dev/v1",
    api_key="your-api-key"
)

response = client.audio.speech.create(
    model="assisters-tts-v1",
    voice="alloy",
    input="Hello! Welcome to Assisters. I'm excited to help you build amazing applications."
)

# Save audio file
response.stream_to_file("output.mp3")

With Different Voices

# Female voice
response = client.audio.speech.create(
    model="assisters-tts-v1",
    voice="nova",
    input="This is a friendly female voice."
)

# Male voice
response = client.audio.speech.create(
    model="assisters-tts-v1",
    voice="onyx",
    input="This is a deep male voice."
)

Streaming Audio

from openai import OpenAI
import pyaudio

client = OpenAI(
    base_url="https://api.assisters.dev/v1",
    api_key="your-api-key"
)

# Stream audio in real-time
with client.audio.speech.with_streaming_response.create(
    model="assisters-tts-v1",
    voice="alloy",
    input="This text is being converted to speech in real-time!"
) as response:
    for chunk in response.iter_bytes():
        # Play or process audio chunks
        audio_player.write(chunk)

Available Voices

Voice	Description	Best For
`alloy`	Neutral, balanced	General purpose
`echo`	Warm, friendly	Customer service
`fable`	Expressive, storytelling	Audiobooks
`onyx`	Deep, authoritative	Professional content
`nova`	Bright, energetic	Marketing, tutorials
`shimmer`	Soft, gentle	Meditation, wellness

300+ additional voices are available. See the voice gallery for the complete list with audio samples.

Parameters

Parameter	Type	Default	Description
`input`	string	required	Text to convert (max 4096 chars)
`model`	string	required	Model ID (`assisters-tts-v1`)
`voice`	string	required	Voice ID to use
`response_format`	string	”mp3”	Audio format
`speed`	float	1.0	Speed multiplier (0.25-4.0)

Output Formats

Format	Quality	File Size
`mp3`	Good	Small
`opus`	Good	Smallest
`aac`	Good	Small
`flac`	Lossless	Large
`wav`	Lossless	Largest
`pcm`	Raw	Varies

Use Cases

Voice Assistants

Build voice-enabled applications:

def speak_response(text):
    response = client.audio.speech.create(
        model="assisters-tts-v1",
        voice="nova",
        input=text,
        response_format="mp3"
    )
    play_audio(response.content)

Audiobook Generation

Convert text content to audiobooks:

def create_audiobook_chapter(chapter_text, chapter_num):
    response = client.audio.speech.create(
        model="assisters-tts-v1",
        voice="fable",  # Expressive voice for storytelling
        input=chapter_text,
        speed=0.9  # Slightly slower for better comprehension
    )
    response.stream_to_file(f"chapter_{chapter_num}.mp3")

Accessibility

Add audio to text content for accessibility:

def add_audio_to_article(article):
    response = client.audio.speech.create(
        model="assisters-tts-v1",
        voice="alloy",
        input=article.text
    )
    return response.content

IVR Systems

Create interactive voice response systems:

def generate_ivr_prompt(message):
    response = client.audio.speech.create(
        model="assisters-tts-v1",
        voice="echo",  # Professional, friendly
        input=message,
        response_format="wav"  # High quality for phone systems
    )
    return response.content

Best Practices

Choose the Right Voice

Match the voice to your use case and audience

Chunk Long Text

Split text longer than 4096 characters into paragraphs

Add Pauses

Use punctuation to control pacing naturally

Stream for Real-time

Use streaming for interactive applications

Tips for Natural Speech

Use proper punctuation: Commas add brief pauses, periods add longer pauses
Spell out numbers: “One hundred twenty-three” vs “123”
Use phonetic spelling: For unusual words or names
Adjust speed: Slower for complex content, faster for simple messages

Assisters Whisper v1

Convert speech to text

Assisters Chat v1

Generate text content for speech

Model Catalog

​Text-to-Speech Models

​Assisters TTS v1

​Capabilities

​Example Usage

​With Different Voices

​Streaming Audio

​Available Voices

​Parameters

​Output Formats

​Use Cases

​Best Practices

Choose the Right Voice

Chunk Long Text

Add Pauses

Stream for Real-time

​Tips for Natural Speech

​Related Models

Assisters Whisper v1

Assisters Chat v1

Text-to-Speech Models

Assisters TTS v1

Capabilities

Example Usage

With Different Voices

Streaming Audio

Available Voices

Parameters

Output Formats

Use Cases

Best Practices

Tips for Natural Speech

Related Models