Skip to main content

Text-to-Speech Models

Generate natural-sounding speech from text with Assisters TTS, our advanced voice synthesis model.

Assisters TTS v1

Our state-of-the-art text-to-speech model with 300+ natural voices in 100+ languages.
SpecificationValue
Model IDassisters-tts-v1
Voices300+
Languages100+
Max Input4,096 characters
Price$0.01 / 1,000 characters
Latency~100ms first audio

Capabilities

  • Natural Voices: Human-like speech with proper intonation
  • Multilingual: 100+ languages with native accents
  • Voice Variety: 300+ unique voices (male, female, various ages)
  • Streaming: Real-time audio streaming
  • Multiple Formats: MP3, WAV, OGG, FLAC output

Example Usage

from openai import OpenAI

client = OpenAI(
    base_url="https://api.assisters.dev/v1",
    api_key="your-api-key"
)

response = client.audio.speech.create(
    model="assisters-tts-v1",
    voice="alloy",
    input="Hello! Welcome to Assisters. I'm excited to help you build amazing applications."
)

# Save audio file
response.stream_to_file("output.mp3")

With Different Voices

# Female voice
response = client.audio.speech.create(
    model="assisters-tts-v1",
    voice="nova",
    input="This is a friendly female voice."
)

# Male voice
response = client.audio.speech.create(
    model="assisters-tts-v1",
    voice="onyx",
    input="This is a deep male voice."
)

Streaming Audio

from openai import OpenAI
import pyaudio

client = OpenAI(
    base_url="https://api.assisters.dev/v1",
    api_key="your-api-key"
)

# Stream audio in real-time
with client.audio.speech.with_streaming_response.create(
    model="assisters-tts-v1",
    voice="alloy",
    input="This text is being converted to speech in real-time!"
) as response:
    for chunk in response.iter_bytes():
        # Play or process audio chunks
        audio_player.write(chunk)

Available Voices

VoiceDescriptionBest For
alloyNeutral, balancedGeneral purpose
echoWarm, friendlyCustomer service
fableExpressive, storytellingAudiobooks
onyxDeep, authoritativeProfessional content
novaBright, energeticMarketing, tutorials
shimmerSoft, gentleMeditation, wellness
300+ additional voices are available. See the voice gallery for the complete list with audio samples.

Parameters

ParameterTypeDefaultDescription
inputstringrequiredText to convert (max 4096 chars)
modelstringrequiredModel ID (assisters-tts-v1)
voicestringrequiredVoice ID to use
response_formatstring”mp3”Audio format
speedfloat1.0Speed multiplier (0.25-4.0)

Output Formats

FormatQualityFile Size
mp3GoodSmall
opusGoodSmallest
aacGoodSmall
flacLosslessLarge
wavLosslessLargest
pcmRawVaries

Use Cases

Build voice-enabled applications:
def speak_response(text):
    response = client.audio.speech.create(
        model="assisters-tts-v1",
        voice="nova",
        input=text,
        response_format="mp3"
    )
    play_audio(response.content)
Convert text content to audiobooks:
def create_audiobook_chapter(chapter_text, chapter_num):
    response = client.audio.speech.create(
        model="assisters-tts-v1",
        voice="fable",  # Expressive voice for storytelling
        input=chapter_text,
        speed=0.9  # Slightly slower for better comprehension
    )
    response.stream_to_file(f"chapter_{chapter_num}.mp3")
Add audio to text content for accessibility:
def add_audio_to_article(article):
    response = client.audio.speech.create(
        model="assisters-tts-v1",
        voice="alloy",
        input=article.text
    )
    return response.content
Create interactive voice response systems:
def generate_ivr_prompt(message):
    response = client.audio.speech.create(
        model="assisters-tts-v1",
        voice="echo",  # Professional, friendly
        input=message,
        response_format="wav"  # High quality for phone systems
    )
    return response.content

Best Practices

Choose the Right Voice

Match the voice to your use case and audience

Chunk Long Text

Split text longer than 4096 characters into paragraphs

Add Pauses

Use punctuation to control pacing naturally

Stream for Real-time

Use streaming for interactive applications

Tips for Natural Speech

  1. Use proper punctuation: Commas add brief pauses, periods add longer pauses
  2. Spell out numbers: “One hundred twenty-three” vs “123”
  3. Use phonetic spelling: For unusual words or names
  4. Adjust speed: Slower for complex content, faster for simple messages