Text-to-Speech Models
Generate natural-sounding speech from text with Assisters TTS, our advanced voice synthesis model.Assisters TTS v1
Our state-of-the-art text-to-speech model with 300+ natural voices in 100+ languages.| Specification | Value |
|---|---|
| Model ID | assisters-tts-v1 |
| Voices | 300+ |
| Languages | 100+ |
| Max Input | 4,096 characters |
| Price | $0.01 / 1,000 characters |
| Latency | ~100ms first audio |
Capabilities
- Natural Voices: Human-like speech with proper intonation
- Multilingual: 100+ languages with native accents
- Voice Variety: 300+ unique voices (male, female, various ages)
- Streaming: Real-time audio streaming
- Multiple Formats: MP3, WAV, OGG, FLAC output
Example Usage
With Different Voices
Streaming Audio
Available Voices
| Voice | Description | Best For |
|---|---|---|
alloy | Neutral, balanced | General purpose |
echo | Warm, friendly | Customer service |
fable | Expressive, storytelling | Audiobooks |
onyx | Deep, authoritative | Professional content |
nova | Bright, energetic | Marketing, tutorials |
shimmer | Soft, gentle | Meditation, wellness |
300+ additional voices are available. See the voice gallery for the complete list with audio samples.
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
input | string | required | Text to convert (max 4096 chars) |
model | string | required | Model ID (assisters-tts-v1) |
voice | string | required | Voice ID to use |
response_format | string | ”mp3” | Audio format |
speed | float | 1.0 | Speed multiplier (0.25-4.0) |
Output Formats
| Format | Quality | File Size |
|---|---|---|
mp3 | Good | Small |
opus | Good | Smallest |
aac | Good | Small |
flac | Lossless | Large |
wav | Lossless | Largest |
pcm | Raw | Varies |
Use Cases
Voice Assistants
Voice Assistants
Build voice-enabled applications:
Audiobook Generation
Audiobook Generation
Convert text content to audiobooks:
Accessibility
Accessibility
Add audio to text content for accessibility:
IVR Systems
IVR Systems
Create interactive voice response systems:
Best Practices
Choose the Right Voice
Match the voice to your use case and audience
Chunk Long Text
Split text longer than 4096 characters into paragraphs
Add Pauses
Use punctuation to control pacing naturally
Stream for Real-time
Use streaming for interactive applications
Tips for Natural Speech
- Use proper punctuation: Commas add brief pauses, periods add longer pauses
- Spell out numbers: “One hundred twenty-three” vs “123”
- Use phonetic spelling: For unusual words or names
- Adjust speed: Slower for complex content, faster for simple messages