Speech-to-Text Models
Transcribe audio files with high accuracy using Assisters Whisper, our advanced speech recognition model.Assisters Whisper v1
Our state-of-the-art speech recognition model that transcribes audio in 100+ languages with exceptional accuracy.| Specification | Value |
|---|---|
| Model ID | assisters-whisper-v1 |
| Languages | 100+ |
| Max Audio Length | 25 minutes |
| Price | $0.01 / minute |
| Latency | ~1x real-time |
Capabilities
- Multilingual: Transcribe 100+ languages automatically
- High Accuracy: State-of-the-art word error rate
- Speaker Diarization: Identify different speakers (coming soon)
- Timestamps: Word and segment-level timestamps
- Translation: Translate audio to English
Supported Formats
MP3, MP4, M4A, WAV, WEBM, FLAC, OGG, and more.Example Usage
With Timestamps
Translation to English
With Language Hint
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
file | file | required | Audio file to transcribe |
model | string | required | Model ID (assisters-whisper-v1) |
language | string | auto | ISO-639-1 language code |
prompt | string | null | Guide the model’s style |
response_format | string | ”json” | Output format |
temperature | float | 0 | Sampling temperature |
timestamp_granularities | array | null | Timestamp detail level |
Response Formats
| Format | Description |
|---|---|
json | Simple JSON with text |
text | Plain text only |
srt | SubRip subtitle format |
verbose_json | Detailed JSON with timestamps |
vtt | WebVTT subtitle format |
Use Cases
Meeting Transcription
Meeting Transcription
Transcribe meetings and calls:
Subtitle Generation
Subtitle Generation
Create subtitles for videos:
Podcast Processing
Podcast Processing
Transcribe podcasts for search and accessibility:
Voice Notes
Voice Notes
Convert voice memos to text:
Supported Languages
Assisters Whisper v1 supports 100+ languages including: Major Languages: English, Spanish, French, German, Italian, Portuguese, Dutch, Russian, Chinese (Simplified & Traditional), Japanese, Korean, Arabic, Hindi, and more. Regional Languages: Catalan, Welsh, Icelandic, Latvian, Lithuanian, Slovenian, and many others.Best Practices
Use Language Hints
Specify the language when known for better accuracy
Clean Audio
Higher quality audio produces better transcriptions
Chunk Long Audio
Split files longer than 25 minutes into chunks
Use Prompts
Guide the model with context-specific terminology