← Back to all generators
thomasmol/whisper-diarization
Official
View on Replicate →
⚡️ Blazing fast audio transcription with speaker diarization | Whisper Large V3 Turbo | word & sentence level timestamps | prompt
Capabilities
No capability data available
Cost
Community model (estimated from hardware time)
Input Parameters
| Name | Type | Description | Default | Constraints |
|---|---|---|---|---|
file | string (uri) | Or an audio file | — | — |
file_string | string | Either provide: Base64 encoded audio file, | — | — |
file_url | string | Or provide: A direct audio file URL | — | — |
group_segments | boolean | Group segments of same speaker shorter apart than 2 seconds | true | — |
language | string | Language of the spoken words as a language code like 'en'. Leave empty to auto detect language. | — | — |
num_speakers | integer | Number of speakers, leave empty to autodetect. | — | min: 1, max: 50 |
prompt | string | Vocabulary: provide names, acronyms and loanwords in a list. Use punctuation for best accuracy. | — | — |
translate | boolean | Translate the speech into English. | false | — |
file string Or an audio file
file_string string Either provide: Base64 encoded audio file,
file_url string Or provide: A direct audio file URL
group_segments boolean Group segments of same speaker shorter apart than 2 seconds
Default:
true language string Language of the spoken words as a language code like 'en'. Leave empty to auto detect language.
num_speakers integer Number of speakers, leave empty to autodetect.
min: 1, max: 50
prompt string Vocabulary: provide names, acronyms and loanwords in a list. Use punctuation for best accuracy.
translate boolean Translate the speech into English.
Default:
false Version:
1495a9cddc83 Updated: 2/26/2026 5.9M runs
cinemasetfree.com