← Back to all generators
thomasmol/whisper-diarization
Official
View on Replicate →
⚡️ Blazing fast audio transcription with speaker diarization | Whisper Large V3 Turbo & pyannote 4.0 community-1 | word & sentence level timestamps | prompt
Capabilities
No capability data available
Cost
Community model (estimated from hardware time)
Input Parameters
| Name | Type | Description | Default | Constraints |
|---|---|---|---|---|
file | string (uri) | Or an audio file | — | — |
file_string | string | Either provide: Base64 encoded audio file, | — | — |
file_url | string | Or provide: A direct audio file URL | — | — |
language | string | Language of the spoken words as a language code like 'en'. Leave empty to auto detect language. | — | — |
num_speakers | integer | Number of speakers, leave empty to autodetect. | — | min: 1, max: 50 |
prompt | string | Vocabulary: provide names, acronyms and loanwords in a list. Use punctuation for best accuracy. | — | — |
translate | boolean | Translate the speech into English. | false | — |
file string Or an audio file
file_string string Either provide: Base64 encoded audio file,
file_url string Or provide: A direct audio file URL
language string Language of the spoken words as a language code like 'en'. Leave empty to auto detect language.
num_speakers integer Number of speakers, leave empty to autodetect.
min: 1, max: 50
prompt string Vocabulary: provide names, acronyms and loanwords in a list. Use punctuation for best accuracy.
translate boolean Translate the speech into English.
Default:
false Version:
744c4f2bffae Updated: 6/26/2026 8.5M runs
cinemasetfree.com