rafaelgalle/whisper-diarization-advanced

OfficialView on Replicate →

Ultra-fast, customizable speech-to-text and speaker diarization for noisy, multi-speaker audio. Includes advanced noise reduction, stereo channel support, and flexible audio preprocessing—ideal for call centers, meetings, and podcasts.

Capabilities

No capability data available

Cost

Community model (estimated from hardware time)

Input Parameters

Name	Type	Description	Default	Constraints
`file_path`	string(uri)	Audio file path	`—`	—
`file_string`	string	Base64 encoded audio	`—`	—
`file_url`	string	Direct URL to audio	`—`	—
`highpass_freq`	integer	Highpass filter frequency (Hz)	`45`	—
`language`	string	Language code (e.g., 'en', 'pt')	`—`	—
`lowpass_freq`	integer	Lowpass filter frequency (Hz)	`8000`	—
`num_speakers`	integer	Number of speakers (leave empty to auto-detect)	`—`	min: 1, max: 50
`preprocess`	integer	Preprocessing level: 0=None, 1=Sanitize, 2=+Filter, 3=+ReduceNoise, 4=+Normalize	`0`	min: 0, max: 4
`prompt`	string	Custom prompt with names/acronyms separated by punctuation	`—`	—
`prop_decrease`	number	Noise reduction proportion	`0.3`	min: 0, max: 1
`stationary`	boolean	Use stationary noise reduction	`true`	—
`target_dBFS`	number	Target loudness in dBFS	`-18`	—
`translate`	boolean	Translate to English	`false`	—

file_pathstring

Audio file path

file_stringstring

Base64 encoded audio

file_urlstring

Direct URL to audio

highpass_freqinteger

Highpass filter frequency (Hz)

Default: 45

languagestring

Language code (e.g., 'en', 'pt')

lowpass_freqinteger

Lowpass filter frequency (Hz)

Default: 8000

num_speakersinteger

Number of speakers (leave empty to auto-detect)

min: 1, max: 50

preprocessinteger

Preprocessing level: 0=None, 1=Sanitize, 2=+Filter, 3=+ReduceNoise, 4=+Normalize

Default: 0min: 0, max: 4

promptstring

Custom prompt with names/acronyms separated by punctuation

prop_decreasenumber

Noise reduction proportion

Default: 0.3min: 0, max: 1

stationaryboolean

Use stationary noise reduction

Default: true

target_dBFSnumber

Target loudness in dBFS

Default: -18

translateboolean

Translate to English

Default: false

Version: 68dd22041e73Updated: 7/25/2026579.5K runs