← Back to all generators

lucataco/csm-1b

CSM (Conversational Speech Model) is a speech generation model from Sesame that generates RVQ audio codes from text and audio inputs

Capabilities

No capability data available

Cost

Community model (estimated from hardware time)

Input Parameters

max_audio_length_ms integer

Maximum audio length in milliseconds

Default: 10000 min: 1000, max: 30000
speaker integer

Speaker ID (0 or 1)

Default: 0
0 1
text string

Text to convert to speech

Default: "Hello from Sesame."
Version: 3e59b10a9894 Updated: 6/8/2026 1.2K runs