← Back to all generators
lucataco/csm-1b
Official
View on Replicate →
CSM (Conversational Speech Model) is a speech generation model from Sesame that generates RVQ audio codes from text and audio inputs
Capabilities
No capability data available
Cost
Community model (estimated from hardware time)
Input Parameters
| Name | Type | Description | Default | Constraints |
|---|---|---|---|---|
max_audio_length_ms | integer | Maximum audio length in milliseconds | 10000 | min: 1000, max: 30000 |
speaker | integer | Speaker ID (0 or 1) | 0 | 0 1 |
text | string | Text to convert to speech | "Hello from Sesame." | — |
max_audio_length_ms integer Maximum audio length in milliseconds
Default:
10000 min: 1000, max: 30000 speaker integer Speaker ID (0 or 1)
Default:
0 0 1
text string Text to convert to speech
Default:
"Hello from Sesame." Version:
3e59b10a9894 Updated: 6/8/2026 1.2K runs
cinemasetfree.com