← Back to all generators

qwen/qwen3-tts

A unified Text-to-Speech demo featuring three powerful modes: Voice, Clone and Design

Capabilities

No capability data available

Cost

Community model (estimated from hardware time)

Input Parameters

text required string

Text to synthesize into speech

language string

Language of the text (use 'auto' for automatic detection)

Default: "auto"
auto Chinese English Japanese Korean French German Spanish Portuguese Russian
mode string

TTS mode: 'custom_voice' uses preset speakers, 'voice_clone' clones from reference audio, 'voice_design' creates voice from description

Default: "custom_voice"
custom_voice voice_clone voice_design
reference_audio string

Reference audio file for voice cloning (only for 'voice_clone' mode)

reference_text string

Transcript of the reference audio (recommended for 'voice_clone' mode)

speaker string

Preset speaker voice (only for 'custom_voice' mode)

Default: "Serena"
Aiden Dylan Eric Ono_anna Ryan Serena Sohee Uncle_fu Vivian
style_instruction string

Optional style/emotion instruction (e.g., 'speak slowly and calmly', 'excited tone')

voice_description string

Natural language description of desired voice (only for 'voice_design' mode). Example: 'A warm, friendly female voice with a slight British accent'

Version: 501be1210291 Updated: 2/26/2026 83.9K runs