zsxkib/mmaudio

OfficialView on Replicate →

Add sound to video using the MMAudio V2 model. An advanced AI model that synthesizes high-quality audio from video content, enabling seamless video-to-audio transformation.

Capabilities

Reference ImagesNegative PromptSeed

Cost

Community model (estimated from hardware time)

Input Parameters

Name	Type	Description	Default	Constraints
`cfg_strength`	number	Guidance strength (CFG)	`4.5`	min: 1
`duration`	number	Duration of output in seconds	`8`	min: 1
`image`	string(uri)	Optional image file for image-to-audio generation (experimental)	`—`	—
`negative_prompt`	string	Negative prompt to avoid certain sounds	`"music"`	—
`num_steps`	integer	Number of inference steps	`25`	—
`prompt`	string	Text prompt for generated audio	`""`	—
`seed`	integer	Random seed. Use -1 or leave blank to randomize the seed	`—`	min: -1
`video`	string(uri)	Optional video file for video-to-audio generation	`—`	—

cfg_strengthnumber

Guidance strength (CFG)

Default: 4.5min: 1

durationnumber

Duration of output in seconds

Default: 8min: 1

imagestring

Optional image file for image-to-audio generation (experimental)

negative_promptstring

Negative prompt to avoid certain sounds

Default: "music"

num_stepsinteger

Number of inference steps

Default: 25

promptstring

Text prompt for generated audio

Default: ""

seedinteger

Random seed. Use -1 or leave blank to randomize the seed

min: -1

videostring

Optional video file for video-to-audio generation

Version: 62871fb59889Updated: 7/25/20264.8M runs