meta/musicgen

Generate music from a prompt or melody

Capabilities

SeedTop-P

Cost

Community model (estimated from hardware time)

Input Parameters

Name	Type	Description	Default	Constraints
`classifier_free_guidance`	integer	Increases the influence of inputs on the output. Higher values produce lower-varience outputs that adhere more closely to inputs.	`3`	—
`continuation`	boolean	If `True`, generated music will continue from `input_audio`. Otherwise, generated music will mimic `input_audio`'s melody.	`false`	—
`continuation_end`	integer	End time of the audio file to use for continuation. If -1 or None, will default to the end of the audio clip.	`—`	min: 0
`continuation_start`	integer	Start time of the audio file to use for continuation.	`0`	min: 0
`duration`	integer	Duration of the generated audio in seconds.	`8`	—
`input_audio`	string(uri)	An audio file that will influence the generated music. If `continuation` is `True`, the generated music will be a continuation of the audio file. Otherwise, the generated music will mimic the audio file's melody.	`—`	—
`model_version`	string	Model to use for generation	`"stereo-melody-large"`	stereo-melody-largestereo-largemelody-largelarge
`multi_band_diffusion`	boolean	If `True`, the EnCodec tokens will be decoded with MultiBand Diffusion. Only works with non-stereo models.	`false`	—
`normalization_strategy`	string	Strategy for normalizing audio.	`"loudness"`	loudnessclippeakrms
`output_format`	string	Output format for generated audio.	`"wav"`	wavmp3
`prompt`	string	A description of the music you want to generate.	`—`	—
`seed`	integer	Seed for random number generator. If None or -1, a random seed will be used.	`—`	—
`temperature`	number	Controls the 'conservativeness' of the sampling process. Higher temperature means more diversity.	`1`	—
`top_k`	integer	Reduces sampling to the k most likely tokens.	`250`	—
`top_p`	number	Reduces sampling to tokens with cumulative probability of p. When set to `0` (default), top_k sampling is used.	`0`	—

classifier_free_guidanceinteger

Increases the influence of inputs on the output. Higher values produce lower-varience outputs that adhere more closely to inputs.

Default: 3

continuationboolean

If `True`, generated music will continue from `input_audio`. Otherwise, generated music will mimic `input_audio`'s melody.

Default: false

continuation_endinteger

End time of the audio file to use for continuation. If -1 or None, will default to the end of the audio clip.

min: 0

continuation_startinteger

Start time of the audio file to use for continuation.

Default: 0min: 0

durationinteger

Duration of the generated audio in seconds.

Default: 8

input_audiostring

An audio file that will influence the generated music. If `continuation` is `True`, the generated music will be a continuation of the audio file. Otherwise, the generated music will mimic the audio file's melody.

model_versionstring

Model to use for generation

Default: "stereo-melody-large"

stereo-melody-largestereo-largemelody-largelarge

multi_band_diffusionboolean

If `True`, the EnCodec tokens will be decoded with MultiBand Diffusion. Only works with non-stereo models.

Default: false

normalization_strategystring

Strategy for normalizing audio.

Default: "loudness"

loudnessclippeakrms

output_formatstring

Output format for generated audio.

Default: "wav"

wavmp3

promptstring

A description of the music you want to generate.

seedinteger

Seed for random number generator. If None or -1, a random seed will be used.

temperaturenumber

Controls the 'conservativeness' of the sampling process. Higher temperature means more diversity.

Default: 1

top_kinteger

Reduces sampling to the k most likely tokens.

Default: 250

top_pnumber

Reduces sampling to tokens with cumulative probability of p. When set to `0` (default), top_k sampling is used.

Default: 0

Version: 671ac645ce5eUpdated: 7/25/20263.3M runs