← Back to all generators

sakemin/musicgen-chord

Generate music restricted to chord sequences and tempo

Capabilities

Seed Top-P

Cost

Community model (estimated from hardware time)

Input Parameters

audio_chords string

An audio file that will condition the chord progression. You must choose only one among `audio_chords` or `text_chords` above.

audio_end integer

End time of the audio file to use for chord conditioning. If None, will default to the end of the audio clip.

min: 0
audio_start integer

Start time of the audio file to use for chord conditioning.

Default: 0 min: 0
bpm number

BPM condition for the generated output. `text_chords` will be processed based on this value. This will be appended at the end of `prompt`.

chroma_coefficient number

Coefficient value multiplied to multi-hot chord chroma.

Default: 1 min: 0.5, max: 2.5
classifier_free_guidance integer

Increases the influence of inputs on the output. Higher values produce lower-varience outputs that adhere more closely to inputs.

Default: 3
continuation boolean

If `True`, generated music will continue from `audio_chords`. If chord conditioning, this is only possible when the chord condition is given with `text_chords`. If `False`, generated music will mimic `audio_chords`'s chord.

Default: false
duration integer

Duration of the generated audio in seconds.

Default: 8
multi_band_diffusion boolean

If `True`, the EnCodec tokens will be decoded with MultiBand Diffusion.

Default: false
normalization_strategy string

Strategy for normalizing audio.

Default: "loudness"
loudness clip peak rms
output_format string

Output format for generated audio.

Default: "wav"
wav mp3
prompt string

A description of the music you want to generate.

seed integer

Seed for random number generator. If `None` or `-1`, a random seed will be used.

temperature number

Controls the 'conservativeness' of the sampling process. Higher temperature means more diversity.

Default: 1
text_chords string

A text based chord progression condition. Single uppercase alphabet character(eg. `C`) is considered as a major chord. Chord attributes like(`maj`, `min`, `dim`, `aug`, `min6`, `maj6`, `min7`, `minmaj7`, `maj7`, `7`, `dim7`, `hdim7`, `sus2` and `sus4`) can be added to the root alphabet character after `:`.(eg. `A:min7`) Each chord token splitted by `SPACE` is allocated to a single bar. If more than one chord must be allocated to a single bar, cluster the chords adding with `,` without any `SPACE`.(eg. `C,C:7 G, E:min A:min`) You must choose either only one of `audio_chords` below or `text_chords`.

time_sig string

Time signature value for the generate output. `text_chords` will be processed based on this value. This will be appended at the end of `prompt`.

top_k integer

Reduces sampling to the k most likely tokens.

Default: 250
top_p number

Reduces sampling to tokens with cumulative probability of p. When set to `0` (default), top_k sampling is used.

Default: 0
Version: c940ab430857 Updated: 2/26/2026 3.1K runs