← Back to all generators

cjwbw/aniportrait-audio2vid

Audio-Driven Synthesis of Photorealistic Portrait Animations

Capabilities

1:1 4:3 3:4 16:9 9:16 2:3 3:2 Reference Images Seed

Cost

Community model (estimated from hardware time)

Input Parameters

audio required string

Input audio

image required string

Input image

fps integer

Frame per second in the output video

Default: 30
guidance_scale number

Scale for classifier-free guidance

Default: 3.5
height integer

Height of output video

Default: 512
seed integer

Random seed. Leave blank to randomize the seed

steps integer

Inference steps

Default: 25
width integer

Width of output video

Default: 512
Version: 3f976d8f2308 Updated: 2/26/2026 14.9K runs