← Back to all generators
cjwbw/aniportrait-audio2vid
Official
View on Replicate →
Audio-Driven Synthesis of Photorealistic Portrait Animations
Capabilities
1:1 4:3 3:4 16:9 9:16 2:3 3:2
Reference Images
Seed
Cost
Community model (estimated from hardware time)
Input Parameters
| Name | Type | Description | Default | Constraints |
|---|---|---|---|---|
audio * | string (uri) | Input audio | — | — |
image * | string (uri) | Input image | — | — |
fps | integer | Frame per second in the output video | 30 | — |
guidance_scale | number | Scale for classifier-free guidance | 3.5 | — |
height | integer | Height of output video | 512 | — |
seed | integer | Random seed. Leave blank to randomize the seed | — | — |
steps | integer | Inference steps | 25 | — |
width | integer | Width of output video | 512 | — |
audio required string Input audio
image required string Input image
fps integer Frame per second in the output video
Default:
30 guidance_scale number Scale for classifier-free guidance
Default:
3.5 height integer Height of output video
Default:
512 seed integer Random seed. Leave blank to randomize the seed
steps integer Inference steps
Default:
25 width integer Width of output video
Default:
512 Version:
3f976d8f2308 Updated: 2/26/2026 14.9K runs
cinemasetfree.com