← Back to all generators
bytedance/omni-human
Official
View on Replicate →
Turns your audio/video/images into professional-quality animated videos
Capabilities
Reference Images
Cost
Community model (estimated from hardware time)
Input Parameters
| Name | Type | Description | Default | Constraints |
|---|---|---|---|---|
audio * | string (uri) | Input audio file (MP3, WAV, etc.). For the best quality outputs audio should be no longer than 15 seconds. After 15 seconds the video quality will begin to degrade. If you have a lot of audio you want to process, we recommend splitting it into 15 second chunks. | — | — |
image * | string (uri) | Input image containing a human subject, face or character. | — | — |
audio required string Input audio file (MP3, WAV, etc.). For the best quality outputs audio should be no longer than 15 seconds. After 15 seconds the video quality will begin to degrade. If you have a lot of audio you want to process, we recommend splitting it into 15 second chunks.
image required string Input image containing a human subject, face or character.
Version:
566f1b030169 Updated: 2/26/2026 153.5K runs
cinemasetfree.com