kwaivgi/kling-v3-omni-video
Kling Video 3.0 Omni: Unified multimodal video generation with reference images, video editing, native audio, and multi-shot control
Capabilities
Cost
Community model (estimated from hardware time)
Input Parameters
| Name | Type | Description | Default | Constraints |
|---|---|---|---|---|
prompt * | string | Text prompt for video generation. Supports <<<image_1>>>, <<<video_1>>> template references. Max 2500 characters. | — | — |
aspect_ratio | string | Aspect ratio. Required when not using start frame or video editing. | "16:9" | 16:9 9:16 1:1 |
duration | integer | Video duration in seconds (3-15). Ignored for video editing (base). | 5 | min: 3, max: 15 |
end_image | string (uri) | Last frame image. Requires start_image. Supports .jpg/.jpeg/.png, max 10MB, min 300px. | — | — |
generate_audio | boolean | Generate native audio. Mutually exclusive with reference video. | false | — |
keep_original_sound | boolean | Keep original sound from reference video. | true | — |
mode | string | 'standard' generates 720p, 'pro' generates 1080p. | "pro" | standard pro |
multi_prompt | string | JSON array of shot definitions for multi-shot mode. Each shot: {"prompt": "...", "duration": N}. Max 6 shots, min duration 1s per shot, total must equal duration. Example: [{"prompt":"A cat jumps","duration":3},{"prompt":"It lands","duration":2}] | — | — |
reference_images | array | Reference images for elements, scenes, or styles. Supports .jpg/.jpeg/.png. Max 7 without video, 4 with video. | — | — |
reference_video | string (uri) | Reference video (.mp4/.mov). Duration 3-10s, resolution 720-2160px per side, max 200MB. | — | — |
start_image | string (uri) | First frame image. Supports .jpg/.jpeg/.png, max 10MB, min 300px, aspect ratio 1:2.5 to 2.5:1. | — | — |
video_reference_type | string | How to use reference video: 'feature' for style/camera reference, 'base' for video editing. | "feature" | feature base |
prompt required string Text prompt for video generation. Supports <<<image_1>>>, <<<video_1>>> template references. Max 2500 characters.
aspect_ratio string Aspect ratio. Required when not using start frame or video editing.
"16:9" duration integer Video duration in seconds (3-15). Ignored for video editing (base).
5 min: 3, max: 15 end_image string Last frame image. Requires start_image. Supports .jpg/.jpeg/.png, max 10MB, min 300px.
generate_audio boolean Generate native audio. Mutually exclusive with reference video.
false keep_original_sound boolean Keep original sound from reference video.
true mode string 'standard' generates 720p, 'pro' generates 1080p.
"pro" multi_prompt string JSON array of shot definitions for multi-shot mode. Each shot: {"prompt": "...", "duration": N}. Max 6 shots, min duration 1s per shot, total must equal duration. Example: [{"prompt":"A cat jumps","duration":3},{"prompt":"It lands","duration":2}]
reference_images array Reference images for elements, scenes, or styles. Supports .jpg/.jpeg/.png. Max 7 without video, 4 with video.
reference_video string Reference video (.mp4/.mov). Duration 3-10s, resolution 720-2160px per side, max 200MB.
start_image string First frame image. Supports .jpg/.jpeg/.png, max 10MB, min 300px, aspect ratio 1:2.5 to 2.5:1.
video_reference_type string How to use reference video: 'feature' for style/camera reference, 'base' for video editing.
"feature" 1d449e255319 Updated: 2/26/2026 4.6K runs
cinemasetfree.com