← Back to all generators

cjwbw/controlvideo

Training-free Controllable Text-to-Video Generation

Capabilities

Seed

Cost

Community model (estimated from hardware time)

Input Parameters

video_path required string

source video

condition string

Condition of structure sequence

Default: "depth"
depth canny pose
guidance_scale number

Scale for classifier-free guidance

Default: 12.5 min: 1, max: 20
is_long_video boolean

Whether to use hierarchical sampler to produce long video

Default: false
num_inference_steps integer

Number of denoising steps

Default: 50
prompt string

Text description of target video

Default: "A striking mallard floats effortlessly on the sparkling pond."
seed string

Random seed. Leave blank to randomize the seed

smoother_steps string

Timesteps at which using interleaved-frame smoother, separate with comma

Default: "19, 20"
video_length integer

Length of synthesized video

Default: 15
Version: 91710b3f53c9 Updated: 2/26/2026 2.4K runs