← Back to all generators

cuuupid/qwen2-vl-2b

SOTA open-source model for chatting with videos and the newest model in the Qwen family

Capabilities

1:1 4:3 3:4 16:9 9:16 2:3 3:2 Max Tokens

Cost

Community model (estimated from hardware time)

Input Parameters

video required string

Video to process

height integer

Height for the video

Default: 128 min: 128, max: 2048
max_duration number

Maximum duration of the video in seconds (above 360, may run out of VRAM).

Default: 60 min: 1, max: 768
max_tokens integer

Maximum number of tokens to generate

Default: 128 min: 1, max: 8192
prompt string

Prompt to use for the video

Default: "Describe the video."
repetition_penalty number

Repetition penalty for the model (1.1 is a good default).

Default: 1.1 min: 0.01, max: 1.5
temperature number

Temperature for the model (0.7 is a good default).

Default: 0.7 min: 0.01, max: 1
width integer

Width for the video

Default: 128 min: 128, max: 2048
Version: b3e77005f199 Updated: 2/26/2026 659 runs