← Back to all generators

zsxkib/uform-gen

🖼️ Super fast 1.5B Image Captioning/VQA Multimodal LLM (Image-to-Text) 🖋️

Capabilities

Reference Images

Cost

Community model (estimated from hardware time)

Input Parameters

image required string

Input image

prompt string

Prompt to guide the caption generation

Default: "Describe the image in great detail"
Version: e6fa8e2d0769 Updated: 2/26/2026 2.3K runs