zsxkib/uform-gen

🖼️ Super fast 1.5B Image Captioning/VQA Multimodal LLM (Image-to-Text) 🖋️

Capabilities

Reference Images

Community model (estimated from hardware time)

Name	Type	Description	Default	Constraints
`image`*	string(uri)	Input image	`—`	—
`prompt`	string	Prompt to guide the caption generation	`"Describe the image in great detail"`	—

imagerequiredstring

Input image

promptstring

Prompt to guide the caption generation

Default: "Describe the image in great detail"

Version: e6fa8e2d0769Updated: 7/25/20262.3K runs