← Back to all generators

lucataco/florence-2-base

Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks

Capabilities

Reference Images

Cost

Community model (estimated from hardware time)

Input Parameters

image required string

Grayscale input image

task_input string

Input task

Default: "Caption"
Caption Detailed Caption More Detailed Caption Caption to Phrase Grounding Object Detection Dense Region Caption Region Proposal OCR OCR with Region
text_input string

Text Input(Optional)

Version: c81609117f66 Updated: 2/26/2026 132.3K runs