← Back to all generators
lucataco/florence-2-base
Official
View on Replicate →
Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks
Capabilities
Reference Images
Cost
Community model (estimated from hardware time)
Input Parameters
| Name | Type | Description | Default | Constraints |
|---|---|---|---|---|
image * | string (uri) | Grayscale input image | — | — |
task_input | string | Input task | "Caption" | Caption Detailed Caption More Detailed Caption Caption to Phrase Grounding Object Detection Dense Region Caption Region Proposal OCR OCR with Region |
text_input | string | Text Input(Optional) | — | — |
image required string Grayscale input image
task_input string Input task
Default:
"Caption" Caption Detailed Caption More Detailed Caption Caption to Phrase Grounding Object Detection Dense Region Caption Region Proposal OCR OCR with Region
text_input string Text Input(Optional)
Version:
c81609117f66 Updated: 2/26/2026 132.3K runs
cinemasetfree.com