ibm-granite/granite-vision-3.3-2b

Official View on Replicate →

Granite-vision-3.3-2b is a compact and efficient vision-language model, specifically designed for visual document understanding, enabling automated content extraction from tables, charts, infographics, plots, diagrams, and more.

Capabilities

Reference Images Seed System Prompt Max Tokens Top-P

Cost

Community model (estimated from hardware time)

Input Parameters

Name	Type	Description	Default	Constraints
`chat_template`	string	A template to format the prompt with. If not provided, the default prompt template will be used.	`—`	—
`frequency_penalty`	number	Frequency penalty	`0`	—
`image`	string (uri)	Deprecated single image input.Use images input instead.Ignored if images used.	`—`	—
`images`	array	Image inputs for the model.	`—`	—
`max_tokens`	integer	The maximum number of tokens the model should generate as output.	`512`	—
`min_tokens`	integer	The minimum number of tokens the model should generate as output.	`0`	—
`presence_penalty`	number	Presence penalty	`0`	—
`prompt`	string	User prompt to send to the model.	`""`	—
`seed`	integer	Random seed. Leave blank to randomize the seed.	`—`	—
`stop_sequences`	string	A comma-separated list of sequences to stop generation at. For example, '<end>,<stop>' will stop generation at the first instance of 'end' or '<stop>'.	`—`	—
`system_prompt`	string	System prompt to send to the model.The chat template provides a good default.	`—`	—
`temperature`	number	The value used to modulate the next token probabilities.	`0.6`	—
`top_k`	integer	The number of highest probability tokens to consider for generating the output. If > 0, only keep the top k tokens with highest probability (top-k filtering).	`50`	—
`top_p`	number	A probability threshold for generating the output. If < 1.0, only keep the top tokens with cumulative probability >= top_p (nucleus filtering). Nucleus filtering is described in Holtzman et al. (http://arxiv.org/abs/1904.09751).	`0.9`	—

chat_template string

A template to format the prompt with. If not provided, the default prompt template will be used.

frequency_penalty number

Frequency penalty

Default: 0

image string

Deprecated single image input.Use images input instead.Ignored if images used.

images array

Image inputs for the model.

max_tokens integer

The maximum number of tokens the model should generate as output.

Default: 512

min_tokens integer

The minimum number of tokens the model should generate as output.

Default: 0

presence_penalty number

Presence penalty

Default: 0

prompt string

User prompt to send to the model.

Default: ""

seed integer

Random seed. Leave blank to randomize the seed.

stop_sequences string

A comma-separated list of sequences to stop generation at. For example, '<end>,<stop>' will stop generation at the first instance of 'end' or '<stop>'.

system_prompt string

System prompt to send to the model.The chat template provides a good default.

temperature number

The value used to modulate the next token probabilities.

Default: 0.6

top_k integer

The number of highest probability tokens to consider for generating the output. If > 0, only keep the top k tokens with highest probability (top-k filtering).

Default: 50

top_p number

A probability threshold for generating the output. If < 1.0, only keep the top tokens with cumulative probability >= top_p (nucleus filtering). Nucleus filtering is described in Holtzman et al. (http://arxiv.org/abs/1904.09751).

Default: 0.9

Version: 3339e8453ca9 Updated: 6/26/2026 247.9K runs