← Back to all generators
yorickvp/llava-13b
Official
View on Replicate →
Visual instruction tuning towards large language and vision models with GPT-4 level capabilities
Capabilities
Reference Images
Max Tokens
Top-P
Cost
Community model (estimated from hardware time)
Input Parameters
| Name | Type | Description | Default | Constraints |
|---|---|---|---|---|
image * | string (uri) | Input image | — | — |
prompt * | string | Prompt to use for text generation | — | — |
max_tokens | integer | Maximum number of tokens to generate. A word is generally 2-3 tokens | 1024 | min: 0 |
temperature | number | Adjusts randomness of outputs, greater than 1 is random and 0 is deterministic | 0.2 | min: 0 |
top_p | number | When decoding text, samples from the top p percentage of most likely tokens; lower to ignore less likely tokens | 1 | min: 0, max: 1 |
image required string Input image
prompt required string Prompt to use for text generation
max_tokens integer Maximum number of tokens to generate. A word is generally 2-3 tokens
Default:
1024 min: 0 temperature number Adjusts randomness of outputs, greater than 1 is random and 0 is deterministic
Default:
0.2 min: 0 top_p number When decoding text, samples from the top p percentage of most likely tokens; lower to ignore less likely tokens
Default:
1 min: 0, max: 1 Version:
80537f9eead1 Updated: 2/26/2026 34.0M runs
cinemasetfree.com