yorickvp/llava-13b

Visual instruction tuning towards large language and vision models with GPT-4 level capabilities

Capabilities

Reference ImagesMax TokensTop-P

Community model (estimated from hardware time)

Name	Type	Description	Default	Constraints
`image`*	string(uri)	Input image	`—`	—
`prompt`*	string	Prompt to use for text generation	`—`	—
`max_tokens`	integer	Maximum number of tokens to generate. A word is generally 2-3 tokens	`1024`	min: 0
`temperature`	number	Adjusts randomness of outputs, greater than 1 is random and 0 is deterministic	`0.2`	min: 0
`top_p`	number	When decoding text, samples from the top p percentage of most likely tokens; lower to ignore less likely tokens	`1`	min: 0, max: 1

imagerequiredstring

Input image

promptrequiredstring

Prompt to use for text generation

max_tokensinteger

Maximum number of tokens to generate. A word is generally 2-3 tokens

Default: 1024min: 0

temperaturenumber

Adjusts randomness of outputs, greater than 1 is random and 0 is deterministic

Default: 0.2min: 0

top_pnumber

When decoding text, samples from the top p percentage of most likely tokens; lower to ignore less likely tokens

Default: 1min: 0, max: 1

Version: 80537f9eead1Updated: 7/25/202634.0M runs