meta/meta-llama-3-70b-instruct

A 70 billion parameter language model from Meta, fine tuned for chat completions

Capabilities

Max Tokens Top-P

Cost

Community model (estimated from hardware time)

Input Parameters

Name	Type	Description	Default	Constraints
`frequency_penalty`	number	Frequency penalty	`0.2`	—
`max_tokens`	integer	The maximum number of tokens the model should generate as output.	`512`	—
`min_tokens`	integer	The minimum number of tokens the model should generate as output.	`0`	—
`presence_penalty`	number	Presence penalty	`1.15`	—
`prompt`	string	Prompt	`""`	—
`prompt_template`	string	Prompt template. The string `{prompt}` will be substituted for the input prompt. If you want to generate dialog output, use this template as a starting point and construct the prompt string manually, leaving `prompt_template={prompt}`.	`"{prompt}"`	—
`temperature`	number	The value used to modulate the next token probabilities.	`0.6`	—
`top_k`	integer	The number of highest probability tokens to consider for generating the output. If > 0, only keep the top k tokens with highest probability (top-k filtering).	`50`	—
`top_p`	number	A probability threshold for generating the output. If < 1.0, only keep the top tokens with cumulative probability >= top_p (nucleus filtering). Nucleus filtering is described in Holtzman et al. (http://arxiv.org/abs/1904.09751).	`0.9`	—

frequency_penalty number

Frequency penalty

Default: 0.2

max_tokens integer

The maximum number of tokens the model should generate as output.

Default: 512

min_tokens integer

The minimum number of tokens the model should generate as output.

Default: 0

presence_penalty number

Presence penalty

Default: 1.15

prompt string

Prompt

Default: ""

prompt_template string

Prompt template. The string `{prompt}` will be substituted for the input prompt. If you want to generate dialog output, use this template as a starting point and construct the prompt string manually, leaving `prompt_template={prompt}`.

Default: "{prompt}"

temperature number

The value used to modulate the next token probabilities.

Default: 0.6

top_k integer

The number of highest probability tokens to consider for generating the output. If > 0, only keep the top k tokens with highest probability (top-k filtering).

Default: 50

top_p number

A probability threshold for generating the output. If < 1.0, only keep the top tokens with cumulative probability >= top_p (nucleus filtering). Nucleus filtering is described in Holtzman et al. (http://arxiv.org/abs/1904.09751).

Default: 0.9

Version: fbfb20b472b2 Updated: 2/26/2026 165.6M runs