ibm-granite/granite-3.3-8b-instruct
Granite-3.3-8B-Instruct is a 8-billion parameter 128K context length language model fine-tuned for improved reasoning and instruction-following capabilities.
Capabilities
Cost
Community model (estimated from hardware time)
Input Parameters
| Name | Type | Description | Default | Constraints |
|---|---|---|---|---|
add_generation_prompt | boolean | Add generation prompt. Passed to the chat template. Defaults to True. | true | — |
chat_template | string | A template to format the prompt with. If not specified, the chat template provided by the model will be used. | — | — |
chat_template_kwargs | object | Additional arguments to be passed to the chat template. | [object Object] | — |
documents | array | Documents for request. Passed to the chat template. | | — |
frequency_penalty | number | Frequency penalty | 0 | — |
max_completion_tokens | integer | An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens. | — | — |
max_tokens | integer | max_tokens is deprecated in favor of the max_completion_tokens field. | — | — |
messages | array | Chat completion API messages. | | — |
min_tokens | integer | The minimum number of tokens the model should generate as output. | 0 | — |
presence_penalty | number | Presence penalty | 0 | — |
prompt | string | Completion API user prompt. | — | — |
response_format | object | An object specifying the format that the model must output. | — | — |
seed | integer | Random seed. Leave unspecified to randomize the seed. | — | — |
stop | array | A list of sequences to stop generation at. For example, ["<end>","<stop>"] will stop generation at the first instance of "<end>" or "<stop>". | | — |
stream | boolean | Request streaming response. Defaults to False. | false | — |
system_prompt | string | Completion API system prompt. The chat template provides a good default. | — | — |
temperature | number | The value used to modulate the next token probabilities. | 0.6 | — |
tool_choice | string | Tool choice for request. If the choice is a specific function, this should be specified as a JSON string. | — | — |
tools | array | Tools for request. Passed to the chat template. | | — |
top_k | integer | The number of highest probability tokens to consider for generating the output. If > 0, only keep the top k tokens with highest probability (top-k filtering). | 50 | — |
top_p | number | A probability threshold for generating the output. If < 1.0, only keep the top tokens with cumulative probability >= top_p (nucleus filtering). Nucleus filtering is described in Holtzman et al. (http://arxiv.org/abs/1904.09751). | 0.9 | — |
add_generation_prompt boolean Add generation prompt. Passed to the chat template. Defaults to True.
true chat_template string A template to format the prompt with. If not specified, the chat template provided by the model will be used.
chat_template_kwargs object Additional arguments to be passed to the chat template.
[object Object] documents array Documents for request. Passed to the chat template.
frequency_penalty number Frequency penalty
0 max_completion_tokens integer An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
max_tokens integer max_tokens is deprecated in favor of the max_completion_tokens field.
messages array Chat completion API messages.
min_tokens integer The minimum number of tokens the model should generate as output.
0 presence_penalty number Presence penalty
0 prompt string Completion API user prompt.
response_format object An object specifying the format that the model must output.
seed integer Random seed. Leave unspecified to randomize the seed.
stop array A list of sequences to stop generation at. For example, ["<end>","<stop>"] will stop generation at the first instance of "<end>" or "<stop>".
stream boolean Request streaming response. Defaults to False.
false system_prompt string Completion API system prompt. The chat template provides a good default.
temperature number The value used to modulate the next token probabilities.
0.6 tool_choice string Tool choice for request. If the choice is a specific function, this should be specified as a JSON string.
tools array Tools for request. Passed to the chat template.
top_k integer The number of highest probability tokens to consider for generating the output. If > 0, only keep the top k tokens with highest probability (top-k filtering).
50 top_p number A probability threshold for generating the output. If < 1.0, only keep the top tokens with cumulative probability >= top_p (nucleus filtering). Nucleus filtering is described in Holtzman et al. (http://arxiv.org/abs/1904.09751).
0.9 618ecbe80773 Updated: 2/26/2026 1.7M runs
cinemasetfree.com