ibm-granite/granite-speech-4.1-2b
Granite Speech 4.1 2B is a compact and efficient speech-language model, specifically designed for multilingual automatic speech recognition (ASR) and bidirectional automatic speech translation (AST) for English, French, German, Spanish, Portuguese and Jap
Capabilities
Cost
Community model (estimated from hardware time)
Input Parameters
| Name | Type | Description | Default | Constraints |
|---|---|---|---|---|
add_generation_prompt | boolean | Add generation prompt. Passed to the chat template. Defaults to True. | true | — |
audio | array | Completion API Audio input. | — | — |
chat_template | string | A template to format the prompt with. If not specified, the chat template provided by the model will be used. | — | — |
chat_template_kwargs | object | Additional arguments to be passed to the chat template. | — | — |
frequency_penalty | number | Frequency penalty | — | — |
max_completion_tokens | integer | An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens. | — | — |
max_tokens | integer | max_tokens is deprecated in favor of the max_completion_tokens field. | — | — |
messages | array | Chat completion API messages. | — | — |
min_tokens | integer | The minimum number of tokens the model should generate as output. | 0 | — |
presence_penalty | number | Presence penalty | — | — |
prompt | string | Completion API user prompt. | — | — |
repetition_penalty | number | Repetition penalty | — | — |
seed | integer | Random seed. Leave unspecified to randomize the seed. | — | — |
stop | array | A list of sequences to stop generation at. For example, ["<end>","<stop>"] will stop generation at the first instance of "<end>" or "<stop>". | — | — |
stream | boolean | Request streaming response. | — | — |
system_prompt | string | Completion API system prompt. The chat template provides a good default. | — | — |
temperature | number | The value used to modulate the next token probabilities. | — | — |
top_k | integer | The number of highest probability tokens to consider for generating the output. If > 0, only keep the top k tokens with highest probability (top-k filtering). | — | — |
top_p | number | A probability threshold for generating the output. If < 1.0, only keep the top tokens with cumulative probability >= top_p (nucleus filtering). Nucleus filtering is described in Holtzman et al. (http://arxiv.org/abs/1904.09751). | — | — |
add_generation_prompt boolean Add generation prompt. Passed to the chat template. Defaults to True.
true audio array Completion API Audio input.
chat_template string A template to format the prompt with. If not specified, the chat template provided by the model will be used.
chat_template_kwargs object Additional arguments to be passed to the chat template.
frequency_penalty number Frequency penalty
max_completion_tokens integer An upper bound for the number of tokens that can be generated for a completion, including visible output tokens and reasoning tokens.
max_tokens integer max_tokens is deprecated in favor of the max_completion_tokens field.
messages array Chat completion API messages.
min_tokens integer The minimum number of tokens the model should generate as output.
0 presence_penalty number Presence penalty
prompt string Completion API user prompt.
repetition_penalty number Repetition penalty
seed integer Random seed. Leave unspecified to randomize the seed.
stop array A list of sequences to stop generation at. For example, ["<end>","<stop>"] will stop generation at the first instance of "<end>" or "<stop>".
stream boolean Request streaming response.
system_prompt string Completion API system prompt. The chat template provides a good default.
temperature number The value used to modulate the next token probabilities.
top_k integer The number of highest probability tokens to consider for generating the output. If > 0, only keep the top k tokens with highest probability (top-k filtering).
top_p number A probability threshold for generating the output. If < 1.0, only keep the top tokens with cumulative probability >= top_p (nucleus filtering). Nucleus filtering is described in Holtzman et al. (http://arxiv.org/abs/1904.09751).
25d7a190df06 Updated: 6/26/2026 96 runs
cinemasetfree.com