← Back to all generators
fofr/deprecated-batch-image-captioning
Official
View on Replicate →
A wrapper model for captioning multiple images using GPT, Claude or Gemini, useful for lora training
Capabilities
System Prompt
Cost
Community model (estimated from hardware time)
Input Parameters
| Name | Type | Description | Default | Constraints |
|---|---|---|---|---|
image_zip_archive * | string (uri) | ZIP archive containing images to process | — | — |
anthropic_api_key | string (password) | API key for Anthropic | — | — |
caption_prefix | string | Optional prefix for image captions | "" | — |
caption_suffix | string | Optional suffix for image captions | "" | — |
google_generativeai_api_key | string (password) | API key for Google Generative AI | — | — |
max_dimension | integer | Maximum dimension (width or height) for resized images | 1024 | — |
message_prompt | string | Message prompt for image captioning | "Caption this image please" | — |
model | string | AI model to use for captioning. Your OpenAI or Anthropic account will be charged for usage, see their pricing pages for details. | "gpt-4o-2024-08-06" | gpt-4o-2024-08-06 gpt-4o-mini gpt-4o gpt-4-turbo claude-3-5-sonnet-20240620 claude-3-opus-20240229 claude-3-sonnet-20240229 claude-3-haiku-20240307 gemini-1.5-pro gemini-1.5-flash |
openai_api_key | string (password) | API key for OpenAI | — | — |
resize_images_for_captioning | boolean | Whether to resize images for captioning. This makes captioning cheaper | true | — |
system_prompt | string | System prompt for image analysis | "
Write a four sentence caption for this image. In the first sentence describe the style and type (painting, photo, etc) of the image. Describe in the remaining sentences the contents and composition of the image. Only use language that would be used to prompt a text to image model. Do not include usage. Comma separate keywords rather than using "or". Precise composition is important. Avoid phrases like "conveys a sense of" and "capturing the", just use the terms themselves.
Good examples are:
"Photo of an alien woman with a glowing halo standing on top of a mountain, wearing a white robe and silver mask in the futuristic style with futuristic design, sky background, soft lighting, dynamic pose, a sense of future technology, a science fiction movie scene rendered in the Unreal Engine."
"A scene from the cartoon series Masters of the Universe depicts Man-At-Arms wearing a gray helmet and gray armor with red gloves. He is holding an iron bar above his head while looking down on Orko, a pink blob character. Orko is sitting behind Man-At-Arms facing left on a chair. Both characters are standing near each other, with Orko inside a yellow chestplate over a blue shirt and black pants. The scene is drawn in the style of the Masters of the Universe cartoon series."
"An emoji, digital illustration, playful, whimsical. A cartoon zombie character with green skin and tattered clothes reaches forward with two hands, they have green skin, messy hair, an open mouth and gaping teeth, one eye is half closed."
" | — |
image_zip_archive required string ZIP archive containing images to process
anthropic_api_key string API key for Anthropic
caption_prefix string Optional prefix for image captions
Default:
"" caption_suffix string Optional suffix for image captions
Default:
"" google_generativeai_api_key string API key for Google Generative AI
max_dimension integer Maximum dimension (width or height) for resized images
Default:
1024 message_prompt string Message prompt for image captioning
Default:
"Caption this image please" model string AI model to use for captioning. Your OpenAI or Anthropic account will be charged for usage, see their pricing pages for details.
Default:
"gpt-4o-2024-08-06" gpt-4o-2024-08-06 gpt-4o-mini gpt-4o gpt-4-turbo claude-3-5-sonnet-20240620 claude-3-opus-20240229 claude-3-sonnet-20240229 claude-3-haiku-20240307 gemini-1.5-pro gemini-1.5-flash
openai_api_key string API key for OpenAI
resize_images_for_captioning boolean Whether to resize images for captioning. This makes captioning cheaper
Default:
true system_prompt string System prompt for image analysis
Default:
"
Write a four sentence caption for this image. In the first sentence describe the style and type (painting, photo, etc) of the image. Describe in the remaining sentences the contents and composition of the image. Only use language that would be used to prompt a text to image model. Do not include usage. Comma separate keywords rather than using "or". Precise composition is important. Avoid phrases like "conveys a sense of" and "capturing the", just use the terms themselves.
Good examples are:
"Photo of an alien woman with a glowing halo standing on top of a mountain, wearing a white robe and silver mask in the futuristic style with futuristic design, sky background, soft lighting, dynamic pose, a sense of future technology, a science fiction movie scene rendered in the Unreal Engine."
"A scene from the cartoon series Masters of the Universe depicts Man-At-Arms wearing a gray helmet and gray armor with red gloves. He is holding an iron bar above his head while looking down on Orko, a pink blob character. Orko is sitting behind Man-At-Arms facing left on a chair. Both characters are standing near each other, with Orko inside a yellow chestplate over a blue shirt and black pants. The scene is drawn in the style of the Masters of the Universe cartoon series."
"An emoji, digital illustration, playful, whimsical. A cartoon zombie character with green skin and tattered clothes reaches forward with two hands, they have green skin, messy hair, an open mouth and gaping teeth, one eye is half closed."
" Version:
d0adb15f4826 Updated: 2/26/2026 1.6K runs
cinemasetfree.com