← Back to all generators

fofr/deprecated-batch-image-captioning

A wrapper model for captioning multiple images using GPT, Claude or Gemini, useful for lora training

Capabilities

System Prompt

Cost

Community model (estimated from hardware time)

Input Parameters

image_zip_archive required string

ZIP archive containing images to process

anthropic_api_key string

API key for Anthropic

caption_prefix string

Optional prefix for image captions

Default: ""
caption_suffix string

Optional suffix for image captions

Default: ""
google_generativeai_api_key string

API key for Google Generative AI

max_dimension integer

Maximum dimension (width or height) for resized images

Default: 1024
message_prompt string

Message prompt for image captioning

Default: "Caption this image please"
model string

AI model to use for captioning. Your OpenAI or Anthropic account will be charged for usage, see their pricing pages for details.

Default: "gpt-4o-2024-08-06"
gpt-4o-2024-08-06 gpt-4o-mini gpt-4o gpt-4-turbo claude-3-5-sonnet-20240620 claude-3-opus-20240229 claude-3-sonnet-20240229 claude-3-haiku-20240307 gemini-1.5-pro gemini-1.5-flash
openai_api_key string

API key for OpenAI

resize_images_for_captioning boolean

Whether to resize images for captioning. This makes captioning cheaper

Default: true
system_prompt string

System prompt for image analysis

Default: " Write a four sentence caption for this image. In the first sentence describe the style and type (painting, photo, etc) of the image. Describe in the remaining sentences the contents and composition of the image. Only use language that would be used to prompt a text to image model. Do not include usage. Comma separate keywords rather than using "or". Precise composition is important. Avoid phrases like "conveys a sense of" and "capturing the", just use the terms themselves. Good examples are: "Photo of an alien woman with a glowing halo standing on top of a mountain, wearing a white robe and silver mask in the futuristic style with futuristic design, sky background, soft lighting, dynamic pose, a sense of future technology, a science fiction movie scene rendered in the Unreal Engine." "A scene from the cartoon series Masters of the Universe depicts Man-At-Arms wearing a gray helmet and gray armor with red gloves. He is holding an iron bar above his head while looking down on Orko, a pink blob character. Orko is sitting behind Man-At-Arms facing left on a chair. Both characters are standing near each other, with Orko inside a yellow chestplate over a blue shirt and black pants. The scene is drawn in the style of the Masters of the Universe cartoon series." "An emoji, digital illustration, playful, whimsical. A cartoon zombie character with green skin and tattered clothes reaches forward with two hands, they have green skin, messy hair, an open mouth and gaping teeth, one eye is half closed." "
Version: d0adb15f4826 Updated: 2/26/2026 1.6K runs