← Back to all generators
lucataco/bulk-video-caption
Official
View on Replicate →
Video Preprocessing tool for captioning multiple videos using GPT, Claude or Gemini
Capabilities
System Prompt
Cost
Community model (estimated from hardware time)
Input Parameters
| Name | Type | Description | Default | Constraints |
|---|---|---|---|---|
video_zip_archive * | string (uri) | ZIP archive containing videos to process | — | — |
anthropic_api_key | string (password) | API key for Anthropic | — | — |
caption_prefix | string | Optional prefix for video captions | "" | — |
caption_suffix | string | Optional suffix for video captions | "" | — |
frames_to_extract | integer | Number of frames to extract from each video for analysis | 2 | — |
google_generativeai_api_key | string (password) | API key for Google Generative AI | — | — |
include_csv | boolean | Whether to include CSV in output | true | — |
model | string | AI model to use for captioning | "gpt-4o" | gpt-4o gpt-4o-mini gpt-4-turbo claude-3-5-sonnet-20240620 claude-3-opus-20240229 claude-3-sonnet-20240229 claude-3-haiku-20240307 gemini-1.5-pro gemini-1.5-flash |
openai_api_key | string (password) | API key for OpenAI | — | — |
system_prompt | string | System prompt for caption generation | "
Analyze these frames from a video and write a detailed caption.
Describe the type of video (e.g., animation, live-action footage, etc.).
Focus on consistent elements across frames and any notable motion or action.
Describe the main subjects, setting, and overall mood of the video.
Use clear, descriptive language suitable for text-to-video generation.
" | — |
video_zip_archive required string ZIP archive containing videos to process
anthropic_api_key string API key for Anthropic
caption_prefix string Optional prefix for video captions
Default:
"" caption_suffix string Optional suffix for video captions
Default:
"" frames_to_extract integer Number of frames to extract from each video for analysis
Default:
2 google_generativeai_api_key string API key for Google Generative AI
include_csv boolean Whether to include CSV in output
Default:
true model string AI model to use for captioning
Default:
"gpt-4o" gpt-4o gpt-4o-mini gpt-4-turbo claude-3-5-sonnet-20240620 claude-3-opus-20240229 claude-3-sonnet-20240229 claude-3-haiku-20240307 gemini-1.5-pro gemini-1.5-flash
openai_api_key string API key for OpenAI
system_prompt string System prompt for caption generation
Default:
"
Analyze these frames from a video and write a detailed caption.
Describe the type of video (e.g., animation, live-action footage, etc.).
Focus on consistent elements across frames and any notable motion or action.
Describe the main subjects, setting, and overall mood of the video.
Use clear, descriptive language suitable for text-to-video generation.
" Version:
bd610b3c0ecd Updated: 2/26/2026 179 runs
cinemasetfree.com