← Back to all generators

datalab-to/ocr

Detect and transcribe text in images with accurate bounding boxes, layout analysis, reding order, and table recognition, in 90 languages

Capabilities

No capability data available

Cost

Community model (estimated from hardware time)

Input Parameters

file required string

Input file. Must be one of: .pdf, .doc, .docx, .ppt, .pptx, .png, .jpg, .jpeg, .webp

max_pages integer

Maximum number of pages to process. Cannot be specified if page_range is set - these parameters are mutually exclusive

min: 1
page_range string

Page range to parse, comma separated like 0,5-10,20. Example: '0,2-4' will process pages 0, 2, 3, and 4. Cannot be specified if max_pages is set - these parameters are mutually exclusive

return_pages boolean

Return detailed page information including text lines, bounding boxes, polygons, and character-level data. When disabled, only text and page_count will be returned

Default: false
skip_cache boolean

Bypass the server-side cache and force re-processing. By default, identical requests are cached to save time and cost. Enable this to get fresh results

Default: false
visualize boolean

Draw red polygons on the input image(s) to visualize detected text regions and return the annotated images

Default: false
Version: 3e6db0d5311d Updated: 2/26/2026 25.3K runs