bytedance/bagel

🥯ByteDance Seed's Bagel Unified multimodal AI that generates images, edits images, and understands images in one 7B parameter model🥯

Capabilities

Reference ImagesSeed

Community model (estimated from hardware time)

Name	Type	Description	Default	Constraints
`prompt`*	string	Text prompt for generation, editing, or understanding	`—`	—
`cfg_img_scale`	number	Image guidance scale for preserving input image details	`1.5`	min: 1, max: 10
`cfg_renorm_min`	number	Minimum CFG renorm value	`1`	min: 0, max: 1
`cfg_renorm_type`	string	CFG renormalization method	`"global"`	globallocaltext_channel
`cfg_text_scale`	number	Text guidance scale for how closely to follow the prompt	`4`	min: 1, max: 20
`enable_thinking`	boolean	Enable chain-of-thought reasoning for better results	`false`	—
`image`	string(uri)	Input image for editing or understanding tasks	`—`	—
`num_inference_steps`	integer	Number of denoising steps	`50`	min: 1, max: 100
`output_format`	string	Output image format	`"webp"`	webpjpgpng
`output_quality`	integer	Image compression quality for lossy formats	`90`	min: 1, max: 100
`seed`	integer	Random seed for reproducible results	`—`	—
`task`	string	Task to perform	`"text-to-image"`	text-to-imageimage-editingimage-understanding
`timestep_shift`	number	Distribution of denoising steps between composition and details	`3`	min: 1, max: 10

promptrequiredstring

Text prompt for generation, editing, or understanding

cfg_img_scalenumber

Image guidance scale for preserving input image details

Default: 1.5min: 1, max: 10

cfg_renorm_minnumber

Minimum CFG renorm value

Default: 1min: 0, max: 1

cfg_renorm_typestring

CFG renormalization method

Default: "global"

globallocaltext_channel

cfg_text_scalenumber

Text guidance scale for how closely to follow the prompt

Default: 4min: 1, max: 20

enable_thinkingboolean

Enable chain-of-thought reasoning for better results

Default: false

imagestring

Input image for editing or understanding tasks

num_inference_stepsinteger

Number of denoising steps

Default: 50min: 1, max: 100

output_formatstring

Output image format

Default: "webp"

webpjpgpng

output_qualityinteger

Image compression quality for lossy formats

Default: 90min: 1, max: 100

seedinteger

Random seed for reproducible results

taskstring

Task to perform

Default: "text-to-image"

text-to-imageimage-editingimage-understanding

timestep_shiftnumber

Distribution of denoising steps between composition and details

Default: 3min: 1, max: 10

Version: 7dd8def79e50Updated: 7/25/2026272.8K runs