CLIP Skip

Also known as: clip_skip

CLIP Skip controls how many of the final layers of the CLIP text encoder are skipped. Higher values produce more stylized, abstract, or “anime-like” interpretations. Lower values produce more literal, photorealistic results that closely follow the prompt.

What It Does

CLIP is the text encoder that translates your prompt into a numerical representation the image model can understand. It has multiple layers, each capturing different levels of abstraction. The early layers capture broad concepts and composition, while the later layers capture specific, literal details.

When you skip the final layers (CLIP Skip = 2 or higher), the model receives a more abstract, less literal interpretation of your prompt. This tends to produce images with a more stylized, artistic feel — which is why CLIP Skip 2 is popular in the anime and illustration community. At CLIP Skip 1 (no layers skipped), the model gets the full detailed encoding, resulting in more photorealistic and prompt-literal outputs.

Adjusting It in CSF

CLIP Skip is set in the Advanced Model Settings modal. Edit a production design, open its Image tab, click Advanced Settings, and pick the model to configure. Because this control only makes sense for models that use CLIP as their text encoder (SDXL, SD 1.5, and similar), it appears in the modal only when the selected model exposes a clip_skip parameter.

It renders as a small whole-number slider (typically 1–2), with the current value beside it. Save stores your choice for this design and model when it differs from the default; Reset to Defaults puts it back. Newer architectures like Flux don’t use CLIP this way, so you won’t find CLIP Skip on those models.

Value Ranges

CLIP Skip 1 (No skip)

Full encoder output. Most literal prompt interpretation. Best for photorealism and precise prompt following.

CLIP Skip 2

Slightly abstracted. The most popular non-default value. Produces stylized results popular for anime and illustration.

CLIP Skip 3+

Highly abstracted. The model works from a very loose interpretation of the prompt. Results can be unpredictable but artistically interesting.

Tips

CLIP Skip 1 is the default for most models and works best for photorealistic prompts.
Try CLIP Skip 2 for anime, illustration, or stylized art.
Values above 3 are rarely useful and can produce incoherent results.
This parameter only applies to models that use CLIP as their text encoder (SDXL, SD 1.5, etc.), so it won’t appear for every model.