Reproducible Diffusers LoRA inference pipelines for adapters trained with ostris/ai-toolkit.
← Docs Home · HTTP API · Troubleshooting
This is the index of supported API model ids for POST /v1/inference.
Each model page is meant to be a small engineering note: defaults that affect outputs, required inputs (e.g. ctrl_img), and the common causes of AI Toolkit preview vs inference mismatch.
If you don’t know which model ids your server build supports, call GET /v1/models (see HTTP API).
Run in the cloud (optional): If you want to reproduce the examples on this page in a pinned runtime without local CUDA/driver setup (and reduce preview‑vs‑inference drift), run it via RunComfy’s Cloud AI Toolkit (Train + Inference). 👉 You can open it here: Cloud AI Toolkit (Train + Inference)
model ids use underscores.
/models/wan22-14b-i2v/ ↔ model="wan22_14b_i2v"src/schemas/models.py + src/pipelines/__init__.py.If you’re trying to reproduce AI Toolkit training sample previews, start from the relevant model page and align:
| Model page | API model id | Base checkpoint | Notes |
|---|---|---|---|
| FLUX (FLUX.1-dev) | flux |
black-forest-labs/FLUX.1-dev |
Pipeline file is named flux_dev.py, but the API model id is flux. |
| FLUX.2 (FLUX.2-dev) | flux2 |
black-forest-labs/FLUX.2-dev |
Uses AI Toolkit components and merges LoRA into the transformer at load time (scale is not dynamically adjustable). |
| FLUX.2-klein 4B | flux2_klein_4b |
black-forest-labs/FLUX.2-klein-base-4B |
Uses AI Toolkit components + Qwen3 text encoder; CFG/negative prompt are active. |
| FLUX.2-klein 9B | flux2_klein_9b |
black-forest-labs/FLUX.2-klein-base-9B |
Uses AI Toolkit components + Qwen3 text encoder; CFG/negative prompt are active. |
| Flex.1 (alpha) | flex1 |
ostris/Flex.1-alpha |
Implemented on top of diffusers.FluxPipeline (same family as FLUX.1-dev). |
| Flex.2 | flex2 |
ostris/Flex.2-preview |
Uses fuse_lora (weights are merged), so LoRA scale is fixed after load; changing loras[].network_multiplier requires reload. |
| SDXL 1.0 | sdxl |
stabilityai/stable-diffusion-xl-base-1.0 |
Uses a DDPMScheduler config aligned to AI Toolkit defaults. |
| SD 1.5 | sd15 |
stable-diffusion-v1-5/stable-diffusion-v1-5 |
Uses a DDPMScheduler config aligned to AI Toolkit defaults. |
| Qwen Image | qwen_image |
Qwen/Qwen-Image |
Guidance is passed as true_cfg_scale in Diffusers. |
| Qwen Image (2512) | qwen_image_2512 |
Qwen/Qwen-Image-2512 |
Uses fuse_lora (weights are merged), so LoRA scale is not dynamically adjustable. |
| Z-Image | zimage |
Tongyi-MAI/Z-Image |
Base Z-Image model. Defaults (30 steps / CFG 4.0) are tuned for preview matching. |
| Z-Image Turbo | zimage_turbo |
Tongyi-MAI/Z-Image-Turbo |
Few-step model: the defaults (8 steps / CFG 1.0) are part of the model’s intended regime. |
| Z-Image De-Turbo | zimage_deturbo |
Tongyi-MAI/Z-Image-Turbo |
Assembles the pipeline from separate components (transformer from ostris/Z-Image-De-Turbo). |
| Chroma | chroma |
lodestones/Chroma1-Base |
Requires AI Toolkit for the custom pipeline and quantization path. LoRA is merged before quantization. |
| HiDream I1 | hidream |
HiDream-ai/HiDream-I1-Full |
Heavy pipeline: loads Llama-3.1-8B-Instruct as an extra text encoder and fuses LoRA into the transformer. |
| Lumina Image 2.0 | lumina2 |
Alpha-VLLM/Lumina-Image-2.0 |
Uses a FlowMatch scheduler config aligned to AI Toolkit’s Lumina2 sampler defaults. |
| OmniGen2 | omnigen2 |
OmniGen2/OmniGen2 |
Optional reference images (up to 3 via ctrl_img(_1..3)). Applies chat template for preview matching. |
| Model page | API model id | Base checkpoint | Notes |
|---|---|---|---|
| FLUX Kontext (FLUX.1-Kontext-dev) | flux_kontext |
black-forest-labs/FLUX.1-Kontext-dev |
Requires a control image (ctrl_img). The server resizes it to match the requested output size. |
| Qwen Image Edit | qwen_image_edit |
Qwen/Qwen-Image-Edit |
Requires a control image (ctrl_img). Prompt encoding depends on the control image. |
| Qwen Image Edit Plus (2509) | qwen_image_edit_plus |
Qwen/Qwen-Image-Edit-2509 |
Multi-image edit. Supports up to 3 control images (ctrl_img_1..3). Prompt encoding uses those images. |
| Qwen Image Edit Plus (2511) | qwen_image_edit_plus_2511 |
Qwen/Qwen-Image-Edit-2511 |
Multi-image edit with fuse_lora (weights are merged). Supports up to 3 control images. |
| HiDream E1 (HiDream-E1-Full) | hidream_e1 |
HiDream-ai/HiDream-E1-Full |
Image editing. Requires ctrl_img. Uses fused LoRA (reload required for different scales). |
| Model page | API model id | Base checkpoint | Notes |
|---|---|---|---|
| LTX-2 | ltx2 |
Lightricks/LTX-2 |
Unified T2V/I2V: if you provide ctrl_img it runs I2V; otherwise T2V. Outputs frames + audio (MP4). LoRA is converted and fused. |
| LTX-2.3 | ltx2.3 |
dg845/LTX-2.3-Diffusers |
Same as LTX-2 but with LTX-2.3 base model. Uses VocoderWithBWE for improved audio. |
| Wan 2.1 T2V (14B) | wan21_14b |
Wan-AI/Wan2.1-T2V-14B-Diffusers |
Text-to-video. Uses diffusers.WanPipeline. |
| Wan 2.1 T2V (1.3B) | wan21_1b |
Wan-AI/Wan2.1-T2V-1.3B-Diffusers |
Text-to-video. Smaller checkpoint (1.3B) with the same API surface as 14B. |
| Wan 2.1 I2V (14B) | wan21_i2v_14b |
Wan-AI/Wan2.1-I2V-14B-720P-Diffusers |
Image-to-video. Requires ctrl_img. The 480p variant wan21_i2v_14b480p uses the same logic but a different base checkpoint. Also supports wan21_i2v_14b480p. |
| Wan 2.2 T2V (A14B) | wan22_14b_t2v |
Wan-AI/Wan2.2-T2V-A14B-Diffusers |
MoE LoRA format: loras with transformer: "low" / "high" (either side optional). |
| Wan 2.2 I2V (A14B) | wan22_14b_i2v |
ai-toolkit/Wan2.2-I2V-A14B-Diffusers-bf16 |
Requires AI Toolkit for the custom Wan22Pipeline + first-frame conditioning. Uses MoE LoRA format and requires ctrl_img. |
| Wan 2.2 TI2V (5B) | wan22_5b |
Wan-AI/Wan2.2-TI2V-5B-Diffusers |
Tier-2 model: requires AI Toolkit. Supports both T2V and I2V (provide ctrl_img for I2V). |