Reproducible Diffusers LoRA inference pipelines for adapters trained with ostris/ai-toolkit.
ai-toolkit-inference is a set of reference Diffusers pipelines plus an async FastAPI server for running LoRAs trained with ostris/ai-toolkit.
The core goal is training preview ↔ inference parity. AI Toolkit samples are produced by a specific inference graph (scheduler wiring, resolution rules, LoRA injection, seeding). Small “default” differences between stacks can show up as visibly different outputs.
README.mdmodel id (with defaults, required inputs like ctrl_img, and preview-matching notes).If you’re calling the server and you don’t know which model ids are supported, start with GET /v1/models (documented under HTTP API).
src/pipelines/ — one pipeline implementation per API model idsrc/api/ + src/server.py — FastAPI server (POST /v1/inference, plus status/result polling)docs/ — this documentation (published as GitHub Pages)POST /v1/inference → returns request_id, status_url, result_urlGET /v1/requests/{request_id}/statusGET /v1/requests/{request_id}/resultThe request schema is defined in src/schemas/request.py.
If AI Toolkit training samples look good but your inference looks different, the most common causes are:
Each model page calls out the model-specific version of these pitfalls.
model / model id: the API selector (e.g. flux2, wan22_14b_i2v). See Model Catalog or GET /v1/models.ctrl_img: control image input for edit/I2V models. In this server it’s a string (URL or base64). See HTTP API.width/height to a multiple of the model’s resolution_divisor.network_multiplier (on loras[]): LoRA strength. Some pipelines apply it dynamically; some fuse/merge weights and require a reload.Note: if your main blocker is environment drift (CUDA/PyTorch/Diffusers versions, large model downloads, model-specific pipeline deps), it helps to run training + inference in a fixed runtime/container. RunComfy provides a managed runtime for AI Toolkit, but any reproducible GPU environment works — the reference behavior is still defined by this repo.