Stable Diffusion WebUI Forge - Neo
[ Neo | Classic ]

Stable Diffusion WebUI Forge is a platform on top of the original Stable Diffusion WebUI by AUTOMATIC1111, to make development easier, optimize resource management, speed up inference, and study experimental features.
The name "Forge" is inspired by "Minecraft Forge". This project aims to become the Forge of Stable Diffusion WebUI.
- lllyasviel
(paraphrased)
"Neo" mainly serves as an continuation for the "latest" version of Forge, which was built on Gradio 4.40.0 before lllyasviel became too busy... Additionally, this fork is focused on optimization and usability, with the main goal of being able to run the latest popular models via an easy-to-use GUI.
[!Tip] How to Install
Features [May.]
Most base features of the original Automatic1111 Webui should still function
New Features
- Support Anima
- Support Flux.2-Klein
4B/9B(notFLUX.2-Dev)
[!Important] To use
Flux.2-Kleinfor regularimg2img, toggle the functionality in Settings/Stable Diffusion
- Support Ernie-Image
ernie-image/ernie-image-turbo
- Support Z-Image
z-image/z-image-turbo
- Support Wan 2.2
- use
Refinerto achieve High Noise / Low Noise switching- enable
Refinerin Settings/Refiner
- enable
- use
[!Important] To export a video, you need to have FFmpeg installed
- Support Mugen
- display the
Shiftslider forxlpreset in Settings/Presets/XL
- display the
- Support advanced SDXL models
[!Note]
- v-prediction:
state_dictmust include "v_pred"- Zero Terminal SNR:
state_dictmust include "ztsnr"- Rectified Flow: the model must include "
rectified" in its path (e.g. file name or folder name)
- Support Qwen-Image / Qwen-Image-Edit
[!Note] To be detected as an Edit model, the model must include "
qwen" and "edit" in its path (e.g. file name or folder name)
- Support Flux Kontext
[!Note] To be detected as a Kontext model, the model must include "
kontext" in its path (e.g. file name or folder name)
- Implement
ImageStitch Integrated- support Multi-Image Inputs for
flux.2-klein/flux-kontext/qwen-image-edit - support FirstLastFrameToVideo for
wan 2.2
- support Multi-Image Inputs for
- Support Nunchaku (
SVDQ) Modelsflux-dev,flux-krea,flux-kontext,qwen-image,qwen-image-edit,z-image-turbo- only
FluxandQwensupport LoRA currently - see Commandline
- Support Lumina-Image-2.0
Neta-Lumina/NetaYume-Lumina
- Support Chroma1-HD
- Support MixedPrecision Models
fp4mixed/fp8mixed/mxfp8/nvfp4/fp8_scaled
- Support Flux.2-Small-Decoder & Qwen2D VAE
[!Tip] Check out Download Models for where to get each model and the accompanying modules
[!Tip] Check out Inference References for how to use each model and the recommended parameters
- Rewrite Preset System
- now remembers the checkpoint/module selection and parameters for each preset
- Support uv package manager
- drastically speed up installation
- requires manually installing uv
- see Commandline
- Support SageAttention, FlashAttention,
fp16_accumulation,torch._scaled_mm- see Commandline
- Implement Triton Kernel for
matmulintorch.int8- speed up inference after quantization
- enable by selecting
int8in theDiffusion in Low Bits
- Implement Radial Attention
- speed up
Wan 2.2 - requires manually installing SpargeAttn
- speed up
- Implement fast
state_dictswitching for Refiner- enable in Settings/Refiner
- Implement RescaleCFG
- reduce burnt colors; mainly for
v-predcheckpoints - enable in Settings/UI Alternatives
- reduce burnt colors; mainly for
- Implement MaHiRo
- alternative CFG calculation; improve prompt adherence
- enable in Settings/UI Alternatives
- Implement Spectrum
- training-free acceleration for all models
- Implement Epsilon Scaling
- enable in Settings/Stable Diffusion
- Implement
torch.compile- speed up inference after compilation
- Implement alternative Prompt Box layouts
- Implement tiled
Conv2dfor VAE- reduce memory usage; reduce speed
- see Commandline
- Implement full precision calculation for
Mask blurblending- enable in Settings/img2img
- Support TAESD live preview for all models
- Support loading upscalers in
halfprecision- speed up; reduce quality
- enable in Settings/Upscaling
- Support running tile composition on GPU
- enable in Settings/Upscaling
- Support (short) videos in Extras tab
- Add support for
.avif,.heif, and.jxlimage formats - Automatically determine the optimal row count for
X/Y/Z Plot - Update LLLite Controlnet
- Support Union Controlnet
Removed Features
- SD2
- SD3
- Forge Spaces
- Hypernetworks
- CLIP Interrogator
- Deepbooru Interrogator
- Textual Inversion Training
- Some built-in Extensions
- Some built-in Scripts
- Some Samplers & Schedulers
- Some Compatibility Settings
- Stealth Infotext
Optimizations
- [Comfy] Rewrite the Backend (
memory_management.py,ModelPatcher,attention.py, etc.) - No longer
gitcloneany repository on fresh install - No longer install
open-clip - Fix memory leak when switching checkpoints
- Restore the ability to drag-and-drop images onto
gr.Imagethat already contains image - Speed up launch time
- Improve timer logs
- Remove unused
cmd_args - Remove unused
args_parser - Remove unused
shared_options - Remove legacy codes
- Fix some typos
- Fix automatic
Tiled VAEfallback - Fix
Tilingfor SD1 and SDXL - Pad conditioning for SDXL
- Remove duplicated upscaler codes
- Update spandrel
- support new upscaler architectures
[!Important] Put every upscaler (
.pth/.safetensors) inside theESRGANfolder
[!Tip] Check out OpenModelDB for where to get upscalers
- Improve
ForgeCanvas- brush adjustments
- customization
- deobfuscate
- eraser
- hotkeys
- Optimize upscaler logics
- Optimize certain operations in
Spandrel - Optimize certain operations for
VAE - Speed up model loading
- Improve memory management
- Improve color correction
- Update the implementation for
X/Y/Z Plot - Update the implementation for
Soft Inpainting - Update the implementation for
MultiDiffusion - Update the implementation for
uni_pcandLCMsamplers - Update the implementation of LoRAs
- Revamp settings
- improve formatting
- update descriptions
- Check for Extension updates in parallel
- Move
embeddingsfolder intomodelsfolder - Infotext Rewrite
- allow switching Models and Modules
- save
emphasisproperly - correct default values
- ControlNet Rewrite
- change Units to
gr.Tab - remove multi-inputs, as they are "misleading"
- change Units to
- Disable Refiner by default
- enable again in Settings/Refiner
- No longer install
bitsandbytesby default- see Commandline
- Improved non-Nvidia support
- Lint & Format
- Update
Pillow- faster image processing
- Update
protobuf- faster
insightfaceloading
- faster
- Update to latest PyTorch
torch==2.11.0+cu130
[!Note] If your GPU does not support the latest PyTorch, manually install older version of PyTorch
- Update some packages to newer versions
- Update recommended Python to
3.13.12 - many more... :tm:
Commandline
These flags can be added after the
set COMMANDLINE_ARGS=line in thewebui-user.bat(in the same line ; separate each flag with space)
[!Tip] Use
python launch.py --helpto see all available flags
--xformers: Install thexformerspackage to speed up generation
[!Warning]
xformersdoes not supportRTX 50s
--port: Specify a server port to use- defaults to
7860
- defaults to
--api: Enable API access
by. Neo
--cuda-malloc: Improve memory allocation--cuda-stream: Enable async weight offloading--pin-shared-memory: Improve RAM utilization--expandable-segments: Enable experimental PyTorch allocator (may preventOutOfMemoryerrors on certain platforms)
--uv: Replace thepython -m pipcalls withuv pipto massively speed up package installation- requires uv to be installed first (see Installation)
--uv-symlink: Same as above; but additionally pass--link-mode symlinkto the commands- significantly reduces installation size (
~7 GBto~100 MB)
- significantly reduces installation size (
[!Important] Using
symlinkmeans it will directly access the packages from the cache folders; refrain from clearing the cache if using this option
--model-ref: Points to a centralmodelsfolder that contains all your models- said folder should contain subfolders like
Stable-diffusion,Lora,VAE,ESRGAN, etc.
- said folder should contain subfolders like
[!Important] This simply replaces the
modelsfolder rather than adding on top of it
-
--forge-ref-a1111-home: Point to an Automatic1111 installation to load itsmodelsfolders- i.e.
Stable-diffusion,text_encoder, etc.
- i.e.
-
--forge-ref-comfy-home: Point to a ComfyUI installation to load itsmodelsfolders- i.e.
diffusion_models,clip, etc.
- i.e.
-
--forge-ref-comfy-yaml: Point to the ComfyUIextra_model_paths.yamlto load its configurations- i.e.
base_path,checkpoints, etc.
- i.e.
--sage: Install thesageattentionpackage to speed up generation- will also attempt to install
tritonautomatically
- will also attempt to install
--flash: Install theflash_attnpackage to speed up generation--nunchaku: Install thenunchakupackage to inference SVDQ models--bnb: Install thebitsandbytespackage to do low-bits (nf4) inference--onnxruntime-gpu: Install theonnxruntimewith the latest GPU support
--fast-fp8: Use thetorch._scaled_mmfunction when the model type isfloat8_e4m3fn--fast-fp16: Enable theallow_fp16_accumulationoption--autotune: Enable thetorch.backends.cudnn.benchmarkoption- this is slower in my experience...
--tiled-conv2d: ReplaceConv2dops with tiled variants- has greater reduction for SD1 and SDXL VAE; less for Wan VAE
64/128/256/512
Installation
-
Install git
-
Clone the Repo
git clone https://github.com/Haoming02/sd-webui-forge-classic sd-webui-forge-neo --branch neo -
Setup Python
Recommended Method
- Install uv
- Set up venv
cd sd-webui-forge-neo uv venv venv --python 3.13 --seed - Add the
--uvflag towebui-user.bat
Deprecated Method
- Install Python 3.13.12
- Remember to enable
Add Python to PATH
- Remember to enable
- (Optional) Configure Commandline
- Launch the WebUI via
webui-user.bat - During the first launch, it will automatically install all the requirements
- Once the installation is finished, the WebUI will start in a browser automatically
[!Tip]
[!Tip] Check out Extra Installations for how to install
git,uv, andFFmpeg
Attention Functions
[!Important] The
--xformers,--flash, and--sageargs are only responsible for installing the packages, not whether its respective attention is used (this also means you can remove them once the packages are successfully installed)
[!Caution] Do not just blindly install all of them
Nowadays the native PyTorchscaled_dot_product_attentionis usually as fast, and also more stable
Forge Neo tries to import the packages and automatically choose the first available attention function in the following order:
SageAttentionFlashAttentionxformersPyTorchBasic
[!Note] To skip a specific attention, add the respective disable arg such as
--disable-sage
Issues & Requests
- Issues about removed features will simply be ignored
- Issues that is obviously user-error will simply be ignored
- Issues regarding AMD GPU will simply be ignored
- Issues running non-official models will simply be ignored
- do not just randomly download every single finetune/quant you find
- Issues about 3rd-party Extensions will simply be ignored
- extension should support the UI, not the other way around
- Issues caused by StabilityMatrix will simply be ignored
- only open an Issue if you can reproduce it on a clean install following the official Installation instruction
[!Caution]
- If you post NSFW images/videos, you will immediately be banned
- the sole discretion is on me ; if you are unsure, just generate
catsanddogs...
Special thanks to AUTOMATIC1111, lllyasviel, and comfyanonymous, kijai, city96,
along with the rest of the contributors,
for their invaluable efforts in the open-source image generation community
Buy me a Coffee ☕~
PayPal me 💳~