Star 历史趋势
数据来源: GitHub API · 生成自 Stargazers.cn
README.md

Stable Diffusion WebUI Forge - Neo

[ Neo | Classic ]

UI

Stable Diffusion WebUI Forge is a platform on top of the original Stable Diffusion WebUI by AUTOMATIC1111, to make development easier, optimize resource management, speed up inference, and study experimental features.
The name "Forge" is inspired by "Minecraft Forge". This project aims to become the Forge of Stable Diffusion WebUI.

- lllyasviel
(paraphrased)


"Neo" mainly serves as an continuation for the "latest" version of Forge, which was built on Gradio 4.40.0 before lllyasviel became too busy... Additionally, this fork is focused on optimization and usability, with the main goal of being able to run the latest popular models via an easy-to-use GUI.

[!Tip] How to Install


Features [May.]

Most base features of the original Automatic1111 Webui should still function

New Features

[!Important] To use Flux.2-Klein for regular img2img, toggle the functionality in Settings/Stable Diffusion

  • Support Ernie-Image
    • ernie-image / ernie-image-turbo
  • Support Z-Image
    • z-image / z-image-turbo
  • Support Wan 2.2
    • use Refiner to achieve High Noise / Low Noise switching
      • enable Refiner in Settings/Refiner

[!Important] To export a video, you need to have FFmpeg installed

  • Support Mugen
    • display the Shift slider for xl preset in Settings/Presets/XL
  • Support advanced SDXL models

[!Note]

  • v-prediction: state_dict must include "v_pred"
  • Zero Terminal SNR: state_dict must include "ztsnr"
  • Rectified Flow: the model must include "rectified" in its path (e.g. file name or folder name)

[!Note] To be detected as an Edit model, the model must include "qwen" and "edit" in its path (e.g. file name or folder name)

[!Note] To be detected as a Kontext model, the model must include "kontext" in its path (e.g. file name or folder name)

  • Implement ImageStitch Integrated
    • support Multi-Image Inputs for flux.2-klein / flux-kontext / qwen-image-edit
    • support FirstLastFrameToVideo for wan 2.2
  • Support Nunchaku (SVDQ) Models
    • flux-dev, flux-krea, flux-kontext, qwen-image, qwen-image-edit, z-image-turbo
    • only Flux and Qwen support LoRA currently
    • see Commandline
  • Support Lumina-Image-2.0
    • Neta-Lumina / NetaYume-Lumina
  • Support Chroma1-HD
  • Support MixedPrecision Models
    • fp4mixed / fp8mixed / mxfp8 / nvfp4 / fp8_scaled
  • Support Flux.2-Small-Decoder & Qwen2D VAE

[!Tip] Check out Download Models for where to get each model and the accompanying modules

[!Tip] Check out Inference References for how to use each model and the recommended parameters


  • Rewrite Preset System
    • now remembers the checkpoint/module selection and parameters for each preset
  • Support uv package manager
    • drastically speed up installation
    • requires manually installing uv
    • see Commandline
  • Support SageAttention, FlashAttention, fp16_accumulation, torch._scaled_mm
  • Implement Triton Kernel for matmul in torch.int8
    • speed up inference after quantization
    • enable by selecting int8 in the Diffusion in Low Bits
  • Implement Radial Attention
    • speed up Wan 2.2
    • requires manually installing SpargeAttn
  • Implement fast state_dict switching for Refiner
    • enable in Settings/Refiner
  • Implement RescaleCFG
    • reduce burnt colors; mainly for v-pred checkpoints
    • enable in Settings/UI Alternatives
  • Implement MaHiRo
    • alternative CFG calculation; improve prompt adherence
    • enable in Settings/UI Alternatives
  • Implement Spectrum
    • training-free acceleration for all models
  • Implement Epsilon Scaling
    • enable in Settings/Stable Diffusion
  • Implement torch.compile
    • speed up inference after compilation
  • Implement alternative Prompt Box layouts
  • Implement tiled Conv2d for VAE
  • Implement full precision calculation for Mask blur blending
    • enable in Settings/img2img
  • Support TAESD live preview for all models
  • Support loading upscalers in half precision
    • speed up; reduce quality
    • enable in Settings/Upscaling
  • Support running tile composition on GPU
    • enable in Settings/Upscaling
  • Support (short) videos in Extras tab
  • Add support for .avif, .heif, and .jxl image formats
  • Automatically determine the optimal row count for X/Y/Z Plot
  • Update LLLite Controlnet
  • Support Union Controlnet

Removed Features

  • SD2
  • SD3
  • Forge Spaces
  • Hypernetworks
  • CLIP Interrogator
  • Deepbooru Interrogator
  • Textual Inversion Training
  • Some built-in Extensions
  • Some built-in Scripts
  • Some Samplers & Schedulers
  • Some Compatibility Settings
  • Stealth Infotext

Optimizations

  • [Comfy] Rewrite the Backend (memory_management.py, ModelPatcher, attention.py, etc.)
  • No longer git clone any repository on fresh install
  • No longer install open-clip
  • Fix memory leak when switching checkpoints
  • Restore the ability to drag-and-drop images onto gr.Image that already contains image
  • Speed up launch time
  • Improve timer logs
  • Remove unused cmd_args
  • Remove unused args_parser
  • Remove unused shared_options
  • Remove legacy codes
  • Fix some typos
  • Fix automatic Tiled VAE fallback
  • Fix Tiling for SD1 and SDXL
  • Pad conditioning for SDXL
  • Remove duplicated upscaler codes
  • Update spandrel
    • support new upscaler architectures

[!Important] Put every upscaler (.pth / .safetensors) inside the ESRGAN folder

[!Tip] Check out OpenModelDB for where to get upscalers

  • Improve ForgeCanvas
    • brush adjustments
    • customization
    • deobfuscate
    • eraser
    • hotkeys
  • Optimize upscaler logics
  • Optimize certain operations in Spandrel
  • Optimize certain operations for VAE
  • Speed up model loading
  • Improve memory management
  • Improve color correction
  • Update the implementation for X/Y/Z Plot
  • Update the implementation for Soft Inpainting
  • Update the implementation for MultiDiffusion
  • Update the implementation for uni_pc and LCM samplers
  • Update the implementation of LoRAs
  • Revamp settings
    • improve formatting
    • update descriptions
  • Check for Extension updates in parallel
  • Move embeddings folder into models folder
  • Infotext Rewrite
    • allow switching Models and Modules
    • save emphasis properly
    • correct default values
  • ControlNet Rewrite
    • change Units to gr.Tab
    • remove multi-inputs, as they are "misleading"
  • Disable Refiner by default
    • enable again in Settings/Refiner
  • No longer install bitsandbytes by default
  • Improved non-Nvidia support
  • Lint & Format
  • Update Pillow
    • faster image processing
  • Update protobuf
    • faster insightface loading
  • Update to latest PyTorch
    • torch==2.11.0+cu130

[!Note] If your GPU does not support the latest PyTorch, manually install older version of PyTorch

  • Update some packages to newer versions
  • Update recommended Python to 3.13.12
  • many more... :tm:

Commandline

These flags can be added after the set COMMANDLINE_ARGS= line in the webui-user.bat (in the same line ; separate each flag with space)

[!Tip] Use python launch.py --help to see all available flags

  • --xformers: Install the xformers package to speed up generation

[!Warning] xformers does not support RTX 50s

  • --port: Specify a server port to use
    • defaults to 7860
  • --api: Enable API access

by. Neo

  • --cuda-malloc: Improve memory allocation
  • --cuda-stream: Enable async weight offloading
  • --pin-shared-memory: Improve RAM utilization
  • --expandable-segments: Enable experimental PyTorch allocator (may prevent OutOfMemory errors on certain platforms)

  • --uv: Replace the python -m pip calls with uv pip to massively speed up package installation
  • --uv-symlink: Same as above; but additionally pass --link-mode symlink to the commands
    • significantly reduces installation size (~7 GB to ~100 MB)

[!Important] Using symlink means it will directly access the packages from the cache folders; refrain from clearing the cache if using this option

  • --model-ref: Points to a central models folder that contains all your models
    • said folder should contain subfolders like Stable-diffusion, Lora, VAE, ESRGAN, etc.

[!Important] This simply replaces the models folder rather than adding on top of it

  • --forge-ref-a1111-home: Point to an Automatic1111 installation to load its models folders

    • i.e. Stable-diffusion, text_encoder, etc.
  • --forge-ref-comfy-home: Point to a ComfyUI installation to load its models folders

    • i.e. diffusion_models, clip, etc.
  • --forge-ref-comfy-yaml: Point to the ComfyUI extra_model_paths.yaml to load its configurations

    • i.e. base_path, checkpoints, etc.

  • --sage: Install the sageattention package to speed up generation
    • will also attempt to install triton automatically
  • --flash: Install the flash_attn package to speed up generation
  • --nunchaku: Install the nunchaku package to inference SVDQ models
  • --bnb: Install the bitsandbytes package to do low-bits (nf4) inference
  • --onnxruntime-gpu: Install the onnxruntime with the latest GPU support

  • --fast-fp8: Use the torch._scaled_mm function when the model type is float8_e4m3fn
  • --fast-fp16: Enable the allow_fp16_accumulation option
  • --autotune: Enable the torch.backends.cudnn.benchmark option
    • this is slower in my experience...
  • --tiled-conv2d: Replace Conv2d ops with tiled variants
    • has greater reduction for SD1 and SDXL VAE; less for Wan VAE
    • 64 / 128 / 256 / 512

Installation

  1. Install git

  2. Clone the Repo

    git clone https://github.com/Haoming02/sd-webui-forge-classic sd-webui-forge-neo --branch neo
  3. Setup Python


Recommended Method
  • Install uv
  • Set up venv
    cd sd-webui-forge-neo uv venv venv --python 3.13 --seed
  • Add the --uv flag to webui-user.bat

Deprecated Method

  1. (Optional) Configure Commandline
  2. Launch the WebUI via webui-user.bat
  3. During the first launch, it will automatically install all the requirements
  4. Once the installation is finished, the WebUI will start in a browser automatically

[!Tip]

  • For Linux and macOS, refer to Wiki
  • For Docker (Nvidia), refer to Docker

[!Tip] Check out Extra Installations for how to install git, uv, and FFmpeg


Attention Functions

[!Important] The --xformers, --flash, and --sage args are only responsible for installing the packages, not whether its respective attention is used (this also means you can remove them once the packages are successfully installed)

[!Caution] Do not just blindly install all of them
Nowadays the native PyTorch scaled_dot_product_attention is usually as fast, and also more stable

Forge Neo tries to import the packages and automatically choose the first available attention function in the following order:

  1. SageAttention
  2. FlashAttention
  3. xformers
  4. PyTorch
  5. Basic

[!Note] To skip a specific attention, add the respective disable arg such as --disable-sage


Issues & Requests

  • Issues about removed features will simply be ignored
  • Issues that is obviously user-error will simply be ignored
  • Issues regarding AMD GPU will simply be ignored
  • Issues running non-official models will simply be ignored
    • do not just randomly download every single finetune/quant you find
  • Issues about 3rd-party Extensions will simply be ignored
    • extension should support the UI, not the other way around
  • Issues caused by StabilityMatrix will simply be ignored
    • only open an Issue if you can reproduce it on a clean install following the official Installation instruction

[!Caution]

  • If you post NSFW images/videos, you will immediately be banned
    • the sole discretion is on me ; if you are unsure, just generate cats and dogs...

[!Tip] Check out the Wiki & FAQ


Special thanks to AUTOMATIC1111, lllyasviel, and comfyanonymous, kijai, city96,
along with the rest of the contributors,
for their invaluable efforts in the open-source image generation community


Buy me a Coffee ☕~
PayPal me 💳~


关于 About

The good ol' Forge WebUI, now updated with new features~
aigenerative-aipytorchstable-diffusionstable-diffusion-webuistable-diffusion-webui-forge

语言 Languages

Python93.7%
JavaScript2.2%
Cuda2.0%
C++1.1%
CSS0.7%
HTML0.2%
Shell0.1%
CMake0.1%
Dockerfile0.0%
Batchfile0.0%

提交活跃度 Commit Activity

代码提交热力图
过去 52 周的开发活跃度
804
Total Commits
峰值: 69次/周
Less
More

核心贡献者 Contributors