Star 历史趋势
数据来源: GitHub API · 生成自 Stargazers.cn
README.md

Windows AI Wheels

AI-windows-whl logo

A curated collection of pre-compiled Python wheels for difficult-to-install AI/ML libraries on Windows.

Report a Broken Link · Request a New Wheel

Table of Contents
  1. About The Project
  2. Getting Started
  3. Available Wheels
PyTorch Torchaudio Flash Attention xFormers SageAttention Nunchaku Natten triton SpargeAttn bitsandbytes

About The Project

This repository was created to address a common pain point for AI enthusiasts and developers on the Windows platform: building complex Python packages from source. Libraries like flash-attention, xformers are essential for high-performance AI tasks but often lack official pre-built wheels for Windows, forcing users into a complicated and error-prone compilation process.

The goal here is to provide a centralized, up-to-date collection of direct links to pre-compiled .whl files for these libraries, primarily for the ComfyUI community and other PyTorch users on Windows. This saves you time and lets you focus on what's important: creating amazing things with AI.

Find Windows AI Wheels

To make life even easier, you can use this page Find Windows AI Wheels for quick searches of the required packages.

(back to top)

Getting Started

Follow these simple steps to use the wheels from this repository.

Prerequisites

  1. Python for Windows: Ensure you have a compatible Python version installed (PyTorch currently supports Python 3.9 - 3.14 on Windows). You can get it from the official Python website.

Installation

To install a wheel, use pip with the direct URL to the .whl file. Make sure to enclose the URL in quotes.

# Example of installing a specific flash-attention wheel pip install "https://huggingface.co/lldacing/flash-attention-windows-wheel/blob/main/flash_attn-2.7.4.post1+cu128torch2.7.0cxx11abiFALSE-cp312-cp312-win_amd64.whl"

[!TIP] Find the package you need in the Available Wheels section below, find the row that matches your environment (Python, PyTorch, CUDA version), and copy the link for the pip install command.

(back to top)

Available Wheels

Here is the list of tracked packages.

🛠 PyTorch

The foundation of everything. Install this first from the official source.

For convenience, here are direct installation commands for specific versions on Linux/WSL with an NVIDIA GPU. For other configurations (CPU, macOS, ROCm), please use the official install page.

Stable Version (2.11.0)

This is the recommended version for most users.

CUDA VersionPip Install Command
CUDA 13.0pip install torch torchvision --index-url https://download.pytorch.org/whl/cu130
CUDA 12.8pip install torch torchvision --index-url https://download.pytorch.org/whl/cu128
CUDA 12.6pip install torch torchvision --index-url https://download.pytorch.org/whl/cu126
Previous Stable Version

Stable Version (2.10.0)

This is the recommended version for most users.

CUDA VersionPip Install Command
CUDA 13.0pip install "torch>=2.10.0.dev,<2.11.0" torchvision --index-url https://download.pytorch.org/whl/cu130
CUDA 12.8pip install "torch>=2.10.0.dev,<2.11.0" torchvision --index-url https://download.pytorch.org/whl/cu128
CUDA 12.6pip install "torch>=2.10.0.dev,<2.11.0" torchvision --index-url https://download.pytorch.org/whl/cu126

Previous Version (2.9.1)

CUDA VersionPip Install Command
CUDA 13.0pip install "torch>=2.9.0.dev,<2.10.0" torchvision --index-url https://download.pytorch.org/whl/cu130
CUDA 12.8pip install "torch>=2.9.0.dev,<2.10.0" torchvision --index-url https://download.pytorch.org/whl/cu128
CUDA 12.6pip install "torch>=2.9.0.dev,<2.10.0" torchvision --index-url https://download.pytorch.org/whl/cu126
Previous Stable Version (2.8.0)
CUDA VersionPip Install Command
CUDA 12.9pip install "torch>=2.8.0.dev,<2.9.0" torchvision --index-url https://download.pytorch.org/whl/cu129
CUDA 12.8pip install "torch>=2.8.0.dev,<2.9.0" torchvision --index-url https://download.pytorch.org/whl/cu128
CUDA 12.6pip install "torch>=2.8.0.dev,<2.9.0" torchvision --index-url https://download.pytorch.org/whl/cu126
Previous Stable Version (2.7.1)
CUDA VersionPip Install Command
CUDA 12.8pip install torch==2.7.1 torchvision==0.22.1 torchaudio==2.7.1 --index-url https://download.pytorch.org/whl/cu128
CUDA 12.6pip install torch==2.7.1 torchvision==0.22.1 torchaudio==2.7.1 --index-url https://download.pytorch.org/whl/cu126
CUDA 11.8pip install torch==2.7.1 torchvision==0.22.1 torchaudio==2.7.1 --index-url https://download.pytorch.org/whl/cu118
CPU onlypip install torch==2.7.1 torchvision==0.22.1 torchaudio==2.7.1 --index-url https://download.pytorch.org/whl/cpu

Nightly Versions

Use these for access to the latest features, but expect potential instability.

PyTorch 2.12 (Nightly)

CUDA VersionPip Install Command
CUDA 13.0pip install --pre torch torchvision --index-url https://download.pytorch.org/whl/nightly/cu130
CUDA 12.8pip install --pre torch torchvision --index-url https://download.pytorch.org/whl/nightly/cu128
CUDA 12.6pip install --pre torch torchvision --index-url https://download.pytorch.org/whl/nightly/cu126

▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲

🛠 Torchaudio

Package VersionPyTorch VerPython VerCUDA VerDownload Link
2.11.0a02.12.03.1413.0Link
2.11.0a02.12.03.1313.0Link
2.11.0a02.11.03.1413.0Link
2.11.0a02.11.03.1313.0Link
2.11.0a02.10.03.1313.0Link
2.11.0a02.10.03.1213.0Link
2.11.0a02.10.03.1312.8Link
2.8.0a02.9.03.1212.8Link
2.8.0a02.9.03.1212.8Link
# Torchcodec pip install torchcodec

▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲

🛠 Flash Attention

High-performance attention implementation.

GitHub HuggingFace HuggingFace GitHub

Package VersionPyTorch VerPython VerCUDA VerCXX11 ABIDownload Link
2.8.42.12.03.1413.0Link
2.8.42.12.03.1313.0Link
2.8.42.11.03.1413.0Link
2.8.42.11.03.1313.0Link
2.8.32.11.03.1313.0Link
2.8.32.11.03.1213.0Link
2.8.32.10.03.1313.0Link
2.8.32.10.03.1313.0Link
2.8.32.10.03.1213.0Link
2.8.32.10.03.1213.0Link
2.8.32.10.03.1312.8Link
2.8.32.9.13.1313.0Link
2.8.32.9.13.1213.0Link
2.8.32.9.13.1312.8Link
2.8.32.9.03.1313.0Link
2.8.32.9.03.1213.0Link
2.8.32.9.03.1312.9Link
2.8.32.9.03.1212.8Link
2.8.32.8.03.1212.8Link
2.8.22.9.03.1212.8Link
2.8.22.8.03.1212.8Link
2.8.22.8.03.1112.8Link
2.8.22.8.03.1012.8Link
2.8.22.7.03.1212.8Link
2.8.22.7.03.1112.8Link
2.8.22.7.03.1012.8Link
2.8.12.8.03.1212.8Link
2.8.0.post22.8.03.1212.8Link
2.7.4.post12.8.03.1212.8Link
2.7.4.post12.8.03.1012.8Link
2.7.4.post12.7.03.1212.8Link
2.7.4.post12.7.03.1112.8Link
2.7.4.post12.7.03.1012.8Link
2.7.42.8.03.1212.8Link
2.7.42.8.03.1112.8Link
2.7.42.8.03.1012.8Link
2.7.42.7.03.1212.8Link
2.7.42.7.03.1112.8Link
2.7.42.7.03.1012.8Link
2.7.42.6.03.1212.6Link
2.7.42.6.03.1112.6Link
2.7.42.6.03.1012.6Link
2.7.42.6.03.1212.4Link
2.7.42.6.03.1112.4Link
2.7.42.6.03.1012.4Link

▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲

🛠 Flash Attention 3

Next-generation Flash Attention with improved performance and features.

GitHub GitHub

Package VersionPyTorch VerPython VerCUDA VerCXX11 ABIDownload Link
3.0.02.103.9+13.0Link
3.0.02.103.9+13.0Link
3.0.02.103.9+12.8Link
3.0.02.103.9+12.8Link
3.0.02.93.9+13.0Link
3.0.02.93.9+12.8Link

▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲

🛠 Flash Attention 4

Latest Flash Attention implementation with cutting-edge optimizations.

(No wheels available - package not tracked)

▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲

🛠 xformers

Another library for memory-efficient attention and other optimizations.

GitHub PyTorch

[!NOTE] PyTorch provides official pre-built wheels for xformers. You can often install it with pip install xformers

CUDA VersionInstall
CUDA 12.6pip3 install -U xformers --index-url https://download.pytorch.org/whl/cu126
CUDA 12.8pip3 install -U xformers --index-url https://download.pytorch.org/whl/cu128
CUDA 13.0pip3 install -U xformers --index-url https://download.pytorch.org/whl/cu130

ABI3 version, any Python 3.9-3.12

Package VersionPyTorch VerPython VerCUDA VerDownload Link
0.0.342.113.9+13.0Link
0.0.342.103.9+13.0Link
0.0.342.103.9+13.0Link
0.0.332.103.9+13.0Link
0.0.332.93.9+13.0Link
0.0.32.post22.8.03.9+12.9Link
0.0.32.post22.8.03.9+12.8Link
0.0.32.post22.8.03.9+12.6Link

▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲

🛠 SageAttention

GitHub GitHub HuggingFace

Package VersionPyTorch VerPython VerCUDA VerDownload Link
2.1.12.8.03.1212.8Link
2.1.12.7.03.1012.8Link
2.1.12.6.03.1312.6Link
2.1.12.6.03.1212.6Link
2.1.12.6.03.1212.6Link
2.1.12.6.03.1112.6Link
2.1.12.6.03.1012.6Link
2.1.12.6.03.912.6Link
2.1.12.5.13.1212.4Link
2.1.12.5.13.1112.4Link
2.1.12.5.13.1012.4Link
2.1.12.5.13.912.4Link

◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇ ◇

🛠 SageAttention 2.2 (SageAttention2++)

[!NOTE] Only supports CUDA >= 12.8, therefore PyTorch >= 2.7.

Package VersionPyTorch VerPython VerCUDA VerDownload Link
2.2.0.post42.9.0+3.9+13.0Link
2.2.0.post42.9.0+3.9+12.8Link
2.2.0.post32.10.03.1213.0Link
2.2.0.post32.10.03.1312.8Link
2.2.0.post32.10.03.1212.8Link
2.2.0.post32.9.03.1313.0Link
2.2.0.post32.9.03.1312.9Link
2.2.0.post32.9.03.9+12.9Link
2.2.0.post32.9.03.1312.8Link
2.2.0.post32.9.03.9+12.8Link
2.2.0.post32.8.03.1312.9Link
2.2.0.post32.8.03.9+12.9Link
2.2.0.post32.8.03.1312.8Link
2.2.0.post32.8.03.9+12.8Link
2.2.0.post32.7.13.9+12.8Link
2.2.0.post32.6.03.9+12.6Link
2.2.0.post32.5.13.9+12.4Link
2.2.0.post22.9.03.9+12.8Link
2.2.0.post22.8.03.9+12.8Link
2.2.0.post22.7.13.9+12.8Link
2.2.0.post22.6.03.9+12.6Link
2.2.0.post22.5.13.9+12.4Link
2.2.02.8.03.1312.8Link
2.2.02.8.03.1212.8Link
2.2.02.8.03.1112.8Link
2.2.02.8.03.1012.8Link
2.2.02.8.03.912.8Link
2.2.02.7.13.1312.8Link
2.2.02.7.13.1212.8Link
2.2.02.7.13.1112.8Link
2.2.02.7.13.1012.8Link
2.2.02.7.13.912.8Link
🛠 SageAttention 3

GitHub

Package VersionPyTorch VerPython VerCUDA VerDownload Link
1.0.02.9.13.1313.0Link
1.0.02.9.13.1213.0Link
1.0.02.8.03.1312.8Link
1.0.02.8.03.1212.8Link
1.0.02.8.03.1112.8Link
1.0.02.7.13.1312.8Link
1.0.02.7.13.1212.8Link
1.0.02.7.13.1112.8Link

▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲

🛠 Nunchaku

Package VersionPyTorch VerPython VerDownload Link
1.2.02.113.13Link
1.2.02.113.12Link
1.2.02.113.11Link
1.2.02.113.10Link
1.2.02.93.13Link
1.2.02.93.12Link
1.2.02.93.11Link
1.2.02.93.10Link
1.2.02.83.13Link
1.2.02.83.12Link
1.2.02.83.11Link
1.2.02.73.13Link
1.2.02.73.12Link
1.2.02.73.11Link
1.0.22.103.13Link
1.0.22.103.12Link
1.0.22.103.11Link
1.0.22.103.10Link
1.0.22.93.13Link
1.0.22.93.12Link
1.0.22.93.11Link
1.0.22.93.10Link
1.0.22.83.13Link
1.0.22.83.12Link
1.0.22.83.11Link
1.0.22.83.10Link
1.0.22.73.13Link
1.0.22.73.12Link
1.0.22.73.11Link
1.0.22.73.10Link
1.0.12.103.13Link
1.0.12.103.12Link
1.0.12.103.11Link
1.0.12.103.10Link
1.0.12.93.13Link
1.0.12.93.13Link
1.0.12.93.12Link
1.0.12.93.12Link
1.0.12.83.13Link
1.0.12.83.13Link
1.0.12.83.12Link
1.0.12.83.11Link
1.0.12.83.10Link
1.0.12.73.13Link
1.0.12.73.12Link
1.0.12.73.11Link
1.0.12.73.10Link
1.0.12.63.13Link
1.0.12.63.12Link
1.0.12.63.11Link
1.0.12.63.10Link
1.0.12.53.12Link
1.0.12.53.11Link
1.0.12.53.10Link
1.0.02.93.13Link
1.0.02.93.12Link
1.0.02.93.11Link
1.0.02.93.10Link
1.0.02.83.13Link
1.0.02.83.12Link
1.0.02.83.11Link
1.0.02.83.10Link
1.0.02.73.13Link
1.0.02.73.12Link
1.0.02.73.11Link
1.0.02.73.10Link
1.0.02.63.13Link
1.0.02.63.12Link
1.0.02.63.11Link
1.0.02.63.10Link
1.0.02.53.12Link
1.0.02.53.11Link
1.0.02.53.10Link
0.3.22.93.12Link
0.3.22.83.12Link
0.3.22.83.11Link
0.3.22.83.10Link
0.3.22.73.12Link
0.3.22.73.11Link
0.3.22.73.10Link
0.3.22.63.12Link
0.3.22.63.11Link
0.3.22.63.10Link
0.3.22.53.12Link
0.3.22.53.11Link
0.3.22.53.10Link

▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲

🛠 NATTEN

Neighborhood Attention Transformer.

GitHub HuggingFace

Package VersionPyTorch VerPython VerCUDA VerDownload Link
0.17.52.7.03.1212.8Link
0.17.52.7.03.1112.8Link
0.17.52.7.03.1012.8Link
0.17.52.6.03.1212.6Link
0.17.52.6.03.1112.6Link
0.17.52.6.03.1012.6Link
0.17.32.5.13.1212.4Link
0.17.32.5.13.1112.4Link
0.17.32.5.13.1012.4Link
0.17.32.5.03.1212.4Link
0.17.32.5.03.1112.4Link
0.17.32.5.03.1012.4Link
0.17.32.4.13.1212.4Link
0.17.32.4.13.1112.4Link
0.17.32.4.13.1012.4Link
0.17.32.4.03.1212.4Link
0.17.32.4.03.1112.4Link
0.17.32.4.03.1012.4Link

▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲

🛠 Triton (Windows Fork)

Triton is a language and compiler for writing highly efficient custom deep-learning primitives. Not officially supported on Windows, but a fork provides pre-built wheels.

GitHub

Supported GPUs:

[!NOTE] Different GPU architectures require different Triton versions due to compute capability support.

Triton VersionSupported GPUsCompute Capability
3.6.xRTX 50xx (Blackwell), RTX 40xx, Ada Lovelace, HopperSM 8.9, 9.0, 10.0
3.5.xRTX 30xx, 40xx, Ada Lovelace, HopperSM 8.0, 8.9, 9.0
3.4.xRTX 20xx, 30xx, 40xx, Ada Lovelace, HopperSM 7.5, 8.0, 8.9, 9.0
<= 3.2.xGTX/RTX 16xx, RTX 20xx, 30xx, 40xx, Ada Lovelace, HopperSM 7.0, 7.5, 8.0, 8.9, 9.0

Installation:

Package VersionPyTorch VerCompute CapabilityInstall
3.6.x>= 2.9SM 8.9+pip install -U "triton-windows<3.7"
3.5.x>= 2.9SM 8.0+pip install -U "triton-windows<3.6"
3.4.x>= 2.8SM 7.5+pip install -U "triton-windows<3.5"

Python libs:

[!IMPORTANT] Triton requires additional Python development libraries for building CUDA kernels. Download the package matching your Python version, extract the ZIP file, and copy the include and libs folders to your Python installation directory.

Python VerDownload
3.13Link
3.12Link
3.11Link
3.10Link
3.9Link

▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲

🛠 bitsandbytes

A lightweight wrapper around CUDA custom functions, particularly for 8-bit optimizers, matrix multiplication (LLM.int8()), and quantization functions.

GitHub

▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲

🛠 RadialAttention for ComfyUI

GitHub

(back to top)

▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲

🛠 SpargeAttn

GitHub GitHub

Package VersionPyTorch VerCUDA VerDownload Link
0.1.0.post12.8.012.8Link
0.1.0.post12.7.112.8Link

▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲

🛠 Block Sparse Attention

GitHub HuggingFace

Package VersionPyTorch VerPython VerCUDA VerDownload Link
0.0.2.post12.113.1313.0Link
0.0.2.post12.103.1313.0Link
0.0.2.post12.9.13.1313.0Link

▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲

🛠 DeepSpeed

Package VersionPython VerDownload Link
0.18.63.13Link

▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲

🛠 Fairseq

Package VersionPython VerDownload Link
0.12.23.13Link

▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲▼▲

🛠 causal_conv1d

Package VersionPyTorch VerPython VerCUDA VerCXX11 ABIDownload Link
1.6.12.11.03.1413.0Link
1.6.12.11.03.1313.0Link
1.6.12.10.03.1313.0Link

▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀

🌐 Accessing Data Programmatically (wheels.json)

All wheel information in this repository is managed in the wheels.json file, which serves as the single source of truth. The tables in this README are automatically generated from this file.

This provides a stable, structured JSON endpoint for any external tool or application that needs to access this data without parsing Markdown.

➤ How to Use

You can access the raw JSON file directly via the following URL:

https://raw.githubusercontent.com/wildminder/AI-windows-whl/main/wheels.json

Example using curl:

curl -L -o wheels.json https://raw.githubusercontent.com/wildminder/AI-windows-whl/main/wheels.json

The file contains a list of packages, each with its metadata and an array of wheels, where each wheel object contains version details and a direct download url.

(back to top)

▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀

➤ Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

If you have found a new pre-built wheel or a reliable source, please fork the repo and create a pull request, or simply open an issue with the link.

(back to top)

➤ Acknowledgments

This repository is simply a collection of links. Huge thanks to the individuals and groups who do the hard work of building and hosting these wheels for the community:

关于 About

Pre-compiled Python whl for Flash-attention, SageAttention, NATTEN, xFormer etc
aicomfyuipython

语言 Languages

提交活跃度 Commit Activity

代码提交热力图
过去 52 周的开发活跃度
98
Total Commits
峰值: 18次/周
Less
More

核心贡献者 Contributors