Star 历史趋势
数据来源: GitHub API · 生成自 Stargazers.cn
README.md

vLLM Metal Plugin

High-performance LLM inference on Apple Silicon using MLX and vLLM

vLLM Metal is a plugin that enables vLLM to run on Apple Silicon Macs using MLX as the primary compute backend. It unifies MLX and PyTorch under a single lowering path.

Documentation: https://docs.vllm.ai/projects/vllm-metal/en/latest/


Latest News 🔥

  • [2026/04] We released the new version v0.2.0! Unified paged varlen Metal kernel is now the default attention backend. 83x TTFT, 3.6x throughput compared to v0.1.0.

Requirements

  • macOS on Apple Silicon
  • Native arm64 Python 3.12. Rosetta/x86_64 Python is not supported.
  • Xcode Command Line Tools (xcode-select --install). vLLM core is compiled from source via clang++. The Metal kernels ship prebuilt, so no Metal compiler or toolchain is needed to run them.

Supported Models

vllm-metal supports a growing set of models on Apple Silicon. See the full matrix in docs/supported_models.md.

Installation

curl -fsSL https://raw.githubusercontent.com/vllm-project/vllm-metal/main/install.sh | bash

Using the install script above, the following will be installed under the ~/.venv-vllm-metal directory (the default).

  • vllm-metal plugin
  • vllm core
  • Related libraries

If you run source ~/.venv-vllm-metal/bin/activate, the vllm CLI becomes available and you can access the vLLM right away.

For how to use the vllm CLI, please refer to the official vLLM guide. https://docs.vllm.ai/en/latest/cli/

Optional: Rust frontend (experimental)

Pass --with-vllm-rs to also install vllm-rs, the experimental Rust frontend vendored in the bundled vLLM release. Requires the Rust toolchain (https://rustup.rs):

./install.sh --with-vllm-rs

See docs/rust_frontend.md for usage and architecture.

关于 About

Community maintained hardware plugin for vLLM on Apple Silicon

语言 Languages

Python89.3%
Metal7.7%
C++2.1%
Shell0.8%
Rust0.0%

提交活跃度 Commit Activity

代码提交热力图
过去 52 周的开发活跃度
423
Total Commits
峰值: 56次/周
Less
More

核心贡献者 Contributors