Github

代码库

Native MTP Speculative Decoding On Apple Silicon | 2x - 2.5x decode TPS increase at temp 0.6 | MLX-native, OpenAI API/Anthropic-compatible serving, no external drafter.
Python
anthropic-compatibleapple-siliconinference-enginelocal-aimetalmlxmtpmtplxnative-mtpopenai-compatibleqwenqwen3-nextspeculative-decodingspeculative-sampling