AutoSubs

Local-first AI subtitles. No cloud, no subscription, no data leaving your machine.

Use it as a standalone app, or connect to DaVinci Resolve, Adobe Premiere Pro, and After Effects.

🎙️ Transcription: Whisper, Moonshine, and Parakeet models via whisper-rs and ONNX Runtime
👥 Speaker Diarization: Identifies and labels different speakers in the transcript, enabling per-speaker styling
🌍 100+ Languages: Transcription and translation across a wide range of languages
💻 Cross-Platform: macOS (Apple Silicon/Intel), Windows (Vulkan/DirectML), Linux

Download

Platform	Installer
🪟 Windows	AutoSubs-windows-x86_64.exe
🍎 macOS (Apple Silicon)	AutoSubs-Mac-ARM.pkg
🍎 macOS (Intel)	AutoSubs-Mac-Intel.pkg
🐧 Linux (Debian/Ubuntu)	AutoSubs-linux-x86_64.deb
🐧 Linux (Fedora/openSUSE)	AutoSubs-linux-x86_64.rpm

macOS Homebrew

macOS users can also install AutoSubs with Homebrew:

brew install --cask auto-subs

Linux install

Debian/Ubuntu (.deb):

wget https://github.com/tmoroney/auto-subs/releases/latest/download/AutoSubs-linux-x86_64.deb
sudo apt install ./AutoSubs-linux-x86_64.deb

Fedora/openSUSE (.rpm): Download AutoSubs-linux-x86_64.rpm and open it with your package manager.

Quick Start

Standalone Mode

Launch AutoSubs and select an audio or video file.
Pick your model and language/translation options.
Click Transcribe. Edit speakers and subtitles as needed.
Export as SRT, text, or copy to clipboard.

DaVinci Resolve Mode

Open DaVinci Resolve → Workspace → Scripts → AutoSubs.
Select your timeline/audio source and settings.
Click Transcribe. Edit speakers and subtitles as needed.
Send styled subtitles back to Resolve.

[!WARNING] Mac App Store version not supported - download DaVinci Resolve from blackmagicdesign.com instead.

Adobe Premiere Pro / After Effects Mode

Launch AutoSubs and open Premiere Pro or After Effects (the CEP extension loads automatically).
Select the Adobe integration from AutoSubs to export timeline audio for transcription, or import generated subtitles into your project.
In Premiere Pro, subtitles are imported as caption tracks; in After Effects, SRT entries are created as text layers.

Command Line Interface

For command-line usage, see the CLI Guide with complete reference, examples, and troubleshooting.

Documentation

CLI Guide - Command-line interface reference
Contributing Guide - Development setup and contribution workflow
AutoSubs-App README - Technical architecture and code organization
Resolve Integration - DaVinci Resolve integration architecture and development
Adobe Extension - Adobe Premiere Pro/After Effects integration details

[!TIP] I highly recommend checking out DeepWiki for asking questions and understanding the codebase.

Integrations

AutoSubs can run as a standalone subtitle generator, connect directly to DaVinci Resolve, or communicate with Adobe Premiere Pro and After Effects through the bundled CEP extension.

Select a Preset Style	Or create your own

What's New in v3.5

Transcription: Voice Activity Detection, multiple models (Whisper/Parakeet/Moonshine), improved speaker diarization, and built-in translation.

Editing & UI: Free-text subtitle editing with auto-timing, transcript history, 6 new UI languages, and custom titlebar.

DaVinci Resolve: Animated caption macro with per-word highlighting, preset system, marker-based word timing, and instant conflict detection.

Bug Fixes (v3.5.1): Formatting improvements, Resolve export corrections, Model Manager recovery, and Linux stability fixes.

Supported Models

AutoSubs ships with several local transcription model families. All run fully on-device — nothing is sent to the cloud. Models are downloaded on demand from the in-app Model Manager.

Accuracy is a relative 1–4 rating within AutoSubs (higher is better). Sizes and RAM figures are approximate.

Whisper

OpenAI's Whisper, via whisper-rs (GGML). Each size is available in a multilingual variant and an .en English-only variant (the .en models are slightly more accurate on English audio).

Model	Size	RAM	Languages	Accuracy
tiny / tiny.en	80 MB	1 GB	Multilingual / English	★
base / base.en	150 MB	1 GB	Multilingual / English	★
small / small.en	480 MB	2 GB	Multilingual / English	★★
medium / medium.en	1.5 GB	5 GB	Multilingual / English	★★★
large-v3-turbo	1.6 GB	6 GB	Multilingual	★★★
large-v3	3.1 GB	10 GB	Multilingual	★★★★

Moonshine

Useful Sensors' Moonshine, via ONNX Runtime. The tiny English model is quantized; the language-specific tiny variants and the base model are float-precision.

Model	Size	RAM	Language	Accuracy
moonshine-tiny	60 MB	1 GB	English	★
moonshine-tiny-ar	120 MB	1 GB	Arabic	★★★
moonshine-tiny-zh	120 MB	1 GB	Chinese	★★★
moonshine-tiny-ja	120 MB	1 GB	Japanese	★★★
moonshine-tiny-ko	120 MB	1 GB	Korean	★★★
moonshine-tiny-uk	120 MB	1 GB	Ukrainian	★★
moonshine-tiny-vi	120 MB	1 GB	Vietnamese	★★★
moonshine-base	200 MB	1 GB	English	★★

Parakeet

NVIDIA's Parakeet-TDT-0.6B-v3 (int8 ONNX). Fast and accurate, with support for 25 European languages plus Russian and Ukrainian.

Model	Size	RAM	Languages	Accuracy
parakeet	700 MB	2 GB	25 languages (EU + RU + UK)	★★★★

SenseVoice

Alibaba's SenseVoice (int8 ONNX). Compact and well-suited to CJK audio.

Model	Size	RAM	Languages	Accuracy
sense-voice	230 MB	1 GB	Chinese, English, Japanese, Korean, Cantonese	★★★

Canary

NVIDIA's Canary-1B-v2 (int8 ONNX). A multilingual encoder-decoder model that also supports native translation.

Model	Size	RAM	Languages	Accuracy
canary	1 GB	3 GB	25 languages (EU + RU + UK)	★★★★

Cohere

Cohere Transcribe (int4 ONNX). The highest-accuracy option for a focused set of 14 widely-spoken languages.

Model	Size	RAM	Languages	Accuracy
cohere	2 GB	4 GB	Arabic, German, Greek, English, Spanish, French, Italian, Japanese, Korean, Dutch, Polish, Portuguese, Vietnamese, Chinese	★★★★

Diarization & VAD

In addition to transcription models, AutoSubs downloads a speaker diarization model (~40 MB, user-selectable from the Model Manager) and a Silero VAD model (auto-downloaded for voice activity detection during transcription).

Contributing

PRs are welcome! See CONTRIBUTING.md for how to get started, including the dev setup and a full codebase walkthrough via AutoSubs DeepWiki.

For detailed information about the DaVinci Resolve integration architecture, Lua server, Fusion macro system, and development workflow, see Resolve-Integration/README.md.

Acknowledgments

AutoSubs is built on top of excellent open-source projects:

whisper-rs - Rust bindings for Whisper C++ library
transcribe-rs - ONNX Runtime transcription with Moonshine and Parakeet models
pyannote-rs - Rust implementation of Pyannote for speaker diarization (integrated into app code for improvements)