🏔️ Alpamayo 1

Bridging Reasoning and Action Prediction for Generalizable Autonomous Driving

Updates

[April 2026] ⚙️ Fine-tuning scripts released: SFT for supervised fine-tuning and RL for reinforcement learning-based post-training.
[March 2026] 🏔️ Alpamayo 1.5 has been released! We recommend all users check out the new version for improved performance, new features, and continued support! 🚀
[January 2026] Following the release of NVIDIA Alpamayo at CES 2026, Alpamayo-R1 has been renamed to Alpamayo 1.

📖 Please read the HuggingFace Model Card first! The model card contains comprehensive details on model architecture, inputs/outputs, licensing, and tested hardware configurations. This GitHub README focuses on setup, usage, and frequently asked questions.

Requirements

Requirement	Specification
Python	3.12.x (see `pyproject.toml`)
GPU	NVIDIA GPU with ≥24 GB VRAM (e.g., RTX 3090, RTX 4090, A5000, H100)
OS	Linux (tested); other platforms unverified

⚠️ Note: GPUs with less than 24 GB VRAM will likely encounter CUDA out-of-memory errors.

Installation

1. Install uv (if not already installed)

curl -LsSf https://astral.sh/uv/install.sh | sh
export PATH="$HOME/.local/bin:$PATH"

2. Set up the environment

uv venv ar1_venv
source ar1_venv/bin/activate
uv sync --active

3. Authenticate with HuggingFace

The model requires access to gated resources. Request access here:

🤗 Physical AI AV Dataset
🤗 Alpamayo Model Weights

Then authenticate using the HuggingFace CLI:

pip install -U huggingface_hub
hf auth login

Get your access token at: https://huggingface.co/settings/tokens

💡 Tip: For more details on HuggingFace authentication, see the official documentation.

Running Inference

Test script

NOTE: This script will download both some example data (relatively small) and the model weights (22 GB). The latter can be particularly slow depending on network bandwidth. For reference, it takes around 2.5 minutes on a 100 MB/s wired connection.

python src/alpamayo_r1/test_inference.py

In case you would like to obtain more trajectories and reasoning traces, please feel free to change the num_traj_samples=1 argument to a higher number (Line 60).

Interactive notebook

We provide a notebook with similar inference code at notebook/inference.ipynb.

Relationship with the Paper

Alpamayo 1 implements the architecture described in our paper "Alpamayo-R1: Bridging Reasoning and Action Prediction for Generalizable Autonomous Driving in the Long Tail ", including:

Feature	Paper Description	This Release (v1.0)
Chain-of-Causation (CoC) reasoning	Hybrid auto-labeling with human in the loop for reasoning traces	✅ Included
Vision-Language-Action architecture	Cosmos-Reason backbone + action expert	✅ Included
Trajectory prediction	6.4s horizon, 64 waypoints at 10 Hz	✅ Included
SFT fine-tuning (weights)	SFT trained model weights	✅ Included
SFT fine-tuning (code)	Supervised fine-tuning pipeline	✅ Included
RL post-training (weights)	RL post-trained model weights	❌ Not in this release
RL post-training (code)	RL post-training pipeline via Cosmos-RL	✅ Included
Route/navigation conditioning	Explicit navigation or route inputs	❌ Not in this release
Meta-actions/General VQA	High-level behavior and visual question answering	❌ Not in this release

This release includes the core model, SFT scripts, and the RL post-training pipeline. RL-trained weights, route conditioning, and meta-actions are candidates for future releases.

Fine-tuning Scripts

Method	Description	Docs
SFT	Supervised fine-tuning	SFT guide
RL	Reinforcement learning-based post-training via Cosmos-RL	RL guide

Please refer to the linked guides for compute requirements, step-by-step instructions, and fine-tuning FAQ.

Frequently Asked Questions (FAQ)

Does the 10B model accept navigation/route inputs?

While we have experimented with route conditioning capabilities, the released model does not include this feature. The current release takes multi-camera video and egomotion history as inputs, without explicit navigation or route inputs (e.g., waypoints, turn-by-turn navigation instructions).

Does the model produce meta-actions or support general VQA?

While we have experimented with meta-action and general VQA capabilities, the released model does not include these features. Alpamayo 1 is designed specifically for trajectory prediction with Chain-of-Causation reasoning, producing trajectory + reasoning trace outputs.

Was the 10B model post-trained with Reinforcement Learning (RL)?

No. The current 10B model release has not undergone RL post-training. While the paper describes RL stages for improving reasoning quality and action consistency, this release focuses on the supervised learning components. As mentioned above, we may release RL post-trained models in future releases.

What are the minimum GPU requirements?

You need an NVIDIA GPU with at least 24 GB VRAM for inference. Tested configurations include RTX 3090, A100, and H100. Running on GPUs with less memory (e.g., 16 GB) will likely result in CUDA out-of-memory errors.

Can I use this model in production / commercial applications?

No. The model weights are released under a non-commercial license. This release is intended for research, experimentation, and evaluation purposes only. See the License section and the HuggingFace Model Card for details.

Project Structure

alpamayo/
├── finetune/
│   ├── rl/                              # RL post-training
│   │   ├── models/                      # Model wrappers & Cosmos-RL entry scripts
│   │   ├── rewards/                     # Reward functions
│   │   ├── prefetch/                    # Shared-memory data prefetch server
│   │   ├── toml/                        # Cosmos-RL training configs
│   │   ├── hydra_configs/               # Dataset & preprocessing configs
│   │   └── README.md                    # RL post-training guide
│   └── sft/                             # Supervised fine-tuning
│       ├── configs/                     # Model configs
│       ├── models/                      # Trainable wrappers
│       ├── train_hf.py                  # Training script
│       └── evaluate_hf.py               # Evaluation script
├── notebook/
│   └── inference.ipynb                  # Example notebook
├── src/
│   └── alpamayo_r1/
│       ├── action_space/
│       │   └── ...                      # Action space definitions
│       ├── diffusion/
│       │   └── ...                      # Diffusion model components
│       ├── geometry/
│       │   └── ...                      # Geometry utilities and modules
│       ├── models/
│       │   ├── ...                      # Model components and utils functions
│       ├── __init__.py                  # Package marker
│       ├── config.py                    # Model and experiment configuration
│       ├── helper.py                    # Utility functions
│       ├── load_physical_aiavdataset.py # Dataset loader
│       ├── test_inference.py            # Inference test script
├── pyproject.toml                       # Project dependencies
└── uv.lock                              # Locked dependency versions

Troubleshooting

Flash Attention issues

The model uses Flash Attention 2 by default. If you encounter compatibility issues:

# Use PyTorch's scaled dot-product attention instead
config.attn_implementation = "sdpa"

CUDA out-of-memory errors

If you encounter OOM errors:

Ensure you have a GPU with at least 24 GB VRAM
Reduce num_traj_samples if generating multiple trajectories
Close other GPU-intensive applications

License

Inference code: Apache License 2.0 - see LICENSE for details.
Model weights: Non-commercial license - see HuggingFace Model Card for details.

Disclaimer

Alpamayo 1 is a pre-trained reasoning model designed to accelerate research and development in the autonomous vehicle (AV) domain. It is intended to serve as a foundation for a range of AV-related use cases-from instantiating an end-to-end backbone for autonomous driving to enabling reasoning-based auto-labeling tools. In short, it should be viewed as a building block for developing customized AV applications.

Important notes:

Alpamayo 1 is provided solely for research, experimentation, and evaluation purposes.
Alpamayo 1 is not a fully fledged driving stack. Among other limitations, it lacks access to critical real-world sensor inputs, does not incorporate required diverse and redundant safety mechanisms, and has not undergone automotive-grade validation for deployment.

By using this model, you acknowledge that it is a research tool intended to support scientific inquiry, benchmarking, and exploration—not a substitute for a certified AV stack. The developers and contributors disclaim any responsibility or liability for the use of the model or its outputs.

Citation

If you use Alpamayo 1 in your research, please cite:

@article{nvidia2025alpamayo,
      title={{Alpamayo-R1}: Bridging Reasoning and Action Prediction for Generalizable Autonomous Driving in the Long Tail},
      author={NVIDIA and Yan Wang and Wenjie Luo and Junjie Bai and Yulong Cao and Tong Che and Ke Chen and Yuxiao Chen and Jenna Diamond and Yifan Ding and Wenhao Ding and Liang Feng and Greg Heinrich and Jack Huang and Peter Karkus and Boyi Li and Pinyi Li and Tsung-Yi Lin and Dongran Liu and Ming-Yu Liu and Langechuan Liu and Zhijian Liu and Jason Lu and Yunxiang Mao and Pavlo Molchanov and Lindsey Pavao and Zhenghao Peng and Mike Ranzinger and Ed Schmerling and Shida Shen and Yunfei Shi and Sarah Tariq and Ran Tian and Tilman Wekel and Xinshuo Weng and Tianjun Xiao and Eric Yang and Xiaodong Yang and Yurong You and Xiaohui Zeng and Wenyuan Zhang and Boris Ivanovic and Marco Pavone},
      year={2025},
      journal={arXiv preprint arXiv:2511.00088},
}