Geometric Action Model for Robot Policy Learning

Jisang Han^1* · Seonghu Jeon^1* · Jaewoo Jung^1,2 · René Zurbrügg^2,3 · Honggyu An¹ · Tifanny Portela^2,3 · Marco Hutter² · Marc Pollefeys² · Seungryong Kim^1† · Sunghwan Hong^2,3†

¹ KAIST AI · ² ETH Zurich · ³ ETH AI Center

^* Equal contribution. ^† Co-corresponding authors.

Paper | Project Page | Checkpoints | BibTeX

GAM (Geometric Action Model) is a language-conditioned robot manipulation policy that adapts a pretrained geometric foundation model into one shared backbone for perception, future prediction, and action decoding. This public release contains the LIBERO and LIBERO-Plus implementation, training configs, standalone rollout scripts, and released checkpoints. The released 1.4B GAM model reports 97.6% LIBERO, 85.5% LIBERO-Plus, 83.1% camera split, and 6.9 ms model-forward latency with the CUDA graph inference path.

Repository Layout

Area	Paths
Model components	`src/robot/modeling/`, `src/robot/losses/`
Data and rollout runtime	`src/robot/data/`, `src/robot/evaluation/`, `src/robot/viz/`
Training and evaluation entrypoints	`src/train_robot.py`, `src/eval_libero_unified.py`, `src/gam/training/`, `src/gam/evaluation/`
Configs and runtime	`configs/training/libero_unified/`, `Dockerfile`, `environment.yml`, `requirements.txt`
Setup utilities	`scripts/setup_sources.sh`, `scripts/setup_libero_plus.sh`

Installation

Docker Setup:

docker build -t gam-libero .
docker run --gpus all -it --rm \
  -v /host/gam_workspace/checkpoints:/workspace/gam-libero/checkpoints \
  -v /host/gam_workspace/data:/workspace/gam-libero/data \
  -e WANDB_API_KEY=$WANDB_API_KEY \
  gam-libero

Conda Setup:

conda env create -f environment.yml
conda activate gam-libero
bash scripts/setup_sources.sh
bash scripts/setup_libero_plus.sh --download-assets

venv Setup:

python3.12 -m venv .venv
source .venv/bin/activate
pip install torch==2.5.1 torchvision==0.20.1 \
  --index-url https://download.pytorch.org/whl/cu124
pip install -r requirements.txt
bash scripts/setup_sources.sh
bash scripts/setup_libero_plus.sh --download-assets

Debian/Ubuntu System Packages:

sudo apt-get install \
  libgl1 libglvnd0 libegl1 libgles2 libosmesa6 libglfw3 \
  ffmpeg imagemagick libmagickwand-dev

Runtime Paths:

export DA3_ROOT=/path/to/this_repo
export DA3_BASE_CKPT=$DA3_ROOT/checkpoints/track4world_da3.pth
export GAM_PRETRAINED_CKPT=$DA3_ROOT/checkpoints_hf/3da-libero-gam/pretrained/pretrained-gam.pt
export DA3_LIBERO_SOURCE_DIR=$DA3_ROOT/LIBERO
export DA3_LIBERO_PLUS_DIR=$DA3_ROOT/LIBERO-plus
export PYTHONPATH=$DA3_ROOT/src:$DA3_LIBERO_PLUS_DIR:$DA3_LIBERO_SOURCE_DIR:$PYTHONPATH
export MUJOCO_GL=egl
export PYOPENGL_PLATFORM=egl

Data And Weights

Place data and base weights under $DA3_ROOT. By default, $DA3_ROOT is the repository root:

$DA3_ROOT/
  checkpoints/track4world_da3.pth
  checkpoints_hf/3da-libero-gam/pretrained/pretrained-gam.pt
  data/libero_noop/<suite>/*.hdf5
  data/libero_noop/_stats/

Download the released training assets:

hf download SeonghuJeon/3da-libero-training-assets \
  --repo-type dataset \
  --local-dir .

Download the released GAM checkpoints, including the pretrained-gam initialization checkpoint:

hf download SeonghuJeon/3da-libero-gam \
  --local-dir checkpoints_hf/3da-libero-gam

To download only the pretrained initialization checkpoint:

hf download SeonghuJeon/3da-libero-gam \
  pretrained/pretrained-gam.pt \
  --local-dir checkpoints_hf/3da-libero-gam

For a single-suite smoke rollout, download only that suite:

hf download SeonghuJeon/3da-libero-gam \
  spatial/gam.pt spatial/config.yaml \
  --local-dir checkpoints_hf/3da-libero-gam

Expected checkpoint layout:

Suite key	LIBERO suite	Checkpoint	Config
`spatial`	`libero_spatial`	`spatial/gam.pt`	`spatial/config.yaml`
`object`	`libero_object`	`object/gam.pt`	`object/config.yaml`
`goal`	`libero_goal`	`goal/gam.pt`	`goal/config.yaml`
`long`	`libero_10`	`long/gam.pt`	`long/config.yaml`

The pretrained initialization checkpoint is available at pretrained/pretrained-gam.pt. To use it for training, set stage_1.ckpt_path to the downloaded path.

Evaluation

Install LIBERO-Plus assets before the first rollout:

bash scripts/setup_libero_plus.sh --download-assets

Run standalone LIBERO-Plus evaluation from the released HF checkpoint:

GAM_EVAL_GPUS=0,1,2,3 bash scripts/run_hf_gam_libero_plus_eval.sh spatial
GAM_EVAL_GPUS=0,1,2,3 bash scripts/run_hf_gam_libero_plus_eval.sh all

The standalone launcher uses qpos=original for LIBERO-Plus robot initialization and writes summary.json, per_task.csv, and logs under the result directory. See docs/evaluation.md for rollout protocols, parallelism, config flags, and CUDA graph latency measurement.

Training

Single-GPU smoke run:

PYTHONPATH=src:$PYTHONPATH python src/train_robot.py \
  --config configs/training/libero_unified/smoke/gam_chunk2.yaml \
  --single-gpu \
  --set training.max_steps=1

Multi-GPU GAM fine-tuning with DeepSpeed ZeRO-2:

PYTHONPATH=src:$PYTHONPATH \
PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True \
deepspeed --include localhost:0,1,2,3 src/train_robot.py \
  --config configs/training/libero_unified/gam/chunk8_150k_2node.yaml \
  --deepspeed_config configs/training/libero_unified/deepspeed/micro2.json \
  --set stage_1.ckpt_path=$GAM_PRETRAINED_CKPT \
  --wandb \
  --wandb-name gam_libero

See docs/training.md for config keys, CLI flags, W&B resume, optimizer-state resume, DeepSpeed ZeRO-2, compile settings, and in-training closed-loop eval.

Acknowledgements

This repository is built on top of GLD: Geometric Latent Diffusion.

We thank the teams behind Track4World, OpenPI, Pi0.5, Cosmos Policy, and OpenVLA-OFT for releasing their research, code, and models to the robotics community.

Citation

@misc{han2026geometricactionmodelrobot,
      title={Geometric Action Model for Robot Policy Learning},
      author={Jisang Han and Seonghu Jeon and Jaewoo Jung and Ren{\'e} Zurbr{\"u}gg and Honggyu An and Tifanny Portela and Marco Hutter and Marc Pollefeys and Seungryong Kim and Sunghwan Hong},
      year={2026},
      eprint={2606.17046},
      archivePrefix={arXiv},
      primaryClass={cs.RO},
      url={https://arxiv.org/abs/2606.17046}
}

License

See LICENSE.