AMP_mjlab
Deployment integration code is in ccrpRepo/wbc_fsm, under MJAmp State.
G1 AMP motion control project built on top of mjlab + rsl_rl.
Key features of this repository:
- A single policy learns both locomotion (walk/run) and recovery (fall-and-get-up)
- AMP discriminator regularizes motion style and priors
- Training and deployment pipelines are consistent, with direct ONNX policy export support
Core Idea
Instead of training separate policies for locomotion and recovery and switching between them, this project learns both capabilities in one unified policy.
Implementation highlights:
- Motion data split:
- Walk/Run data:
src/assets/motions/g1/amp/WalkandRun - Recovery data:
src/assets/motions/g1/amp/Recovery
- Walk/Run data:
- Delayed termination/reset:
- A subset of environments does not reset immediately after termination
- These environments receive a recovery window and reset states sampled from recovery clips
- Unified AMP training:
- One actor-critic + One AMP discriminator
- Velocity tracking, perturbation robustness, and recovery are learned together
This reduces discontinuities caused by policy switching and yields more consistent behavior.
Requirements
- Linux
- Python 3.11 (recommended)
- Working MuJoCo and GPU driver setup
Quick Start
1. Install
conda activate mjlab
cd AMP_mjlab
python -m pip install -e .
2. Apply mjlab Patch (Optional)
If you do not apply this patch, remove history_ordering configuration from the code.
What this patch does:
- It adds an option for how observation history is flattened: by time (
time) or by term (term). - Default mjlab behavior supports only
termordering.
Patch file:
mjlab_patch/mjlab/managers/observation_manager.py
Example command:
cp mjlab_patch/mjlab/managers/observation_manager.py \
/home/crp/miniconda3/envs/mjlab/lib/python3.11/site-packages/mjlab/managers/observation_manager.py
3. List Available Tasks
python scripts/list_envs.py --keyword AMP
Main tasks:
Unitree-G1-AMP-RoughUnitree-G1-AMP-Flat
Training
python scripts/train.py Unitree-G1-AMP-Flat --env.scene.num-envs=4096
Logs are saved by default to:
logs/rsl_rl/g1_amp_locomotion/<time_stamp_run>/
Training Curve Note (Important)
- Around
2witerations (about 20k), the policy often suddenly learns fall-recovery behavior. - As a result, multiple metrics in
logsmay show abrupt jumps. This is expected and not necessarily a training failure.

Evaluation and Visualization
Replay with a trained checkpoint:
python scripts/play.py Unitree-G1-AMP-Rough \
--checkpoint-file logs/rsl_rl/g1_amp_locomotion/<run_dir>/model_<iter>.pt
Note: ONNX export is enabled by default in both training and play workflows.
Motion Data Preparation
CSV-to-NPZ conversion script:
python scripts/csv_to_npz.py --help
Recommended data layout:
- Raw CSV:
motion_data_csv/amp - Converted NPZ:
src/assets/motions/g1/amp/WalkandRunandsrc/assets/motions/g1/amp/Recovery
If valid NPZ files exist in these folders, training config loads them automatically.
Repository Structure
src/tasks/amp_loco: AMP locomotion/recovery task implementationsrc/tasks/amp_loco/config/g1: G1 task registration, env configs, RL configssrc/tasks/amp_loco/mdp: rewards, observations, events, termination logicscripts/train.py: training entry pointscripts/play.py: playback entry pointscripts/csv_to_npz.py: motion data conversion toolmjlab_patch: required local patch for mjlab
Highlights
- One policy unifies walk/run and recovery skills
- AMP + velocity objective jointly optimize style and task performance
- Delayed reset with recovery sampling explicitly improves recovery ability
- End-to-end pipeline supports ONNX export for deployment
Acknowledgements
- Thanks to unitreerobotics/unitree_rl_mjlab for open-sourcing their work and inspiration.
- Thanks to Open-X-Humanoid/TienKung-Lab; the rsl_rl AMP part in this project references their implementation.