Kairos 3.0

💜 Kairos Platform ｜ 🖥️ GitHub | 🤗 Hugging Face | 🤖 Model Scope | 📑 Paper

Kairos 3.0 is grounded in physical laws as its cognitive foundation, establishing a unified cross-embodiment world modeling framework. Featuring a 4B-parameter architecture with a custom hybrid linear attention operator, it unifies multimodal understanding, generation, and action prediction for real-time edge deployment. By achieving physics-level deep cognition and low-latency inference, it empowers high-precision action prediction and HD generation for both physical and digital embodied AI applications.

🎯 1. Motivation

While Scaling Laws are emerging in Embodied AI, their efficiency is severely bottlenecked by data heterogeneity, poor long-horizon reasoning, and edge-side compute constraints. These hurdles make scaling alone insufficient for reliable interaction, hindering the path to industrial-grade General Embodied Intelligence.

🌟 2. Kairos 3.0 Framework

🌍 Unified World Modeling Framework

Kairos 3.0 uses fundamental physical and causal laws as its cognitive foundation. By integrating real-robot interaction, structured human behavior, and Chain-of-Thought (CoT) data, it breaks heterogeneity barriers and boosts data reuse efficiency. This shifts the paradigm from simple imitation to physics-level deep understanding, enabling robust generalization and long-horizon reasoning at a more efficient model scale.

🔗 Integrated Multimodal Architecture

Designed as a unified end-to-end pipeline for Understanding, Generating, and Predicting the world. Leveraging physical laws and causal CoT, the model doesn't just "see" but "understands" the underlying logic of environments. This allows for precise decomposition of complex tasks, seamless planning, and reliable execution in a single intelligence loop.

⚡ Linear-Time Attention for World Models

Introducing the first Hybrid Linear Attention operator specifically for world models. By reducing temporal complexity from $O(n^2)$ to $O(n)$, Kairos 3.0 slashes VRAM and compute overhead while maintaining long-sequence capabilities. This enables the industry’s first real-time on-robot inference for an open-source world model.

✨ 3. Demos

Physical–causal consistency	Cross-embodiment generalization	High-efficiency inference

🧠 Physical–causal consistency

Kairos leverages causal CoT and physical laws to transform multimodal inputs into deep task logic. It enables autonomous planning and feasibility analysis, shifting the system from "executing commands" to "understanding intent" for real-world robotic actions.

🎨 Cross-embodiment generalization

Unified Cross-Embodied Generation: A single "brain" that generalizes across single-arm, dual-arm, and dexterous-hand platforms. Kairos enables shared, transferable world knowledge with maximal adaptability. Broad Hardware Support: Native compatibility with Agibot G1, Unitree G1, and Songling PIPER, significantly slashing development costs through zero-shot multi-task generalization.

🔮 High-efficiency inference

Real-time Edge Performance: Industry-leading inference speed with ultra-low resource consumption. Optimized for low-latency, high-reliability deployment across single or multi-GPU embodied systems.

📦 4. Model Zoo

Download Links	Model Version	Use cases	Highlights
🤗HuggingFace 🤖ModelScope	kairos-4B-480P	480p general pretrained model	480p pretrained model for downstream fine-tuning.
🤗HuggingFace 🤖ModelScope	kairos-4B-robot-480P	Robot manipulation & real-world closed-loop control	Specialized for embodied AI; leading accuracy on PAI-Bench
🤗HuggingFace 🤖ModelScope	kairos-4B-robot-480P-distillation	On-robot Integration、Edge Computing、Low-power Efficiency	Ultra-lightweight via distillation; enables real-time inference on embedded/edge devices.
🤗HuggingFace 🤖ModelScope	kairos-4B-720p	HD visual generation & complex physical reasoning	Supports 720P HD output with enhanced fine-grained detail capture.

📈5. Evaluation

🎯 5.1 Accuracy Benchmarks

Domain	Benchmarks	Kairos-Robot	Cosmos 2.5-2B*	Wan 2.2-5B*	Cosmos 2.5-14B*	Lingbot*
Robot	PAI-Bench-robot	80.03	78.3	78.6	79.4	79.96
	WorldModelBench-robot TI2V	9.08	9.04	8.52	8.94	9.04
	DreamGen Bench(PA/IF)	0.529/0.609	0.418/0.568	0.314/0.543	0.495/0.478	0.466/0.569

Domain	Benchmarks	Kairos 3.0-4B	Cosmos 2.5-2B*	Wan 2.2-5B*	Cosmos 2.5-14B
General	PAI-Bench	80.84	81.0	80.4	81.0
	WorldModelBench	8.94	8.86	8.70	9.02*
	VideoPHY	45.55	44.64	38.85	-

*（results reproduced from open-source model baselines, "robot" refers to the corresponding results of the robot subset.）

Kairos models deliver SOTA performance across diverse benchmarks. In embodied scenarios, Kairos-Robot leads PAI-Bench with a score of 80.03 and dominates generalization tasks in DreamGen Bench. For general world modeling, Kairos 3.0-4B matches or exceeds larger-scale models on WorldModelBench and VideoPHY, achieving a perfect balance of precision and efficiency at a compact 4B scale.

⚡ 5.2 Deployment

5.2.1 Real-time Inference

GPU	Resulotion	Memory(GB)	1 GPU (s)	4 GPUs (s)
NV-A800	480P	23.5	11.7	3.0
NV-RTX5090	480P	13.9	11.4	5.7

*（results based on kairos-4B-robot 480p distillation）

5.2.2 Benchmark for A800 GPU

Model	Parameter	Memory (GB)	Complexity (PFlops)	1 GPU (s)	4 GPUs (s)
Kairos 3.0	4B	23.5	2.3	43.3	9.5
Cosmos 2.5	14B	70.2	156.5 (~70x)	2526.0	687.2
Wan 2.2	5B	23.4	16.6 (~7x)	201.0	85.0
Lingbot	28B	46.1	347.4 (~160x)	5525.0	1436.0

*（evaluation setting：TI2V mode with 720P/5s）

🔧 6. Quick Start

6.1 Environment Installation

# Clone the repository
git clone https://github.com/kairos-agi/kairos-sensenova.git
cd kairos-sensenova

# You can set up the environment in two ways:
# 1) Build container from the Docker image
# 2) Build the environment from requirements with conda or venv

# 1) Docker image:
# Note:
# Please select the Docker image that matches your GPU platform.
# The default image is for A800 / A100, while RTX 5090 requires the -rtx5090 image tag, and METAX C500 requires the -metax tag.

# Pull the Docker image
# For A800 / A100
echo "$GHCR_TOKEN" | docker login ghcr.io -u username --password-stdin
docker pull ghcr.io/kairos-agi/kairos-sensenova:v0.0.1

# For RTX 5090
# docker pull ghcr.io/kairos-agi/kairos-sensenova:v0.0.1-rtx5090

# For METAX C500
# docker pull ghcr.io/kairos-agi/kairos-sensenova:v0.0.1-metax

# Create a container using Docker
docker run --rm -it \
  --gpus all \
  -v $(pwd):/workspace \
  ghcr.io/kairos-agi/kairos-sensenova:v0.0.1 \
  bash

# For RTX 5090
# docker run --rm -it \
#   --gpus all \
#   -v $(pwd):/workspace \
#   ghcr.io/kairos-agi/kairos-sensenova:v0.0.1-rtx5090 \
#   bash

# For METAX C500
# docker run --rm -it \
#   --gpus all \
#   -v $(pwd):/workspace \
#   ghcr.io/kairos-agi/kairos-sensenova:v0.0.1-metax \
#   bash

# 2) Requirements
# build a python environment with python>=3.10, torch>=2.6, and cuda>=12.6
# install requirements
# Note: METAX C500 is not supported in this setup method. For METAX C500, please use the Docker image only.
pip install -r requirements.txt

6.2 Download Kairos Models

Download with huggingface

pip install -U huggingface_hub 

# 4B-480P
hf download kairos-agi/kairos-sensenova-4B-480P-pretrained \
  --local-dir models/Kairos-model/kairos-sensenova-4B-480P-pretrained 

# 4B-720P
hf download kairos-agi/kairos-sensenova-4B-720P \
  --local-dir models/Kairos-model/kairos-sensenova-4B-720P 

# 4B-robot
hf download kairos-agi/kairos-sensenova-robot-4B-480P \
  --local-dir models/Kairos-model/kairos-sensenova-robot-4B-480P

# 4B-robot distilled
hf download kairos-agi/kairos-sensenova-robot-4B-480P-distilled \
  --local-dir models/Kairos-model/kairos-sensenova-robot-4B-480P-distilled

Download with modelscope

pip install modelscope

# 4B-480P
modelscope download kairos-team/kairos-sensenova-4B-480P-pretrained \
  --local_dir models/Kairos-model/kairos-sensenova-4B-480P-pretrained 

# 4B-720P
modelscope download kairos-team/kairos-sensenova-4B-720P \
  --local_dir models/Kairos-model/kairos-sensenova-4B-720P 

# 4B-robot
modelscope download kairos-team/kairos-sensenova-robot-4B-480P \
  --local_dir models/Kairos-model/kairos-sensenova-robot-4B-480P

# 4B-robot distilled
modelscope download kairos-team/kairos-sensenova-robot-4B-480P-distilled \
  --local_dir models/Kairos-model/kairos-sensenova-robot-4B-480P-distilled

6.3 Run Inference

Note: Please complete Section 6.2 first to download the Kairos model weights.

# Step1: Download additional dependencies for inference
mkdir -p models/Qwen models/Wan2.1-T2V-14B

# Download Qwen2.5-VL for Text-Encoder
hf download Qwen/Qwen2.5-VL-7B-Instruct-AWQ \
  --local-dir models/Qwen/Qwen2.5-VL-7B-Instruct-AWQ \
  --include "*.safetensors"  
  
# Download Wan2.1-VAE for VAE-Encoder/Decoder
hf download Wan-AI/Wan2.1-T2V-14B \
  --local-dir models/Wan2.1-T2V-14B \
  --include "Wan2.1_VAE.pth"  

# Step2: Run the examples
# The example JSON files provided here are intended for the
# `kairos-sensenova-robot-4B-480P-distilled` model.
#
# For TI2V and I2V, please use the 480P JSON configs. This distilled model is
# optimized for 480P, and its performance at 720P is not ideal.
#
# For other models and the matching configs, please refer to `docs/QUICKSTART.md`.
# Text2Video
bash examples/inference.sh examples/example_t2v_480P.json
# Text&FirstImage2Video
bash examples/inference.sh examples/example_ti2v_480P.json
# FirstImage2Video
bash examples/inference.sh examples/example_i2v_480P.json

👥 7. About Us

Developed and maintained by the Kairos Team. We specialize in Embodied Intelligence and World Model research, with a mission to build Artificial General Intelligence (AGI) that truly understands the physical world. Our goal is to accelerate the industrialization of embodied technologies and reshape the global landscape of AI competition.

📄 8. License

Kairos is open-sourced under the Apache License 2.0. Feel free to use, modify, and build commercial products on top of it. Check the LICENSE file for the full text.

9. Acknowledgements

We would like to thank the contributors to Qwen-Image, Wan2.1, DiffSynth-Studio and HuggingFace for their open-source research contributions.

⭐ Star us on GitHub if you find Kairos 3.0 helpful!