Unitree G1 - Phase 1: Baseline Locomotion (Frozen Arms)

This checkpoint represents a trained policy for the Unitree G1 humanoid robot, focusing on stable bipedal locomotion with frozen arms.

Model Description

This is a PPO-trained locomotion policy for the Unitree G1 29-DOF humanoid robot, trained in NVIDIA Isaac Gym. Phase 1 focuses on establishing baseline walking behavior with arms held in a fixed position, allowing the legs and torso to learn stable locomotion patterns.

Training Framework: Isaac Gym (GPU-accelerated physics simulation) Algorithm: Proximal Policy Optimization (PPO) Robot: Unitree G1 (29 DOF humanoid) Policy Type: Actor-Critic with continuous actions

Training Details

Configuration

DOF: 29 (12 legs + 3 waist + 14 arms)
Active DOF: 15 (legs + waist only, arms frozen)
Parallel Environments: 4096
Training Device: NVIDIA GeForce RTX 4080 SUPER
Total Iterations: 5,000
Training Time: ~1.2 hours
Training Speed: 4,253 iterations/hour (~89,000-94,000 steps/second)

Training Command

python train_rl.py task=g1_curriculum_phase1 headless=true

Hyperparameters

Learning Rate: Default PPO settings
Batch Size: 4096 environments
Horizon Length: 24 steps
Discount Factor (gamma): 0.99
GAE Lambda: 0.95

Reward Design

Alive Reward: 5.0 (primary focus on survival and stability)
Arm Tracking: 0.0 (arms frozen in default position)
Base Height: Maintains upright posture
Orientation: Penalizes tilting
Linear Velocity Tracking: Encourages forward motion

Performance Metrics

Metric	Value
Mean Reward	69.46
Episode Length	836.7 steps (83% of max)
Alive Reward	3.9894
Noise Std	3.78
Checkpoint Iteration	5000

Training Progress

Initial mean reward: ~0.0 (random policy)
Final mean reward: ~69.46 (steady improvement)
Episode lengths reached near-maximum (1000 steps)
Training converged smoothly without reward collapse

Validation

✅ Isaac Gym Playback: Robot walks stably with smooth forward locomotion and frozen arms ✅ No Falls: No instability observed during policy rollout ⏳ MuJoCo Sim2Sim: Validation pending

Usage

Prerequisites

pip install torch isaacgym

Load and Run Policy

import torch

# Load checkpoint
checkpoint = torch.load("model_5000.pt")

# Extract policy network
policy = checkpoint['model']  # Adjust key based on checkpoint structure

# Use policy for inference
# (Requires Isaac Gym environment setup)

Observation Space

Base linear velocity
Base angular velocity
Projected gravity
Joint positions
Joint velocities
Previous actions

Action Space

29 continuous actions (normalized joint position targets)
Only 15 DOF are active (legs + waist)
Arms receive zero-gain actions (effectively frozen)

Limitations

Arms are frozen in default position (no manipulation capability)
Trained only for forward locomotion on flat terrain
No arm tracking or upper-body control
Sim-to-real transfer not yet validated

Next Steps

This checkpoint serves as the foundation for Phase 2 (Arm Awakening), where arm control is gradually introduced while maintaining locomotion stability.

Citation

If you use this model, please cite:

@misc{unitree_g1_phase1,
  title={Unitree G1 Phase 1: Baseline Locomotion},
  author={PathonAI},
  year={2025},
  howpublished={\url{https://huggingface.co/[your-username]/unitree-g1-phase1}},
}

Training Logs

Run ID: Dec12_16-19-24_
TensorBoard Logs: Available in training repository
Full Training Log: See EXPERIMENTS_LOG.md in repository

License

MIT License - See repository for full license details

Downloads last month: -; Downloads are not tracked for this model. How to track

Video Preview

Reinforcement Learning