Unitree G1 - Phase 1: Baseline Locomotion (Frozen Arms)

This checkpoint represents a trained policy for the Unitree G1 humanoid robot, focusing on stable bipedal locomotion with frozen arms.

Model Description

This is a PPO-trained locomotion policy for the Unitree G1 29-DOF humanoid robot, trained in NVIDIA Isaac Gym. Phase 1 focuses on establishing baseline walking behavior with arms held in a fixed position, allowing the legs and torso to learn stable locomotion patterns.

Training Framework: Isaac Gym (GPU-accelerated physics simulation) Algorithm: Proximal Policy Optimization (PPO) Robot: Unitree G1 (29 DOF humanoid) Policy Type: Actor-Critic with continuous actions

Training Details

Configuration

  • DOF: 29 (12 legs + 3 waist + 14 arms)
  • Active DOF: 15 (legs + waist only, arms frozen)
  • Parallel Environments: 4096
  • Training Device: NVIDIA GeForce RTX 4080 SUPER
  • Total Iterations: 5,000
  • Training Time: ~1.2 hours
  • Training Speed: 4,253 iterations/hour (~89,000-94,000 steps/second)

Training Command

python train_rl.py task=g1_curriculum_phase1 headless=true

Hyperparameters

  • Learning Rate: Default PPO settings
  • Batch Size: 4096 environments
  • Horizon Length: 24 steps
  • Discount Factor (gamma): 0.99
  • GAE Lambda: 0.95

Reward Design

  • Alive Reward: 5.0 (primary focus on survival and stability)
  • Arm Tracking: 0.0 (arms frozen in default position)
  • Base Height: Maintains upright posture
  • Orientation: Penalizes tilting
  • Linear Velocity Tracking: Encourages forward motion

Performance Metrics

Metric Value
Mean Reward 69.46
Episode Length 836.7 steps (83% of max)
Alive Reward 3.9894
Noise Std 3.78
Checkpoint Iteration 5000

Training Progress

  • Initial mean reward: ~0.0 (random policy)
  • Final mean reward: ~69.46 (steady improvement)
  • Episode lengths reached near-maximum (1000 steps)
  • Training converged smoothly without reward collapse

Validation

Isaac Gym Playback: Robot walks stably with smooth forward locomotion and frozen arms ✅ No Falls: No instability observed during policy rollout ⏳ MuJoCo Sim2Sim: Validation pending

Usage

Prerequisites

pip install torch isaacgym

Load and Run Policy

import torch

# Load checkpoint
checkpoint = torch.load("model_5000.pt")

# Extract policy network
policy = checkpoint['model']  # Adjust key based on checkpoint structure

# Use policy for inference
# (Requires Isaac Gym environment setup)

Observation Space

  • Base linear velocity
  • Base angular velocity
  • Projected gravity
  • Joint positions
  • Joint velocities
  • Previous actions

Action Space

  • 29 continuous actions (normalized joint position targets)
  • Only 15 DOF are active (legs + waist)
  • Arms receive zero-gain actions (effectively frozen)

Limitations

  • Arms are frozen in default position (no manipulation capability)
  • Trained only for forward locomotion on flat terrain
  • No arm tracking or upper-body control
  • Sim-to-real transfer not yet validated

Next Steps

This checkpoint serves as the foundation for Phase 2 (Arm Awakening), where arm control is gradually introduced while maintaining locomotion stability.

Citation

If you use this model, please cite:

@misc{unitree_g1_phase1,
  title={Unitree G1 Phase 1: Baseline Locomotion},
  author={PathonAI},
  year={2025},
  howpublished={\url{https://huggingface.co/[your-username]/unitree-g1-phase1}},
}

Training Logs

  • Run ID: Dec12_16-19-24_
  • TensorBoard Logs: Available in training repository
  • Full Training Log: See EXPERIMENTS_LOG.md in repository

License

MIT License - See repository for full license details

Downloads last month

-

Downloads are not tracked for this model. How to track
Video Preview
loading