RLinf/RLinf-OpenVLAOFT-GRPO-LIBERO-object
Reinforcement Learning • 8B • Updated • 73
None defined yet.
WoVR: World Models as Reliable Simulators for Post-Training VLA Policies with RL
RLinf-Co: Reinforcement Learning-Based Sim-Real Co-Training for VLA Models