πTraffic-R1-3B (Public 0.1) π¦
Traffic-R1 is a foundational LLM built specifically for traffic signal control. This publicly available version, Traffic-R1-3B (Public 0.1), delivers superior zero-shot performance and stable generalization, allowing it to reason like a human traffic expert. π§
This model is a checkpoint based on the research in our paper:
Traffic-R1: Reinforced LLMs Bring Human-Like Reasoning to Traffic Signal Control Systems π https://arxiv.org/abs/2508.02344
Introduction Video
Abstract
Traffic signal control (TSC) is vital for mitigating congestion and sustaining urban mobility. In this paper, we introduce Traffic-R1, a foundation model with human-like reasoning for TSC systems. Our model is developed through self-exploration and iteration of reinforced large language models (LLMs) with expert guidance in a simulated traffic environment. Compared to traditional reinforcement learning (RL) and recent LLM-based methods, Traffic-R1 offers three significant advantages. First, Traffic-R1 delivers zero-shot generalisation, transferring unchanged to new road networks and out-of-distribution incidents by utilizing its internal traffic control policies and human-like reasoning. Second, its 3B-parameter architecture is lightweight enough for real-time inference on mobile-class chips, enabling large-scale edge deployment. Third, Traffic-R1 provides an explainable TSC process and facilitates multi-intersection communication through its self-iteration and a new synchronous communication network. Extensive benchmarks demonstrate that Traffic-R1 sets a new state of the art, outperforming strong baselines and training-intensive RL controllers. In practice, the model now manages signals for more than 55,000 drivers daily, shortening average queues by over 5% and halving operator workload.
Compatibility & Reproducibility π οΈ
This model supports a wide range of deployment methods compatible with the Qwen architecture, including those provided by the transformers library. You can easily use it in a chat mode to interactively discuss traffic-related scenarios.
For more detailed information on deployment, please refer to the official Qwen documentation.
The model is compatible with the signal control evaluation code provided by LLMLight [https://github.com/usail-hkust/LLMTSCS].
You can easily reproduce our results with minor changes to the prompt format.
Quick Reproduction Steps
Clone the Repository:
git clone https://github.com/usail-hkust/LLMTSCSCreate a Related Environment: Follow the instructions in the repository to set up the necessary environment.
Run the Script:
python run_open_LLM_with_vllm.py \ --llm_model LLM_MODEL_NAME_ONLY_FOR_LOG \ --llm_path LLM_PATH (input the location of Traffic-R1's checkpoint) \ --dataset hangzhou \ --traffic_file anon_4_4_hangzhou_real.json \ --proj_name TSCS
Important Note: The format prompt of LLMLight is inconsistent with Traffic-R1. For more stable results, consider making changes to LLMLight's format prompt codes.
A big thanks to these excellent projects! π
Future Releases π
We are working on upgrading base mode Qwen 2.5-> Qwen 3/VL for the latest features.
Important Notice β οΈ
This is an earlier checkpoint and doesn't include all the data samples from our offline pretraining stage. We've done this to address commercial and privacy concerns. We will release updates as the model continues to be upgraded internally. π
Model tree for Season998/Traffic-R1
Base model
Qwen/Qwen2.5-3B