RO
Roni Itkin, Noam Issachar, Yehonatan Keypur, Yehonatan Keypur, Anpei Chen, Sagie Benaim
4/15/2026
RAD-2: Scaling Reinforcement Learning in a Generator-Discriminator Framework
TL;DR
RAD-2 combines diffusion-based trajectory generation with RL-optimized reranking for autonomous driving motion planning. The framework achieves 56% collision rate reduction through temporal consistency in policy optimization and structured feedback mechanisms. Real-world deployment validates improved safety and smoothness in urban traffic scenarios.
- •Generator-discriminator framework pairs diffusion models for trajectory candidates with RL-based quality assessment
- •Novel policy optimization (TCGRPO) and on-policy generator updates reduce collision risk 56%
- •Bird's Eye View simulation enables efficient large-scale training with real-world validation
Generated with AI, which can make mistakes.
Is this a good recommendation for you?
