Back to feed
RO
Roni Itkin, Noam Issachar, Yehonatan Keypur, Yehonatan Keypur, Anpei Chen, Sagie Benaim
4/15/2026

RAD-2: Scaling Reinforcement Learning in a Generator-Discriminator Framework

TL;DR

RAD-2 combines diffusion-based trajectory generation with RL-optimized reranking for autonomous driving motion planning. The framework achieves 56% collision rate reduction through temporal consistency in policy optimization and structured feedback mechanisms. Real-world deployment validates improved safety and smoothness in urban traffic scenarios.

  • Generator-discriminator framework pairs diffusion models for trajectory candidates with RL-based quality assessment
  • Novel policy optimization (TCGRPO) and on-policy generator updates reduce collision risk 56%
  • Bird's Eye View simulation enables efficient large-scale training with real-world validation

Generated with AI, which can make mistakes.

Is this a good recommendation for you?

Explore more