SICNav-Diffusion: Safe and Interactive Crowd Navigation with Diffusion Trajectory Predictions

Sepehr Samavi Anthony Lem Fumiaki Sato Sirui Chen Qiao Gu Keijiro Yano Angela P. Schoellig Florian Shkurti

For interactive robotic crowd navigation, we propose a diffusion-based trajectory prediction model to forecast the future motion of humans as joint trajectory samples. We then feed the trajectory samples to the bilevel SICNav problem, which uses a particle-filter inspired approach to jointly refine the predictions (lower-level) and optimize robot actions that satisfy safety constraints (upper-level), resulting in a robot plan coupled with the refined predictions. When solving SICNav, the optimizer can explore trajectory prediction modes generated by the diffusion model and refine the predictions to be from the mode that best optimizes the robot objective, while guaranteeing safety (i.e. non-collision). We follow the SICNav solutions in receding horizon fashion.

To evaluate our approach we first validate the prediction performance of our proposed diffusion model on a dataset benchmark. To evaluate closed loop performance, we conduct simulation experiments and in-lab real-robot experiments. We also demonstrate the ability of our approach to work in the wild solely with on-board sensors and compute.

Video

Abstract

To navigate crowds without collisions, robots must interact with humans by forecasting their future motion and reacting accordingly. While learning-based prediction models have shown success in generating likely human trajectory predictions, integrating these stochastic models into a robot controller presents several challenges. The controller needs to account for interactive coupling between planned robot motion and human predictions while ensuring both predictions and robot actions are safe (i.e. collision-free). To address these challenges, we present a receding horizon crowd navigation method for single-robot multi-human environments. We first propose a diffusion model to generate joint trajectory predictions for all humans in the scene. We then incorporate these multi-modal predictions into a SICNav Bilevel MPC problem that simultaneously solves for a robot plan (upper-level) and acts as a safety filter to refine the predictions for non-collision (lower-level). Combining planning and prediction refinement into one bilevel problem ensures that the robot plan and human predictions are coupled. We validate the open-loop trajectory prediction performance of our diffusion model on the commonly used ETH/UCY benchmark and evaluate the closed-loop performance of our robot navigation method in simulation and extensive real-robot experiments demonstrating safe, efficient, and reactive robot motion.

BibTex

@article{samavi2025sicnavdiffusion,
      author={Sepehr Samavi and Anthony Lem and Fumiaki Sato and Sirui Chen and Qiao Gu and Keijiro Yano and Angela P. Schoellig and Florian Shkurti},
      journal={IEEE Robotics and Automation Letters (in press)},
      title={SICNav-Diffusion: Safe and Interactive Crowd Navigation with Diffusion Trajectory Predictions},
      year={2025},
      volume={},
      number={},
      pages={},
      url={https://arxiv.org/abs/2503.08858}
}