USC at ICRA 2025

| May 23, 2025 

USC School of Advanced Computing and USC Viterbi researchers showcase breakthroughs in generative modeling, safety in imitation learning, human-aware planning and more at ICRA 2025, one of the most prestigious gatherings in robotics.

usc campus

USC School of Advanced Computing researchers are presenting their research this week at the flagship conference of the IEEE Robotics and Automation Society (RAS) in Atlanta, GA. Photo/USC.

From generative models for realistic driving simulations to safety-aware robot learning and human-robot collaboration shaped by human perception, USC School of Advanced Computing researchers are advancing the frontiers of robotics at the International Conference on Robotics and Automation (ICRA), Atlanta, 2025.

The researchers’ work spans the spectrum of autonomous systems, control, human interaction and biologically-inspired robots—tackling real-world challenges with cutting-edge solutions.

Papers presented include DreamDrive, a novel method for generating 4D, 3D-consistent driving scenes from real-world data; SAFE-GIL, a safety-focused behavior cloning approach that prepares robots for high-stakes environments by simulating test-time errors during training; and an innovative framework that accounts for human field of view in collaborative planning. USC faculty also chaired or co-chaired six sessions on topics from marine robotics to imitation learning and bimanual manipulation.

The conference also recognized USC’s leadership in the field: Professor Maja Matarić received the MassRobotics Medal, a prestigious award sponsored by Amazon Robotics, for her groundbreaking contributions to socially assistive robotics. Her pioneering work focuses on creating robots that empower individuals through personalized support and motivation.

The USC School of Advanced Computing  — comprising the Thomas Lord Department of Computer Science and the Ming Hsieh Department of Electrical and Computer Engineering — is a unit of the Viterbi School of Engineering

TECHNICAL PROGRAM

Day 1

DreamDrive: Generative 4D Scene Modeling from Street View Images

Jiageng Mao, Boyi Li, Boris Ivanovic, Yuxiao Chen, Yan Wang, Yurong You, Chaowei Xiao, Danfei Xu, Marco Pavone, Yue Wang

Keywords: Computer Vision for Automation, Autonomous Vehicle Navigation, Virtual Reality and Interfaces

Abstract: Synthesizing photo-realistic visual observations from an ego vehicle’s driving trajectory is a critical step towards scalable training of self-driving models. Reconstruction-based methods create 3D scenes from driving logs and synthesize geometry-consistent driving videos through neural rendering, but their dependence on costly object annotations limits their ability to generalize to in-the-wild driving scenarios. On the other hand, generative models can synthesize action-conditioned driving videos in a more generalizable way but often struggle with maintaining 3D visual consistency. In this paper, we present ourmethod, a 4D spatial-temporal scene generation approach that combines the merits of generation and reconstruction, to synthesize generalizable 4D driving scenes and dynamic driving videos with 3D consistency. Specifically, we leverage the generative power of video diffusion models to synthesize a sequence of visual references and further elevate them to 4D with a novel hybrid Gaussian representation. Given a driving trajectory, we then render 3D-consistent driving videos via Gaussian splatting. The use of generative priors allows our method to produce high-quality 4D scenes from in-the-wild driving data, while neural rendering ensures 3D-consistent video generation from the 4D scenes. Extensive experiments on nuScenes and in-the-wild driving data demonstrate that ourmethod can generate controllable and generalizable 4D driving scenes, synthesize novel views of driving videos with high fidelity and 3D consistency, decompose static and dynamic elements in a self-supervised manner, and enhance perception and planning tasks for autonomous driving.

 ReMEmbR: Building and Reasoning Over Long-Horizon Spatio-Temporal Memory for Robot Navigation

Abrar Anwar, John Bradford Welsh, Joydeep Biswas, Soha Pouya, Yan Chang

Keywords: AI-Enabled Robotics, Semantic Scene Understanding, Vision-Based Navigation

Abstract:
Navigating and understanding complex environments over extended periods of time is a significant challenge for robots. People interacting with the robot may want to ask questions like where something happened, when it occurred, or how long ago it took place, which would require the robot to reason over a long history of their deployment. To address this problem, we introduce a Retrieval-augmented Memory for Embodied Robots, or ReMEmbR, a system designed for long-horizon video question answering for robot navigation. To evaluate ReMEmbR, we introduce the NaVQA dataset where we annotate spatial, temporal, and descriptive questions to long-horizon robot navigation videos. ReMEmbR employs a structured approach involving a memory building and a querying phase, leveraging temporal information, spatial information, and images to efficiently handle continuously growing robot histories. Our experiments demonstrate that ReMEmbR outperforms LLM and VLM baselines, allowing ReMEmbR to achieve effective long-horizon reasoning with low latency. Additionally, we deploy ReMEmbR on a robot and show that our approach can handle diverse queries. The dataset, code, videos, and other material can be found at the following link: https://nvidia-ai-iot.github.io/remembr.

 SMART: Advancing Scalable Map Priors for Driving Topology Reasoning

Junjie Ye, David Paz, Hengyuan Zhang, Yuliang Guo, Xinyu Huang, Henrik Iskov Christensen, Yue Wang, Liu Ren

Keywords: Mapping, Computer Vision for Transportation

Abstract:
Topology reasoning is crucial for autonomous driving as it enables comprehensive understanding of connectivity and relationships between lanes and traffic elements. While recent approaches have shown success in perceiving driving topology using vehicle-mounted sensors, their scalability is hindered by the reliance on training data captured by consistent sensor configurations. We identify that the key factor in scalable lane perception and topology reasoning is the elimination of this sensor-dependent feature. To address this, we propose SMART, a scalable solution that leverages easily available standard-definition (SD) and satellite maps to learn a map prior model, supervised by large-scale geo-referenced high-definition (HD) maps independent of sensor settings. Attributing to scaled training, SMART alone achieves superior offline lane topology understanding using only SD and satellite inputs. Extensive experiments further demonstrate that SMART can be seamlessly integrated into any online topology reasoning method, yielding significant improvements by up to 28% on the OpenLane-V2 benchmark. Project page: https://jay-ye.github.io/smart.

 SAFE-GIL: SAFEty Guided Imitation Learning for Robotic Systems

Yusuf Umut Ciftci, Darren Chiu, Zeyuan Feng, Gaurav Sukhatme, Somil Bansal

Keywords: Robot Safety, Machine Learning for Robot Control, Imitation Learning

Abstract:
Behavior cloning (BC) is a widely used approach in imitation learning, where a robot learns a control policy by observing an expert supervisor. However, the learned policy can make errors and might lead to safety violations, which limits their utility in safety-critical robotics applications. While prior works have tried improving a BC policy via additional real or synthetic action labels, adversarial training, or runtime filtering, none of them explicitly focus on reducing the BC policy’s safety violations during training time. We propose SAFE-GIL, a design-time method to learn safety-aware behavior cloning policies. SAFE-GIL deliberately injects adversarial disturbance in the system during data collection to guide the expert towards safety-critical states. This disturbance injection simulates potential policy errors that the system might encounter during the test time. By ensuring that training more closely replicates expert behavior in safety-critical states, our approach results in safer policies despite policy errors during the test time. We further develop a reachability-based method to compute this adversarial disturbance. We compare SAFE-GIL with various behavior cloning techniques and online safety-filtering methods in three domains: autonomous ground navigation, aircraft taxiing, and aerial navigation on a quadrotor testbed. Our method demonstrates a significant reduction in safety failures, particularly in low data regimes where the likelihood of learning errors, and therefore safety violations, is higher. See our website here: https://y-u-c.github.io/safegil/


Session: Marine Robotics 2

  • Chair: Gaurav Sukhatme, University of Southern California
  • Co-Chair: Brendan Englot

Day 2

A Novel Telelocomotion Framework with CoM Estimation for Scalable Locomotion on Humanoid Robots

An-Chi He, Junheng Li, Jungsoo Park, Omar Kolt, Benjamin Beiter, Alexander Leonessa, Quan Nguyen, Kaveh Akbari Hamed

Keywords: Telerobotics and Teleoperation, Haptics and Haptic Interfaces, Humanoid and Bipedal Locomotion

Abstract: Teleoperated humanoid robot systems have made substantial advancements in recent years, offering a physical avatar that harnesses human skills and decision-making while safeguarding users from hazardous environments. However, current telelocomotion interfaces often fail to accurately represent the robot’s environment, limiting the user’s ability to effectively navigate the robot through unstructured terrain. This paper presents an initial telelocomotion framework that integrates the ForceBot locomotion interface with the small-sized humanoid robot, HECTOR V2. The framework utilizes ForceBot to simulate walking motion and estimate the user’s center of mass trajectory, which serves as a tracking reference for the robot. On the robot side, a model predictive control approach, based on a reduced-order single rigid body model, is employed to track the user’s scaled trajectory. We present experimental results on ForceBot’s trajectory estimation and the robot’s tracking performance, demonstrating the feasibility of this approach.

 Integrating Field of View in Human-Aware Collaborative Planning

Ya-Chuan Hsu, Michael Defranco, Rutvik Rakeshbhai Patel, Stefanos Nikolaidis

Keywords: Human-Robot Collaboration, Planning under Uncertainty, Human-Aware Motion Planning

Abstract: In human-robot collaboration (HRC), it is crucial for robot agents to consider humans’ knowledge of their surroundings. In reality, humans possess a narrow field of view (FOV), limiting their perception. However, research on HRC often overlooks this aspect and presumes an omniscient human collaborator. Our study addresses the challenge of adapting to the evolving subtask intent of humans while accounting for their limited FOV. We integrate FOV within the human-aware probabilistic planning framework. To account for large state spaces due to considering FOV, we propose a hierarchical online planner that efficiently finds approximate solutions while enabling the robot to explore low-level action trajectories that enter the human FOV, influencing their intended subtask. Through user study with our adapted cooking domain, we demonstrate our FOV-aware planner reduces human’s interruptions and redundant actions during collaboration by adapting to human perception limitations. We extend these findings to a virtual reality kitchen environment, where we observe similar collaborative behaviors.


Session: Award Finalists 7

  • Chair: Gaurav Sukhatme, University of Southern California
  • Co-Chair: Althoefer, Kaspar Queen Mary University of London

Session: Bimanual Manipulation 2

  • Chair: Asfour, Tamim Karlsruhe Institute of Technology (KIT)
  • Co-Chair: Satyandra K. Gupta, University of Southern California

Force-Conditioned Diffusion Policies for Compliant Sheet Separation Tasks in Bimanual Robotic Cells

Rishabh Shukla, Raj Talan, Samrudh Moode, Neel Dhanaraj, Jeon Ho Kang, Satyandra K. Gupta

Keywords: Learning from Demonstration, Bimanual Manipulation, Disassembly

Abstract: Disassembly is a critical challenge in maintenance and service tasks, particularly in high-precision operations such as electric vehicle (EV) battery recycling. Tasks like prying-open sealed battery covers require precise manipulation and controlled force application. In our approach we collect human demonstrations using a motion capture system, enabling the robot to learn from human-expert disassembly strategies. These demonstrations train a bimanual robotic system in which one arm exerts force with a specialized tool while the other manipulates and removes sealed components. Our method builds on a diffusion-based policy and integrates real-time force sensing to adapt its actions as contact conditions change. We decompose the demonstrations into distinct sub-tasks and apply data augmentation, thereby reducing the number of demonstrations needed and mitigating potential task failures. Our results show that the proposed method, even with a small dataset, achieves a high task success rate and efficiency compared to a standard diffusion technique. We demonstrate in a real-world application that the bimanual system effectively executes chiseling and peeling actions to separate bonded sheet from a substrate.

SurfaceAug: Toward Versatile, Multimodally Consistent Ground Truth Sampling

Ryan Rubel, Nathan Clark, Andrew Dudash

Keywords: Computer Vision for Transportation, Object Detection, Segmentation and Categorization, AI-Enabled Robotics

Abstract: Despite recent advances in both model architectures and data augmentation, multimodal object detectors still barely outperform their LiDAR-only counterparts. This shortcoming has been attributed to a lack of sufficiently powerful multimodal data augmentation. To address this, we present SurfaceAug, a novel ground truth sampling algorithm. SurfaceAug pastes objects by resampling both images and point clouds, enabling object-level transformations in both modalities. We evaluate our algorithm by training a multimodal detector on KITTI and compare its performance to previous works. We show experimentally that SurfaceAug demonstrates promising improvements on car detection tasks.

DREAM: Decentralized Real-Time Asynchronous Probabilistic Trajectory Planning for Collision-Free Multi-Robot Navigation in Cluttered Environments

Baskın Şenbaşlar, Gaurav Sukhatme

Keywords: Collision Avoidance, Multi-Robot Systems, Motion and Path Planning, Probabilistic Trajectory Planning

Abstract: Collision-free navigation in cluttered environments with static and dynamic obstacles is essential for many multi-robot tasks. Dynamic obstacles may also be interactive, i.e., their behavior varies based on the behavior of other entities. We propose a novel representation for interactive behavior of dynamic obstacles and a decentralized real-time multi-robot trajectory planning algorithm allowing inter-robot collision avoidance as well as static and dynamic obstacle avoidance. Our planner simulates the behavior of dynamic obstacles, accounting for interactivity. We account for the perception inaccuracy of static and prediction inaccuracy of dynamic obstacles. We handle asynchronous planning between teammates and message delays, drops, and re-orderings. We evaluate our algorithm in simulations using 25,400 random cases and compare it against three state-of-the-art baselines using 2,100 random cases. Our algorithm achieves up to 1.68x success rate using as low as 0.28x time in single-robot, and up to 2.15x success rate using as low as 0.36x time in multi-robot cases compared to the best baseline. We implement our planner on real quadrotors to show its real-world applicability.

 Variable-Frequency Model Learning and Predictive Control for Jumping Maneuvers on Legged Robots

Chuong Nguyen, Abdullah Altawaitan, Thai Duong, Nikolay Atanasov, Quan Nguyen

Keywords: Legged Robots, Model Learning for Control

Abstract: Achieving both target accuracy and robustness in dynamic maneuvers with long flight phases, such as high or long jumps, has been a significant challenge for legged robots. To address this challenge, we propose a novel learning-based control approach consisting of model learning and model predictive control (MPC) utilizing a variable-frequency scheme. Compared to existing MPC techniques, we learn a model directly from experiments, accounting not only for leg dynamics but also for modeling errors and unknown dynamics mismatch in hardware and during contact. Additionally, learning the model with variable-frequency allows us to cover the entire flight phase and final jumping target, enhancing the prediction accuracy of the jumping trajectory. Using the learned model, we also design variable-frequency to effectively leverage different jumping phases and track the target accurately. In a total of 92 jumps on Unitree A1 robot hardware, we verify that our approach outperforms other MPCs using fixed-frequency or nominal model, reducing the jumping distance error 2 to 8 times. We also achieve jumping distance errors of less than 3 percent during continuous jumping on uneven terrain with randomly-placed perturbations of random heights (up to 4 cm or 27 percent of the robot’s standing height). Our approach obtains distance errors of 1cm to 2cm on 34 single and continuous jumps with different jumping targets and model uncertainties. Code is available at https://github.com/DRCL-USC/Learning_MPC_Jumping.

Mastering Agile Jumping Skills from Simple Practices with Iterative Learning Control

Chuong Nguyen, Lingfan Bao, Quan Nguyen,

Keywords: Legged Robots, Learning from Experience, Model Learning for Control

Abstract: Achieving precise target jumping with legged robots poses a significant challenge due to the long flight phase and the uncertainties inherent in contact dynamics and hardware. Forcefully attempting these agile motions on hardware could result in severe failures and potential damage. Motivated by this challenge, we propose an Iterative Learning Control (ILC) approach to learn and refine jumping skills from easy to difficult, instead of directly learning these challenging tasks. We verify that learning from simplicity can enhance safety and target jumping accuracy over trials. Compared to other ILC approaches for legged locomotion, our method can tackle the problem of a long flight phase where control input is not available. In addition, our approach allows the robot to apply what it learns from a simple jumping task to accomplish more challenging tasks within a few trials directly in hardware, instead of learning from scratch. We validate the method through extensive experiments on the A1 model and hardware for various tasks. Starting from a small jump (e.g., a forward jump 40cm), our learning approach empowers the robot to accomplish a variety of challenging targets, including jumping onto a 20cm high box, leaping to a greater distance of up to 60cm, as well as performing jumps while carrying an unknown payload of 2kg. Our framework allows the robot to reach the desired position and orientation targets with approximate errors of 1cm and 1 degree within a few trials.

High Accuracy Aerial Maneuvers on Legged Robots Using Variational Integrator Discretized Trajectory Optimization

Scott Beck, Chuong Nguyen, Thai Duong, Nikolay Atanasov, Quan Nguyen

Keywords: Legged Robots, Optimization and Optimal Control

Abstract: Performing acrobatic maneuvers involving long aerial phases, such as precise dives or multiple backflips from significant heights, remains an open challenge in legged robot autonomy. Such aggressive motions often require accurate state predictions over long horizons with multiple contacts and extended flight phases. Most existing trajectory optimization (TO) methods rely on Euler or Runge-Kutta integration, which can accumulate significant prediction errors over long planning horizons. In this work, we propose a novel whole-body TO method using variational integration (VI) and full-body nonlinear dynamics for long-flight aggressive maneuvers. Compared to traditional Euler-based TO, our approach using VI preserves energy and momentum properties of the continuous time system and reduces error between predicted and executed trajectories by factors of between 2 − 10 while achieving similar planning time. We successfully demonstrate long-flight triple backflips on a quadruped A1 robot model and backflips on a bipedal HECTOR robot model for various heights and distances, achieving landing angle errors of only a few degrees. In contrast, TO with Euler integration fails to achieve accurate landings in equivalent circumstances, e.g., with landing angle errors greater than 90◦ for triple backflips. We provide an open-source implementation of our VI-discretized TO to support further research on accurate dynamic maneuvers for multi-rigid-body robot systems with contact: https://github.com/DRCL-USC/VI_discretized_TO

Coordinating Spinal and Limb Dynamics for Enhanced Sprawling Robot Mobility

Merve Atasever, Ali Okhovat, Azhang Nazaripouya, John Nisbet, Omer Kurkutlu, Jyotirmoy Deshmukh, Yasemin Ozkan-Aydin

Keywords: Legged Robots, Machine Learning for Robot Control, Deep Learning Methods

Abstract: Salamanders, with their ability to switch between walking and swimming, showcase how spinal flexibility enhances locomotion. Their undulating body motion helps them move over uneven terrain and adapt to unpredictable environments. Inspired by this, we explore control strategies for a salamander-like robot with two configurations: one with a fixed spine and one with an active, flexible spine. We compare biologically inspired gaits and learning-based approaches under different scenarios to see how well each performs. Our findings show that combining models like the Hildebrand gait with deep reinforcement learning (DRL) leads to more robust and efficient movement. Building on this, we developed a modular Hopf oscillator-based CPG framework, which successfully generates coordinated locomotion across multiple limbs. This work is part of our ongoing effort to merge the adaptability of DRL with the rhythmic stability of CPGs for better performance in real-world conditions.

Day 3


Session: Planning and Control for Legged Robots 3

  • Chair: Feifei Qian, University of Southern California
  • Co-Chair: Marchionni Luca, Pal Robotics SL

Obstacle-Aided Trajectory Control of a Quadrupedal Robot through Sequential Gait Composition

Haodi Hu, Feifei Qian

Keywords: Legged Robots, Biologically-Inspired Robots, Dynamics, Rough Terrain Locomotion

Abstract: Modeling and controlling legged robot locomotion on terrains with densely distributed large rocks and boulders are fundamentally challenging. Unlike traditional methods which often consider these rocks and boulders as obstacles and attempt to find a clear path to circumvent them, in this study we aim to develop methods for robots to actively utilize interaction forces with these “obstacles” for locomotion and navigation. To do so, we studied the locomotion of a quadrupedal robot as it traversed a simplified obstacle field, and discovered that with different gaits, the robot could passively converge to distinct orientations. A compositional return map explained this observed passive convergence, and enabled theoretical prediction of the steady-state orientation angles for any given quadrupedal gait. We experimentally demonstrated that with these predictions, a legged robot could effectively generate desired shape of trajectories amongst large, slippery obstacles, simply by switching between different gaits. Our study offered a novel method for robots to exploit traditionally-considered “obstacles” to achieve agile movements on challenging terrains.

Adapting Gait Frequency for Posture-Regulating Humanoid Push-Recovery Via Hierarchical Model Predictive Control

Junheng Li, Zhanhao Le, Junchao Ma, Quan Nguyen

Keywords: Humanoid and Bipedal Locomotion, Optimization and Optimal Control, Whole-Body Motion Planning and Control

Abstract: Current humanoid push-recovery strategies often use whole-body motion, yet they tend to overlook posture regulation. For instance, in manipulation tasks, the upper body may need to stay upright and have minimal recovery displacement. This paper introduces a novel approach to enhancing humanoid push-recovery performance under unknown disturbances and regulating body posture by tailoring the recovery stepping strategy. We propose a hierarchical-MPC-based scheme that analyzes and detects instability in the prediction window and quickly recovers through adapting gait frequency. Our approach integrates a high-level nonlinear MPC, a posture-aware gait frequency adaptation planner, and a low-level convex locomotion MPC. The planners predict the center of mass (CoM) state trajectories that can be assessed for precursors of potential instability and posture deviation. In simulation, we demonstrate improved maximum recoverable impulse by 131% on average compared with baseline approaches. In hardware experiments, a 125 ms advancement in recovery stepping timing/reflex has been observed with the proposed approach. We also demonstrate improved push-recovery performance and minimized body attitude change under 0.2 rad.

System-Level Safety Monitoring and Recovery for Perception Failures in Autonomous Vehicles

Kaustav Chakraborty, Zeyuan Feng, Sushant Veer, Apoorva Sharma, Boris Ivanovic, Marco Pavone, Somil Bansal

Keywords: Intelligent Transportation Systems, Failure Detection and Recovery, Autonomous Vehicle Navigation

Abstract: The safety-critical nature of autonomous vehicle (AV) operation necessitates development of task-relevant algorithms that can reason about safety at the system level and not just at the component level. To reason about the impact of a perception failure on the entire system performance, such task-relevant algorithms must contend with various challenges: complexity of AV stacks, high uncertainty in the operating environments, and the need for real-time performance. To overcome these challenges, in this work, we introduce a Q-network called SPARQ (abbreviation for Safety evaluation for Perception And Recovery Q-network) that evaluates the safety of a plan generated by a planning algorithm, accounting for perception failures that the planning process may have overlooked. This Q-network can be queried during system runtime to assess whether a proposed plan is safe for execution or poses potential safety risks. If a violation is detected, the network can then recommend a corrective plan while accounting for the perceptual failure. We validate our algorithm using the NuPlan-Vegas dataset, demonstrating its ability to handle cases where a perception failure compromises a proposed plan, while the corrective plan remains safe. We observe an overall accuracy and recall of 90% while sustaining a frequency of 42HZ on the unseen testing dataset. We compare our performance to a popular reachability based baseline and analyzed some interesting properties of our approach in improving the safety properties of an AV pipeline.


Session: Big Data

  • Chair: Danfei Xu, Georgia Institute of Technology
  • Co-Chair: Guangyao Shi, University of Southern California

PlanarNeRF: Online Learning of Planar Primitives with Neural Radiance Fields

Zheng Chen, Qingan Yan, Huangying Zhan, Changjiang Cai, Xiangyu Xu, Yuzhong Huang, Weihan Wang, Ziyue Feng, Yi Xu, Lantao Liu

Keywords: RGB-D Perception, Recognition

Abstract: Identifying spatially complete planar primitives from visual data is a crucial task in computer vision. Prior methods are largely restricted to either 2D segment recovery or simplifying 3D structures, even with extensive plane annotations. We present PlanarNeRF, a novel framework capable of detecting dense 3D planes through online learning. Drawing upon the neural field representation, PlanarNeRF brings three major contributions. First, it enhances 3D plane detection with concurrent appearance and geometry knowledge. Second, a lightweight plane fitting module is used to estimate plane parameters. Third, a novel global memory bank structure with an update mechanism is introduced, ensuring consistent cross-frame correspondence. The flexible architecture of PlanarNeRF allows it to function in both 2D-supervised and self-supervised solutions, in each of which it can effectively learn from sparse training signals, significantly improving training efficiency. Through extensive experiments, we demonstrate the effectiveness of PlanarNeRF in various real-world scenarios and remarkable improvement in 3D plane detection over existing works.

An Addendum to NeBula: Towar

Numerous authors, including Robert Trybula, Lillian Clark

Keywords: Field Robots, Multi-Robot Systems, Software-Hardware Integration for Robot Systems

Abstract: This article presents an appendix to the original NeBula autonomy solution developed by the Team Collaborative SubTerranean Autonomous Robots (CoSTAR), participating in the DARPA Subterranean Challenge. Specifically, this article presents extensions to NeBula’s hardware, software, and algorithmic components that focus on increasing the range and scale of the exploration environment. From the algorithmic perspective, we discuss the following extensions to the original NeBula framework: 1) large-scale geometric and semantic environment mapping; 2) an adaptive positioning system; 3) probabilistic traversability analysis and local planning; 4) large-scale partially observable Markov decision process (POMDP)-based global motion planning and exploration behavior; 5) large-scale networking and decentralized reasoning; 6) communication-aware mission planning; and 7) multimodal ground–aerial exploration solutions. We demonstrate the application and deployment of the presented systems and solutions in various large-scale underground environments, including limestone mine exploration scenarios as well as deployment in the DARPA Subterranean challenge.

Multi-Agent Inverse Q-Learning from Demonstrations

Nathaniel Haynam, Adam Khoja, Dhruv Kumar, Vivek Myers, Erdem Bıyık

Keywords: Multi-Robot Systems, Imitation Learning

Abstract: When reward functions are hand-designed, deep reinforcement learning algorithms often suffer from reward misspecification, causing them to learn suboptimal policies. In the single-agent case, Inverse Reinforcement Learning (IRL) techniques attempt to address this issue by inferring the reward function from expert demonstrations. However, in multi-agent problems, misalignment between the learned and true objectives is exacerbated due to increased environment non-stationarity and variance that scale with multiple agents. As such, in multi-agent general-sum games, multi-agent IRL algorithms have difficulty balancing cooperative and competitive objectives. To address these issues, we propose Multi-Agent Marginal Q-Learning from Demonstrations (MAMQL), a novel sample-efficient framework for multi-agent IRL. For each agent, MAMQL learns a critic marginalized over the other agents’ policies, allowing for a well-motivated use of Boltzmann policies in the multi-agent context. We identify a connection between optimal marginalized critics and single-agent soft-Q IRL, allowing us to apply a direct, simple optimization criterion from the single-agent domain. Across our experiments on three different simulated domains, MAMQL significantly outperforms previous multi-agent methods in average reward, sample efficiency, and reward recovery by often more than 2-5x. We make our code available at https://sites.google.com/view/mamql .

MatchMaker: Automated Asset Generation for Robotic Assembly

Yian Wang, Bingjie Tang, Chuang Gan, Dieter Fox, Kaichun Mo, Yashraj Narang, Iretiayo Akinola

Keywords: Assembly, AI-Enabled Robotics, Computer Vision for Manufacturing

Abstract: Robotic assembly remains a significant challenge due to complexities in visual perception, functional grasping, contact-rich manipulation, and performing high-precision tasks. Simulation-based learning and sim-to-real transfer have led to recent success in solving assembly tasks in the presence of object pose variation, perception noise, and control error; however, the development of a generalist (i.e., multi-task) agent for a broad range of assembly tasks has been limited by the need to manually curate assembly assets, which greatly constrains the number and diversity of assembly problems that can be used for policy learning. Inspired by recent success of using Generative AI to scale up robot learning, we propose MatchMaker, a pipeline to automatically generate diverse, simulation-compatible assembly asset pairs to facilitate learning assembly skills. Specifically, MatchMaker can 1) take a simulation-incompatible, interpenetrating asset pair as input, and automatically convert it into a simulation-compatible, interpenetration-free pair, 2) take an arbitrary single asset as input , and generate a geometrically-mating asset to create an asset pair, 3) automatically erode contact surfaces from (1) or (2) according to a user-specified clearance parameter to generate realistic parts.


Session: Imitation Learning 3

  • Chair: Jens Kober, TU Delft
  • Co-Chair: Erdem Bıyık, University of Southern California

MILE: Model-Based Intervention Learning

Yigit Korkmaz, Erdem Bıyık

Keywords: Imitation Learning, AI-Based Methods, Human Factors and Human-in-the-Loop

Abstract: Imitation learning techniques have been shown to be highly effective in real-world control scenarios, such as robotics. However, these approaches not only suffer from compounding error issues but also require human experts to provide complete trajectories. Although there exist interactive methods where an expert oversees the robot and intervenes if needed, these extensions usually only utilize the data collected during intervention periods and ignore the feedback signal hidden in non-intervention timesteps. In this work, we create a model to formulate how the interventions occur in such cases, and show that it is possible to learn a policy with just a handful of expert interventions. Our key insight is that it is possible to get crucial information about the quality of the current state and the optimality of the chosen action from expert feedback, regardless of the presence or the absence of intervention. We evaluate our method on various discrete and continuous simulation environments, a real-world robotic manipulation task, as well as a human subject study. Videos and the code can be found at https://liralab.usc.edu/mile.

 A Bio-Inspired Sand-Rolling Robot: Effect of Body Shape on Sand Rolling Performance

Xingjue Liao, Wenhao Liu, Hao Wu, Feifei Qian

Keywords: Biologically-Inspired Robots, Biomimetics, Passive Walking

Abstract: The capability of effectively moving on complex terrains such as sand and gravel can empower our robots to robustly operate in outdoor environments, and assist with critical tasks such as environment monitoring, search-and-rescue, and supply delivery. Inspired by the Mount Lyell salamander’s ability to curl its body into a loop and effectively roll down hill slopes, in this study we develop a sand-rolling robot and investigate how its locomotion performance is governed by the shape of its body. We experimentally tested three different body shapes: Hexagon, Quadrilateral, and Triangle. We found that Hexagon and Triangle can achieve a faster rolling speed on sand, but exhibited more frequent failures of getting stuck. Analysis of the interaction between robot and sand revealed the failure mechanism: the deformation of the sand produced a local “sand incline” underneath robot contact segments, increasing the effective region of supporting polygon (ERSP) and preventing the robot from shifting its center of mass (CoM) outside the ERSP to produce sustainable rolling. Based on this mechanism, a highly-simplified model successfully captured the critical body pitch for each rolling shape to produce sustained rolling on sand, and informed design adaptations that mitigated the locomotion failures and improved robot speed by more than 200%. Our results provide insights into how locomotors can utilize different morphological features to achieve robust rolling motion across deformable substrates.

 Jointly Assigning Processes to Machines and Generating Plans for Autonomous Mobile Robots in a Smart Factory

Christopher Leet, Aidan Sciortino, Sven Koenig

Keywords: Path Planning for Multiple Mobile Robots or Agents, Multi-Robot Systems, Industrial Robots

Abstract: A modern smart factory runs a manufacturing procedure using a collection of programmable machines. Typically, materials are ferried between these machines using a team of mobile robots. To embed a manufacturing procedure in a smart factory, a factory operator must a) assign its processes to the smart factory’s machines and b) determine how agents should carry materials between machines. A good embedding maximizes the smart factory’s throughput; the rate at which it outputs products. Existing smart factory management systems solve the aforementioned problems sequentially, limiting the throughput that they can achieve. In this paper we introduce ACES, the Anytime Cyclic Embedding Solver, the first solver which jointly optimizes the assignment of processes to machines and the assignment of paths to agents. We evaluate ACES and show that it can scale to real industrial scenarios.


Note: Every effort was made to include all USC Viterbi-affiliated papers at ICRA 2025. If you believe your work was inadvertently left out, please let us know at cscomms@usc.edu so we can update the list.

 

 

Published on May 23rd, 2025

Last updated on May 27th, 2025