For robots to function as home assistants or caregivers, they must solve sequential multi-object manipulation tasks that require spatial-temporal reasoning and process high-dimensional sensor data as input. A key challenge in this domain is connecting sensory input to a unified representation that supports spatial-temporal reasoning for sequential manipulation tasks. To address this challenge, this talk discusses how to develop a latent representation that integrates spatial-temporal reasoning with sensory data. This representation captures both geometric and symbolic effects of actions within a shared latent space, enabling robots to perform complex, long-horizon manipulation tasks in real-world environments.
Yixuan Huang is an incoming postdoctoral scholar at Princeton University, where he will work with Prof. Tom Silver. He completed his Ph.D. at the University of Utah, advised by Prof. Tucker Hermans. During his Ph.D., he was a visiting student researcher at Stanford University, working with Prof. Jeannette Bohg. His research focuses on the intersection of robot learning, planning, and manipulation. He is a recipient of the ICRA 2025 Doctoral Consortium, and his work was a best paper award finalist at ISMR 2021.