Modeling, Understanding, and Interacting with the 3D World
时间:2026-06-22 

讲座主题: Modeling, Understanding, and Interacting with the 3D World

讲座时间: 2026年6月25日(星期四)13:30-15:30

讲座地点: 复旦大学江湾校区交叉二号楼B2005会议室

联系人: 陈涛

讲座摘要: The rapid rise of large language models has brought AI into people’s daily lives and is reshaping many aspects of society. It is increasingly recognized that AI’s success in the digital domain must be extended to the real 3D world, ultimately enabling robotic AI systems to live and work in physical environments. Achieving this goal requires models that can effectively model, understand, and interact with the 3D world. In this talk, I will present our recent research spanning 3D object generation, dynamic scene understanding, geometric and spatial reasoning, world models, and a controllable action policy approach. In particular, I will introduce Stream3D, a scalable framework for streaming and consistent 3D generation from sparse observations; PAGE-4D, a dynamic-aware 4D reconstruction model that jointly estimates geometry and camera motion in dynamic scenes; GeoWorld, a geometry-grounded world modeling framework that improves spatial reasoning and physical consistency in vision-language models; GEM-4D, a geometry-enhanced world model that aligns generative dynamics with structured geometric representations for robotic manipulation; SteerAct, a controllable action policy approach to steer vision-language-action models and world action models for better generalization performance. Together, these works highlight a pathway toward robotic AI systems that can robustly perceive, predict, and act in the real world.

主讲人介绍: Dr. Mengyu Wang (name in Chinese: 王孟渝) is an Associate Professor with appointments at Harvard Medical School, Kempner Institute for the Study of Natural and Artificial Intelligence at Harvard University, Harvard Data Science Initiative, and Broad Institute of MIT and Harvard. Dr. Mengyu Wang has interests spanning generative AI for computer vision, multimodal large language model behaviors and agents, AI for robotics, AI for genomics, and various other AI applications in medicine.