The School of Computing and Data Science (https://www.cds.hku.hk/) was established by the University of Hong Kong on 1 July 2024, comprising the Department of Computer Science and Department of Statistics and Actuarial Science and Department of AI and Data Science.

Abstract

The rapid rise of large language models has brought AI into people’s daily lives and is reshaping many aspects of society. It is increasingly recognized that AI’s success in the digital domain must be extended to the real 3D world, ultimately enabling robotic AI systems to live and work in physical environments. Achieving this goal requires models that can effectively model, understand, and interact with the 3D world. In this talk, I will present our recent research spanning 3D object generation, dynamic scene understanding, geometric and spatial reasoning, world models, and active vision systems. In particular, I will introduce Stream3D, a scalable framework for streaming and consistent 3D generation from sparse observations; PAGE-4D, a dynamic-aware 4D reconstruction model that jointly estimates geometry and camera motion in dynamic scenes; GeoWorld, a geometry-grounded world modeling framework that improves spatial reasoning and physical consistency in vision-language models; GEM, a geometry-enhanced world model that aligns generative dynamics with structured geometric representations for robotic manipulation; and an active vision system that enables robots to actively perceive the world, improve scene understanding, and increase manipulation success through closed-loop interaction. Together, these works highlight a pathway toward robotic AI systems that can robustly perceive, predict, and act in the real world.

About the speaker

Prof. Mengyu Wang is an Associate Professor with appointments at Harvard Medical School, Kempner Institute for the Study of Natural and Artificial Intelligence at Harvard University, Harvard Data Science Initiative, and Broad Institute of MIT and Harvard. Prof. Mengyu Wang has interests spanning generative AI for computer vision, multimodal large language model behaviors and agents, AI for robotics, AI for genomics, and various other AI applications in medicine.

 

Division of Computer Science,
School of Computing and Data Science

Rm 207 Chow Yei Ching Building
The University of Hong Kong
Pokfulam Road, Hong Kong
香港大學計算與數據科學學院, 計算機科學系
香港薄扶林道香港大學周亦卿樓207室

Email: csenq@hku.hk
Telephone: 3917 3146

Copyright © School of Computing and Data Science, The University of Hong Kong. All rights reserved.
Don't have an account yet? Register Now!

Sign in to your account