Past

Events for

Past Seminars and Events
June 10, 2026	Title: Semiparametric Distribution Learning Via Quantile Regression Time: 04:00pm Venue: Room 301, Run Run Shaw Building Speaker(s): Prof. Huixia Judy Wang Remark(s): Abstract "Modern data analysis increasingly requires learning not only average trends, but also heterogeneity, uncertainty, tail behavior, and how information can be fused across heterogeneous data sources. In this talk, I will discuss how the quantile regression process provides a flexible semiparametric approach to these problems by learning conditional distributions without imposing strong parametric assumptions on their shape. I will highlight its role in several modern statistical problems, including multiple imputation, Bayesian inference, extreme quantile analysis, and conformal prediction, where quantile processes can help construct density-based nonconformity scores and prediction regions under complex error distributions. I will also discuss rank-based data integration motivated by the fusion of multiple epigenetic clocks for assessing biological aging. Together, these examples illustate how quantile-based thinking can move beyond mean-centered modeling toward a richer and more robust understanding of variation, uncertainty, and individualized prediction." About the speaker "Huixia (Judy) Wang is the William Marsh Trustee Professor in Data Science and Chair of the Statistics Department at Rice University. She previously held faculty positions at The George Washington University and North Carolina State University and served as a Program Director at the National Science Foundation from 2018 to 2022. Her research spans statistical learning, uncertainty quantification, high-dimensional inference, quantile regression, extreme value theory and applications, spatial data analysis. She is a Fellow of the American Statistical Association and the Institute of Mathematical Statistics, an elected member of the International Statistical Institute, and currently serves as Co-Editor of Statistica Sinica"
June 03, 2026	Title: Stochastic models based on Hawkes and marked Hawkes processes and their applications in insurance Time: 11:00am Venue: Room 301, 3/F, Run Run Shaw Building Speaker(s): Prof. Anatoliy Swishchuk Remark(s): Abstract This talk is devoted to the study of new stochastic models for risk processes basedon Hawkes and marked Hawkes processes and their applications in insurance. We first introduce those models and outline some properties. Then we will present two applications of those models in insurance: solution of Merton optimization problem and finding ruin probabilities. Numerical examples will be presented as well. About the speaker
May 21, 2026	Title: Toward Real-World Autonomous Learning: Adaptive Control, Safe Planning, and On-Device Foundation Models Time: 02:00pm Venue: CB308, 3/F, Chow Yei Ching Building, HKU Speaker(s): Dr. Ma Hao Remark(s): Abstract Recent progress in vision-language-action models has made embodied intelligence increasingly promising, but current robotic demonstrations still expose several system-level bottlenecks, including execution mismatch, inference latency, and limited safety integration at the planning level. In this talk, I will present my research toward autonomous learning in real-world robotics through the joint lens of control, learning, and optimization. I will first introduce a model-based online learning framework for adaptive control, with rigorous convergence guarantees and successful evaluation on a pneumatic table-tennis robot, a soft robotic system, and a heavy-duty excavator. I will then discuss constraint-aware generative planning through a diffusion-based planner for obstacle avoidance in autonomous racing, where constraints are incorporated directly into the planning process. Finally, I will present my work on efficient inference of large foundation models on edge devices under memory and compute constraints, aiming to make large-model capabilities practical for real robotic deployment. Together, these directions form a system-level framework for embodied intelligence that is adaptive, safe, and deployable, and I will conclude by discussing future opportunities in vision-based and multimodal robot learning for contact-rich manipulation. About the speaker Hao Ma is currently a Postdoctoral Researcher at ETH Zurich and a Scientific Researcher at the Max Planck Institute for Intelligent Systems. He received his Bachelor’s degree in Energy and Power Engineering from Jilin University in 2017, his Master’s degree in Automotive Engineering from the Technical University of Munich from 2019 to 2021, and his Doctorate in Dynamic Systems and Control from ETH Zurich from 2022 to 2025. During his Ph.D., he was also affiliated with both ETH Zurich and the Max Planck Institute for Intelligent Systems through the highly competitive Max Planck-ETH Center for Learning Systems Fellowship. His research lies at the intersection of control theory and machine learning, with a focus on enabling robots to learn autonomously in the real world. His current interests include vision-based and multimodal robot learning, contact-rich manipulation, and on-device intelligence, with an emphasis on system-level solutions for real-world robotic autonomy.
May 18, 2026	Title: Beyond LLMS: Architecting the systems backbone for semantic engines and agents Time: 03:00pm Venue: HW312, Haking Wong Building, HKU Speaker(s): Dr. Fatma Özcan Remark(s): Abstract "Large Language Models (LLMs) are redefining analysis across structured and unstructured data, leading to the emergence of two primary architectural paradigms: AI or semantic engines, and data agents. Despite distinct approaches, both architectures encounter pivotal challenges, particularly in optimizing AI operators, agentic pipelines, natural language data interfaces, and AI-powered search. Centrally, embeddings and similarity search are key building blocks. This talk first addresses optimization for semantic operators, presenting an extensive evaluation of proxy models for AI query approximation. The findings demonstrate a greater than 100x cost and latency reduction for semantic filtering (AI.IF) and significant gains for semantic ranking (AI.RANK). Next, the talk examines Filtered Vector Search (FVS), a key component for semantic search and Generative AI (GenAI) applications in modern database systems. A central insight is that optimal algorithm selection is not determined solely by distance‑metric computation costs; rather, system‑level overheads play a substantial and decisive role. Finally, the talk highlights the discovery of relevant data sources as a major bottleneck and introduces a metadata reasoner agent to address this challenge." About the speaker "Fatma Özcan is a Principal Engineer at Systems Research@Google. Her current research focuses on GenAI and data management, vector search, platforms and infra-structure for large-scale data analysis, and natural language interfaces to data. Dr Özcan got her PhD degree in computer science from University of Maryland, College Park, and her BSc degree in computer engineering from METU, Ankara. Before joining Google, she was a Distinguished Research Staff Member and a senior manager at IBM Almaden Research Center. She has over 24 years of experience in industrial research, and has delivered core technologies into various IBM and Google products. She is the co-author of the book ""Heterogeneous Agent Systems"", and co-author of several conference papers and patents. She is an ACM Fellow and serves on the CRA board of directors, and she is the co-chair of CRA-Industry. She received the VLDB Women in Database Research Award in 2022."
May 15, 2026	Title: Towards Trustworthy Medical Intelligence Time: 10:30am Venue: Innovation Wing Two, G/F, Run Run Shaw Building Speaker(s): Dr. Huazhu Fu Remark(s): Abstract Artificial intelligence (AI) has shown transformative potential in healthcare, particularly in medical imaging and clinical decision support. However, real-world deployment of AI systems remains hindered by two fundamental challenges: lack of trustworthiness and limited clinical usability. In this talk, I will discuss recent advances aimed at bridging these gaps. First, I will introduce methodologies for uncertainty quantification and out-of-distribution detection, enabling AI models to recognize when their predictions may be unreliable—a critical feature for patient safety. Second, I will also present GlobeReady, a training-free AI platform designed for fundus disease diagnosis that operates robustly across diverse populations and clinical environments without the need for retraining or technical intervention. Together, these efforts demonstrate a pathway toward developing AI systems that are not only technically robust but also aligned with the needs and workflows of frontline healthcare professionals. About the speaker Dr. Huazhu Fu is a Principal Scientist at the Institute of High Performance Computing (IHPC), Agency for Science, Technology and Research (A*STAR), Singapore. His research focuses on AI for Healthcare and Trustworthy AI. He has authored over 300 publications in leading venues, with more than 40,000 citations on Google Scholar, H-index exceeding 90. He has been recognized as a Clarivate ‘Highly Cited Researcher’ and included in the ‘World's Top 2% Scientists’ list by Stanford. He serves as an Associate Editor for IEEE Transactions on Medical Imaging (TMI), IEEE Transactions on Neural Networks and Learning Systems (TNNLS), and IEEE Journal of Biomedical and Health Informatics (JBHI). He is a Fellow of IET.
May 12, 2026	Title: Anti-concentration inequalities for the difference of maxima of gaussian random vectors Time: 10:30am Venue: Room 301, Run Run Shaw Building Speaker(s): Prof. Shuting Shen Remark(s): Abstract We derive novel anti-concentration bounds for the difference between the maximal values of two Gaussian random vectors across various settings. Our bounds are dimension-free, scaling with the dimension of the Gaussian vectors only through the smaller expected maximum of the Gaussian subvectors. In addition, our bounds hold under the degenerate covariance structures, which previous results do not cover. In addition, we show that our conditions are sharp under the homogeneous component-wise variance setting, while we only impose some mild assumptions on the covariance structures under the heterogeneous variance setting. We apply the new anticoncentration bounds to derive the central limit theorem for the maximizers of discrete empirical processes. Finally, we back up our theoretical findings with comprehensive numerical studies. About the speaker Shen Shuting is an Assistant Professor of Statistics & Data Science at the National University of Singapore. Before joining NUS, she was a postdoctoral fellow at the Fuqua School of Business and the Department of Biostatistics & Bioinformatics at Duke University, jointly supervised by Dr. Alexandre Belloni and Dr. Ethan X. Fang. Prior to her postdoctoral position, she obtained her PhD in Biostatistics from Harvard University in 2023, where she was jointly supervised by Dr. Xihong Lin and Dr. Junwei Lu. She earned a B.A. and a B.S. in Mathematics (dual) from Peking University in 2018. Her research interests primarily include large-scale inference, combinatorial inference, choice model asymptotics, operations research theories, applied probability, and distributed computing.
May 11, 2026	Title: From cross-modal alignment to hierarchical sharing: statistical foundations of contrastive learning for multimodal data Time: 02:30pm Venue: Room 301, Run Run Shaw Building Speaker(s): Prof. Doudou Zhou Remark(s): Abstract "Multimodal data are increasingly common in modern biomedical and machine learning applications yet learning useful representations from heterogeneous modalities remains challenging. A central issue is that different modalities may contain complementary information, but the extent and pattern of information sharing can vary substantially across modalities. In this talk, I will present two recent works that develop statistical foundations for contrastive learning in multimodal settings. The first focuses on electronic health records and studies how structured clinical codes and unstructured clinical notes can be jointly embedded through a multimodal contrastive framework. This approach connects the contrastive objective to a pointwise mutual information matrix, yielding an interpretable and privacy-preserving algorithm based on summarylevel co-occurrence information. The second work moves beyond the conventional sharedversus-private decomposition and introduces a hierarchical framework that learns globally shared, partially shared, and modality-specific representations within a unified model. I will discuss the key modeling ideas, identifiability results, recovery guarantees, and implications for downstream prediction. Together, these works highlight how principled statistical modeling can improve both the interpretability and effectiveness of multimodal representation learning." About the speaker Doudou Zhou is an Assistant Professor of Statistics & Data Science at the National University of Singapore. His research lies at the intersection of statistics, machine learning, and artificial intelligence, with a focus on statistical learning theory, multimodal data integration, electronic health records, and the evaluation of large language models. He develops principled methods for learning from noisy, heterogeneous, and partially observed data, with applications in biomedicine and modern AI systems. Title: Generative AI for Drug Discovery: From High-Resolution Proteomics to Autonomous Scientific Workflows Time: 10:00am Venue: CB308, 3/F, Chow Yei Ching Building, HKU (Zoom broadcasting) Speaker(s): Prof. Sun Siqi Remark(s): Abstract The integration of generative AI into drug discovery is moving beyond simple structure prediction toward a more comprehensive and autonomous pipeline. In this talk, I will focus on our recent efforts to accelerate AI-driven drug discovery (AIDD) through a multi-layered approach. I will first present our work on de novo protein and peptide sequencing, which enables the high-resolution data acquisition necessary for identifying novel targets. I will then delve into our core research on biomolecular structure prediction, discussing how we optimize these models for the specific challenges of therapeutic design. Finally, I will briefly explore how these generative tools are setting the stage for agentic science, where autonomous systems begin to orchestrate complex discovery workflows. About the speaker Siqi Sun is an associate professor at Fudan University and a researcher at the Shanghai AI Lab. He previously served as a researcher at Microsoft Research, Redmond. He holds a PhD from the Toyota Technological Institute at Chicago (TTIC) and a bachelor's degree in Mathematics from Fudan University. His research focuses on AI for science, specifically developing generative models and standardized benchmarks for proteomics and structural biology.
May 07, 2026	Title: Choosing the right stochastic block model Time: 04:00pm Venue: CB 328 Speaker(s): Dr. Max Jerdee Remark(s): Abstract Many types of stochastic block models (SBMs) have been proposed and used to model community structure in networks. In a social network, for example, these methods can reveal tightly-knit friend groups. Across the literature, these models variously appear in canonical and microcanonical, degree-corrected and non-degree corrected, assortative and non-assortative forms. When applied to the same network, variants of the model often yield markedly different groupings of nodes and so produce competing interpretations and predictions. We introduce a parametric model that directly generalizes many of these forms, allowing us to for instance interpolate between a non degree-corrected and a degree-corrected SBM. We discuss how the posterior distribution of the parameter that bridges these models not only reveals which endpoint better represents the network, but also itself measures something meaningful about the network, in this case the inequality of degrees within communities. While individual SBMs can identify interpretable groups of nodes under restricted assumptions, we demonstrate that in an unsupervised, purely data-driven sense (model evidence and predictive power), our generalized model routinely adjudicates between and out-performs existing SBM variants on real-world networks. This unified picture allows us to precisely identify the assumptions latent within each of these models and select between them as appropriate for empirical networks. About the speaker Max is a Omidyar Postdoctoral Fellow at the Santa Fe Institute where he works on various problems in math, physics, and statistics related to network science. He aims to understand the mechanisms driving the formation of observed network structures and to explore the fundamental limits of what such methods can reveal. Max holds a B.A. in Physics from Princeton University and a Ph.D. in Physics from the University of Michigan.
April 29, 2026	Title: When Quantum Causal Structures Diverge from their Classical Counterparts Time: 11:00am Venue: CB 308 Speaker(s): Dr. Elie Wolfe Remark(s): Abstract In classical causal modelling it is conventional to group together “indistinguishable” scenarios; that is, to use a single graphical model to represent all the different latent-variable structures that generate the same operationally testable predictions. Equivalence rules which hold in the classical setting, however, can break down in the quantum setting. I will discuss my group’s recent work regarding causal scenarios with intermediate latent variables, where different quantum structures can be distinguished in ways that have no classical analogue. I will summarize prior work establishing that replacing classical hidden common causes by quantum systems often broadens the set of correlations admitting causal explanation. I will then highlight that such “causal quantum-ization” fundamentally reorganizes the landscape of which causal structures are operationally distinguishable. To capture these new distinctions, we will leverage tools such as monogamy of nonlocal correlations and semidefinite-programming hierarchies. The talk will summarize arXiv:2412.10238, will introduce (unpublished!) results regarding the (astonishing!) causal utility of quantum secret sharing codes, and will conclude with some (tantalizing!) open questions. About the speaker Elie Wolfe is a Research Scientist at the Perimeter Institute for Theoretical Physics. His research lies at the intersection of quantum foundations, information, and causality. He studies diverse topics such as causal modelling, quantum networks, and contextuality, all through the unifying theme of distinguishing classical, quantum, and post-quantum operational theories.

Sign in to your account