Abstract
Deploying foundation model services is crucial to contemporary AI applications. We focus on deploying such services in heterogeneous, potentially decentralized settings to mitigate the substantial costs typically associated with centralized data centers. Our work relies on carefully designed scheduling algorithms and integrated system optimizations to fully unleash the potential of heterogeneous computational power across comprehensive serving paradigms, including data preparation pipelines, large-scale pretraining, reinforcement learning-based alignment, and agentic inference deployment.
About the speaker
Binhang YUAN is an Assistant Professor at the Department of Computer Science and Engineering (CSE), the Hong Kong University of Science and Technology (HKUST) since 2023. He received his Ph.D. and master's degrees from Rice University and his bachelor's degree from Fudan University. Before joining HKUST, he was a Postdoc at the Swiss Federal Institute of Technology Zurich (ETH Zurich). His main research interests are in distributed, decentralized, and heterogeneous machine learning systems for foundation models.
