Harold E. Smalley Early Career Professor and
Assistant Professor
Education
- Ph.D. Statistics (2019), Harvard University
- A.M. Statistics (2016), Harvard University
- B.Sc. Actuarial Science (2014), University of Hong Kong
Dr. Yang's research focuses on time series analysis, developing both theory and methods at the intersection of statistical modeling and modern deep learning:
Physics-Informed Machine Learning for Dynamical Systems. Dr. Yang develops Gaussian process methods that embed the structure of ordinary differential equations directly into statistical inference. This line of work enables principled parameter estimation, uncertainty quantification, and change-point detection in dynamic systems from noisy and sparse data. Key contributions include the MAGI framework for ODE inference (PNAS 2021), extensions to PDE systems (SIAM/ASA JUQ 2024), and accompanying open-source software on CRAN and GitHub.
Attention Mechanisms and Transformers for Time Series. Dr. Yang's group designs transformer architectures tailored to the structure of time series data, bridging classical autoregressive models with modern attention mechanisms. Recent contributions include methods that align linear attention with VAR models, construct auxiliary time series as exogenous variables for multivariate forecasting, and develop zero-sum linear attention for efficient transformers. These works has been published at NeurIPS (Spotlight), ICML, and ICLR.
Applications in Infectious Disease Forecasting. These methodological advances are applied to real-time forecasting of influenza, COVID-19, dengue, and other infectious diseases, integrating internet search data, electronic health records, and epidemiological surveillance. Dr. Yang's forecasting models have been featured on the CDC FluSight website and covered by outlets including CNN, Voice of America, and Ars Technica.
Dr. Yang teaches courses in regression, forecasting, and multivariate data analysis at both the undergraduate and graduate levels, including ISYE 4031 (Regression and Forecasting) and ISYE 7405 (Multivariate Data Analysis). He also developed a graduate special topics course on data-driven infectious disease prediction.