Saurabh Sinha

Wallace H. Coulter Distinguished Chair and
Professor


Contact

  • Saurabh Sinha Google Scholar

Education

  • Ph.D. Computer Science and Engineering (2002), University of Washington
  • M.S. Computer Science and Engineering (2000), University of Washington
  • B.Tech. Computer Science and Engineering (1997), Indian Institute of Technology Kanpur

About

My research program addresses analytical challenges related to gene expression and its regulation, using diverse methods from statistics and machine learning. I have been a PI since 2005, first at the University of Illinois at Urbana-Champaign (Dept. of Computer Science, 2005-22), and then at Georgia Tech (since 2022), where I have a joint faculty appointment in ISyE and BME. We develop computational tools to identify regulators of gene expression and decipher how their influence is encoded in the genome, by analyzing transcriptomics data along with other omics data. Most of these tools are specialized for single-cell data, e.g., the SERGIO tool for simulating scRNA-seq data and the CIMLA tool for discovering inter-condition changes in gene regulation using robust causal inference approaches. We use our tools to uncover regulatory mechanisms underlying diverse phenotypes, including embryonic development, social behavior, cancer drug response. Over the last 4-5 years, our lab’s attention has increasingly turned to analytical challenges in “Spatial Transcriptomics” (ST) technology, which is revolutionizing the measurement of gene expression and thus promises exciting new avenues for our research. We are also developing methods for analyzing other modalities of spatial data, such as spatial metabolomics and microscopy data. I have also served in leadership positions in major federally funded centers.  From 2014 to 2019 I served as Co-Director and Research Lead of an NIH-funded Center of Excellence (BD2K) at UIUC. This Center created “KnowEnG”, a Cloud-based toolbox implementing novel algorithms for network-guided analysis of multi-omics data, and trained ~50 personnel in bioinformatics. As a co-PI and a Thrust lead of the NSF AI Institute for Chemical Synthesis, I recently led development of machine learning approaches to optimize biosynthesis or chemical synthesis strategies. I also serve as a Thrust lead on the NSF Engineering Research Center “CMaT: Center for Manufacturing Technologies” led by Georgia Tech. 

Research

Focus areas of our future work will include (1) multi-omics – where we develop rigorous analytical approaches to combine multiple types of molecular data, e.g., genomics, transcriptomics, epigenomics, metabolomics, to infer a coherent picture of the underlying cellular biology, and (2) spatial omics – where we analyze transcriptomics and other omics data at the sub-cellular resolution to understand dynamic processes shaping the spatial distribution of molecules. Research into these topics will aim to understand the changes accompanying a biological process such as disease progression or behavioral responses, and how the DNA encodes the program for such changes. We will use methods of machine learning and deep learning as well as probabilistic models and biophysical models, separately and in combination, to tackle these challenging problems.

Representative Publications

Dibaeinia, P., Ojha, A., & Sinha, S. (2025). Interpretable AI for inference of causal molecular relationships from omics data. Science Advances, 11(7), eadk0837.


Kumar, A., Schrader, A. W., Aggarwal, B., Boroojeny, A. E., Asadian, M., Lee, J., ... & Sinha, S. (2024). Intracellular spatial transcriptomic analysis toolkit (InSTAnT). Nature communications, 15(1), 7794.

Ghaffari, S., Bouchonville, K. J., Saleh, E., Schmidt, R. E., Offer, S. M., & Sinha, S. (2023). BEDwARS: a robust Bayesian approach to bulk gene expression deconvolution with noisy reference signatures. Genome biology, 24(1), 178.

S. Ghaffari, E. Saleh, A. G. Schwing, Y. Wang, M. D. Burke, S. Sinha. (2024) Robust Model-Based Optimization for Challenging Fitness Landscapes. ICLR 2024.

Dibaeinia, P., & Sinha, S. (2021). Deciphering enhancer sequence using thermodynamics-based models and convolutional neural networks. Nucleic acids research, 49(18), 10309-10327.

Ghaffari, S., Hanson, C., Schmidt, R. E., Bouchonville, K. J., Offer, S. M., & Sinha, S. (2021). An integrated multi-omics approach to identify regulatory mechanisms in cancer metastatic processes. Genome biology, 22(1), 19.

Dibaeinia, P., & Sinha, S. (2020). SERGIO: a single-cell expression simulator guided by gene regulatory networks. Cell systems, 11(3), 252-271.

Blatti III, C., Emad, A., Berry, M. J., Gatzke, L., Epstein, M., Lanier, D., ... & Sinha, S. (2020). Knowledge-guided analysis of" omics" data using the KnowEnG cloud platform. PLoS biology, 18(1), e3000583.