Yifei  Huang

Yifei Huang

Main Content

Assistant Professor of Biology

514A Wartik
University Park, PA 16802
Phone: (814) 863-6829


  1. Ph.D., McMaster University, 2014
  2. M.Sc., Beijing Normal University, 2009
  3. B.Sc., Zhengzhou University, 2006

Postdoc Training

  1. Cold Spring Harbor Laboratory, 2015-2018
  2. University of British Columbia, 2014-2015

Research Interests

Millions of genetic variants have been identified in human genomes and the catalog of genetic variation is still expanding rapidly due to the continual drop of sequencing costs. Understanding the functional, clinical, and evolutionary significance of genetic variants has become a central question in biology and precision medicine. However, it is very challenging to distinguish important variants from neutral ones. Therefore, many genetic variants in patients’ genomes are marked as “variant of uncertain significance”, forming a major hurdle for both basic research and medical practice.

I am interested in addressing the problem of “variant of uncertain significance” by unifying evolutionary biology and machine learning. My research is motivated by the insight that evolution operates like a high-throughput mutagenesis experiments: deleterious mutations are quickly purged from populations due to natural selection, which in turn leaves detectable marks on human genomic sequences. I have developed multiple machine learning and statistical frameworks to identify the signatures of deleterious variants from population and functional genomic data. Not only have these computational methods provided useful insights into human evolution but also have been applied to prioritize causal variants associated with human genetic disorders.

Selected Publications

Huang YF, Siepel A (2018) Estimation of allele-specific fitness effects across human protein-coding sequences and implications for disease. bioRxiv doi:

Fang H, Huang YF, Radhakrishnan A, Siepel A, Lyon GJ, Schatz MC (2018) Scikit-ribo enables accurate estimation and robust modeling of translation dynamics at codon resolution. Cell systems 6:180-191.e4

Ramani R, Krumholz K, Huang YF, Siepel A (2018) PhastWeb: a web interface for evolutionary conservation scoring of multiple sequence alignments using phastCons and phyloP. Bioinformatics, In Press

Huang YF, Siepel A (2017) Fast, scalable prediction of deleterious noncoding variants from functional and population genomic data. Nature Genetics 49:618-624

Dukler N*, Gulko B*, Huang YF*, Siepel A (2017) Is a super-enhancer greater than the sum of its parts? Nature Genetics 49:2-3

Dukler N, Booth GT, Huang YF, Tippens N, Waters CT, Danko CG, Lis JT, Siepel A (2017) Nascent RNA sequencing reveals a dynamic global transcriptional response at genes and enhancers to the natural medicinal compound celastrol. Genome Research 27:1816-1829

Huang YF, Golding GB (2015) FuncPatch: a web server for the fast Bayesian inference of conserved functional patches in protein 3D structures. Bioinformatics 31:523-531

Huang YF, Golding GB (2014) Phylogenetic Gaussian process model for the inference of functionally important regions in protein tertiary structures. PLoS Computational Biology 10:e1003429

Huang YF, Golding GB (2012) Inferring sequence regions under functional divergence in duplicate genes. Bioinformatics 28:176-183