I'm interested in the colonization of retroviruses in their host genome. Despite the association to diseases, retroviruses are active modulators of their host genomes, because integration into the host genome is a key part of the retrovirus life cycle. A group of mule deer retroviruses called Cervid Endogenous Retrovirus (CrERV) are actively colonizing the mule deer genome. This unique system gives us the opportunity to study the colonization of young retrovirus in a free-ranging natural population in action. I am generating a draft reference genome of mule deer, which is the first Cervidae draft genome, to address questions related to CrERV colonization. Do CrERVs preferentially insert into certain genomic locations? What determines their insertion site preference? Which subgroup of CrERV are the most frequent in the mule deer population? What factors govern the rate of CrERVs to covert to solo LTRs? I hope addressing these questions will help us to understand more general properties of (endogenous) retroviruses, particularly at their early stage of colonizing the genome.
With similar approaches, I'm also studying retrovirus dynamics in a human leukemia called Large Granular Lymphocyte (LGL) leukemia. Since retroviral peptide signals have been detected in the LGL leukemia patient's and their spouse's genomes, we investigate the LGL leukemia genomes for possible retrovirus etiology. Given the two possible scenarios of retrovirus etiology, we designed a workflow to detect both clonal and rare retrovirus insertions in the sequenced genome and transcriptome of LGL leukemia. We developed a pipeline to detect retrovirus-sized insertions, which will detect a clonal insertion of a retrovirus; while another pipeline inspired by metagenomics analysis allows us to detect rare retroviruses in the raw sequences that cannot be mapped to the reference genome or assembled.
Before becoming a postdoc and studying the "birth" of endogenous retroviruses, I work on the "death" of another group of transposable elements called Long Interspersed Element-1 (L1). L1s are prevalent in all mammals but have lost their activity in megabats some twenty million years ago. I reconstructed the evolutionary history of L1s in megabats and found that L1s persist as multiple distinct lineages in megabats before their extinction, unlike in most other mammals. The last wave of L1 expansion in the megabats right before their "death" is the strongest wave of all in the detectable L1 evolutionary history. I reconstructed the last active L1 lineage and showed that they can actively transpose in the tissue culture assays.