Viral phylogenies are commonly built from selected open reading frames (ORFs) or genes and ignore recent discoveries on viral genome complexity from functional genomics (omics) studies on virus-infected cells. These omics studies use RNA-seq, Ribo-seq, SHAPE-seq or other sequencing-based assays and vastly extended our knowledge on viral genomes by detecting numerous novel functional sequence elements (FSEs). However, these studies commonly ignore one fundamental question: Are these novel FSEs conserved during virus evolution and thus likely to play an important role in the virus life cycle?
FSEs identified in omics studies include short ORFs (sORFs) with <100 nucleotides, e.g., upstream ORFs (uORFs) within 5’ untranslated regions (UTRs) of other ORFs, or alternative proteins generated from the same locus through programmed ribosomal frameshifting or alternative splicing, Fig. A02.1. In addition, novel viral non-coding RNAs like circular RNAs (circRNAs) and microRNAs (miRNAs) have been discovered. Furthermore, binding sites of host RNA and DNA binding proteins in viral DNA or RNA can now be determined at large scale. These FSEs cannot be predicted from sequence alone and some FSEs have to form specific RNA structures to be functional.
To date, no standardised, comprehensive tool is available to detect different types of viral FSEs from omics data and analyse their conservation; existing phylogenetics approaches focus only on protein- genes. In this project, we will close this gap by developing tools to identify FSEs that are conserved in sequence and/or structure for (1) reconstructing their evolutionary histories; (2) incorporating them into robust virus phylogenies; and (3) predicting potential functional roles. As recombination is an important evolutionary process that affects many viruses, we will implement a method for recombination-aware reconstruction of phylogenies. We will therefore contribute to central goals G1, G2, and G3 of the CRC VirusREvolution. Our tools will initially be developed for SARS-CoV-2, vibriophage N4, HBV, and HSV-1 and will be generalised to other viruses in subsequent funding phases. Here, inclusion of ancient HBV and HSV-1 genomes and recombination events will also enable us to describe the evolutionary histories of viruses spanning several thousand years. Genome annotations extended with conserved FSEs will be incorporated into VirJenDB within NFDI4Microbiota.
Project Overview
Our project is based on the hypothesis that the integration of omics-based FSEs detection with phylogenetic analysis will allow us to both uncover the functional importance of individual FSEs and improve our understanding of virus evolution. For this purpose, we will pursue the following objectives: (1) Develop standardised pipelines for identifying novel FSEs in viral genomes from omics data; (2) Characterise their evolutionary histories and conservation; (3) Incorporate FSEs into robust virus phylogenies; (4) Identify sequence and RNA structure constraints for both known and novel FSEs and their link to FSE function. Here, we will focus on a wide range of FSEs, including UTRs, translated (s)ORFs, PRF elements, IRES, miRNAs, circRNAs, (alternative) splicing and polyadenylation, and binding sites of RBPs and DNA binding proteins (for DNA viruses and phages), Fig. A02.1, Tab. A02.1. Our tools will initially be developed for SARS-CoV-2, vibriophage N4, HBV, and HSV-1 and will be generalised to other viruses in subsequent funding periods. By inclusion of ancient HBV and HSV-1 genomes, we will aim to describe the evolutionary histories of viruses and their FSEs spanning several thousand years. Similarly, rapid evolution and divergence of SARS-CoV-2 in the last 5 years and the wealth of data on variant genomes available provides the unique opportunity to study the evolution of FSEs in the context of host immune evasion. By modelling recombination in SARS-CoV-2 and HBV, we will investigate the link between recombination and its impact on FSE evolution. Tool to be developed: Tool for identifying conserved FSEs from omics data.
- Tool to be developed: Tool for identifying conserved FSEs from omics data
- Tool to be developed: Tool for phylogeny reconstruction and conservation analysis for viral FSEs
Hypothesis enabled by the proposed tool: This will allow the investigation of whether and how different FSEs are conserved during virus evolution and how this links to the general evolution of these viruses.
Overarching CRC goals: Our project develops standardised tools to detect conserved FSEs from multiomics and reconstructs structure- and recombination-aware phylogenies that extend beyond protein coding genes (G1, G2). By comparing FSE conservation across several viruses, the project derives generalisable rules linking viral genome architecture to virus-host interaction and evolutionary trajectories (G3)
- WP 1: Identification of functional sequence elements (Friedel)
- WP 2: Evolutionary history of functional sequence elements (Kühnert)
- WP 3: Quantifying conservation of functional sequence elements (Kühnert)
- WP 4: Evaluating functional roles and impact of sequence variation for FSEs (Friedel)
Team
Prof. Dr. Caroline Friedel
Principle Investigator
Dr. Denise Kühnert
Principle Investigator
Dr. Jens-Uwe Ulrich
Postdoc
Project- and subject-related list of publications
2024
Recombination-aware phylogenetic analysis sheds light on the evolutionary origin of SARS-CoV-2 Journal Article
In: Scientific Reports, vol. 14, no. 1, pp. 541, 2024.
2023
The HSV-1 ICP22 protein selectively impairs histone repositioning upon Pol II transcription downstream of genes Journal Article
In: Nature communications, vol. 14, no. 1, pp. 4591, 2023.
2022
The source of the Black Death in fourteenth-century central Eurasia Journal Article
In: Nature, vol. 606, no. 7915, pp. 718–724, 2022.
2021
Ten millennia of hepatitis B virus evolution Journal Article
In: Science, vol. 374, no. 6564, pp. 182–188, 2021.
Rapid incidence estimation from SARS-CoV-2 genomes reveals decreased case detection in Europe during summer 2020 Journal Article
In: Nature Communications, vol. 12, no. 1, pp. 6009, 2021.
2020
Integrative functional genomics decodes herpes simplex virus 1 Journal Article
In: Nature communications, vol. 11, no. 1, pp. 2038, 2020.
2018
Quantifying the fitness cost of HIV-1 drug resistance mutations through phylodynamics Journal Article
In: PLoS pathogens, vol. 14, no. 2, pp. e1006895, 2018.
HSV-1-induced disruption of transcription termination resembles a cellular stress response but selectively increases chromatin accessibility downstream of genes Journal Article
In: PLoS pathogens, vol. 14, no. 3, pp. e1006954, 2018.
2017
Prediction of poly (A) sites by poly (A) read mapping Journal Article
In: PLoS One, vol. 12, no. 1, pp. e0170914, 2017.
2015
Widespread disruption of host transcription termination in HSV-1 infection Journal Article
In: Nature communications, vol. 6, no. 1, pp. 7126, 2015.