A02: Phylogeny of functional sequence elements in virus genomes 

Projects of the CRC 1768

A02: Do the shape and size of quasispecies reflect the host range of viruses?

Viral phylogenies are commonly built from selected open reading frames (ORFs) or genes and ignore recent discoveries on viral genome complexity from functional genomics (omics) studies on virus-infected cells. These omics studies use RNA-seq, Ribo-seq, SHAPE-seq or other sequencing-based assays and vastly extended our knowledge on viral genomes by detecting numerous novel functional sequence elements (FSEs). However, these studies commonly ignore one fundamental question: Are these novel FSEs conserved during virus evolution and thus likely to play an important role in the virus life cycle?
FSEs identified in omics studies include short ORFs (sORFs) with <100 nucleotides, e.g., upstream ORFs (uORFs) within 5’ untranslated regions (UTRs) of other ORFs, or alternative proteins generated from the same locus through programmed ribosomal frameshifting or alternative splicing. In addition, novel viral non-coding RNAs like circular RNAs (circRNAs) and microRNAs (miRNAs) have been discovered. Furthermore, binding sites of host RNA and DNA binding proteins in viral DNA or RNA can now be determined at large scale. These FSEs cannot be predicted from sequence alone and some FSEs have to form specific RNA structures to be functional. To date, no standardised, comprehensive tool is available to detect different types of viral FSEs from omics data and analyse their conservation; existing phylogenetics approaches focus only on protein- genes.

Tools developed in this project combine phylogenetic analyses to characterise the evolution of FSEs with the detection of previously unknown FSEs from omics data to obtain improved robust virus phylogenies.

In this project, we will close this gap by developing tools to identify FSEs that are conserved in sequence and/or structure for (1) reconstructing their evolutionary histories; (2) incorporating them into robust virus phylogenies; and (3) predicting potential functional roles. As recombination is an important evolutionary process that affects many viruses, we will implement a method for recombination-aware reconstruction of phylogenies. We will therefore contribute to central goals G1, G2, and G3 of the CRC VirusREvolution. Our tools will initially be developed for SARS-CoV-2, vibriophage N4, HBV, and HSV-1 and will be generalised to other viruses in subsequent funding phases. Here, inclusion of ancient HBV and HSV-1 genomes and recombination events will also enable us to describe the evolutionary histories of viruses spanning several thousand years. Genome annotations extended with conserved FSEs will be incorporated into VirJenDB within NFDI4Microbiota.

Project Overview

Our project is based on the hypothesis that the integration of omics-based FSEs detection with phylogenetic analysis will allow us to both uncover the functional importance of individual FSEs and improve our understanding of virus evolution. For this purpose, we will pursue the following objectives: (1) Develop standardised pipelines for identifying novel FSEs in viral genomes from omics data; (2) Characterise their evolutionary histories and conservation; (3) Incorporate FSEs into robust virus phylogenies; (4) Identify sequence and RNA structure constraints for both known and novel FSEs and their link to FSE function. Here, we will focus on a wide range of FSEs, including UTRs, translated (s)ORFs, PRF elements, IRES, miRNAs, circRNAs, (alternative) splicing and polyadenylation, and binding sites of RBPs and DNA binding proteins (for DNA viruses and phages). Our tools will initially be developed for SARS-CoV-2, vibriophage N4, HBV, and HSV-1 and will be generalised to other viruses in subsequent funding periods. By inclusion of ancient HBV and HSV-1 genomes, we will aim to describe the evolutionary histories of viruses and their FSEs spanning several thousand years. Similarly, rapid evolution and divergence of SARS-CoV-2 in the last 5 years and the wealth of data on variant genomes available provides the unique opportunity to study the evolution of FSEs in the context of host immune evasion. By modelling recombination in SARS-CoV-2 and HBV, we will investigate the link between recombination and its impact on FSE evolution. Tool to be developed: Tool for identifying conserved FSEs from omics data.

  • Tool to be developed: Tool for identifying conserved FSEs from omics data
  • Tool to be developed: Tool for phylogeny reconstruction and conservation analysis for viral FSEs

Hypothesis enabled by the proposed tool: This will allow the investigation of whether and how different FSEs are conserved during virus evolution and how this links to the general evolution of these viruses.

Overarching CRC goals: Our project develops and applies a quasispecies Sequence Variation Graph (qs-SVG) toolkit to capture, annotate, and quantify intra-population viral diversity from long/short-read and metagenomic data, enabling rapid characterization of emerging viruses and their mutational mechanisms (G1). By deploying qs-SVG across human viruses and environmental phages, the project dissects how ecological context and host diversity shape quasispecies structure, deriving generalisable rules and experimentally testable trajectories of host-range evolution (G2, G3).

Work Packages (WP):

  • WP 1: Identification of functional sequence elements (Friedel)
  • WP 2: Evolutionary history of functional sequence elements (Kühnert)
  • WP 3: Quantifying conservation of functional sequence elements (Kühnert)
  • WP 4: Evaluating functional roles and impact of sequence variation for FSEs (Friedel)

Team Members

Prof. Dr. Caroline Friedel

Project Leader

Dr. Denise Kühnert

Project Leader

Dr. Jens-Uwe Ulrich

PostDoc

PhD A02 2

PhD Student

PhD A02 2

PhD Student

2025

Daodu, Richard Olumide; Awotoro, Ebenezer; Ulrich, Jens-Uwe; Kühnert, Denise

CLASV: Rapid Lassa virus lineage assignment with random forest. Journal Article

In: PLoS Negl Trop Dis, vol. 19, iss. 9, pp. e0013512, 2025, ISSN: 1935-2735.

Links | BibTeX

2024

Gomez, Luis Roger Esquivel; Weber, Ariane; Kocher, Arthur; Kühnert, Denise

Recombination-aware phylogenetic analysis sheds light on the evolutionary origin of SARS-CoV-2. Journal Article

In: Sci. Rep., vol. 14, iss. 1, pp. 541, 2024, ISSN: 2045-2322.

Links | BibTeX

2023

Djakovic, Lara; Hennig, Thomas; Reinisch, Katharina; Milić, Andrea; Whisnant, Adam W; Wolf, Katharina; Weiß, Elena; Haas, Tobias; Grothey, Arnhild; Jürges, Christopher S; Kluge, Michael; Wolf, Elmar; Erhard, Florian; Friedel, Caroline C; Dölken, Lars

The HSV-1 ICP22 protein selectively impairs histone repositioning upon Pol II transcription downstream of genes. Journal Article

In: Nat Commun, vol. 14, iss. 1, no. 1, pp. 4591, 2023, ISSN: 2041-1723.

Links | BibTeX

2021

Smith, Maureen Rebecca; Trofimova, Maria; Weber, Ariane; Duport, Yannick; Kühnert, Denise; Kleist, Max

Rapid incidence estimation from SARS-CoV-2 genomes reveals decreased case detection in Europe during summer 2020. Journal Article

In: Nat Commun, vol. 12, iss. 1, no. 1, pp. 6009, 2021, ISSN: 2041-1723.

Links | BibTeX

Kocher, Arthur; Papac, Luka; Barquera, Rodrigo; Key, Felix M; Spyrou, Maria A; Hübler, Ron; Rohrlach, Adam B; Aron, Franziska; Stahl, Raphaela; Wissgott, Antje; Bömmel, Florian; Pfefferkorn, Maria; Mittnik, Alissa; Villalba-Mouco, Vanessa; Hansen, Svend; Kitov, Egor P; Dobeš, Miroslav; Ernée, Michal; Meller, Harald; Alt, Kurt W; Prüfer, Kay; Warinner, Christina; Schiffels, Stephan; Stockhammer, Philipp W; Bos, Kirsten; Posth, Cosimo; Herbig, Alexander; Haak, Wolfgang; Krause, Johannes; Kühnert, Denise

Ten millennia of hepatitis B virus evolution. Journal Article

In: Science, vol. 374, iss. 6564, no. 6564, pp. 182–188, 2021, ISSN: 1095-9203.

Links | BibTeX

2020

Whisnant, Adam W; Jürges, Christopher S; Hennig, Thomas; Wyler, Emanuel; Prusty, Bhupesh; Rutkowski, Andrzej J; L'hernault, Anne; Djakovic, Lara; Göbel, Margarete; Döring, Kristina; Menegatti, Jennifer; Antrobus, Robin; Matheson, Nicholas J; Künzig, Florian W H; Mastrobuoni, Guido; Bielow, Chris; Kempa, Stefan; Liang, Chunguang; Dandekar, Thomas; Zimmer, Ralf; Landthaler, Markus; Grässer, Friedrich; Lehner, Paul J; Friedel, Caroline C; Erhard, Florian; Dölken, Lars

Integrative functional genomics decodes herpes simplex virus 1 Journal Article

In: Nat Commun, vol. 11, iss. 1, pp. 2038, 2020.

Links | BibTeX

2018

Kühnert, Denise; Kouyos, Roger; Shirreff, George; Pečerska, Jūlija; Scherrer, Alexandra U; Böni, Jürg; Yerly, Sabine; Klimkait, Thomas; Aubert, Vincent; Günthard, Huldrych F; Stadler, Tanja; Bonhoeffer, Sebastian; Study, Swiss HIV Cohort

Quantifying the fitness cost of HIV-1 drug resistance mutations through phylodynamics. Journal Article

In: PLoS Pathog, vol. 14, iss. 2, no. 2, pp. e1006895, 2018, ISSN: 1553-7374.

Links | BibTeX

Hennig, Thomas; Michalski, Marco; Rutkowski, Andrzej J; Djakovic, Lara; Whisnant, Adam W; Friedl, Marie-Sophie; Jha, Bhaskar Anand; Baptista, Marisa A P; L'Hernault, Anne; Erhard, Florian; Dölken, Lars; Friedel, Caroline C

HSV-1-induced disruption of transcription termination resembles a cellular stress response but selectively increases chromatin accessibility downstream of genes. Journal Article

In: PLoS Pathog, vol. 14, iss. 3, no. 3, pp. e1006954, 2018, ISSN: 1553-7374.

Links | BibTeX

2017

Bonfert, Thomas; Friedel, Caroline C

Prediction of poly(A) sites by poly(A) read mapping. Journal Article

In: PLoS One, vol. 12, iss. 1, no. 1, pp. e0170914, 2017, ISSN: 1932-6203.

Links | BibTeX

2015

Rutkowski, Andrzej J; Erhard, Florian; L'Hernault, Anne; Bonfert, Thomas; Schilhabel, Markus; Crump, Colin; Rosenstiel, Philip; Efstathiou, Stacey; Zimmer, Ralf; Friedel, Caroline C; Dölken, Lars

Widespread disruption of host transcription termination in HSV-1 infection. Journal Article

In: Nat Commun, vol. 6, pp. 7126, 2015, ISSN: 2041-1723.

Links | BibTeX