
Project Area A
Projects of the CRC 1768
Project Area A
Tools for nucleotide sequences and regulation
Even annotation of virus genomes cannot be adequately performed with off-the-shelf tools due to largely incomplete databases. Bioinformatic analysis of the host response is established on the transcriptome level. However, signalling pathways of individual viruses have to be revealed. Efforts were made during the COVID-19 pandemic to predict virus infection severity using available sequence data due to the vast amount of training data available. Yet efforts were in large parts futile, especially for machine learning approaches. Beyond the threat of virus infections, bacteriophages may serve as medical treatments: The “One Health” approach emphasises the need for scientific, economic, policy, and legal measures to promote their use, as demonstrated by vibriophage N4. A potential explanation for the unsatisfactory performance of current bioinformatic tools is that developers tried to leapfrog virus understanding; rather, machine learning models were trained to jump from sequence to diagnosis directly. As part of this CRC, we will develop computational methods that approach virus sequence data at a nuanced and “more understanding” pace. Project A01 will develop a bioinformatic tool to monitor virus development using Deep Sequencing data, which allows the quasispecies reconstruction of viruses using haplotype analysis. This approach will also lead to a fundamental understanding of the host range of viruses in correlation with their quasispecies size. This novel data structure may also contain novel annotation features beyond single-gene information, such as transcription start sites (of e.g. vibriophage N4) or RNA secondary structures (of e.g. SARS-CoV-2), which can be directly used in project A02 for a more accurate phylogenetic tree reconstruction tool for virus quasispecies or ancient viruses (> 2 000 years old). A03 will provide computational methods for the analysis of transcriptomic data of the heavily infected host cell. Here, data analysis is complicated by the fact that we can only analyse the mixture of virus and host transcriptome, requiring in silico deconvolution of the data. Finally, A04 will blend in vitro experiments and machine learning, enabling a neural network to choose the direction of RNA driven antiviral wet-lab experiments.
Do viruses exploit their quasispecies for host range evolution?
Viruses exist as dynamic populations of closely related virus genomes arising from mutations, known as quasispecies. We hypothesise that viruses use their quasispecies to expand their evolutionary potential, making them critical for adaptation to new hosts and for resistance to host defences or immunity. Yet, the evolutionary trajectories of viruses cannot be fully understood without considering their ecological context. Host range and environmental conditions act as powerful filters and drivers of virus diversification, raising fundamental research questions at the interface of virus ecology and evolution: How do host interactions and environmental factors shape the emergence, stability, and adaptability of virus quasispecies? If the genetic diversity of the quasispecies reflects the evolutionary potential and ecological interactions of the virus, by which molecular mechanisms do viruses exploit their quasispecies for host range evolution? To address these fundamental questions at the core of our project, we will develop and apply a novel suite of computational tools based on Sequence Variation Graphs (SVGs). SVGs are increasingly utilised for population structure analysis in higher organisms, but their application in virology is limited due to the high mutation rates and genomic diversity of viruses. Nevertheless, they offer potential for analysing data from both genomic and metagenomic samples. In work package WP 1, we will build a quasispecies Sequence Variation Graph (qs-SVG) toolkit that can store sequencing data of virus populations, and we will use and further improve the tool in the remaining work packages. Once the tool is built, we will first apply it to an ideal case where abundant data is available, i.e., SARS-CoV-2 and Influenza viruses before, taking on a more challenging case, i.e., bacteriophages found in environmental metagenomes. By combining these two study systems in our project, we will be able to test different functionalities of the qs-SVG toolkit and develop an optimal bioinformatic solution. In WP 2, we will exploit new and existing data on the quasispecies of human viruses and bacteriophages with broad and narrow host ranges, to test the specific hypothesis that viruses with a broad host range also have a large quasispecies. In WP 3, we will investigate under what conditions viruses evolve their quasispecies. We will examine the relationship between quasispecies, host diversity, and the environment, using both in vitro data from isolates, and in situ data by screening metagenomic data sets derived from environmental samples. Finally, in WP 4 we will focus on the underlying mutational mechanisms. How does quasispecies sequence variation arise? Can viruses exploit mutation to expand their host range, and what molecular mechanisms enable this?
Project Leaders
Prof. Dr. Bas E. Dutilh
Institute of Biodiversity, Ecology,
and Evolution, Friedrich Schiller University Jena
Prof. Dr. Kirsten Küsel
Institute of Biodiversity, Ecology,
and Evolution, Friedrich Schiller University Jena
Phylogeny of functional sequence elements in virus genomes
Project Leaders
Prof. Dr. Caroline Friedel
Institute for Informatics,
Ludwig-Maximilians-University Munich
Dr. Denise Kühnert
Centre for Artificial Intelligence in Public Health,
Robert Koch Institute
Detecting time-resolved and virulence-associated host responses to virus infection
Project Leaders
Prof. Dr. Steve Hoffmann
Faculty of Biological Sciences,
Friedrich Schiller University Jena,
Leibniz Institute on Aging, Fritz Lipmann Institute
Prof. Dr. Friedemann Weber
Institute of Virology,
Veterinary Medicine,
Justus Liebig University Giessen
Harnessing synthetic small RNAs to probe, decode, and optimise phage-host interactions
Project Leaders
Prof. Dr. Manja Marz
Institute of Computer Science,
Friedrich Schiller University Jena
Prof. Dr. Kai Papenfort
Institute for Microbiology