Project Area B

Projects of the CRC 1768

Project Area B

Tools for Virus Interaction and Structure

This project area focuses on the development of bioinformatic tools with a special focus on molecular structures information, functions and interactions, instead of nucleotide sequence-level data. Understanding virus infection requires a keen appreciation of how the pathomechanism of a virus is closely connected with the host cell, emphasising the importance of comprehending both virus structure and host cell response. This project area will complement genomic and transcriptomic data from Project Area A by considering information on molecular structures, functions, and interactions. Here, we will focus on developing tools for four types of structures to gain additional insights into viruses and virus infections: first, a tool for chemical structures and small molecules; second, a tool for RNA secondary structures interacting with immune proteins; third, a tool for protein 3D structures; and fourth, a tool for computational structures that allows the surveillance and modelling of viruses and their evolution. It is understood that only through a combined view can we gain a full understanding of the virus and its host that includes different omics data – RNA secondary structures in transcripts, metabolites, and proteins. Yet, “subtle changes” of an RNA or DNA sequence or a metabolite or protein structure can result in dramatic changes in the structural phenotype. 
This may be an RNA molecule that folds incorrectly, or a tiny modification of a small molecule that substantially influences its binding to proteins. For example, the addition of a single carbon atom (chemically speaking, of a methylene group) can change the perceived smell of a metabolite (chemically speaking, the binding of a small molecule to an olfactory sensory neuron) from “vanilla” to “stool”. Metabolites form the building blocks of all life on earth, and an analysis of a biological system that ignores metabolites is necessarily incomplete and restricted in its explanatory power. Virus infections may modify the metabolism of an organism well beyond primary, well-described metabolites; to this end, our methods will put particular focus on such “secondary metabolites” (B01). Virus RNAs trigger innate immunity through specific sequence and structural features, and analyses that overlook these determinants remain incomplete. This project will identify the RNA motifs sensed by different immune receptors and clarify how their combined action shapes antiviral responses, enabling us to test whether innate immunity arises from coordinated pathway activation rather than a single trigger (B02). Third, the COVID-19 pandemic demonstrated not only the importance of vaccination, but more so the importance of being able to develop new vaccines as quickly as possible: a vaccine arriving a month later may result in thousands of human deaths. We will develop computational tools to identify conserved regions of virus surface proteins, and we will try to improve recognition by the immune system to speed the development of an effective vaccine (B03). In a similar direction, surveillance of the genetic changes in a virus quasispecies is of utmost importance to allow for a timely response to novel emerging viruses. This was again vividly demonstrated during the COVID-19 pandemic, as new virus variants emerged with new abilities to evade immunity in the population. Hence, tools for monitoring these changes, and for simulating virus-host interactions to identify potentially dangerous changes before they actually occur in nature, are of highest interest (B04). Likewise, Project Area B contributes to all research goals: G1 Tools for the systematic characterisation of novel viruses, G2 Tools for precise and comparative analysis of virus evolution, G3 Tools for the generalisation and quantification of virus-host interactions, and G4 Tools for the early assessment and prediction of virus infection potential.

Chemical mediators of virus infection

Metabolites are small molecules that participate in, and arise from, cellular metabolism. They span all chemical classes including sugars, amino acids, nucleotides, lipids, oxylipins and oxidised lipids, and many more. Metabolites show extensive structural diversity that is not dictated by polymeric templates. Beyond endogenously synthesised compounds, the metabolome also includes exogenous molecules and their biotransformation products. These metabolites can mediate interactions between cells and organisms of all phyla. The diversity, dynamic production, and turnover of metabolites make their analysis highly challenging. Because metabolites are key to understanding biological processes, including resistance to infection, their study is highly rewarding.
Virus infections are associated with substantial rewiring of the host metabolome. For example, viruses incorporate host metabolites into their own structures, thus repurposing compounds for virogenesis. In addition, viruses alienate host lipids for replication, and lipid droplets may support persistent propagation of virus infection. The host may defend itself against virus infection using a wide range of metabolites, including small cyclic nucleotides or modified nucleotides. In addition, (modified) lipids and other metabolites are often produced by an organism as part of the immune response. During these processes, infected cells and organisms can substantially alter their physiology and ecological function within a community.
In this project, we will establish an experimental and computational platform for untargeted metabolomics that allows us to monitor the changes in metabolism induced by a virus infection, Fig. B01.1. We will develop experimental and computational methods that cover a broad range of small molecules. We will place a special focus on (modified) lipids and cyclic nucleotides in response to virus infections. These compound classes are central to virus infection processes, but notoriously difficult to investigate using current computational approaches. We will optimise analytical and computational methods side by side. Our computational methods will be made available via the well-established and frequently used SIRIUS platform from the Böcker lab, and also integrated into the joint computational platform of the CRC VirusREvolution. We will use our platform to unravel intrinsic and induced metabolomic properties of virus infections and to monitor the associated, even subtle, changes in metabolism over time. 
Our metabolomics approaches will enable the ecolog- ical and pathological monitoring of virus infections. The emerging metabolic patterns will be linked to the transcriptomics and imaging platforms of this CRC VirusREvolution (G1). Bioassay-guided, functional ver- ification of dysregulated metabolites and pathways will be analysed together with Z03 (Fröhlich/Höppener/ Reiche)8. The combination of the ecometabolomics approach with advanced computational methods, both existing and newly developed as part of this project, will enable the annotation of an unprecedented diversity of metabolites relevant in these interactions, Fig. B01.1. The resulting annotations will open new perspectives on virus mechanisms beyond primary metabolism1. All data and results from the project will be collected and made available through the VirJenDB by NFDI4Microbiota, see Z02 (Barth/Cassman/Gerlach/König-Ries).
It must be understood that studying the metabolomic response of a virus infection is not as well established as studying virus genomes. Consequently, part of this project will be to establish protocols, both on the experimental and the computational side, on how to carry out metabolomic analysis. We will publish established experimental and computational protocols to benefit the community. Our experimental and computational framework will be open to the investigation of all infection systems within this CRC, including phage infections. We will need the emerging large body of data to optimise experimental protocols and to get started with computational methods development.

Project Leaders

Prof. Dr. Sebastian Böcker

Institute for Computer Science,

Friedrich Schiller University Jena

Prof. Dr. Georg Pohnert

Institute for Inorganic and Analytical Chemistry (IAAC),

Friedrich Schiller University Jena

RNA determinants of the antiviral innate immune response

Virus RNAs trigger the innate immune response, which is able not only to distinguish foreign RNA from the cell’s own diverse complement of RNA molecules, but also varies among virus pathogens. Macrophages act as the first line of defence against invading viruses and are pivotal for the innate immune response, crucially regulating the entire inflammatory process during virus infections. Single- or double-stranded RNA viruses are recognised by several classes of RNA-binding proteins, in particular, the RNA-dependent toll-like receptors (TLR), TLR-7 and TLR-8, the RIG-I-like receptors (RLR) RIG-I, MDA5, LGP2, and their homologues, and the double-strand sensing protein kinase R (PKR). While individual RNA ligands of these sensory proteins have been well studied, a comprehensive and comparative understanding of their evolutionary and structural diversity across virus groups remains lacking.
In our project, we therefore aim to systematically determine the key features of both RNA sequence and RNA structure required for binding to these sensory proteins and for the subsequent activation of the innate immune response. We aim to make this connection actionable by developing a comprehensive predictive toolkit, RNAinnate, designed to predict the tempo and mode of activation of the innate immune system from the virus RNA sequence. In detail, we will first investigate the specificity of RNA-protein binding for each purified RNA sensor in vitro, using randomised synthesised RNA libraries and employing cross-linking and immunoprecipitation sequencing methods (CLIP-seq). We will then transfect cells with distinct RNA sequences to validate our findings on the purified RNA sensors using electroporation- and DharmaFECT-based transfection. We will establish a transfection protocol using human lung epithelial cells (A549 cells and Calu-3 cells) and subsequently apply this technique to human primary macrophages. We will thoroughly examine the innate immune response, in particular the subsequent phosphorylation of downstream kinases of the specific RNA-binding proteins in the lung epithelial cells and in the macrophages. Afterwards, we will infect lung epithelial cells and human macrophages with SARS-CoV-2 to investigate virus-specific RNA-protein binding. Ultimately, we will infect cells in parallel with intact SARS-CoV-2, Influenza A virus (IAV), and respiratory syncytial virus (RSV) to map the previously identified triggering elements within their natural genomic contexts. In order to
determine characteristics of binding RNAs, we will evaluate enrichment and depletion of features such as secondary structure elements and local sequence motifs, as well as assess the distribution of folding energies. Moreover, we will employ unsupervised clustering techniques on these features, and combine the results of the different methods to extract descriptors of binding patterns in the form of covariance models and Bayesian descriptors similar to Dimont. Besides the determination of the RNA sequences, we will analyse the transcriptomic, metabolomic, and lipidomic changes of human primary macrophages related to the specific RNA sensors that occur following distinct RNA recognition. Finally, we will uncover evidence of selection pressure that removes or attenuates the individual RNA trigger elements for the innate immune system, providing insights into the mechanisms of RNA genome evolution. To achieve the latter goal, scalable, high-quality alignments of virus genomes are required that can combine inter-species comparison with information of strain-level variations. As no such tool exists, we will fill this gap with VirAligner, using novel combinations of existing approaches in comparative sequence analysis. The overview of the key tasks of this proposal is shown in Fig. B02.1.

Project Leaders

Prof. Dr. Peter F. Stadler

Leipzig University — Institute of Computer Science

Dr. Paul M. Jordan

Institute of Pharmacy,

Friedrich Schiller University Jena

Uncovering virus glycoprotein conformational dynamics for rational vaccine design

Structure-based prediction and design have made tremendous progress over the last thirty years with Rosetta and, especially, since 2021, when AlphaFold2 was released. This progress to achieve the holy grail of computational structural biology was rewarded with the Nobel Prize for these two algorithms in 2024. The accurate prediction of the protein structure is the first step towards an in silico first strategy to design vaccines and antibodies. Structure-based modelling will also provide estimates for host-receptor interactions, such as antibody-antigen complexes. With this information, the missing link between sequence and function can be filled. Here, we are developing methods to predict structures of virus glycoproteins that will allow us to assess emerging virus variants. With structure-based methods, we will investigate the impact of the observed mutations in virus glycoproteins on the structure and the respective conformational states. This is critical for understanding the conformational space that mediates the fusion process. One major step to overcome here is the lack of predictive power of AI tools for structure prediction to provide pre- and post-fusion receptor states. We will subsequently predict the effect these mutations have on the structure. Together with the consortium partners, we will investigate whether our methods predict the effect and the respective function of the virus glycoprotein (variants). The major virus-host interaction we will study is the interaction with the immune system, with the spike protein being the major target of the humoral response to SARS-CoV-2. Antibodies, as major determinants of this immune response, are useful research tools and therapeutics but are challenging for structure prediction and design. Moreover, antigen-antibody interactions are inherently hard to predict and evaluate, even with emerging AI tools, due to the lack of data and the complexity of the molecular interaction. Here, we will overcome these limitations and develop a new tool termed ANNtibody that takes atomic and electron density calculations into account. Thus far, these methods could not be employed for systems with more than a couple dozen atoms, but the training of AI on electron density data or on Density Functional Theory (DFT) calculations circumvents these resource-limited steps. Data from NFDI4Chem and its associated repositories will be used to benchmark these.
With this increase in resolution, we hypothesise that our method will capture the complex interaction network in the antibody-antigen interface more accurately. These calculations will be used to predict antibodyantigen interactions, which we will challenge with the experimental design of antibodies for emerging SARSCoV-2 variants using antibody interactions with the highly variable receptor-binding domain. With our method, we will update these antibody sequences and test experimentally whether we can overcome virus escape. Subsequently, we will design epitope-focused immunogens based on SARS-CoV-2 epitopes that will elicit broadly neutralising antibody populations. All experimentally obtained data will be used to refine the developed methods. Data provided by C02 (Deckert/Deinhardt-Emmer) on virus-host interactions, by A02 (Friedel/ Kühnert) and NFDI4Microbiota on virus sequences, and by B04 (Dittrich/McHardy) on surveillance will be essential for training and optimising our tool. These datasets will be integrated with our structural features, enabling our tool to generate predictions that will directly inform and support our partner projects. Altogether, we will generate computational structure-based tools that will help answers goals G1 and G3, providing insight into the host-receptor interaction. These tools will allow us to rapidly fight emerging virus infections and tune these tools for vaccine design, probing our ability to generate a broadly neutralising vaccine.

Project Leaders

Prof. Dr. Jens Meiler

Institute for Drug Discovery,

Leipzig University Medical Faculty

JProf. Dr. Clara Schoeder

Institute for Drug Discovery,

Leipzig University Medical Faculty

Linking macroscopic evolution with molecular processes for rapidly evolving virus pathogens via data-driven inference and simulations

Virus pathogens such as SARS-CoV-2 and human Influenza A viruses are single-stranded RNA viruses with substantial capacity to mutate and to adapt to the human host for more efficient replication and spread. A multitude of factors affect the evolutionary patterns left in their genomes, such as adaptation to changing host immunity or for more efficient replication, phylogenetic spread, as well as uncharacterised processes on the cellular level. Continuous changes in the surface antigens of these viruses allow them to evade host immunity developed through either prior infection from previous strains or from vaccination. This capacity of a virus, known as immune escape, facilitates the reinfection of individuals. Consequently, vaccines protecting against such viruses need to be frequently updated to maintain their effectiveness against circulating variants. We hypothesise that our understanding of the complex interplay of these various processes from large-scale virus genome data can be improved by careful analysis and deconvolution with tailor-made computational techniques. This improved understanding of virus evolution will make it even more predictable on the population level and facilitate the early identification of future emerging, antigenically altered variants of concern for public health.

We have recently developed techniques that allowed us to predict the emergence of relevant variants of SARS-CoV-2, as reported by the World Health Organization (WHO), substantially prior to this classification and to their reaching their maximal abundances. We are also able to identify lineages with substantial antigenic alterations, which can inform considerations regarding vaccine strain updates. In this project, we will combine data-driven analytics of population-level virus diversity with molecular modelling across scales to link macroscopic virus evolution on a population level to molecular processes within the cell. By combining data-driven surveillance and simulation, we will be able to study evolutionary and epidemiological phenomena in both data and models, see Fig. B04.1. These include: (1) Developing approaches for early detection and further characterisation of antigenically or otherwise phenotypically altered lineages identified by the WHO as Variants of Concern (VOCs) via virus genomic surveillance (G3). Early detection methods for identifying antigenically altered lineages classified by the WHO as concerning, of interest, or under monitoring have recently been developed in the McHardy lab.

We will extend this approach to predict combinations of amino acid changes driving future predominant lineages, enabling earlier detection of potential VOCs than current methods. (2) Developing a multi-scale simulation platform consisting of (i) a micro-level – simulating virus replication within a cell; and (ii) a macro-level – simulating virus evolution. At the micro level, we will develop a new type of rule-based description language, including RNA and dynamical compartments. The rule-based replication cycle model will allow us to trace a mutation through the replication cycle. This trace helps to clarify the putative effects of the mutation on the dynamics of the replication cycle, and also to explain how the mutation could affect the fitness of the virus. (3) Combining the results from all work packages to disentangle the contributions of genetic drift, antigenic drift, and currently uncharacterised processes on the genetic diversity of circulating lineages and to study the evolutionary role of the “not-yet-explained” mutations that influence the replication cycle of the virus within the host cell.

Project Leaders

Prof. Dr. Peter Dittrich

Institute for Computer Science,

Friedrich Schiller University Jena,

Prof. Dr. Alice C. McHardy

Helmholtz Centre for Infection Research,

Department for Computational
Biology for Infection Research