There is a critical need for essential bioinformatic and photonic tools to comprehensively characterize viruses, predict their pathogenic potential, uncover virus tropism, and study their fast adaptation and diversity. The COVID-19 pandemic – as a good example of real-time evolution – has highlighted our deficiency in fundamental data and tools related to viral genomes, host transcriptomes, virus phylogenetics, morphology, and surveillance, emphasizing the immediate need for improved preparedness in the face of future viral outbreaks.


Tools for virus genome identification and assembly of quasispecies.

Inferring viral genetic diversity in mixed samples from deep coverage sequencing data remains a major challenge. Effective viral haplotype reconstruction tools require unique haplotypes, adequate read length, and sufficient coverage. Therefore, there is an urgent need for a unified workflow that streamlines the various processing steps involved in viral diversity studies and facilitates the daily work of clinicians and virologists. In addition, targeted hypermutation processes result in substantial genomic variation, but the computational tools required to analyze these mutation patterns within quasispecies remain lacking.

Tools for virus genome annotation.

Annotation of viral genes may appear straightforward due to the relatively low gene content of viruses. However, this task is significantly complicated by a variety of features common to viruses or virus families, including polycistronic mRNAs, overlapping genes, splicing, internal ribosome entry sites, and multiple AUG triplets upstream of the open reading frame. Regrettably, there is currently no universally accepted standard pipeline for annotating viral genomes and existing annotation pipelines are generally tailored to specific virus families.

Tools for virus genome alignments.

The rapid development of high-throughput technologies has opened opportunities to address entirely new questions, approaches, and research topics in virology. Unfortunately, existing multiple sequence alignment (MSA) tools, developed primarily for bacteria and higher eukaryotes, encounter challenges when applied to viral data due to the unique characteristics of viruses, such as their small genome size and considerable diversity within individual populations. In addition, the structure of RNA is often considered more crucial than the actual sequence itself. Consequently, it is insufficient to rely only on sequence information within an MSA; it is important to also consider potential RNA-RNA interactions. While these structure-guided alignments are invaluable, they involve significant computational costs and are generally limited to processing short sequences.

Tools for virus phylogeny.

Virus phylogeny studies employ almost exclusively general-purpose tools that have not been tailored to the unique features of viral genomes. Available tools have significant scalability issues when confronted with viruses that have millions of sequenced isolates. At best, they provide only a rough approximation of the complexity within viral populations and have difficulty coping with the exceptionally large genetic distances that occur in viral phylogeny. In addition, mechanisms that are rare exceptions and can be safely ignored for phylogenetic analyses in cellular systems are far more abundant and non-negligible in viruses. These mechanisms include frameshifts, codon skipping, ribosomal shunting, leaky scan motifs, superimposed reading frames, and the influence of RNA secondary structure.

Tools for host response: Transcripts.

Our understanding of the transcriptome responses within the direct host cells of viruses remains somewhat limited. To identify shared and unique molecular mechanisms during the initial stages of infection, it becomes essential to examine a broader spectrum of viruses within their primary host cells. Nonetheless, effectively distinguishing between transcriptomic variations attributable to the virus and those stemming from the host is a non-trivial task, necessitating further advancements in statistics and computational methods.

Tools for host response: Proteins.

The cellular phenotype present during infection depends on the factors and constitutions in which the virus impacts its host. Single-cell mass spectroscopy (scMS) allows the quantification of approximately 1,000 proteins per cell and provides the ability to detect heterogeneity and cell-specific proteins, which can serve as an initial basis for further investigation. A significant advance in viral infection research would be an approach that not only comprehensively characterizes the proteome, but also captures the entirety of the cellular response, providing a holistic view of how the host responds to viral infection.

Tools for host response: Metabolites.

Metabolites as an intermediate or end product of metabolism are an important indicator of the cellular host response to viral infection. To the best of our knowledge, all previous studies restrict their attention to primary metabolites and a few, structurally restricted lipid classes.

Tools for optimizing antiviral strategies.

While there are numerous instances of successful antiviral development, current approaches face limitations, primarily due to the structural complexity of viral proteins. To strike a balance between this complexity and computational efficiency, modeling techniques rely on various approximations. One common approach is the design of sequences on a single, unchanging protein backbone. While these approximations facilitate the computation of molecular structures within manageable timeframes, the efficacy of design outcomes is modest. Consequently, a significant amount of in vitro screening and validation is necessary to create antiviral agents with the desired properties.

Tools for surveillance.

In response to the SARS-CoV-2 pandemic, numerous countries have established systematic genomic surveillance initiatives, resulting in an unprecedented increase in genomic data, with tens of thousands of sequences being deposited every day. There is a need for automated procedures to identify emergent viral lineages with altered phenotypes, detect de novo lineages and amino acid changes under selection, and statistical evaluations to interpret the data. While methods for detecting positively selected lineages and antigenic variants have been developed, there is a need to improve them for rapid real-time analyses and extend their capabilities to process the extremely large datasets that are currently being generated. Further, we urgently need to generalize methods to other viruses, in particular potentially emerging, future pathogens.

Tools for analyzing virus morphology.

Virus morphology is ideally directly visualized using microscopy techniques. However, these techniques come with limitations and challenges, such as sophisticated fixation and labeling protocols, the danger of phototoxic effects, or increased background signals.

Tools for analysing virus entry and trafficking.

Virus infection begins with cellular entry and intracellular trafficking, processes that vary by virus and host cell and critically determine infectivity and genome release, typically via membrane fusion or endocytic pathways. Understanding these mechanisms is essential but experimentally challenging due to nanoscale spatial features (<200 nm), rapid spatiotemporal dynamics, and difficulties in labelling without artefacts, requiring carefully tailored experimental designs and controls. A broad photonic toolbox—including live-cell optical microscopy, super-resolution microscopy, electron and X-ray microscopy, tip-based methods (AFM, SNOM), and label-free Raman-based techniques—provides complementary insights into virus uptake and virus–host interactions. However, current methods are complex and not yet suited for rapid, predictive determination of entry mechanisms. To address this gap, coordinated efforts within the CRC VirusREvolution focus on optimizing experimental design and developing correlative multimodal imaging and automated data analysis pipelines to ultimately link virus entry mechanisms with infectivity in a predictive manner.

Tools for analysing image data on virus dynamics and interactions.

Automated, AI-based image analysis is essential for quantitative studies of virus–host interactions, as manual analysis is biased and unsuitable for the nanoscale, low-signal, and highly heterogeneous nature of virus imaging. Machine-learning tools for segmentation, tracking, and denoising enable robust quantification of virus dynamics, while platforms such as JIPipe integrate these methods into reproducible, FAIR-compliant workflows. Further development of such integrated tools within the CRC aims to enable objective characterization of virus dynamics and accelerate virus research.

Tools for visualising virus–host response.

Exposure to viruses, even without productive infection, triggers complex biochemical, structural, and metabolic host-cell responses that influence infection outcomes and immune control. Label-free photonic techniques such as Raman, IR, OPTIR, coherent Raman imaging, and advanced fluorescence methods enable high-resolution, non-destructive analysis of these responses by capturing metabolic states, molecular redistribution, and redox balance in living cells. When combined with AI-assisted data analysis and correlative multimodal imaging, these approaches allow single-cell quantification of heterogeneous host responses, a key focus of ongoing efforts within the CRC VirusREvolution.

Overall significance of photonic tools within the CRC VirusREvolution.

In summary, the photonic tools developed within this CRC enable quantitative, multiscale linkage of virus morphology, entry mechanisms, and host-cell responses, providing an integrated structural, chemical, and functional view of virus–host interactions beyond the reach of molecular or omics approaches alone. By combining super-resolution, vibrational, and correlative multimodal imaging with AI-based data analysis, the CRC establishes a platform for rapid and predictive characterisation of emerging viruses, positioning photonic technologies as a central methodological pillar of the CRC VirusREvolution.