Using bioinformatics to hunt SARS-CoV-2, its variants & its origins – a practical guide

This Practical Guide outlines basic bioinformatics approaches for exploring the SARS-CoV-2 genome and its corresponding proteins, focusing on the protein exposed on the viral particle surface: the spike protein. The ways in which bioinformatics can be harnessed to study a new virus, its genome, its proteins, its origins and its evolution are explored.

Specifically, this Guide introduces a range of bioinformatics tools for comparing and analysing nucleotide and protein sequences. On reading the Guide and completing the exercises, you will be able to:
- discover SARS-CoV-2 genome(s) available in a public nucleotide repository;
- compare SARS-CoV-2 genome sequences, look for their differences (mutations) and identify the variants;
- translate the spike gene into its encoded protein sequence;
- discover the 3D structure of the spike protein;
- understand the impact of mutations on infectivity and immune responses; and
- infer the origin of SARS-CoV-2 by comparing coronavirus spike protein sequences from different animal origins.

DOI: https://doi.org/10.7490/f1000research.1118746.1

Licence: Creative Commons Attribution Share Alike 4.0 International

Keywords: Bioinformatics for schools, basic bioinformatics, SARS-CoV-2 pandemic, genome analysis, protein sequence analysis, protein structure analysis, virus variants, spike protein, training material


Additional information

Target audience: Trainers, Training instructors, Training Designers, PhD students, post-docs, Life Science Researchers

Resource type: Training materials

Authors: Marie-Claude Blatter, Philippe Lemercier, Teresa Attwood, The GOBLET Foundation