Example “Variant Calling”

Example “Variant Calling”

TITLE: One day variant calling module

## AUTHOR: Chiara Batini
## DESCRIPTION:
Comparing genome variation among populations or looking disease genotypes is a complex task. Finding the variants that illustrate these differences is the first step in understanding these population structures or genotype-phenotype relations. This course will show the participants an entire workflow for the alignment of NGS data to a reference genome and variant calling. Participants will be taught the basic conceptual aspects of the methods and will be guided through their practical application to a test dataset.
## AIMS:
The aim of this course is to illustrate the workflow needed to detect genomic variation from NGS data. It will teach students about the best practices for alignment and variant calling and allow them to maximise the potential of the data and identify its limitations. By the end of the day participants should have the basic skills to analyse their own data or to communicate with a bioinformatician.
TARGET AUDIENCE: Wet lab biologists who plan to analyse their own data or to interact with a bioinformatician
## PREREQUISITES:
* Unix command line
* basic fastq QC
* genetics
* sequencing technologies

LEARNING OBJECTIVES:

  • To define the standard file formats for representing sequence, alignment and variant data (fastq, bam, sam, vcf, bed)
  • To recognise the major steps in the computational analysis workflow and which software tools can be used at each stage
  • To visualize alignments and variants using a genome viewer
  • Apply filters to your list of variants
  • Evaluate the quality of your alignments and your variants

CONTENT:

Slides:

It usually takes the whole day to go through the course, alternating slides and explanations and practicals; an indicative length of sub-sections is:
* align reads to a reference genome: 1 hour
* bam manipulation: 1.5 hours
* bam visualisation and QC: 1 hour
* variant calling and filtering: 1.5 hours

Handbook:

this contains the practical exercises that are explained in the slides; participants are asked to go through it (step by step) on their own with the support of the tutors when needed

Software needed:

bwa 0.7.5a, samtools 1.1, picard 1.93, GATK 3.2-2, vcftools 0.1.12a, tablet 1.13.07.31

IT requirements:

1CPU, 2GB RAM

Datasets:

Dataset provided as zipfile in this directory.

Stability of the content

TBD

Literature references

See slides.


Activity log