Exercises for the course RNA-seq data analysis with Chipster

Exercises for the course RNA-seq data analysis with Chipster

Keywords

FASTQ, QC, Pre-processing, Alignment, BAM, Expression-estimation, Feature-summarisation, Differential-expression, Statistical-model, Exploratory-analysis

Authors

Type

  • Practical

Description

This practical covers the whole RNA-seq data analysis pipeline, from quality control of raw reads to differential expression analysis, using the free Chipster software. Material updated in Dec 2015.

Aims

  • Performing RNA-seq analysis
  • Recognizing and troubleshooting issues with the data

Prerequisites

  • As the user-friendly Chipster software is used in the exercises, no command line or R experience is required.

Target audience

  • The course is suitable for any researcher interested in learning RNA-seq data analysis.

Learning objectives

  • Applying FastQC quality control software and interpreting the output
  • Performing preprocessing with Trimmomatic software
  • Producing alignment with TopHat2
  • Interpreting the aligner output
  • Being able to visualise alignments with Chipster genome browser
  • Applying RseQC software for alignment level QC and interpreting the output
  • Producing a table of read counts with HTSeq software
  • Identifying confounding effects with PCA and MDS plots and taking necessary action
  • Performing DE analysis with edgeR and DESeq2 and interpreting the output
  • Understanding and performing multifactor analysis
  • Comparing gene lists with Venn diagram
  • Producing plots with DESeq2: normalized counts for a gene, dispersion plot, MA plot, p-value distribution
  • Operating Chipster software

Materials

  • Practicals on RNA-seq data analysis

Data

The datasets for the exercises are available on Chipster server as example sessions. Two datasets are used:
* Raw reads from human hESC1 and GM12878 cells produced by the ENCODE project.
* Table of per-gene read counts from an experiment by Brooks et al which studied the effect of RNAi knockdown of the splicing factor Pasilla in Drosophila melanogaster. The read counts were obtained from the pasilla Bioconductor package.

Timing

The lecture and practicals can be performed in one day.

Content stability

The content is updated approximately every 3 months.

Technical requirements

  • Chipster software v3.6.3 or later

Literature references

  • Suitable reading includes the book RNA-seq data analysis - practical approach

Keywords: FASTQ, QC, Pre-processing, Alignment, BAM, Expression-estimation, Feature-summarisation, Differential-expression, Statistical-model, Exploratory-analysis

Authors: Eija Korpelainen @eija, ekorpelainen@gmail.com


Activity log