RNA-Seq Analysis with Biocluster and R

RNA-Seq Analysis with Biocluster and R

Keywords

RNA-Seq, Alignment, Annotation, BAM, Differential-expression, Exploratory-analysis, Expression-estimation, FASTA, FASTQ, Feature-summarisation, Pre-processing, QC, Statistical-model

Authors

Top | Keywords | Authors | Description | Aims | Prerequisites | Target audience | Learning objectives | Materials | Data | Technical requirements | Literature references

Description

Sequencing of RNA (RNA-Seq) is the latest method to assess global gene expression because it
can be done on any species. This is a 4-day workshop for biologists who are doing or plan on
doing RNA-Seq experiments and would like to learn how to analyze their data. It will cover
experimental design, evaluation of sequencing quality, genome alignment, gene count
extraction, differential expression analysis and downstream data mining. It will also include
discussions of what to do if your species lacks gene models and/or a genome.

Aims

This workshop aims to give biologists the information and skills they need to analyze RNA-Seq
data, including practical experience using command-line tools and R packages.

Top | Keywords | Authors | Description | Aims | Prerequisites | Target audience | Learning objectives | Materials | Data | Technical requirements | Literature references

Prerequisites

  • Basic UNIX and how to submit jobs to a computing cluster
  • Basic knowledge of R

Target audience

  • Graduates students/post docs/beginning faculty

Learning objectives

  • Be able to describe the steps in a typical RNA-Seq workflow.
  • Be able to follow and modify UNIX scripts for QC, trimming, alignment and count generation to work with other samples.
  • Be able to follow and modify R scripts for QC, normalization, statistical analysis and data mining.

Top | Keywords | Authors | Description | Aims | Prerequisites | Target audience | Learning objectives | Materials | Data | Technical requirements | Literature references

Materials

  • Day 1 - Experimental design, QC and trimming
  • Day 2 - Alignment, counts, QC and normalization
  • Day 3 - Statistical Analysis and Venn diagrams
  • Day 4 - Heatmaps, WGCNA and Annotation

Data

### Description
Saccharomyces cerevisiae genome and gene models from Ensembl R64-1-1, plus index files for alignment with STAR. Also fastq files from Illumina single-end sequencing, reverse-stranded, 100 bp. Samples are a 2x2 factorial, 3 reps each, but actual treatment details have been withheld. Replicate groups can be found in the Day 3 materials, targets.txt file.

Availability

  • tarball of yeast genome, gene models, STAR index files and 12 fastq files
  • list of files in tarball
  • md5sum for the tarball

Top | Keywords | Authors | Description | Aims | Prerequisites | Target audience | Learning objectives | Materials | Data | Technical requirements | Literature references

Technical requirements

UNIX server with FastQC >= 0.11.2, Trimmomatic >= 0.30, STAR >= 2.4.0i, and subread >= 1.4.6-p1. Also IGV >= 2.3 and R >= 3.1.3 on any OS.

Literature references

Top | Keywords | Authors | Description | Aims | Prerequisites | Target audience | Learning objectives | Materials | Data | Technical requirements | Literature references

Keywords: RNA-Seq, Alignment, Annotation, BAM, Differential-expression, Exploratory-analysis, Expression-estimation, FASTA, FASTQ, Feature-summarisation, Pre-processing, QC, Statistical-model

Authors: Jenny Drnevich @jenny, Radhika Khetani @radhika, Jessica Kirkpatrick krkptrc2@illinois.edu

Scientific topics: RNA-Seq


Activity log