RNA-Seq Analysis with Biocluster and R

TeSS has been unable to access this material's URL since 31 May 2022 - the page may have been moved.

RNA-Seq Analysis with Biocluster and R

View material

RNA-Seq Analysis with Biocluster and R

Keywords

RNA-Seq, Alignment, Annotation, BAM, Differential-expression, Exploratory-analysis, Expression-estimation, FASTA, FASTQ, Feature-summarisation, Pre-processing, QC, Statistical-model

Authors

Jenny Drnevich @jenny
Radhika Khetani @radhika
Jessica Kirkpatrick krkptrc2@illinois.edu

Description

Sequencing of RNA (RNA-Seq) is the latest method to assess global gene expression because it
can be done on any species. This is a 4-day workshop for biologists who are doing or plan on
doing RNA-Seq experiments and would like to learn how to analyze their data. It will cover
experimental design, evaluation of sequencing quality, genome alignment, gene count
extraction, differential expression analysis and downstream data mining. It will also include
discussions of what to do if your species lacks gene models and/or a genome.

Aims

This workshop aims to give biologists the information and skills they need to analyze RNA-Seq
data, including practical experience using command-line tools and R packages.

Prerequisites

Basic UNIX and how to submit jobs to a computing cluster
Basic knowledge of R

Target audience

Graduates students/post docs/beginning faculty

Learning objectives

Be able to describe the steps in a typical RNA-Seq workflow.
Be able to follow and modify UNIX scripts for QC, trimming, alignment and count generation to work with other samples.
Be able to follow and modify R scripts for QC, normalization, statistical analysis and data mining.

Materials

Day 1 - Experimental design, QC and trimming
Day 2 - Alignment, counts, QC and normalization
Day 3 - Statistical Analysis and Venn diagrams
Day 4 - Heatmaps, WGCNA and Annotation

Data

### Description
Saccharomyces cerevisiae genome and gene models from Ensembl R64-1-1, plus index files for alignment with STAR. Also fastq files from Illumina single-end sequencing, reverse-stranded, 100 bp. Samples are a 2x2 factorial, 3 reps each, but actual treatment details have been withheld. Replicate groups can be found in the Day 3 materials, targets.txt file.

Availability

tarball of yeast genome, gene models, STAR index files and 12 fastq files
list of files in tarball
md5sum for the tarball

Technical requirements

UNIX server with FastQC >= 0.11.2, Trimmomatic >= 0.30, STAR >= 2.4.0i, and subread >= 1.4.6-p1. Also IGV >= 2.3 and R >= 3.1.3 on any OS.

Literature references

Keywords: RNA-Seq, Alignment, Annotation, BAM, Differential-expression, Exploratory-analysis, Expression-estimation, FASTA, FASTQ, Feature-summarisation, Pre-processing, QC, Statistical-model

Authors: Jenny Drnevich @jenny, Radhika Khetani @radhika, Jessica Kirkpatrick krkptrc2@illinois.edu

Scientific topics: RNA-Seq

Activity log