Bioinformatics and Biomathematics Training Hub
The Bioinformatics and Biomathematics Training Hub (BBTH) is a BBSRC-funded collaborative project to coordinate the development and sharing of training materials and expertise across the UK’s National Institutes of Bioscience (NIB — http://www.nib.ac.uk/) for the increasingly important areas of bioinformatics and biomathematics.
Guided by our NIB partners, researcher surveys, and interactions with interested bodies such as Elixir-UK, we will work to ensure that best use is made of existing assets and capabilities, that best practice is shared across the NIB, and that redundant effort is minimised going forwards.
Five PowerPoint presentations and an Excel worksheet for a Train the Trainer course developed by Chris Taylor while at the Earlham Institute working for the National Institutes of Science's Bioinformatics and Biomathematics Training hub. The five slide decks cover pedagogical theory, trainer...
Scientific topics: Bioinformatics, Computational biology
Keywords: training, pedagogy
Overview
R is a programming language and associated environment developed for statistical computing and data analysis. It provides many powerful tools for statistics, data visualisation and bioinformatics.
Learning outcomes
- What R is suitable for.
- How to use R...
Scientific topics: Software engineering, Statistics and probability
Keywords: John Innes Centre, JIC
Audience
The intended audience includes any student, postdoc or RA who has an interest in bioinformatics and who intends to conduct RNA-Seq analysis on a Galaxy platform.
Overview
RNA-Seq is an immensely powerful technique that allows us to examine the presence and...
Scientific topics: RNA, RNA-Seq
Keywords: John Innes Centre, JIC
Overview
Perl is a programming language for getting your job done. It is designed to make the easy jobs easy, without making the hard jobs impossible. Perl is also well known for BioPerl, a collection of modules which greatly simplify complicated bioinformatics tasks.
Learning...
Scientific topics: Software engineering
Keywords: perl, John Innes Centre, JIC
Overview
The Linux operating system underlies the HPC cluster at the NBI, and other clusters throughout the world. Unlike Windows and OS X, we usually interact with these systems through the command line. The course will teach the basics of Linux use.
Learning...
Keywords: John Innes Centre, JIC
R is an open source programming/scripting language.
- Inspired by the programming language S.
- Useful for statistics and data science.
- Superior like commercial alternatives (over 7,000 user contributed packages at this time).
- Widely used both in academia and industry.
- Available on...
Scientific topics: Genomics
Keywords: IBERS, Institute of Biological, Environmental and Rural Sciences
UNIX is the most commonly used operating system in bioinformatics. Increasingly, the raw output of biological research exists as in silico data, usually in the form of large text files. Unix is particularly suited to work with such files and has several powerful (and flexible) commands that...
Keywords: IBERS, Institute of Biological, Environmental and Rural Sciences
Introduction to Galaxy
- Introduction to Reads Mapping
- Hands-on experience:
- Upload data on Galaxy repository
- Perform bioinformatics analysis with Galaxy
- Save data
- Visualize the results
Keywords: Galaxy, IBERS, Institute of Biological, Environmental and Rural Sciences
Sequence processing Issues
General sequence quality
- 5’ base bias
- 3’ quality decline
Library diversity
- Low complexity sequence
- Over-amplification
Contamination
- Intrinsic to sample?
- Cross contamination? *...
Scientific topics: Quality affairs
Keywords: IBERS, Institute of Biological, Environmental and Rural Sciences
Introduction
What Are Mixed Models?
Potential Advantages of Mixed Models
Historical Perspective
Example 1: Biological & Technical Replicates
Example 2: Rabbit Inspiration Time – Sources of Variation
Non-Normal Data
Example 3: Non-normal data
9....
Keywords: Mixed models, Roslin Institute
Introduction
Sample Size Calculation – Basic Formula
Example 1
Calculating In Terms of ‘Difference to Detect’ & Power From Sample Size
Example 2
Different Numbers Per Group
Different Variances for Groups
Binary Data
Allowing for...
Keywords: Sample size, Roslin Institute
Introduction
Introduction to Data Summary & Types of data
Continuous Data – Measures of Location
Continuous Data – Measures of Spread & Accuracy
Continuous Data with Skewed or Odd Distributions – Measures of Spread
Choosing Appropriate Summary Statistics...
Keywords: Summarising data, Roslin Institute
A Brief Introduction to R
Very Basic Functions
Demo Data - Rivers
Demo Data - Rivers (Cont.)
Demo Data - Orange
Demo Data - Orange (Cont.)
Demo data - Anderson's Iris Data
Demo Data - US Judge Rating
Summary
Keywords: R language, Roslin Institute
Introduction
Achieving Statistical Significance
Achieving Generalisable Results
Avoiding Bias
Questions to Address When Planning
Examples of Experimental Designs
Randomised Block Design
Crossover & Repeated Measures Designs
Pilot & Proof...
Keywords: Experimental design, Roslin Institute
Introduction
Many Views of A Pathway(Lecture)
Pathway Models & Pathway Types (Lecture)
Sources of ‘Pathway Data’ (Lecture)
Issues (Lecture)
SBGN (Lecture)
mEPN (Lecture)
Practical Session: Pathway to Pathway Construction - yED(1)
Practical...
Keywords: Graphical modelling, Computational modelling, Roslin Institute
R is a popular language and environment that allows powerful and fast manipulation of data, offering many statistical and graphical options. This course aims to introduce R as a tool for statistics and graphics, with the main aim being to become comfortable with the R environment. It will focus...
Scientific topics: Software engineering
Keywords: R programming, Babraham Institute
1.Introduction
Data summary
Demo – Data Summary
Demo – Data Summary cont.
Data summary using PROC TABULATE and PROC GPLOT
Statistical analysis
Example analysis
Example analysis cont.1
Example analysis cont.2
Reading and manipulating...
Keywords: SAS, Roslin Institute
Increasing amounts of bioinformatics work is done in a command line unix environment. Most large scale processing applications are written for unix and most large scale compute environments are also based on this.
This course provides an introduction to the concepts of unix and provides a...
Keywords: Unix, Babraham Institute
Phylogenetics & Phylogeography Practical - Overview
Phylogenetics & Phylogeography Practical 1
Phylogenetics & Phylogeography Practical 2 - Part 1
Phylogenetics & Phylogeography Practical 2 - Part 2
Phylogenetics & Phylogeography Practical 3
6....
Keywords: Phylogenetics, Phylogeography, Roslin Institute
This course is a comprehensive guide to the use of the built in R plotting functionality to construct everything from customised simple plots to complex multi-layered figures. It follows on from the material in our introductory R course and participants are expected to have a basic understanding...
Keywords: Graph plotting, R Programming, Babraham Institute
Ggplot is the most popular plotting extension to R and replicates many of the graph types found in the core plotting libraries. This course provides an introduction to the ggplot2 libraries and gives a practical guide for how to use these to create different types of graphs.
**Course...
Keywords: ggPlot, Babraham Institute
GraphPad Prism is a powerful and friendly package which allows you to plot and analyse your data. This course acts not only as an introduction to Prism, but also goes through the basic statistical knowledge which should allow you to make the most of your data.
Course Content
*...
Keywords: GraphPad Prism, Babraham Institute
This course provides a practical guide to producing figures for use in reports and publications. It is a wide ranging course which looks at how to design figures to clearly and fairly represent your data, the practical aspects of graph creation, the allowable manipulation of bitmap images and...
Scientific topics: Data visualisation
Keywords: Figure design, Babraham Institute
Statistics are an important part of most modern studies and being able to effectively use a statistics package can help you to understand your results. This course provides an introduction to statistics illustrated though the use of the friendly SPSS package.
Course Content
*...
Scientific topics: Statistics and probability
Keywords: Babraham Institute
Power analysis is a method to estimate the samples sizes needed to detect statistical effects with high probability. This course provides an introduction to the basic principles of power analysis and goes through the variables for power calculation. It also addresses the issues associated with...
Keywords: Power analysis, Sample size, Babraham Institute
Phylogenetic analysis of pathogens (lecture - part1)
Phylogenetic analysis of pathogens (lecture - part2)
Phylogenetic analysis of pathogens (lecture - part3)
Phylogenetic analysis of pathogens (lecture - part4)
Phylogenetic analysis of pathogens (lecture - part5)
6....
Keywords: Phylogenetic analysis, Roslin Institute
Phylogenetics & Phylogeography (lecture-part 1)
Phylogenetics & Phylogeography (lecture-part 2)
Phylogenetics & Phylogeography (lecture-part 3)
Phylogenetics & Phylogeography (lecture-part 4)
Phylogenetics & Phylogeography (lecture-part 5)
6....
Keywords: Phylogenetics, Roslin Institute
Welcome to the IFR Dockerised version of the Welsh Genepark's
Introduction to Command-line NGS Analysis course!
The original materials belong to Welsh Genepark and the original
materials and software can be found here.
This...
Keywords: NGS, Command line, IFR, Institute of Food Research, Quadram Institute
Welcome to the IFR Dockerised version of the Welsh Genepark's
Introduction to Command-line NGS Analysis course!
The original materials belong to Wesh Genepark and the original
materials and software can be found here.
This...
Keywords: NGS, IFR, Institute of Food Research, Quadram Institute
This is basically the EBI RNA-seq course bundled up in a Docker container using LXDE via TightVNC to provide a graphical environment. The material has been repackaged to use an Ipython Notebook as the learning environment which can be annotated by the learner.
Keywords: RNA-Seq, IFR, Institute of Food Research, Quadram Institute
This course follows on from the introductory course. It goes into more detail on practical guides to filtering and combining complex data sets. It also looks at other core R concepts such as looping with apply statements and using packages. Finally it looks at how to document your R analyses and...
Keywords: R programming, Babraham Institute
SeqMonk is a program which can analyse large data sets of mapped genomic positions. It is most commonly used to work with data coming from high-throughput sequencing pipelines.
The program allows you to view your reads against an...
Scientific topics: Sequence alignment
Keywords: Babraham Institute
Introduction
Hypothesis Testing
Choosing Between Parametric & Non-Parametric Tests
T-Test for Comparing Two Groups
T-Test – Interpreting Software Output
Paired T-Test
Mann-Whitney U-Test to Compare Two Groups When Data Are Not Normally Distributed
8....
Keywords: Statistical tests, Roslin Institute
Overview
We make extensive use of both light and electron microscopy in our research. The aquisition of images by a microscope is only the beginning of the process of extracting meaningful and useful information from those images. As well as extracting information, it's...
Scientific topics: Imaging, Bioimaging
Keywords: John Innes Centre, JIC
Many experimental designs end up producing lists of hits, usually based around genes or transcripts. Sometimes these lists are small enough that they can be examined individually, but often it is useful to do a more structured functional analysis to try to automatically determine any interesting...
Keywords: Functional analysis, Gene lists, Babraham Institute
Introduction: What are Virtual Machines?
- We introduce the concept of the virtual machine, what they can be used for and the advantages and disadvantages of using them.
Why use a Virtual machine and what is it?
Fetching and installing the VM software
Getting Virtualbox Book
*...
Keywords: VM, IFR, Institute of Food Research, Quadram Institute, Bioinformatics
Introduction
Many Views of A Pathway(Lecture)
Pathway Models & Pathway Types (Lecture)
Sources of ‘Pathway Data’ (Lecture)
Issues (Lecture)
SBGN (Lecture)
mEPN (Lecture)
Practical Session: Pathway to Pathway Construction - yED(1)
Practical...
Keywords: Graphical modelling, Roslin Institute
Overview
Galaxy is an open, web-based platform for data intensive biomedical research. It offers an accessible, reproducible and transparent computational workbench for the biologist. This workbench is very useful in automating repeated analysis steps in the form of workflows. The...
Scientific topics: Workflows, High-throughput sequencing, Nucleic acid sequence analysis, Bioinformatics
Keywords: John Innes Centre, JIC
Introduction
Potential Analysis Approaches
Mixed Model Analysis
Covariance Pattern Model
Example – Covariance Pattern Model
Significance Testing for Fixed Effects
Model Checking
Random Coefficients (Slopes) Models
Example – Random Coefficients...
Keywords: Data analysis, Repeated measures, Roslin Institute
1.Introduction to data analysis - course overview (lecture)
2.Introduction to networks (lecture)
BioLayout Express3D - background and benefits (lecture)
BioLayout Express3D – getting started 1 (Practical Session)
BioLayout Express3D – getting started 2 (Practical...
Keywords: Gene expression analysis, Roslin Institute
Overview
Chromatin Immunoprecipiation (ChIP) allows us to identify DNA sequences bound by particular proteins. This provides a powerful way to investigate chromatin-associated proteins such as transcription factors. Recent advances in high throughput sequencing allow ChIP data...
Scientific topics: ChIP-seq
Keywords: Roslin Institute
What is HPC
- Our HPC(s)
- Submitting a job to the HPC
- Monitoring your running job
- When your job has finished
- Checking how the resources on the HPC are being used
- Submitting multiple jobs
- Best practice
Keywords: IBERS, Institute of Biological, Environmental and Rural Sciences
Introduction
One-Way Analysis of Variance (ANOVA) Recap
Two-Way ANOVA - Assessing Two Effects in Same Model
Two-Way ANOVA - Allowing for Structure in The Data
Paired T-Test Using Two-Way ANOVA & ANOVA With More Effects
Regression Recap
General Linear...
Keywords: Statistical modelling, Roslin Institute
For a long time Perl has been a popular language among those programming for the first time. Although it is a powerful language many of its features mean make it especially suited to first time programmers as it reduces the complexity found in many other languages. Perl is also one of the world's...
Keywords: perl, Babraham Institute
Molecular epidemiology - practical 1
Molecular epidemiology - practical 1 review
Molecular epidemiology - practical 2 part 1
Molecular epidemiology - practical 2 part 2
Molecular epidemiology - practical 2 review
Molecular epidemiology - practical 3 part 1
7....
Keywords: Molecular epidemiology, Roslin Institute
Overview
Python is a flexible programming language that is becoming increasingly popular for scientific computing. The course is is split into 12 modules and runs over two half days. At the end of each module there a number of exercises to help solidify the learning. By the end of...
Scientific topics: Software engineering
Keywords: Python for Biologists, John Innes Centre, JIC
This course looks at the different ways in which sequencing based studies can fail and the options for visualisation and QC which allow you to identify and diagnose these failures at an early stage. It is designed to be of use to anyone who is using sequencing as part of their research, not just...
Scientific topics: Quality affairs
Keywords: Babraham Institute
Complete course materials from the 2015 Python course.
Keywords: Earlham Institute
Course materials from the data QC and preparation course.
Keywords: Earlham Institute
Course materials from the 2015 De Novo course.
Keywords: Earlham Institute
An LXDE Rstudio with pre-loaded Bioconductor tools for rna-seq DE analysis & visualisation
Keywords: RNA-Seq, IFR, Institute of Food Research, Quadram Institute
Complete course materials from the Ensembl Genomes course.
Keywords: Earlham Institute
Complete course materials from the 2014 pathways to networks course.
Scientific topics: Molecular interactions, pathways and networks
Keywords: Earlham Institute
Complete course materials from the 2014 GBS course.
Scientific topics: Genotype and phenotype
Keywords: Earlham Institute
Complete course materials from the 2014 Summer School.
Keywords: Earlham Institute
Complete course materials from the 2015 GBS course.
Keywords: Earlham Institute
Complete course materials from the 2014 NGS course.
Keywords: next generation sequencing, Earlham Institute
Complete course materials from the 2013 Python course.
Keywords: Python, Earlham Institute
Course materials from the 2015 de novo course.
Scientific topics: Sequencing
Keywords: De Novo, Earlham Institute
Course materials from the 2015 De Novo course.
Keywords: Earlham Institute
Course materials from the 2015 de novo course.
Keywords: Earlham Institute
Complete course materials from the 2014 SWC course.
Keywords: Earlham Institute
