Date: 8 - 11 July 2019

Loading map...

This workshop will focus on the core steps involved in calling germline short variants, somatic short variants, and copy number alterations with the Broad’s Genome Analysis Toolkit (GATK), using “Best Practices” developed by the GATK methods development team. A team of methods developers and instructors from the Data Sciences Platform at Broad will give talks explaining the rationale, theory, and real-world applications of the GATK Best Practices. You will learn why each step is essential to the variant-calling process, what key operations are performed on the data at each step, and how to use the GATK tools to get the most accurate and reliable results out of your dataset. If you are an experienced GATK user, you will gain a deeper understanding of how the GATK works under-the-hood and how to improve your results further, especially with respect to the latest innovations.

*Day 1: Introductory and Overview. The first day of the workshop gives a high-level overview of various topics in the morning, and in the afternoon we show how these concepts apply to a case study. The case study is tailored based on the audience, as represented by their answers in our pre-workshop survey.

*Day 2: Germline Short Variant Discovery. Today we dive deep into the tools that make up the GATK Best Practices Pipeline. In the morning we discuss variant discovery, and in the afternoon we look at refinement and filtering. You will have the opportunity both in the morning and in the afternoon to get hands-on with these tools and run them yourself.

*Day 3: Somatic Variant Discovery. Today we will cover Somatic Variant Discovery in more depth. In the morning we primarily focus on calling short variants with Mutect2, and in the afternoon we look at copy number alterations. Both sections have a paired hands-on activity.

*Day 4: Pipelining. Over the first three days, you would have learned a lot about different pipelines and tools that you can use in GATK. Today we will be learning all about how those pipelines are written in a language called WDL. In the afternoon we cover other useful topics to working on the cloud, including Docker and BigQuery.

Please note that this workshop is focused on human data analysis. The majority of the materials presented does apply equally to non-human data, and we will address some questions regarding adaptations that are needed for analysis of non-human data, but we will not go into much detail on those points.

The hands-on GATK tutorials in this workshop will be conducted on Terra, a new platform developed at Broad in collaboration with Verily Life Sciences for accessing data, running analysis tools and collaborating securely and seamlessly.

The training room is located on the first floor and there is currently no wheelchair or level access available to this level.

Please note that if you are not eligible for a University of Cambridge Raven account you will need to Book or register Interest by linking here.''

Keywords: HDRUK

Venue: Craik-Marshall Building

City: Cambridge

Country: United Kingdom

Postcode: CB2 3AR

Organizer: University of Cambridge

Host institutions: University of Cambridge Bioinformatics Training

Target audience: The course is aimed primarily at mid-career scientists – especially those whose formal education likely included statistics, but who have not perhaps put this into practice since., Graduate students, Postdocs and Staff members from the University of Cambridge, Institutions and other external Institutions or individuals

Event types:

  • Workshops and courses

Scientific topics: Bioinformatics, Data mining, Data visualisation, Genomics


Activity log