Date: 16 - 18 May 2023

This three-day course will introduce concepts vital to biocuration, including metadata and ontologies, the role of identifiers, and extracting data from literature. 

During the course, there will be a focus on practical skills required by biocurators, including programming for handling large biological datasets, and querying databases. Participants will have the opportunity to use these skills in a real-life practical biocuration exercise during the group projects. 

There will also be an opportunity to network with biocurators and learn more about their roles and data resources.

Group projects

A major element of this course is a group project, where participants will be placed in small groups to work together on a challenge set by trainers from EMBL-EBI and external institutes. This allows people to explore the role of biocurators and provide participants with hands-on experience in biocuration. The group work will culminate in a presentation session involving all participants on the final day of the course, giving an opportunity for wider discussion in the whole course group.

Groups are mentored and supported by the trainers who set the initial challenge, but the groups will be responsible for driving their projects forward, with all members expected to take an active role. Groups are pre-organised before the course.

Basic outlines of the projects on offer this year are given below. In your registration you must indicate your first and second choice of project. Not all projects may be offered, and final decisions on which projects will be run during the course will be made based on the number of applicants per project.

Group project: Curating a protein complex

In this project, a new protein complex has been published and it is your role to find out as much information as possible about the complex. Using literature and bioinformatics resources, you will provide an overview of the protein complex including a full list of the complex members, a description of the role of each protein and an overview of the role of the complex. You will use resources such as UniProt, PDB and EuropePMC.

Project mentor: Michele Magrane

Group project: Exploring data and metadata from a genome-wide association study

The GWAS Catalog is a richly-annotated database of human genome-wide association studies, which analyse associations between genetic variants and a disease or other trait of interest in a sample of individuals from a particular population. In this mini project you will examine a GWAS publication in detail, to extract information about the traits and samples under investigation and decide how to represent these metadata using standardised vocabularies and ontologies. You will also use simple command line tools to look at a GWAS dataset and explore the role of standard formats and quality control in biocuration.

Project mentor: Elliot Sollis

Group project: Using alignments to improve Rfam families: a case of curating non-coding RNA families

Rfam is the database of non-coding RNAs (ncRNAs) families. These families are built out of a sequence alignment, metadata describing the family and an infernal model. Each family is built by curating alignments from publications or user submitted alignments. In this project we will demonstrate the process of updating an existing family with an improved alignment. We will examine the reports generated during curation. In this process, we will examine the phylogenetic distribution, sequence alignment and model results to determine how the family should be updated to reflect the improved alignment. Additionally, we will show how to connect this information to resources like Wikipedia.

Project mentor: Nancy Ontiveros

 

Contact: Sophie Spencer - sophie@ebi.ac.uk

Keywords: Biocuration, Data management, Data handling, Biocurators, ELIXIR Converge

Venue: European Bioinformatics Institute, Hinxton

Region: Cambridge

Country: United Kingdom

Postcode: CB10 1SD

Organizer: European Bioinformatics Institute (EBI)

Host institutions: European Bioinformatics Institute

Capacity: 30

Event types:

  • Workshops and courses

Scientific topics: Ontology, Data submission, annotation, and curation, Data management


Activity log