A Critical Guide to the neXtProt knowledgebase: querying using SPARQL
This Critical Guide in the Introduction to Bioinformatics series briefly outlines how to explore the neXtProt human protein database using SPARQL. While text indexation has made database contents more accessible, being able to combine search criteria for specific content permits more powerful...
Scientific topics: Database management
Keywords: Human protein database, Introduction bioinformatics, Introduction nextprot, Nextprot data model, Rdf triples, Semantic triples, Sparql queries, Sparql syntax, Training material
A Critical Guide to the neXtProt knowledgebase: querying using SPARQL
https://www.mygoblet.org/training-portal/materials/critical-guide-nextprot-knowledgebase-querying-using-sparql
http://tess.elixir-uk.org/materials/a-critical-guide-to-the-nextprot-knowledgebase-querying-using-sparql
This Critical Guide in the Introduction to Bioinformatics series briefly outlines how to explore the neXtProt human protein database using SPARQL. While text indexation has made database contents more accessible, being able to combine search criteria for specific content permits more powerful querying, and provides a means to mine the information stored in databases. This Guide illustrates the use of the SPARQL semantic query language to interrogate neXtProt and other databases that provide SPARQL endpoints.
Specifically, the Guide introduces the concept of database ‘semantic triples’, and examines features of the neXtProt data model. On reading this Guide, and completing the exercises, users will be able to: i) identify key entities within the neXtProt data model; ii) explain what these entities represent, what information they contain and what the information is used for; iii) identify key SPARQL syntax elements; iv) understand SPARQL tutorial examples; and v) write a SPARQL query to retrieve entries matching specific criteria.
Terri Attwood
Database management
Human protein database, Introduction bioinformatics, Introduction nextprot, Nextprot data model, Rdf triples, Semantic triples, Sparql queries, Sparql syntax, Training material
Beginners
2019-06-06
A Critical Guide to the PDB
This Critical Guide in the Introduction to Bioinformatics series provides a brief outline of the Protein Data Bank – the PDB – the world’s primary repository of biological macromolecular structures. The rationale for creating the resource and the kinds of information it provides are discussed,...
Scientific topics: Database management
Keywords: Introduction bioinformatics, Introduction pdb, Protein structure analysis, Protein structure databases, Protein structures
A Critical Guide to the PDB
https://www.mygoblet.org/training-portal/materials/critical-guide-pdb
http://tess.elixir-uk.org/materials/a-critical-guide-to-the-pdb
This Critical Guide in the Introduction to Bioinformatics series provides a brief outline of the Protein Data Bank – the PDB – the world’s primary repository of biological macromolecular structures. The rationale for creating the resource and the kinds of information it provides are discussed, and issues relating to its evolution and growth are explored.
Specifically, this Guide introduces the principal features of the PDB, the nature (and quality) of its contents and how these may be interrogated. On reading this Guide, users will be able to: i) explain some of the ways in which knowledge of protein structures is useful; ii) identify the constituent databases of the wwPDB; iii) explain key features of the RCSB PDB in terms of its data distribution, growth and redundancy statistics; iv) search the PDB using simple and advanced keywords and full sequences, and analyse differences between them; and v) explain various structural quality criteria, and infer the quality of individual PDB entries.
Terri Attwood
Database management
Introduction bioinformatics, Introduction pdb, Protein structure analysis, Protein structure databases, Protein structures
Beginners
2018-09-08
A Critical Guide to InterPro
This Critical Guide in the Introduction to Bioinformatics series provides an introduction to the InterPro database, the largest, most comprehensive, integrated protein family database in the world. The rationale for creating the resource, the nature of its contributing databases and the kinds of...
Scientific topics: Database management
Keywords: Introduction bioinformatics, Introduction interpro, Protein family classification, Protein family databases, Protein family hierarchies, Protein function annotation, Protein sequence analysis
A Critical Guide to InterPro
https://www.mygoblet.org/training-portal/materials/critical-guide-interpro
http://tess.elixir-uk.org/materials/a-critical-guide-to-interpro
This Critical Guide in the Introduction to Bioinformatics series provides an introduction to the InterPro database, the largest, most comprehensive, integrated protein family database in the world. The rationale for creating the resource, the nature of its contributing databases and the kinds of information they provide are discussed, and the role of InterPro in protein classification and function-annotation projects is outlined.
Specifically, this Guide introduces the principal components of the InterPro database, the differences between them, and how their integration creates a resource whose diagnostic power is greater than the sum of its parts. On reading this Guide, users will be able to: i) explain how protein family databases are used to help annotate uncharacterised protein sequences; ii) identify InterPro’s constituent data resources and explain the main methods that underpin them; iii) search InterPro using keywords and full sequences; iv) analyse and interpret search results in terms of protein family hierarchies, their structural domains and functional features; and v) track the provenance of InterPro’s annotations.
Terri Attwood
Database management
Introduction bioinformatics, Introduction interpro, Protein family classification, Protein family databases, Protein family hierarchies, Protein function annotation, Protein sequence analysis
Beginners
2018-09-08
A Critical Guide to the UniProtKB Flat-file Format
This Critical Guide briefly presents the need for biological databases and for a standard format for storing and organising biological data. Web-based interfaces have made databases more user-friendly, but knowledge of the underlying file format offers a deeper understanding of how to navigate...
Scientific topics: Database management
Keywords: Flat file databases, Flat files, Introduction bioinformatics, Uniprotkb flat file format
A Critical Guide to the UniProtKB Flat-file Format
https://www.mygoblet.org/training-portal/materials/critical-guide-uniprotkb-flat-file-format
http://tess.elixir-uk.org/materials/a-critical-guide-to-the-uniprotkb-flat-file-format
This Critical Guide briefly presents the need for biological databases and for a standard format for storing and organising biological data. Web-based interfaces have made databases more user-friendly, but knowledge of the underlying file format offers a deeper understanding of how to navigate and mine the information they contain, so that humans and machines can get the most out of them. This Guide explores the file format that underpins one of today’s most popular protein sequence databases – UniProtKB.
Specifically, this Guide introduces the concept of database ‘flat-files’, and examines features of the UniProtKB flat-file format. On reading this Guide, users will be able to: i) identify key fields within UniProtKB/Swiss-Prot and UniProtKB/TrEMBL flat-files; ii) explain what these fields mean, what information they contain and what the information is used for; iii) analyse the information in different fields and infer structural and functional features of a sequence; iv) examine and investigate the provenance of annotations; and v) compare annotations at different time-points and evaluate the likely impact of annotation changes.
Terri Attwood
Database management
Flat file databases, Flat files, Introduction bioinformatics, Uniprotkb flat file format
Beginners
2018-09-08
A Critical Guide to UniProtKB
This Critical Guide in the Introduction to Bioinformatics series provides a brief outline of the UniProt protein sequence database, with a particular focus on the UniProt Knowledgebase – UniProtKB. The rationale for creating the resource, its contributing databases and the kinds of information...
Scientific topics: Database management
Keywords: Introduction bioinformatics, Introduction uniprot, Protein sequence databases, Uniprot knowledgebase
A Critical Guide to UniProtKB
https://www.mygoblet.org/training-portal/materials/critical-guide-uniprotkb
http://tess.elixir-uk.org/materials/a-critical-guide-to-uniprotkb
This Critical Guide in the Introduction to Bioinformatics series provides a brief outline of the UniProt protein sequence database, with a particular focus on the UniProt Knowledgebase – UniProtKB. The rationale for creating the resource, its contributing databases and the kinds of information they provide are discussed, and issues behind the quality of their annotations are explored.
Specifically, this Guide introduces the principal components of the UniProt Knowledgebase, and the differences between them. On reading this Guide, users will be able to: i) identify and explain the characteristic features of UniProtKB/Swiss-Prot and UniProtKB/TrEMBL entries; ii) distinguish annotations that are computed, and hence not experimentally validated; iii) search UniProtKB using keywords, full sequences and peptides, and interpret the results; iv) analyse and track the provenance of annotations; and v) infer which annotations are likely to be accurate and which erroneous.
Terri Attwood
Database management
Introduction bioinformatics, Introduction uniprot, Protein sequence databases, Uniprot knowledgebase
Beginners
2018-09-08
A Critical Guide to BLAST
This Critical Guide in the Introduction to Bioinformatics series provides an overview of the BLAST similarity search tool, briefly examining the underlying algorithm and its rise to popularity. Several Web-based and stand-alone implementations are reviewed, and key features of typical search...
Keywords: Introduction bioinformatics, Introduction blast, Sequence database searching, Sequence similarity searching
A Critical Guide to BLAST
https://www.mygoblet.org/training-portal/materials/critical-guide-blast
http://tess.elixir-uk.org/materials/a-critical-guide-to-blast
This Critical Guide in the Introduction to Bioinformatics series provides an overview of the BLAST similarity search tool, briefly examining the underlying algorithm and its rise to popularity. Several Web-based and stand-alone implementations are reviewed, and key features of typical search results are discussed.
Specifically, this Guide introduces concepts and theories that underpin the BLAST search tool, and examines features of search outputs important for understanding and interpreting BLAST results. On reading this Guide, users will be able to: i) search a variety of Web-based sequence databases with different query sequences, and alter search parameters; ii) explain a range of typical search parameters, and the likely impacts on search outputs of changing them; iii) analyse the information conveyed in search outputs and infer the significance of reported matches; iv) examine and investigate the annotations of reported matches, and their provenance; and v) compare the outputs of different BLAST implementations and evaluate the implications of any differences.
Terri Attwood
Introduction bioinformatics, Introduction blast, Sequence database searching, Sequence similarity searching
Beginners
2018-09-08
A Critical Guide to Unix
This Critical Guide in the Introduction to Bioinformatics series briefly introduces the Unix Operating System, and provides a subset of some of the most helpful and commonly used commands, including those that allow various types of search, navigation and file manipulation. Several keystroke...
Keywords: Command line, Introduction bioinformatics, Introduction unix, Unix commands, Unix operating system
A Critical Guide to Unix
https://www.mygoblet.org/training-portal/materials/critical-guide-unix
http://tess.elixir-uk.org/materials/a-critical-guide-to-unix
This Critical Guide in the Introduction to Bioinformatics series briefly introduces the Unix Operating System, and provides a subset of some of the most helpful and commonly used commands, including those that allow various types of search, navigation and file manipulation. Several keystroke short-cuts are also explained, which help to make the routine use of Unix commands more efficient.
Specifically, this Guide showcases some of the simplest, most frequently used commands to help new users to understand and gain confidence in using the Unix Operating System. On reading the Guide, users will be able: i) to exploit a range of commands: to manipulate files, directories and processes; to navigate directory structures and explore their contents; to search for files, and search and compare file contents; and to direct command outputs into files or into other commands; and ii) to explain what many simple commands mean and how they’re used.
Terri Attwood
Command line, Introduction bioinformatics, Introduction unix, Unix commands, Unix operating system
Beginners
2018-09-08
Filename Expansion
Filename Expansion from https://github.com/swcarpentry/bc/tree/gh-pages/intermediate/shell/05-expansion.md.
Filename Expansion
https://github.com/swcarpentry/bc/tree/gh-pages/intermediate/shell/05-expansion.md
http://tess.elixir-uk.org/materials/filename-expansion
Filename Expansion from https://github.com/swcarpentry/bc/tree/gh-pages/intermediate/shell/05-expansion.md.
Shell
Intermediate
Job Control
Job Control from https://github.com/swcarpentry/bc/tree/gh-pages/intermediate/shell/04-job.md.
Job Control
https://github.com/swcarpentry/bc/tree/gh-pages/intermediate/shell/04-job.md
http://tess.elixir-uk.org/materials/job-control
Job Control from https://github.com/swcarpentry/bc/tree/gh-pages/intermediate/shell/04-job.md.
Shell
Intermediate
Instructor's Guide
Instructor's Guide from https://github.com/swcarpentry/bc/tree/gh-pages/intermediate/shell/guide.md.
Instructor's Guide
https://github.com/swcarpentry/bc/tree/gh-pages/intermediate/shell/guide.md
http://tess.elixir-uk.org/materials/instructor-s-guide
Instructor's Guide from https://github.com/swcarpentry/bc/tree/gh-pages/intermediate/shell/guide.md.
Shell
Intermediate
Variables
Variables from https://github.com/swcarpentry/bc/tree/gh-pages/intermediate/shell/03-var.md.
Variables
https://github.com/swcarpentry/bc/tree/gh-pages/intermediate/shell/03-var.md
http://tess.elixir-uk.org/materials/variables
Variables from https://github.com/swcarpentry/bc/tree/gh-pages/intermediate/shell/03-var.md.
Shell
Intermediate
Loops
Loops from https://github.com/swcarpentry/bc/tree/gh-pages/novice/shell/04-loop.md.
Loops
https://github.com/swcarpentry/bc/tree/gh-pages/novice/shell/04-loop.md
http://tess.elixir-uk.org/materials/loops
Loops from https://github.com/swcarpentry/bc/tree/gh-pages/novice/shell/04-loop.md.
Shell
Novice
Creating Things
Creating Things from https://github.com/swcarpentry/bc/tree/gh-pages/novice/shell/02-create.md.
Creating Things
https://github.com/swcarpentry/bc/tree/gh-pages/novice/shell/02-create.md
http://tess.elixir-uk.org/materials/creating-things
Creating Things from https://github.com/swcarpentry/bc/tree/gh-pages/novice/shell/02-create.md.
Shell
Novice
Shell Scripts
Shell Scripts from https://github.com/swcarpentry/bc/tree/gh-pages/novice/shell/05-script.md.
Shell Scripts
https://github.com/swcarpentry/bc/tree/gh-pages/novice/shell/05-script.md
http://tess.elixir-uk.org/materials/shell-scripts
Shell Scripts from https://github.com/swcarpentry/bc/tree/gh-pages/novice/shell/05-script.md.
Shell
Novice
Pipes and Filters
Pipes and Filters from https://github.com/swcarpentry/bc/tree/gh-pages/novice/shell/03-pipefilter.md.
Pipes and Filters
https://github.com/swcarpentry/bc/tree/gh-pages/novice/shell/03-pipefilter.md
http://tess.elixir-uk.org/materials/pipes-and-filters
Pipes and Filters from https://github.com/swcarpentry/bc/tree/gh-pages/novice/shell/03-pipefilter.md.
Shell
Novice
Introducing the Shell
Introducing the Shell from https://github.com/swcarpentry/bc/tree/gh-pages/novice/shell/00-intro.md.
Introducing the Shell
https://github.com/swcarpentry/bc/tree/gh-pages/novice/shell/00-intro.md
http://tess.elixir-uk.org/materials/introducing-the-shell
Introducing the Shell from https://github.com/swcarpentry/bc/tree/gh-pages/novice/shell/00-intro.md.
Shell
Novice
Finding Things
Finding Things from https://github.com/swcarpentry/bc/tree/gh-pages/novice/shell/06-find.md.
Finding Things
https://github.com/swcarpentry/bc/tree/gh-pages/novice/shell/06-find.md
http://tess.elixir-uk.org/materials/finding-things
Finding Things from https://github.com/swcarpentry/bc/tree/gh-pages/novice/shell/06-find.md.
Shell
Novice
Files and Directories
Files and Directories from https://github.com/swcarpentry/bc/tree/gh-pages/novice/shell/01-filedir.md.
Files and Directories
https://github.com/swcarpentry/bc/tree/gh-pages/novice/shell/01-filedir.md
http://tess.elixir-uk.org/materials/files-and-directories
Files and Directories from https://github.com/swcarpentry/bc/tree/gh-pages/novice/shell/01-filedir.md.
Shell
Novice
Permissions
Permissions from https://github.com/swcarpentry/bc/tree/gh-pages/intermediate/shell/01-perm.md.
Permissions
https://github.com/swcarpentry/bc/tree/gh-pages/intermediate/shell/01-perm.md
http://tess.elixir-uk.org/materials/permissions
Permissions from https://github.com/swcarpentry/bc/tree/gh-pages/intermediate/shell/01-perm.md.
Shell
Intermediate
Working Remotely
Working Remotely from https://github.com/swcarpentry/bc/tree/gh-pages/intermediate/shell/02-ssh.md.
Working Remotely
https://github.com/swcarpentry/bc/tree/gh-pages/intermediate/shell/02-ssh.md
http://tess.elixir-uk.org/materials/working-remotely
Working Remotely from https://github.com/swcarpentry/bc/tree/gh-pages/intermediate/shell/02-ssh.md.
Shell
Intermediate
STAO 2014 Understanding a genetic disease thanks to Bioinformatics
This workshop allows to discover several bioinformatics tools and databases (genome browser, alignment tool, BLAST, dbSNP, UniProtKB, PDB) in the context of the discovery of a rare variant leading to the production of a non functional insulin in a Norwegian family. Additional documents are ...
Keywords: Blast, Genome browsing, Introduction bioinformatics, Protein structure visualisation, Variant detection
STAO 2014 Understanding a genetic disease thanks to Bioinformatics
https://www.mygoblet.org/training-portal/materials/stao-2014-understanding-genetic-disease-thanks-bioinformatics
http://tess.elixir-uk.org/materials/stao-2014-understanding-a-genetic-disease-thanks-to-bioinformatics
This workshop allows to discover several bioinformatics tools and databases (genome browser, alignment tool, BLAST, dbSNP, UniProtKB, PDB) in the context of the discovery of a rare variant leading to the production of a non functional insulin in a Norwegian family. Additional documents are available: http://education.expasy.org/cours/Toronto
Marie-Claude Blatter
Blast, Genome browsing, Introduction bioinformatics, Protein structure visualisation, Variant detection
high school
2014-11-10
2017-10-09
Clinical Bioinformatics I - Tutor notes
Clinical Bioinformatics I is a 10 credit module of the new MSc in Clinical Bioinformatics, delivered by Nowgen/NGRL and the University of Manchester in the UK. Clinical Bioinformatics is one of the streams of the NHS Scientific Training Programme (STP). The programme is a mixture of work...
Keywords: Clinical bioinformatics, Genomics, Introduction bioinformatics
Clinical Bioinformatics I - Tutor notes
https://www.mygoblet.org/training-portal/materials/clinical-bioinformatics-i-tutor-notes
http://tess.elixir-uk.org/materials/clinical-bioinformatics-i-tutor-notes
Clinical Bioinformatics I is a 10 credit module of the new MSc in Clinical Bioinformatics, delivered by Nowgen/NGRL and the University of Manchester in the UK. Clinical Bioinformatics is one of the streams of the NHS Scientific Training Programme (STP). The programme is a mixture of work placement based training interspersed with academic teaching.
The tutor notes describe the content taught on each day of the 5 day course, and the approach used to teach the materials. Each day is divided into two parts, a series of lectures in the morning to introduce the topics, then the afternoon sessions present a series of case studies for the students to work through in a 'problem based learning (PBL) approach to reinforce the lecture content.
Jan Taylor
Clinical bioinformatics, Genomics, Introduction bioinformatics
Life Science Researchers
healthcare professionals
postgrad
2013-11-28
2017-10-09
Introduction to Bioinformatics
An introduction to bioinformatics for bench biologists delivered as part of the EMBL Australia Masterclass on Protein Sequence Analysis http://oz-masterclass.wikispaces.com/ . Focuses on using UniProt to explore different reasons why information inferred by "direct assay" and "prediction" could...
Keywords: Introduction bioinformatics
Introduction to Bioinformatics
https://www.mygoblet.org/training-portal/materials/introduction-bioinformatics
http://tess.elixir-uk.org/materials/introduction-to-bioinformatics
An introduction to bioinformatics for bench biologists delivered as part of the EMBL Australia Masterclass on Protein Sequence Analysis http://oz-masterclass.wikispaces.com/ . Focuses on using UniProt to explore different reasons why information inferred by "direct assay" and "prediction" could be wrong, and what we can do to spot it.
Aidan Budd
Introduction bioinformatics
Bench biologists
2013-10-23
2017-10-09
eBioKit
Extensive teaching experience gained by conducting bioinformatics training courses in Kenya, Uganda, Mauritius (UoM), SriLanka, Sweden, Chile and Zimbabwe showed that it was difficult to successfully teach and demonstrate several bioinformatics resources, due to and often limited by slow Internet...
Keywords: Advanced bioinformatics training, Introduction bioinformatics
eBioKit
https://www.mygoblet.org/training-portal/materials/ebiokit
http://tess.elixir-uk.org/materials/ebiokit
Extensive teaching experience gained by conducting bioinformatics training courses in Kenya, Uganda, Mauritius (UoM), SriLanka, Sweden, Chile and Zimbabwe showed that it was difficult to successfully teach and demonstrate several bioinformatics resources, due to and often limited by slow Internet access. For that reason a bioinformatics platform, eBioKit was engineered to ease the administrative burden of regularly updating large databases and installing software. This platform contains more than 300 bioinformatics applications (EMBOSS, Galaxy, Blast, RSAT etc), and most relevant databases (ENSEMBL, Uniprot, OMIM, PDB, etc) locally, solving the network speed related problems and problems associated with the installation of software. Version 2 of this system has been successfully tested in real world situations both for capacity building and research in Kenya at ILRI, the Biosciences eastern and central Africa (BecA), KEMRI Wellcome Trust Research Programme (KWTRP) and The International Centre of Insect Physiology and Ecology (ICIPE). The system has further been deployed in the SANBio bioinformatics network in Southern Africa (10 countries) and Sri Lanka (Version 1) and has recently been adopted by the H3ABioNet African Bioinformatics Network as a platform to provide bioinformatics training and bioinformatics computing services at nodes in the network (http://www.h3abionet.org).
Erik Bongcam-Rudloff
Advanced bioinformatics training, Introduction bioinformatics
Bachelor students
Bench biologists
Life Science Researchers
PhD students
2013-10-03
2017-10-09