Validate Graph Data with SHACL

Validate Graph Data with SHACL is a training that was developed in the context of the Swiss Personalized Health Network (SPHN) initiative and is part of a series of trainings centered around the SPHN Interoperability Framework developed by the SPHN Data Coordination Center (DCC). The framework aims at facilitating collaborative research by providing a decentralized infrastructure sustained by a strong semantic layer (SPHN Dataset) and graph technology, based on RDF, for the exchange and storage of data.

Ensuring data quality is of utmost importance for high-quality research. To facilitate quality control and compliance with the SPHN data specifications at the data provider level, the DCC offers a set of Shapes Constraint Language (SHACL) rules that can be executed on the data graphs before sharing them with the researchers. The DCC SHACLer tool enables quality control of project-specific data specification by automatically generating SHACLs using the project’s data schema. This also facilitates to review new project specifications and ensures harmonized data delivery from all sites. This training offers an insight into SHACL, a W3C recommended standard language for validating RDF graphs against a set of conditions.

Prerequisites:

To follow this training, a good knowledge of RDF is required.

After the training you will be able to:

  • Validate an RDF data graph against a set of constraints expressed in SHACL.
  • Understand the possibilities and limitations of validation with SHACL.
  • Perform SPHN SHACL validation in GraphDB, and interpret a SHACL Validation Report.
  • Use and understand the SHACLer tool.

Resources:

All resources are available on the training's GitLab space

Licence: Creative Commons Attribution Share Alike 4.0 International

Keywords: Clinical data, SHACL, Data validation, RDF, Knowledge graph, GraphDB, RDF, RDF graph validation


Additional information

Target audience: Research Scientists, Data Managers, Biomedical Researchers, Bioinformaticians, Data Scientists

Resource type: Video, Training materials, E-learning

Status: Archived

Authors: Personalized Health Informatics Group, Philip Krauss, Martin Zablocki

Contributors: Sabine Österle, Vasundra Touré

Scientific topics: Medical informatics, FAIR data, Data management, Computer science

Operations: Validation, Data handling