Skip to content

biolink-model

Details

GitHub biolink/biolink-model
Language Python
Description Schema and generated objects for biolink data model and upper ontology

Dependencies

External Dependencies

Package Version
python ^3.9
curies ^0.7.4
prefixmaps ^0.2.0
pyyaml ^6.0.1
stringcase ^1.2.0
pytest ^7.3.1
beautifulsoup4 >=4.0.0

Documentation

Biolink Model Python 3.10.1 DOI

Biolink Model: https://w3id.org/biolink/biolink-model.yaml

Quickstart docs:

For a good overview of the biolink-model, watch Chris Mungall's talk at ICBO 2020.

Refer to the following resources for a quick introduction to the Biolink Model: - Introduction to the Biolink Datamodel - Biolink Model - A community driven data model for life sciences (Biocuration 2020) - Slides: https://bit.ly/biolink-model-workshop-biocuration-2020 - Video: https://www.youtube.com/watch?v=RE1hFm8lvJA

See also Biolink Model Guidelines for help understanding, curating, and working with the model.

Introduction

The purpose of the Biolink Model is to provide a high-level datamodel of biological entities (genes, diseases, phenotypes, pathways, individuals, substances, etc), their properties, relationships, and enumerate ways in which they can be associated.

The representation is independent of storage technology or metamodel (Solr documents, neo4j/property graphs, RDF/OWL, JSON, CSVs, etc). Different mappings to each of these are provided.

The specification of the Biolink Model is a single YAML file built using linkml. The basic elements of the YAML are:

  • Class Definitions: definitions of upper level classes representing both 'named thing' and 'association'
  • Slot Definitions: definitions of slots (aka properties) that can be used to relate members of these classes to other classes or data types. Slots collectively refer to predicates, node properties, and edge properties

The model itself is being used in the following projects: - NCATS Biomedical Data Translator - Monarch Initiative - KG-COVID-19 - KG Microbe - Illuminating the Druggable Genome

Organization

The main source of truth is biolink-model.yaml. This is a YAML file that is intended to be relatively simple to view and edit in its native form.

The yaml definition is currently used to derive:

Unni DR, Moxon SAT, Bada M, Brush M, Bruskiewich R, Caufield JH, Clemons PA, Dancik V, Dumontier M, Fecho K, Glusman G, Hadlock JJ, Harris NL, Joshi A, Putman T, Qin G, Ramsey SA, Shefchek KA, Solbrig H, Soman K, Thessen AE, Haendel MA, Bizon C, Mungall CJ, The Biomedical Data Translator Consortium (2022). Biolink Model: A universal schema for knowledge graphs in clinical, biomedical, and translational science. Clin Transl Sci. Wiley; 2022 Jun 6; https://onlinelibrary.wiley.com/doi/10.1111/cts.13302