biolink-model
Details
GitHub | biolink/biolink-model |
Language | Python |
Description | Schema and generated objects for biolink data model and upper ontology |
Dependencies
External Dependencies
Package | Version |
---|---|
python | ^3.9 |
curies | ^0.7.4 |
prefixmaps | ^0.2.3 |
pyyaml | ^6.0.1 |
stringcase | ^1.2.0 |
pytest | ^8.1.1 |
beautifulsoup4 | >=4.0.0 |
yamllint | ^1.35.1 |
path | ^17.0.0 |
linkml-runtime | ^1.8.3 |
Documentation
Biolink Model
Biolink Model: https://w3id.org/biolink/biolink-model.yaml
Quickstart docs:
For a good overview of the biolink-model, watch Chris Mungall's talk at ICBO 2020.
- Browse the model: https://biolink.github.io/biolink-model
- named thing
- association
- predicate
Refer to the following resources for a quick introduction to the Biolink Model: - Introduction to the Biolink Datamodel - Biolink Model - A community driven data model for life sciences (Biocuration 2020) - Slides: https://bit.ly/biolink-model-workshop-biocuration-2020 - Video: https://www.youtube.com/watch?v=RE1hFm8lvJA
See also the Biolink Model Documentation for help in understanding, curating, and working with the model.
Introduction
The purpose of the Biolink Model is to provide a high-level datamodel of biological entities (genes, diseases, phenotypes, pathways, individuals, substances, etc), their properties, relationships, and enumerate ways in which they can be associated.
The representation is independent of storage technology or metamodel (Solr documents, neo4j/property graphs, RDF/OWL, JSON, CSVs, etc). Different mappings to each of these are provided.
The specification of the Biolink Model is a single YAML file built using linkml. The basic elements of the YAML are:
- Class Definitions: definitions of upper level classes representing both 'named thing' and 'association'
- Slot Definitions: definitions of slots (aka properties) that can be used to relate members of these classes to other classes or data types. Slots collectively refer to predicates, node properties, and edge properties
The model itself is being used in the following projects: - NCATS Biomedical Data Translator - Monarch Initiative - KG-COVID-19 - KG Microbe - Illuminating the Druggable Genome
Organization
The main source of truth is biolink-model.yaml. This is a YAML file that is intended to be relatively simple to view and edit in its native form.
The yaml definition is currently used to derive:
- JSON Schema
- Python dataclasses
- ProtoBuf definitions
- GraphQL
- OWL
- RDF Shape Expressions
- JSON-LD context
- SHACL Shapes
- ShEx Shapes
Citing Biolink Model
Unni DR, Moxon SAT, Bada M, Brush M, Bruskiewich R, Caufield JH, Clemons PA, Dancik V, Dumontier M, Fecho K, Glusman G, Hadlock JJ, Harris NL, Joshi A, Putman T, Qin G, Ramsey SA, Shefchek KA, Solbrig H, Soman K, Thessen AE, Haendel MA, Bizon C, Mungall CJ, The Biomedical Data Translator Consortium (2022). Biolink Model: A universal schema for knowledge graphs in clinical, biomedical, and translational science. Clin Transl Sci. Wiley; 2022 Jun 6; https://onlinelibrary.wiley.com/doi/10.1111/cts.13302