Skip to content

Monarch Ingest

Overview

The Monarch Ingest generates KGX formatted files conforming to the BioLink Model from a wide variety of biomedical data sources.

The eventual output of the Monarch Ingest process is the Monarch KG.
The latest version of this can be found at data.monarchinitiative.org

See also the folder monarch-kg-dev/latest

Monarch Ingest is managed with uv, which creates and manages its own virtual environment.

Installation

monarch-ingest is a Python 3.10+ package, managed with uv.

  1. Install uv, if you don't already have it:

    curl -LsSf https://astral.sh/uv/install.sh | sh
    

  2. Clone the repo:

    git clone git@github.com:monarch-initiative/monarch-ingest
    

  3. Install monarch-ingest:

    cd monarch-ingest
    uv sync
    

  4. (Optional) Activate the virtual environment:

    # This step removes the need to prefix all commands with `uv run`
    source .venv/bin/activate
    

Usage

For a detailed tutorial on ingests and how to make one, see the Create an Ingest tab.

CLI usage is available in the CLI tab, gcor by running ingest --help.

Run the whole pipeline!
  • Download the source data:

    ingest download --all
    

  • Run all transforms:

    ingest transform --all
    

  • Merge all transformed output into a tar.gz containing one node and one edge file

    ingest merge
    

  • Upload the results to the Monarch Ingest Google bucket

    ingest release