vignettes/examples/transitive_rollups.Rmd
transitive_rollups.Rmd
First, let’s load some required libraries, and instantiate a
session-cached monarch_engine()
for querying.
Knowledge graphs frequently incorporate ontologies, which include
complex hierarchies of classes and sub-classes. Let’s visualize a couple
of levels of this hierarchy for the phenotype
"leg phenotype"
(removing "leg phenotype"
as
it will make the next examples clearer).
phenos <- e |>
fetch_nodes(name == "leg phenotype") |>
expand_n(predicates = "biolink:subclass_of",
categories = "biolink:PhenotypicFeature",
direction = "out",
n = 4) |>
activate(nodes) |>
filter(name != "leg phenotype")
plot(phenos)
It is not uncommon for data like this to come with additional
information; if these were a set of disease diagnoses, we might have
patient counts associated with each. Since patients receive diagnoses of
varying specificity, there may be counts on any subtype.
Hypothesizing these phenotypes as diagnoses, we’ll simulate some count information, plotting it in the node labels:
set.seed(42)
num_nodes <- nrow(nodes(phenos))
phenos_counted <- phenos |>
activate(nodes) |>
mutate(count = rpois(num_nodes, lambda = 5))
plot(phenos_counted,
node_label = paste(name, " || count: ", count))
A “rollup” might thus ask, how many patients are associated with each phenotype, if we include all of it’s descendants? For example, “lower limb segment phenotype” (8 patients) is a subclass of “limb segment phenotype” (4 patients), so the total number of “limb segment phenotype” patients includes both (12 patients).
The roll_up()
function allows us to compute this
information. It is designed to work with dplyr
’s
mutate()
on node data: we provide the column specifying
information to aggregate, a function to apply over the values (amongst
all descendants), and whether each node should include its own value in
the aggregation.
phenos_counted_rolled <- phenos_counted |>
activate(nodes) |>
mutate(total = roll_up(count, fun = sum, include_self = TRUE))
plot(phenos_counted_rolled,
node_label = paste0(name,
" || count: ", count,
" || total: ", total))
The corresponding roll_down()
aggregates in the opposite direction (not shown).
When performing a rollup, each node receives the specified column,
indexed to include only its descendants (and itself, if
include_self
is set). This is then passed to the
aggregating function fun
.
To see how this can be useful, we’ll start by introducing another
function, transfer()
. Much like roll_up()
,
this function is designed to be used with mutate()
on node
data; its purpose is to transfer information across edges, usually from
nodes of one kind to another. We’ll start by fetching all of the
subtypes of Niemann-Pick disease, and all known causal genes.
npc_genes <- e |>
fetch_nodes(name == "Niemann-Pick disease") |>
descendants() |>
expand(predicates = "biolink:causes")
plot(npc_genes)
Now, we might wish for disease nodes to have an attribute reflecting
their causal genes. This information is captured in the graph, but not
as a part of the nodes. The
transfer()
function ‘pulls’
information across edges:
npc_genes_causal <- npc_genes |>
activate(nodes) |>
mutate(caused_by = transfer(name, over = "biolink:causes", direction = "out"))
plot(npc_genes_causal,
node_label = paste0(name, " || caused by: ", caused_by))
npc_genes_causal
Graph with 17 nodes and 17 edges. Expand sections below for details.
Showing 17 of 17 nodes:
id | pcategory | name | symbol | in_taxon_label | description | synonym (list) | category (list) | iri | xref (list) | namespace | provided_by (list) | full_name | in_taxon (list) | type (list) | caused_by |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
“MONDO:0001982” | “biolink:Disease” | “Niemann-Pick disease” | NA | NA | “A group of inherited, severe metabolic disorders in which sphingomyelin accumulates in lysosomes in cells. The lysosomes normally transport material through and out of the cell.” | c(“Niemann-Pick disease with cholesterol esterification block”, “Niemann-Pick disease, subacute juvenile form”, “lipoid histiocytosis”, “lipoid histiocytosis (classical phosphatide)”, “sphingomyelin lipidosis”, “sphingomyelin/cholesterol lipidosis”, “sphingomyelinase deficiency disease”, “type A Niemann-Pick disease”) | c(“biolink:BiologicalEntity”, “biolink:Disease”, “biolink:DiseaseOrPhenotypicFeature”, “biolink:Entity”, “biolink:NamedThing”, “biolink:ThingWithTaxon”) | “http://purl.obolibrary.org/obo/MONDO_0001982” | c(“DOID:14504”, “EFO:1001380”, “GARD:13334”, “MEDGEN:10348”, “MESH:D009542”, “NANDO:2200561”, “NCIT:C61269”, “SCTID:58459009”, “UMLS:C0028064”, “icd11.foundation:398872780”) | “MONDO” | “phenio_nodes” | NA | NA | NA | NA |
“MONDO:0018982” | “biolink:Disease” | “Niemann-Pick disease type C” | NA | NA | “NPC is a complex lipid storage disease mainly characterized by the accumulation of unesterified cholesterol in the late endosomal/lysosomal compartment.” | c(“NPC”, “Niemann Pick Disease Type C”) | c(“biolink:BiologicalEntity”, “biolink:Disease”, “biolink:DiseaseOrPhenotypicFeature”, “biolink:Entity”, “biolink:NamedThing”, “biolink:ThingWithTaxon”) | “http://purl.obolibrary.org/obo/MONDO_0018982” | c(“GARD:7207”, “MEDGEN:67399”, “MESH:D052556”, “NANDO:1200063”, “NORD:1509”, “Orphanet:646”, “SCTID:66751000”, “UMLS:C0220756”, “icd11.foundation:812702125”) | “MONDO” | “phenio_nodes” | NA | NA | NA | NA |
“MONDO:0011873” | “biolink:Disease” | “Niemann-Pick disease, type C2” | NA | NA | “Niemann-Pick disease type C2 is a rare metabolic condition that affects many different parts of the body. Although signs and symptoms can develop at any age (infancy through adulthood), most affected people develop features of the condition during childhood. Neimann-Pick disease type C2 may be characterized by ataxia (difficulty coordinating movements), vertical supranuclear gaze palsy (inability to move the eyes vertically), poor muscle tone, hepatosplenomegaly (enlarged liver and spleen), interstitial lung disease, intellectual decline, seizures, speech problems, and difficulty swallowing. Niemann-Pick disease type C2 is caused by changes (mutations) in the NPC2 gene and is inherited in an autosomal recessive manner. There is, unfortunately, no cure for Niemann-Pick disease type C2. Treatment is based on the signs and symptoms present in each person.” | c(“NPC2”, “Niemann-PICK disease, type C2”, “Niemann-Pick disease type C2”, “Niemann-Pick disease, type C2”, “type C2 Niemann-Pick disease”) | c(“biolink:BiologicalEntity”, “biolink:Disease”, “biolink:DiseaseOrPhenotypicFeature”, “biolink:Entity”, “biolink:NamedThing”, “biolink:ThingWithTaxon”) | “http://purl.obolibrary.org/obo/MONDO_0011873” | c(“DOID:0070114”, “GARD:3992”, “MEDGEN:335942”, “MESH:C536119”, “NCIT:C126865”, “OMIM:607625”, “UMLS:C1843366”) | “MONDO” | “phenio_nodes” | NA | NA | NA | “NPC2” |
“MONDO:0100464” | “biolink:Disease” | “acid sphingomyelinase deficiency” | NA | NA | “An autosomal recessive lysosomal disease caused by biallelic loss of function variants in the SMPD1 gene. Clinical symptoms in affected individuals occur along a continuum. At the severe end of the spectrum are individuals historically diagnosed with Niemann-Pick disease type A (the neurovisceral form), which is characterized by hepatosplenomegaly with rapid neurological deterioration leading to death in the first few years of life. At the milder end of the spectrum are individuals historically diagnosed with Niemann-Pick disease type B, a later-onset, chronic visceral form, characterized by progressive visceral organ symptoms including hepatosplenomegaly and pulmonary insufficiency, and survival into adulthood. In addition, some affected individuals present with an intermediate phenotype, Niemann-Pick disease type A/B.” | NA | c(“biolink:BiologicalEntity”, “biolink:Disease”, “biolink:DiseaseOrPhenotypicFeature”, “biolink:Entity”, “biolink:NamedThing”, “biolink:ThingWithTaxon”) | “http://purl.obolibrary.org/obo/MONDO_0100464” | c(“MEDGEN:1800807”, “UMLS:C5243927”) | “MONDO” | “phenio_nodes” | NA | NA | NA | NA |
“MONDO:0011871” | “biolink:Disease” | “Niemann-Pick disease type B” | NA | NA | “Niemann-Pick disease type B is a mild subtype of Niemann-Pick disease, an autosomal recessive lysosomal disease, and is characterized clinically by onset in childhood with hepatosplenomegaly, growth retardation, and lung disorders such as infections and dyspnea” | c(“Niemann Pick disease type B”, “Niemann-PICK disease, type B”, “Niemann-Pick disease, Intermediate, with visceral involvement and rapid progression”, “Niemann-Pick disease, type E”, “Niemann-Pick disease, type F”, “type B Niemann-Pick disease”) | c(“biolink:BiologicalEntity”, “biolink:Disease”, “biolink:DiseaseOrPhenotypicFeature”, “biolink:Entity”, “biolink:NamedThing”, “biolink:ThingWithTaxon”) | “http://purl.obolibrary.org/obo/MONDO_0011871” | c(“DOID:0070112”, “GARD:10729”, “ICD10CM:E75.241”, “MEDGEN:78651”, “MESH:D052537”, “NANDO:1200062”, “NANDO:2201207”, “NCIT:C126866”, “OMIM:607616”, “Orphanet:77293”, “SCTID:39390005”, “UMLS:C0268243”, “icd11.foundation:327269975”) | “MONDO” | “phenio_nodes” | NA | NA | NA | “SMPD1” |
“MONDO:0009757” | “biolink:Disease” | “Niemann-Pick disease, type C1” | NA | NA | “Type C Niemann-Pick disease associated with a mutation in the gene NPC1, encoding Niemann-Pick C1 protein.” | c(“NPC1”, “Niemann-PICK disease, type C1”, “Niemann-Pick disease type C1”, “Niemann-Pick disease with cholesterol esterification block”, “Niemann-Pick disease without sphingomyelinase deficiency”, “Niemann-Pick disease, chronic neuronopathic form”, “Niemann-Pick disease, nova Scotian type”, “Niemann-Pick disease, subacute juvenile form”, “Niemann-Pick disease, type C”, “Niemann-Pick disease, type C1”, “Niemann-Pick disease, type D”, “neurovisceral storage disease with vertical supranuclear ophthalmoplegia”, “type C1 Niemann-Pick disease”) | c(“biolink:BiologicalEntity”, “biolink:Disease”, “biolink:DiseaseOrPhenotypicFeature”, “biolink:Entity”, “biolink:NamedThing”, “biolink:ThingWithTaxon”) | “http://purl.obolibrary.org/obo/MONDO_0009757” | c(“DOID:0070113”, “MEDGEN:465922”, “NCIT:C126864”, “OMIM:257220”, “SCTID:18927009”, “SCTID:67855008”, “UMLS:C3179455”) | “MONDO” | “phenio_nodes” | NA | NA | NA | “NPC1” |
“MONDO:0009756” | “biolink:Disease” | “Niemann-Pick disease type A” | NA | NA | “Niemann-Pick disease type A is a very severe subtype of Niemann-Pick disease, an autosomal recessive lysosomal disease, and is characterized clinically by onset in infancy or early childhood with failure to thrive, hepatosplenomegaly, and rapidly progressive neurodegenerative disorders.” | c(“Niemann-PICK disease, type A”, “Niemann-Pick disease, Intermediate, protracted neurovisceral”, “sphingomyelin lipidosis”, “sphingomyelinase deficiency”) | c(“biolink:BiologicalEntity”, “biolink:Disease”, “biolink:DiseaseOrPhenotypicFeature”, “biolink:Entity”, “biolink:NamedThing”, “biolink:ThingWithTaxon”) | “http://purl.obolibrary.org/obo/MONDO_0009756” | c(“DOID:0070111”, “GARD:7206”, “MEDGEN:78650”, “MESH:D052536”, “NANDO:1200061”, “NANDO:2201206”, “NCIT:C126561”, “OMIM:257200”, “Orphanet:77292”, “SCTID:52165006”, “UMLS:C0268242”, “icd11.foundation:530611243”) | “MONDO” | “phenio_nodes” | NA | NA | NA | “SMPD1” |
“MONDO:0016306” | “biolink:Disease” | “Niemann-Pick disease type C, severe perinatal form” | NA | NA | NA | NA | c(“biolink:BiologicalEntity”, “biolink:Disease”, “biolink:DiseaseOrPhenotypicFeature”, “biolink:Entity”, “biolink:NamedThing”, “biolink:ThingWithTaxon”) | “http://purl.obolibrary.org/obo/MONDO_0016306” | c(“GARD:20504”, “MEDGEN:1842349”, “Orphanet:216972”, “UMLS:C5680866”) | “MONDO” | “phenio_nodes” | NA | NA | NA | NA |
“MONDO:0016307” | “biolink:Disease” | “Niemann-Pick disease type C, severe early infantile neurologic onset” | NA | NA | NA | NA | c(“biolink:BiologicalEntity”, “biolink:Disease”, “biolink:DiseaseOrPhenotypicFeature”, “biolink:Entity”, “biolink:NamedThing”, “biolink:ThingWithTaxon”) | “http://purl.obolibrary.org/obo/MONDO_0016307” | c(“GARD:20505”, “MEDGEN:1842400”, “Orphanet:216975”, “UMLS:C5680868”, “icd11.foundation:587642791”) | “MONDO” | “phenio_nodes” | NA | NA | NA | NA |
“MONDO:0016308” | “biolink:Disease” | “Niemann-Pick disease type C, late infantile neurologic onset” | NA | NA | NA | NA | c(“biolink:BiologicalEntity”, “biolink:Disease”, “biolink:DiseaseOrPhenotypicFeature”, “biolink:Entity”, “biolink:NamedThing”, “biolink:ThingWithTaxon”) | “http://purl.obolibrary.org/obo/MONDO_0016308” | c(“GARD:20506”, “MEDGEN:1843353”, “Orphanet:216978”, “UMLS:C5680867”, “icd11.foundation:2075382821”) | “MONDO” | “phenio_nodes” | NA | NA | NA | NA |
“MONDO:0016310” | “biolink:Disease” | “Niemann-Pick disease type C, adult neurologic onset” | NA | NA | NA | NA | c(“biolink:BiologicalEntity”, “biolink:Disease”, “biolink:DiseaseOrPhenotypicFeature”, “biolink:Entity”, “biolink:NamedThing”, “biolink:ThingWithTaxon”) | “http://purl.obolibrary.org/obo/MONDO_0016310” | c(“GARD:20508”, “MEDGEN:1826101”, “NANDO:1200065”, “NANDO:2201209”, “Orphanet:216986”, “UMLS:C5680869”, “icd11.foundation:77127214”) | “MONDO” | “phenio_nodes” | NA | NA | NA | NA |
“MONDO:0016309” | “biolink:Disease” | “Niemann-Pick disease type C, juvenile neurologic onset” | NA | NA | NA | “Niemann-Pick disease type C, classic form” | c(“biolink:BiologicalEntity”, “biolink:Disease”, “biolink:DiseaseOrPhenotypicFeature”, “biolink:Entity”, “biolink:NamedThing”, “biolink:ThingWithTaxon”) | “http://purl.obolibrary.org/obo/MONDO_0016309” | c(“GARD:20507”, “MEDGEN:1842257”, “Orphanet:216981”, “UMLS:C5679813”, “icd11.foundation:2006062681”) | “MONDO” | “phenio_nodes” | NA | NA | NA | NA |
“MONDO:0020384” | “biolink:Disease” | “Niemann-Pick disease type E” | NA | NA | “Niemann-Pick disease, type E is a poorly defined adult-onset and non-neuronopathic form of Niemann-Pick disease.” | NA | c(“biolink:BiologicalEntity”, “biolink:Disease”, “biolink:DiseaseOrPhenotypicFeature”, “biolink:Entity”, “biolink:NamedThing”, “biolink:ThingWithTaxon”) | “http://purl.obolibrary.org/obo/MONDO_0020384” | c(“MEDGEN:82781”, “Orphanet:99022”, “SCTID:73399005”, “UMLS:C0268248”) | “MONDO” | “phenio_nodes” | NA | NA | NA | NA |
“MONDO:0850058” | “biolink:Disease” | “chronic neurovisceral acid sphingomyelinase deficiency” | NA | NA | NA | NA | c(“biolink:BiologicalEntity”, “biolink:Disease”, “biolink:DiseaseOrPhenotypicFeature”, “biolink:Entity”, “biolink:NamedThing”, “biolink:ThingWithTaxon”) | “http://purl.obolibrary.org/obo/MONDO_0850058” | c(“GARD:22456”, “MEDGEN:1842316”, “Orphanet:618891”, “UMLS:C5539139”) | “MONDO” | “phenio_nodes” | NA | NA | NA | NA |
“HGNC:14537” | “biolink:Gene” | “NPC2” | “NPC2” | “Homo sapiens” | NA | c(“EDDM1”, “HE1”, “NP-C2”, “Niemann-Pick disease, type C2”, “epididymal protein 1”) | c(“biolink:BiologicalEntity”, “biolink:ChemicalEntityOrGeneOrGeneProduct”, “biolink:Entity”, “biolink:Gene”, “biolink:GeneOrGeneProduct”, “biolink:GenomicEntity”, “biolink:MacromolecularMachineMixin”, “biolink:NamedThing”, “biolink:OntologyClass”, “biolink:PhysicalEssence”, “biolink:PhysicalEssenceOrOccurrent”, “biolink:ThingWithTaxon”) | NA | c(“ENSEMBL:ENSG00000119655”, “OMIM:601015”) | “HGNC” | “hgnc_gene_nodes” | “NPC intracellular cholesterol transporter 2” | “NCBITaxon:9606” | “SO:0001217” | NA |
“HGNC:7897” | “biolink:Gene” | “NPC1” | “NPC1” | “Homo sapiens” | NA | c(“Niemann-Pick disease, type C1”, “SLC65A1”) | c(“biolink:BiologicalEntity”, “biolink:ChemicalEntityOrGeneOrGeneProduct”, “biolink:Entity”, “biolink:Gene”, “biolink:GeneOrGeneProduct”, “biolink:GenomicEntity”, “biolink:MacromolecularMachineMixin”, “biolink:NamedThing”, “biolink:OntologyClass”, “biolink:PhysicalEssence”, “biolink:PhysicalEssenceOrOccurrent”, “biolink:ThingWithTaxon”) | NA | c(“ENSEMBL:ENSG00000141458”, “OMIM:607623”) | “HGNC” | “hgnc_gene_nodes” | “NPC intracellular cholesterol transporter 1” | “NCBITaxon:9606” | “SO:0001217” | NA |
“HGNC:11120” | “biolink:Gene” | “SMPD1” | “SMPD1” | “Homo sapiens” | NA | c(“ASM”, “Niemann-Pick type A/B”, “acid sphingomyelinase”, “sphingomyelin phosphodiesterase 1, acid lysosomal”) | c(“biolink:BiologicalEntity”, “biolink:ChemicalEntityOrGeneOrGeneProduct”, “biolink:Entity”, “biolink:Gene”, “biolink:GeneOrGeneProduct”, “biolink:GenomicEntity”, “biolink:MacromolecularMachineMixin”, “biolink:NamedThing”, “biolink:OntologyClass”, “biolink:PhysicalEssence”, “biolink:PhysicalEssenceOrOccurrent”, “biolink:ThingWithTaxon”) | NA | c(“ENSEMBL:ENSG00000166311”, “OMIM:607608”) | “HGNC” | “hgnc_gene_nodes” | “sphingomyelin phosphodiesterase 1” | “NCBITaxon:9606” | “SO:0001217” | NA |
Showing 17 of 17 edges:
from | to | subject | predicate | object | primary_knowledge_source | agent_type | knowledge_level | knowledge_source | aggregator_knowledge_source (list) | provided_by (list) | id | category (list) | original_subject | original_object |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
3 | 2 | “MONDO:0011873” | “biolink:subclass_of” | “MONDO:0018982” | “infores:mondo” | “not_provided” | “not_provided” | “monarch-kg_edges.jsonl” | c(“infores:monarchinitiative”, “infores:phenio”) | “phenio_edges” | “urn:uuid:d84e5ef9-9f3a-48ad-b02e-8bcb3c5c1d66” | c(“biolink:Association”, “biolink:Entity”) | NA | NA |
5 | 4 | “MONDO:0011871” | “biolink:subclass_of” | “MONDO:0100464” | “infores:mondo” | “not_provided” | “not_provided” | “monarch-kg_edges.jsonl” | c(“infores:monarchinitiative”, “infores:phenio”) | “phenio_edges” | “urn:uuid:1139cc43-14fb-40be-888a-5db87084ac41” | c(“biolink:Association”, “biolink:Entity”) | NA | NA |
6 | 2 | “MONDO:0009757” | “biolink:subclass_of” | “MONDO:0018982” | “infores:mondo” | “not_provided” | “not_provided” | “monarch-kg_edges.jsonl” | c(“infores:monarchinitiative”, “infores:phenio”) | “phenio_edges” | “urn:uuid:bc968cd7-5df4-4f0a-a1d0-cd9ab1024c4f” | c(“biolink:Association”, “biolink:Entity”) | NA | NA |
7 | 4 | “MONDO:0009756” | “biolink:subclass_of” | “MONDO:0100464” | “infores:mondo” | “not_provided” | “not_provided” | “monarch-kg_edges.jsonl” | c(“infores:monarchinitiative”, “infores:phenio”) | “phenio_edges” | “urn:uuid:d4851de8-b18c-4957-8c8d-d592732c2e2d” | c(“biolink:Association”, “biolink:Entity”) | NA | NA |
8 | 2 | “MONDO:0016306” | “biolink:subclass_of” | “MONDO:0018982” | “infores:mondo” | “not_provided” | “not_provided” | “monarch-kg_edges.jsonl” | c(“infores:monarchinitiative”, “infores:phenio”) | “phenio_edges” | “urn:uuid:6b6a820e-1cff-48c3-ad38-e8ead9291e11” | c(“biolink:Association”, “biolink:Entity”) | NA | NA |
9 | 2 | “MONDO:0016307” | “biolink:subclass_of” | “MONDO:0018982” | “infores:mondo” | “not_provided” | “not_provided” | “monarch-kg_edges.jsonl” | c(“infores:monarchinitiative”, “infores:phenio”) | “phenio_edges” | “urn:uuid:2a8dfadf-942b-40d1-872f-ea8743e643bb” | c(“biolink:Association”, “biolink:Entity”) | NA | NA |
10 | 2 | “MONDO:0016308” | “biolink:subclass_of” | “MONDO:0018982” | “infores:mondo” | “not_provided” | “not_provided” | “monarch-kg_edges.jsonl” | c(“infores:monarchinitiative”, “infores:phenio”) | “phenio_edges” | “urn:uuid:60bf219e-359c-47d8-9625-7548f3fa29de” | c(“biolink:Association”, “biolink:Entity”) | NA | NA |
11 | 2 | “MONDO:0016310” | “biolink:subclass_of” | “MONDO:0018982” | “infores:mondo” | “not_provided” | “not_provided” | “monarch-kg_edges.jsonl” | c(“infores:monarchinitiative”, “infores:phenio”) | “phenio_edges” | “urn:uuid:3ee3d8a0-8c25-4f72-b655-85353e04a57b” | c(“biolink:Association”, “biolink:Entity”) | NA | NA |
12 | 2 | “MONDO:0016309” | “biolink:subclass_of” | “MONDO:0018982” | “infores:mondo” | “not_provided” | “not_provided” | “monarch-kg_edges.jsonl” | c(“infores:monarchinitiative”, “infores:phenio”) | “phenio_edges” | “urn:uuid:73fa49c2-2b74-4197-a2cb-3ad785d43498” | c(“biolink:Association”, “biolink:Entity”) | NA | NA |
2 | 1 | “MONDO:0018982” | “biolink:subclass_of” | “MONDO:0001982” | “infores:mondo” | “not_provided” | “not_provided” | “monarch-kg_edges.jsonl” | c(“infores:monarchinitiative”, “infores:phenio”) | “phenio_edges” | “urn:uuid:2955ef59-2e0c-4857-8a3b-619e37e283f5” | c(“biolink:Association”, “biolink:Entity”) | NA | NA |
13 | 1 | “MONDO:0020384” | “biolink:subclass_of” | “MONDO:0001982” | “infores:mondo” | “not_provided” | “not_provided” | “monarch-kg_edges.jsonl” | c(“infores:monarchinitiative”, “infores:phenio”) | “phenio_edges” | “urn:uuid:c68dd71c-5986-499e-a898-2191b1b9060a” | c(“biolink:Association”, “biolink:Entity”) | NA | NA |
4 | 1 | “MONDO:0100464” | “biolink:subclass_of” | “MONDO:0001982” | “infores:mondo” | “not_provided” | “not_provided” | “monarch-kg_edges.jsonl” | c(“infores:monarchinitiative”, “infores:phenio”) | “phenio_edges” | “urn:uuid:07d2ac87-52da-4628-bf0a-ca8dd0ef173c” | c(“biolink:Association”, “biolink:Entity”) | NA | NA |
14 | 1 | “MONDO:0850058” | “biolink:subclass_of” | “MONDO:0001982” | “infores:mondo” | “not_provided” | “not_provided” | “monarch-kg_edges.jsonl” | c(“infores:monarchinitiative”, “infores:phenio”) | “phenio_edges” | “urn:uuid:69db7c8b-0592-4e33-9a9d-8cdd3f1fb3de” | c(“biolink:Association”, “biolink:Entity”) | NA | NA |
15 | 3 | “HGNC:14537” | “biolink:causes” | “MONDO:0011873” | “infores:omim” | “manual_agent” | “knowledge_assertion” | “monarch-kg_edges.jsonl” | c(“infores:medgen”, “infores:monarchinitiative”) | “hpoa_gene_to_disease_edges” | “uuid:a6261aaa-8b41-11ef-b621-6045bdbae67e” | c(“biolink:Association”, “biolink:CausalGeneToDiseaseAssociation”, “biolink:Entity”, “biolink:EntityToDiseaseAssociationMixin”, “biolink:EntityToFeatureOrDiseaseQualifiersMixin”, “biolink:EntityToPhenotypicFeatureAssociationMixin”, “biolink:FrequencyQualifierMixin”, “biolink:FrequencyQuantifier”, “biolink:GeneToDiseaseAssociation”, “biolink:GeneToDiseaseOrPhenotypicFeatureAssociation”, “biolink:GeneToEntityAssociationMixin”, “biolink:RelationshipQuantifier”) | “NCBIGene:10577” | “OMIM:607625” |
16 | 6 | “HGNC:7897” | “biolink:causes” | “MONDO:0009757” | “infores:omim” | “manual_agent” | “knowledge_assertion” | “monarch-kg_edges.jsonl” | c(“infores:medgen”, “infores:monarchinitiative”) | “hpoa_gene_to_disease_edges” | “uuid:a63c3eec-8b41-11ef-b621-6045bdbae67e” | c(“biolink:Association”, “biolink:CausalGeneToDiseaseAssociation”, “biolink:Entity”, “biolink:EntityToDiseaseAssociationMixin”, “biolink:EntityToFeatureOrDiseaseQualifiersMixin”, “biolink:EntityToPhenotypicFeatureAssociationMixin”, “biolink:FrequencyQualifierMixin”, “biolink:FrequencyQuantifier”, “biolink:GeneToDiseaseAssociation”, “biolink:GeneToDiseaseOrPhenotypicFeatureAssociation”, “biolink:GeneToEntityAssociationMixin”, “biolink:RelationshipQuantifier”) | “NCBIGene:4864” | “OMIM:257220” |
17 | 5 | “HGNC:11120” | “biolink:causes” | “MONDO:0011871” | “infores:omim” | “manual_agent” | “knowledge_assertion” | “monarch-kg_edges.jsonl” | c(“infores:medgen”, “infores:monarchinitiative”) | “hpoa_gene_to_disease_edges” | “uuid:a6261ac5-8b41-11ef-b621-6045bdbae67e” | c(“biolink:Association”, “biolink:CausalGeneToDiseaseAssociation”, “biolink:Entity”, “biolink:EntityToDiseaseAssociationMixin”, “biolink:EntityToFeatureOrDiseaseQualifiersMixin”, “biolink:EntityToPhenotypicFeatureAssociationMixin”, “biolink:FrequencyQualifierMixin”, “biolink:FrequencyQuantifier”, “biolink:GeneToDiseaseAssociation”, “biolink:GeneToDiseaseOrPhenotypicFeatureAssociation”, “biolink:GeneToEntityAssociationMixin”, “biolink:RelationshipQuantifier”) | “NCBIGene:6609” | “OMIM:607616” |
17 | 7 | “HGNC:11120” | “biolink:causes” | “MONDO:0009756” | “infores:omim” | “manual_agent” | “knowledge_assertion” | “monarch-kg_edges.jsonl” | c(“infores:medgen”, “infores:monarchinitiative”) | “hpoa_gene_to_disease_edges” | “uuid:a63c3f7e-8b41-11ef-b621-6045bdbae67e” | c(“biolink:Association”, “biolink:CausalGeneToDiseaseAssociation”, “biolink:Entity”, “biolink:EntityToDiseaseAssociationMixin”, “biolink:EntityToFeatureOrDiseaseQualifiersMixin”, “biolink:EntityToPhenotypicFeatureAssociationMixin”, “biolink:FrequencyQualifierMixin”, “biolink:FrequencyQuantifier”, “biolink:GeneToDiseaseAssociation”, “biolink:GeneToDiseaseOrPhenotypicFeatureAssociation”, “biolink:GeneToEntityAssociationMixin”, “biolink:RelationshipQuantifier”) | “NCBIGene:6609” | “OMIM:257200” |
Here, transfer is moving information ‘over’ (or across)
"biolink:causes"
edges in an outward direction, along the
direction of the edge. The transferred information is being drawn from
source nodes’ name
, resulting in a new
caused_by
column in the node table.
## # A tibble: 17 × 2
## name caused_by
## <chr> <chr>
## 1 Niemann-Pick disease NA
## 2 Niemann-Pick disease type C NA
## 3 Niemann-Pick disease, type C2 NPC2
## 4 acid sphingomyelinase deficiency NA
## 5 Niemann-Pick disease type B SMPD1
## 6 Niemann-Pick disease, type C1 NPC1
## 7 Niemann-Pick disease type A SMPD1
## 8 Niemann-Pick disease type C, severe perinatal form NA
## 9 Niemann-Pick disease type C, severe early infantile neurologic onset NA
## 10 Niemann-Pick disease type C, late infantile neurologic onset NA
## 11 Niemann-Pick disease type C, adult neurologic onset NA
## 12 Niemann-Pick disease type C, juvenile neurologic onset NA
## 13 Niemann-Pick disease type E NA
## 14 chronic neurovisceral acid sphingomyelinase deficiency NA
## 15 NPC2 NA
## 16 NPC1 NA
## 17 SMPD1 NA
In cases where a transfer would result in multiple values being collected at the destination node, the result will be a list column.
To finish this example, we use roll_up()
to collect, for
each diseases, the set of genes that cause it or any of its
subtypes.
npc_genes_causal_rolled <- npc_genes_causal |>
activate(nodes) |>
mutate(any_caused_by = roll_up(caused_by,
fun = unique,
include_self = TRUE,
predicates = "biolink:subclass_of"))
plot(npc_genes_causal_rolled,
node_label = paste0(name, " || any caused by: ", any_caused_by))
npc_genes_causal_rolled
Graph with 17 nodes and 17 edges. Expand sections below for details.
Showing 17 of 17 nodes:
id | pcategory | name | symbol | in_taxon_label | description | synonym (list) | category (list) | iri | xref (list) | namespace | provided_by (list) | full_name | in_taxon (list) | type (list) | caused_by | any_caused_by (list) |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
“MONDO:0001982” | “biolink:Disease” | “Niemann-Pick disease” | NA | NA | “A group of inherited, severe metabolic disorders in which sphingomyelin accumulates in lysosomes in cells. The lysosomes normally transport material through and out of the cell.” | c(“Niemann-Pick disease with cholesterol esterification block”, “Niemann-Pick disease, subacute juvenile form”, “lipoid histiocytosis”, “lipoid histiocytosis (classical phosphatide)”, “sphingomyelin lipidosis”, “sphingomyelin/cholesterol lipidosis”, “sphingomyelinase deficiency disease”, “type A Niemann-Pick disease”) | c(“biolink:BiologicalEntity”, “biolink:Disease”, “biolink:DiseaseOrPhenotypicFeature”, “biolink:Entity”, “biolink:NamedThing”, “biolink:ThingWithTaxon”) | “http://purl.obolibrary.org/obo/MONDO_0001982” | c(“DOID:14504”, “EFO:1001380”, “GARD:13334”, “MEDGEN:10348”, “MESH:D009542”, “NANDO:2200561”, “NCIT:C61269”, “SCTID:58459009”, “UMLS:C0028064”, “icd11.foundation:398872780”) | “MONDO” | “phenio_nodes” | NA | NA | NA | NA | c(NA, “NPC2”, “NPC1”, “SMPD1”) |
“MONDO:0018982” | “biolink:Disease” | “Niemann-Pick disease type C” | NA | NA | “NPC is a complex lipid storage disease mainly characterized by the accumulation of unesterified cholesterol in the late endosomal/lysosomal compartment.” | c(“NPC”, “Niemann Pick Disease Type C”) | c(“biolink:BiologicalEntity”, “biolink:Disease”, “biolink:DiseaseOrPhenotypicFeature”, “biolink:Entity”, “biolink:NamedThing”, “biolink:ThingWithTaxon”) | “http://purl.obolibrary.org/obo/MONDO_0018982” | c(“GARD:7207”, “MEDGEN:67399”, “MESH:D052556”, “NANDO:1200063”, “NORD:1509”, “Orphanet:646”, “SCTID:66751000”, “UMLS:C0220756”, “icd11.foundation:812702125”) | “MONDO” | “phenio_nodes” | NA | NA | NA | NA | c(NA, “NPC2”, “NPC1”) |
“MONDO:0011873” | “biolink:Disease” | “Niemann-Pick disease, type C2” | NA | NA | “Niemann-Pick disease type C2 is a rare metabolic condition that affects many different parts of the body. Although signs and symptoms can develop at any age (infancy through adulthood), most affected people develop features of the condition during childhood. Neimann-Pick disease type C2 may be characterized by ataxia (difficulty coordinating movements), vertical supranuclear gaze palsy (inability to move the eyes vertically), poor muscle tone, hepatosplenomegaly (enlarged liver and spleen), interstitial lung disease, intellectual decline, seizures, speech problems, and difficulty swallowing. Niemann-Pick disease type C2 is caused by changes (mutations) in the NPC2 gene and is inherited in an autosomal recessive manner. There is, unfortunately, no cure for Niemann-Pick disease type C2. Treatment is based on the signs and symptoms present in each person.” | c(“NPC2”, “Niemann-PICK disease, type C2”, “Niemann-Pick disease type C2”, “Niemann-Pick disease, type C2”, “type C2 Niemann-Pick disease”) | c(“biolink:BiologicalEntity”, “biolink:Disease”, “biolink:DiseaseOrPhenotypicFeature”, “biolink:Entity”, “biolink:NamedThing”, “biolink:ThingWithTaxon”) | “http://purl.obolibrary.org/obo/MONDO_0011873” | c(“DOID:0070114”, “GARD:3992”, “MEDGEN:335942”, “MESH:C536119”, “NCIT:C126865”, “OMIM:607625”, “UMLS:C1843366”) | “MONDO” | “phenio_nodes” | NA | NA | NA | “NPC2” | “NPC2” |
“MONDO:0100464” | “biolink:Disease” | “acid sphingomyelinase deficiency” | NA | NA | “An autosomal recessive lysosomal disease caused by biallelic loss of function variants in the SMPD1 gene. Clinical symptoms in affected individuals occur along a continuum. At the severe end of the spectrum are individuals historically diagnosed with Niemann-Pick disease type A (the neurovisceral form), which is characterized by hepatosplenomegaly with rapid neurological deterioration leading to death in the first few years of life. At the milder end of the spectrum are individuals historically diagnosed with Niemann-Pick disease type B, a later-onset, chronic visceral form, characterized by progressive visceral organ symptoms including hepatosplenomegaly and pulmonary insufficiency, and survival into adulthood. In addition, some affected individuals present with an intermediate phenotype, Niemann-Pick disease type A/B.” | NA | c(“biolink:BiologicalEntity”, “biolink:Disease”, “biolink:DiseaseOrPhenotypicFeature”, “biolink:Entity”, “biolink:NamedThing”, “biolink:ThingWithTaxon”) | “http://purl.obolibrary.org/obo/MONDO_0100464” | c(“MEDGEN:1800807”, “UMLS:C5243927”) | “MONDO” | “phenio_nodes” | NA | NA | NA | NA | c(NA, “SMPD1”) |
“MONDO:0011871” | “biolink:Disease” | “Niemann-Pick disease type B” | NA | NA | “Niemann-Pick disease type B is a mild subtype of Niemann-Pick disease, an autosomal recessive lysosomal disease, and is characterized clinically by onset in childhood with hepatosplenomegaly, growth retardation, and lung disorders such as infections and dyspnea” | c(“Niemann Pick disease type B”, “Niemann-PICK disease, type B”, “Niemann-Pick disease, Intermediate, with visceral involvement and rapid progression”, “Niemann-Pick disease, type E”, “Niemann-Pick disease, type F”, “type B Niemann-Pick disease”) | c(“biolink:BiologicalEntity”, “biolink:Disease”, “biolink:DiseaseOrPhenotypicFeature”, “biolink:Entity”, “biolink:NamedThing”, “biolink:ThingWithTaxon”) | “http://purl.obolibrary.org/obo/MONDO_0011871” | c(“DOID:0070112”, “GARD:10729”, “ICD10CM:E75.241”, “MEDGEN:78651”, “MESH:D052537”, “NANDO:1200062”, “NANDO:2201207”, “NCIT:C126866”, “OMIM:607616”, “Orphanet:77293”, “SCTID:39390005”, “UMLS:C0268243”, “icd11.foundation:327269975”) | “MONDO” | “phenio_nodes” | NA | NA | NA | “SMPD1” | “SMPD1” |
“MONDO:0009757” | “biolink:Disease” | “Niemann-Pick disease, type C1” | NA | NA | “Type C Niemann-Pick disease associated with a mutation in the gene NPC1, encoding Niemann-Pick C1 protein.” | c(“NPC1”, “Niemann-PICK disease, type C1”, “Niemann-Pick disease type C1”, “Niemann-Pick disease with cholesterol esterification block”, “Niemann-Pick disease without sphingomyelinase deficiency”, “Niemann-Pick disease, chronic neuronopathic form”, “Niemann-Pick disease, nova Scotian type”, “Niemann-Pick disease, subacute juvenile form”, “Niemann-Pick disease, type C”, “Niemann-Pick disease, type C1”, “Niemann-Pick disease, type D”, “neurovisceral storage disease with vertical supranuclear ophthalmoplegia”, “type C1 Niemann-Pick disease”) | c(“biolink:BiologicalEntity”, “biolink:Disease”, “biolink:DiseaseOrPhenotypicFeature”, “biolink:Entity”, “biolink:NamedThing”, “biolink:ThingWithTaxon”) | “http://purl.obolibrary.org/obo/MONDO_0009757” | c(“DOID:0070113”, “MEDGEN:465922”, “NCIT:C126864”, “OMIM:257220”, “SCTID:18927009”, “SCTID:67855008”, “UMLS:C3179455”) | “MONDO” | “phenio_nodes” | NA | NA | NA | “NPC1” | “NPC1” |
“MONDO:0009756” | “biolink:Disease” | “Niemann-Pick disease type A” | NA | NA | “Niemann-Pick disease type A is a very severe subtype of Niemann-Pick disease, an autosomal recessive lysosomal disease, and is characterized clinically by onset in infancy or early childhood with failure to thrive, hepatosplenomegaly, and rapidly progressive neurodegenerative disorders.” | c(“Niemann-PICK disease, type A”, “Niemann-Pick disease, Intermediate, protracted neurovisceral”, “sphingomyelin lipidosis”, “sphingomyelinase deficiency”) | c(“biolink:BiologicalEntity”, “biolink:Disease”, “biolink:DiseaseOrPhenotypicFeature”, “biolink:Entity”, “biolink:NamedThing”, “biolink:ThingWithTaxon”) | “http://purl.obolibrary.org/obo/MONDO_0009756” | c(“DOID:0070111”, “GARD:7206”, “MEDGEN:78650”, “MESH:D052536”, “NANDO:1200061”, “NANDO:2201206”, “NCIT:C126561”, “OMIM:257200”, “Orphanet:77292”, “SCTID:52165006”, “UMLS:C0268242”, “icd11.foundation:530611243”) | “MONDO” | “phenio_nodes” | NA | NA | NA | “SMPD1” | “SMPD1” |
“MONDO:0016306” | “biolink:Disease” | “Niemann-Pick disease type C, severe perinatal form” | NA | NA | NA | NA | c(“biolink:BiologicalEntity”, “biolink:Disease”, “biolink:DiseaseOrPhenotypicFeature”, “biolink:Entity”, “biolink:NamedThing”, “biolink:ThingWithTaxon”) | “http://purl.obolibrary.org/obo/MONDO_0016306” | c(“GARD:20504”, “MEDGEN:1842349”, “Orphanet:216972”, “UMLS:C5680866”) | “MONDO” | “phenio_nodes” | NA | NA | NA | NA | NA |
“MONDO:0016307” | “biolink:Disease” | “Niemann-Pick disease type C, severe early infantile neurologic onset” | NA | NA | NA | NA | c(“biolink:BiologicalEntity”, “biolink:Disease”, “biolink:DiseaseOrPhenotypicFeature”, “biolink:Entity”, “biolink:NamedThing”, “biolink:ThingWithTaxon”) | “http://purl.obolibrary.org/obo/MONDO_0016307” | c(“GARD:20505”, “MEDGEN:1842400”, “Orphanet:216975”, “UMLS:C5680868”, “icd11.foundation:587642791”) | “MONDO” | “phenio_nodes” | NA | NA | NA | NA | NA |
“MONDO:0016308” | “biolink:Disease” | “Niemann-Pick disease type C, late infantile neurologic onset” | NA | NA | NA | NA | c(“biolink:BiologicalEntity”, “biolink:Disease”, “biolink:DiseaseOrPhenotypicFeature”, “biolink:Entity”, “biolink:NamedThing”, “biolink:ThingWithTaxon”) | “http://purl.obolibrary.org/obo/MONDO_0016308” | c(“GARD:20506”, “MEDGEN:1843353”, “Orphanet:216978”, “UMLS:C5680867”, “icd11.foundation:2075382821”) | “MONDO” | “phenio_nodes” | NA | NA | NA | NA | NA |
“MONDO:0016310” | “biolink:Disease” | “Niemann-Pick disease type C, adult neurologic onset” | NA | NA | NA | NA | c(“biolink:BiologicalEntity”, “biolink:Disease”, “biolink:DiseaseOrPhenotypicFeature”, “biolink:Entity”, “biolink:NamedThing”, “biolink:ThingWithTaxon”) | “http://purl.obolibrary.org/obo/MONDO_0016310” | c(“GARD:20508”, “MEDGEN:1826101”, “NANDO:1200065”, “NANDO:2201209”, “Orphanet:216986”, “UMLS:C5680869”, “icd11.foundation:77127214”) | “MONDO” | “phenio_nodes” | NA | NA | NA | NA | NA |
“MONDO:0016309” | “biolink:Disease” | “Niemann-Pick disease type C, juvenile neurologic onset” | NA | NA | NA | “Niemann-Pick disease type C, classic form” | c(“biolink:BiologicalEntity”, “biolink:Disease”, “biolink:DiseaseOrPhenotypicFeature”, “biolink:Entity”, “biolink:NamedThing”, “biolink:ThingWithTaxon”) | “http://purl.obolibrary.org/obo/MONDO_0016309” | c(“GARD:20507”, “MEDGEN:1842257”, “Orphanet:216981”, “UMLS:C5679813”, “icd11.foundation:2006062681”) | “MONDO” | “phenio_nodes” | NA | NA | NA | NA | NA |
“MONDO:0020384” | “biolink:Disease” | “Niemann-Pick disease type E” | NA | NA | “Niemann-Pick disease, type E is a poorly defined adult-onset and non-neuronopathic form of Niemann-Pick disease.” | NA | c(“biolink:BiologicalEntity”, “biolink:Disease”, “biolink:DiseaseOrPhenotypicFeature”, “biolink:Entity”, “biolink:NamedThing”, “biolink:ThingWithTaxon”) | “http://purl.obolibrary.org/obo/MONDO_0020384” | c(“MEDGEN:82781”, “Orphanet:99022”, “SCTID:73399005”, “UMLS:C0268248”) | “MONDO” | “phenio_nodes” | NA | NA | NA | NA | NA |
“MONDO:0850058” | “biolink:Disease” | “chronic neurovisceral acid sphingomyelinase deficiency” | NA | NA | NA | NA | c(“biolink:BiologicalEntity”, “biolink:Disease”, “biolink:DiseaseOrPhenotypicFeature”, “biolink:Entity”, “biolink:NamedThing”, “biolink:ThingWithTaxon”) | “http://purl.obolibrary.org/obo/MONDO_0850058” | c(“GARD:22456”, “MEDGEN:1842316”, “Orphanet:618891”, “UMLS:C5539139”) | “MONDO” | “phenio_nodes” | NA | NA | NA | NA | NA |
“HGNC:14537” | “biolink:Gene” | “NPC2” | “NPC2” | “Homo sapiens” | NA | c(“EDDM1”, “HE1”, “NP-C2”, “Niemann-Pick disease, type C2”, “epididymal protein 1”) | c(“biolink:BiologicalEntity”, “biolink:ChemicalEntityOrGeneOrGeneProduct”, “biolink:Entity”, “biolink:Gene”, “biolink:GeneOrGeneProduct”, “biolink:GenomicEntity”, “biolink:MacromolecularMachineMixin”, “biolink:NamedThing”, “biolink:OntologyClass”, “biolink:PhysicalEssence”, “biolink:PhysicalEssenceOrOccurrent”, “biolink:ThingWithTaxon”) | NA | c(“ENSEMBL:ENSG00000119655”, “OMIM:601015”) | “HGNC” | “hgnc_gene_nodes” | “NPC intracellular cholesterol transporter 2” | “NCBITaxon:9606” | “SO:0001217” | NA | NA |
“HGNC:7897” | “biolink:Gene” | “NPC1” | “NPC1” | “Homo sapiens” | NA | c(“Niemann-Pick disease, type C1”, “SLC65A1”) | c(“biolink:BiologicalEntity”, “biolink:ChemicalEntityOrGeneOrGeneProduct”, “biolink:Entity”, “biolink:Gene”, “biolink:GeneOrGeneProduct”, “biolink:GenomicEntity”, “biolink:MacromolecularMachineMixin”, “biolink:NamedThing”, “biolink:OntologyClass”, “biolink:PhysicalEssence”, “biolink:PhysicalEssenceOrOccurrent”, “biolink:ThingWithTaxon”) | NA | c(“ENSEMBL:ENSG00000141458”, “OMIM:607623”) | “HGNC” | “hgnc_gene_nodes” | “NPC intracellular cholesterol transporter 1” | “NCBITaxon:9606” | “SO:0001217” | NA | NA |
“HGNC:11120” | “biolink:Gene” | “SMPD1” | “SMPD1” | “Homo sapiens” | NA | c(“ASM”, “Niemann-Pick type A/B”, “acid sphingomyelinase”, “sphingomyelin phosphodiesterase 1, acid lysosomal”) | c(“biolink:BiologicalEntity”, “biolink:ChemicalEntityOrGeneOrGeneProduct”, “biolink:Entity”, “biolink:Gene”, “biolink:GeneOrGeneProduct”, “biolink:GenomicEntity”, “biolink:MacromolecularMachineMixin”, “biolink:NamedThing”, “biolink:OntologyClass”, “biolink:PhysicalEssence”, “biolink:PhysicalEssenceOrOccurrent”, “biolink:ThingWithTaxon”) | NA | c(“ENSEMBL:ENSG00000166311”, “OMIM:607608”) | “HGNC” | “hgnc_gene_nodes” | “sphingomyelin phosphodiesterase 1” | “NCBITaxon:9606” | “SO:0001217” | NA | NA |
Showing 17 of 17 edges:
from | to | subject | predicate | object | primary_knowledge_source | agent_type | knowledge_level | knowledge_source | aggregator_knowledge_source (list) | provided_by (list) | id | category (list) | original_subject | original_object |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
3 | 2 | “MONDO:0011873” | “biolink:subclass_of” | “MONDO:0018982” | “infores:mondo” | “not_provided” | “not_provided” | “monarch-kg_edges.jsonl” | c(“infores:monarchinitiative”, “infores:phenio”) | “phenio_edges” | “urn:uuid:d84e5ef9-9f3a-48ad-b02e-8bcb3c5c1d66” | c(“biolink:Association”, “biolink:Entity”) | NA | NA |
5 | 4 | “MONDO:0011871” | “biolink:subclass_of” | “MONDO:0100464” | “infores:mondo” | “not_provided” | “not_provided” | “monarch-kg_edges.jsonl” | c(“infores:monarchinitiative”, “infores:phenio”) | “phenio_edges” | “urn:uuid:1139cc43-14fb-40be-888a-5db87084ac41” | c(“biolink:Association”, “biolink:Entity”) | NA | NA |
6 | 2 | “MONDO:0009757” | “biolink:subclass_of” | “MONDO:0018982” | “infores:mondo” | “not_provided” | “not_provided” | “monarch-kg_edges.jsonl” | c(“infores:monarchinitiative”, “infores:phenio”) | “phenio_edges” | “urn:uuid:bc968cd7-5df4-4f0a-a1d0-cd9ab1024c4f” | c(“biolink:Association”, “biolink:Entity”) | NA | NA |
7 | 4 | “MONDO:0009756” | “biolink:subclass_of” | “MONDO:0100464” | “infores:mondo” | “not_provided” | “not_provided” | “monarch-kg_edges.jsonl” | c(“infores:monarchinitiative”, “infores:phenio”) | “phenio_edges” | “urn:uuid:d4851de8-b18c-4957-8c8d-d592732c2e2d” | c(“biolink:Association”, “biolink:Entity”) | NA | NA |
8 | 2 | “MONDO:0016306” | “biolink:subclass_of” | “MONDO:0018982” | “infores:mondo” | “not_provided” | “not_provided” | “monarch-kg_edges.jsonl” | c(“infores:monarchinitiative”, “infores:phenio”) | “phenio_edges” | “urn:uuid:6b6a820e-1cff-48c3-ad38-e8ead9291e11” | c(“biolink:Association”, “biolink:Entity”) | NA | NA |
9 | 2 | “MONDO:0016307” | “biolink:subclass_of” | “MONDO:0018982” | “infores:mondo” | “not_provided” | “not_provided” | “monarch-kg_edges.jsonl” | c(“infores:monarchinitiative”, “infores:phenio”) | “phenio_edges” | “urn:uuid:2a8dfadf-942b-40d1-872f-ea8743e643bb” | c(“biolink:Association”, “biolink:Entity”) | NA | NA |
10 | 2 | “MONDO:0016308” | “biolink:subclass_of” | “MONDO:0018982” | “infores:mondo” | “not_provided” | “not_provided” | “monarch-kg_edges.jsonl” | c(“infores:monarchinitiative”, “infores:phenio”) | “phenio_edges” | “urn:uuid:60bf219e-359c-47d8-9625-7548f3fa29de” | c(“biolink:Association”, “biolink:Entity”) | NA | NA |
11 | 2 | “MONDO:0016310” | “biolink:subclass_of” | “MONDO:0018982” | “infores:mondo” | “not_provided” | “not_provided” | “monarch-kg_edges.jsonl” | c(“infores:monarchinitiative”, “infores:phenio”) | “phenio_edges” | “urn:uuid:3ee3d8a0-8c25-4f72-b655-85353e04a57b” | c(“biolink:Association”, “biolink:Entity”) | NA | NA |
12 | 2 | “MONDO:0016309” | “biolink:subclass_of” | “MONDO:0018982” | “infores:mondo” | “not_provided” | “not_provided” | “monarch-kg_edges.jsonl” | c(“infores:monarchinitiative”, “infores:phenio”) | “phenio_edges” | “urn:uuid:73fa49c2-2b74-4197-a2cb-3ad785d43498” | c(“biolink:Association”, “biolink:Entity”) | NA | NA |
2 | 1 | “MONDO:0018982” | “biolink:subclass_of” | “MONDO:0001982” | “infores:mondo” | “not_provided” | “not_provided” | “monarch-kg_edges.jsonl” | c(“infores:monarchinitiative”, “infores:phenio”) | “phenio_edges” | “urn:uuid:2955ef59-2e0c-4857-8a3b-619e37e283f5” | c(“biolink:Association”, “biolink:Entity”) | NA | NA |
13 | 1 | “MONDO:0020384” | “biolink:subclass_of” | “MONDO:0001982” | “infores:mondo” | “not_provided” | “not_provided” | “monarch-kg_edges.jsonl” | c(“infores:monarchinitiative”, “infores:phenio”) | “phenio_edges” | “urn:uuid:c68dd71c-5986-499e-a898-2191b1b9060a” | c(“biolink:Association”, “biolink:Entity”) | NA | NA |
4 | 1 | “MONDO:0100464” | “biolink:subclass_of” | “MONDO:0001982” | “infores:mondo” | “not_provided” | “not_provided” | “monarch-kg_edges.jsonl” | c(“infores:monarchinitiative”, “infores:phenio”) | “phenio_edges” | “urn:uuid:07d2ac87-52da-4628-bf0a-ca8dd0ef173c” | c(“biolink:Association”, “biolink:Entity”) | NA | NA |
14 | 1 | “MONDO:0850058” | “biolink:subclass_of” | “MONDO:0001982” | “infores:mondo” | “not_provided” | “not_provided” | “monarch-kg_edges.jsonl” | c(“infores:monarchinitiative”, “infores:phenio”) | “phenio_edges” | “urn:uuid:69db7c8b-0592-4e33-9a9d-8cdd3f1fb3de” | c(“biolink:Association”, “biolink:Entity”) | NA | NA |
15 | 3 | “HGNC:14537” | “biolink:causes” | “MONDO:0011873” | “infores:omim” | “manual_agent” | “knowledge_assertion” | “monarch-kg_edges.jsonl” | c(“infores:medgen”, “infores:monarchinitiative”) | “hpoa_gene_to_disease_edges” | “uuid:a6261aaa-8b41-11ef-b621-6045bdbae67e” | c(“biolink:Association”, “biolink:CausalGeneToDiseaseAssociation”, “biolink:Entity”, “biolink:EntityToDiseaseAssociationMixin”, “biolink:EntityToFeatureOrDiseaseQualifiersMixin”, “biolink:EntityToPhenotypicFeatureAssociationMixin”, “biolink:FrequencyQualifierMixin”, “biolink:FrequencyQuantifier”, “biolink:GeneToDiseaseAssociation”, “biolink:GeneToDiseaseOrPhenotypicFeatureAssociation”, “biolink:GeneToEntityAssociationMixin”, “biolink:RelationshipQuantifier”) | “NCBIGene:10577” | “OMIM:607625” |
16 | 6 | “HGNC:7897” | “biolink:causes” | “MONDO:0009757” | “infores:omim” | “manual_agent” | “knowledge_assertion” | “monarch-kg_edges.jsonl” | c(“infores:medgen”, “infores:monarchinitiative”) | “hpoa_gene_to_disease_edges” | “uuid:a63c3eec-8b41-11ef-b621-6045bdbae67e” | c(“biolink:Association”, “biolink:CausalGeneToDiseaseAssociation”, “biolink:Entity”, “biolink:EntityToDiseaseAssociationMixin”, “biolink:EntityToFeatureOrDiseaseQualifiersMixin”, “biolink:EntityToPhenotypicFeatureAssociationMixin”, “biolink:FrequencyQualifierMixin”, “biolink:FrequencyQuantifier”, “biolink:GeneToDiseaseAssociation”, “biolink:GeneToDiseaseOrPhenotypicFeatureAssociation”, “biolink:GeneToEntityAssociationMixin”, “biolink:RelationshipQuantifier”) | “NCBIGene:4864” | “OMIM:257220” |
17 | 5 | “HGNC:11120” | “biolink:causes” | “MONDO:0011871” | “infores:omim” | “manual_agent” | “knowledge_assertion” | “monarch-kg_edges.jsonl” | c(“infores:medgen”, “infores:monarchinitiative”) | “hpoa_gene_to_disease_edges” | “uuid:a6261ac5-8b41-11ef-b621-6045bdbae67e” | c(“biolink:Association”, “biolink:CausalGeneToDiseaseAssociation”, “biolink:Entity”, “biolink:EntityToDiseaseAssociationMixin”, “biolink:EntityToFeatureOrDiseaseQualifiersMixin”, “biolink:EntityToPhenotypicFeatureAssociationMixin”, “biolink:FrequencyQualifierMixin”, “biolink:FrequencyQuantifier”, “biolink:GeneToDiseaseAssociation”, “biolink:GeneToDiseaseOrPhenotypicFeatureAssociation”, “biolink:GeneToEntityAssociationMixin”, “biolink:RelationshipQuantifier”) | “NCBIGene:6609” | “OMIM:607616” |
17 | 7 | “HGNC:11120” | “biolink:causes” | “MONDO:0009756” | “infores:omim” | “manual_agent” | “knowledge_assertion” | “monarch-kg_edges.jsonl” | c(“infores:medgen”, “infores:monarchinitiative”) | “hpoa_gene_to_disease_edges” | “uuid:a63c3f7e-8b41-11ef-b621-6045bdbae67e” | c(“biolink:Association”, “biolink:CausalGeneToDiseaseAssociation”, “biolink:Entity”, “biolink:EntityToDiseaseAssociationMixin”, “biolink:EntityToFeatureOrDiseaseQualifiersMixin”, “biolink:EntityToPhenotypicFeatureAssociationMixin”, “biolink:FrequencyQualifierMixin”, “biolink:FrequencyQuantifier”, “biolink:GeneToDiseaseAssociation”, “biolink:GeneToDiseaseOrPhenotypicFeatureAssociation”, “biolink:GeneToEntityAssociationMixin”, “biolink:RelationshipQuantifier”) | “NCBIGene:6609” | “OMIM:257200” |
The inclusion of NA
values may not be desired (it
signals that at least one of the rolled nodes had a
caused_by
of NA
). We could write an
aggregating function that removes NA
and supply that; this
would also a good use case for purrr
’s
compose()
(fun = compose(unique, na.omit)
).
Let’s return to the patient-count example, using the rolled-up data:
It may be the case that to protect patient privacy (again, pretending
these phenotypes are disease diagnoses associated with patients) we want
to remove nodes that have a
count
less than 6. If we do so
however, we lose connectivity:
censored <- phenos_counted_rolled |>
activate(nodes) |>
filter(!count < 6)
plot(censored,
node_label = paste0(name,
" || count: ", count,
" || total: ", total))
To fix this, we can first compute the
transitive_closure()
of the graph, with respect to an edge predicate we want to treat as
transitive (defaulting to biolink:subclass_of
). We color
edges by primary_knowledge_source
to highlight that newly
created transitive edges are given knowledge source
transitive_<predicate>
, but use the same predicate.
The result is busy, and in general the number of transitive edges can be
in the number of nodes.
phenos_closed <- phenos_counted_rolled |>
transitive_closure(predicate = "biolink:subclass_of")
plot(phenos_closed,
node_label = paste0(name,
" || count: ", count,
" || total: ", total),
edge_color = primary_knowledge_source,
edge_linetype = predicate)
Now we can try our removal:
closed_censored <- phenos_closed |>
activate(nodes) |>
filter(!count < 6)
plot(closed_censored,
node_label = paste0(name,
" || count: ", count,
" || total: ", total),
edge_color = primary_knowledge_source,
edge_linetype = predicate)
The graph retains its connectivity, but has redundant edges. The
transitive_reduction()
function removes these, again
according to a specified transitive predicate (defaulting again to
biolink:subclass_f
).
phenos_final <- closed_censored |>
transitive_reduction(predicate = "biolink:subclass_of")
plot(phenos_final,
node_label = paste0(name,
" || count: ", count,
" || total: ", total),
edge_color = primary_knowledge_source,
edge_linetype = predicate)