Given an optional KG engine (e.g. a file_engine()
,
neo4j_engine()
, or monarch_engine()
) and a query tbl_kgx()
graph, fetches additional nodes and edges
from the KG, expanding the query graph according to specific criteria. If the first parameter is an engine, that
engine is used; if the first parameter is a query graph, the most recent engine associated with the graph is used.
expand(
graph,
engine = NULL,
direction = "both",
predicates = NULL,
categories = NULL,
transitive = FALSE,
drop_unused_query_nodes = FALSE,
...
)
A query tbl_kgx()
graph ot query with.
(Optional) An engine to use for fetching query graph edges. If not provided, the graph's most recent engine is used.
The direction of associations to fetch. Can be "in", "out", or "both". Default is "both".
A vector of relationship predicates (nodes in g are subjects in the KG), indicating which edges to consider in the neighborhood. If NULL (default), all edges are considered.
A vector of node categories, indicating which nodes in the larger KG may be fetched. If NULL (default), all nodes in the larger KG are will be fetched.
If TRUE, include transitive closure of the neighborhood. Default is FALSE. Useful in combination with predicates like biolink:subclass_of
.
If TRUE, remove query nodes from the result, unless they are at the neighborhood boundary, i.e., required for connecting to the result nodes. Default is FALSE.
Other parameters passed to methods.
A tbl_kgx()
graph
## Using Monarch (hosted)
phenos <- monarch_engine() |>
fetch_nodes(query_ids = "MONDO:0007525") |>
expand(predicates = "biolink:has_phenotype",
categories = "biolink:PhenotypicFeature")
#> Trying to connect to https://neo4j.monarchinitiative.org
#> Connected to https://neo4j.monarchinitiative.org
#> Fetching; counting matching nodes...
#> total: 1.
#> Fetching; fetched 1 of 1
#> Expanding; counting matching edges...
#> total: 48.
#> Expanding; fetched 48 of 48 edges.
print(phenos)
#> # A tbl_graph: 49 nodes and 48 edges
#> #
#> # A rooted tree
#> #
#> # Node Data: 49 × 10 (active)
#> id pcategory name description synonym category iri xref namespace
#> <chr> <chr> <chr> <chr> <list> <list> <chr> <lis> <chr>
#> 1 MONDO:000… biolink:… Ehle… Arthrochal… <chr> <chr> http… <chr> MONDO
#> 2 HP:0000963 biolink:… Thin… Reduction … <chr> <chr> http… <chr> HP
#> 3 HP:0000974 biolink:… Hype… A conditio… <chr> <chr> http… <chr> HP
#> 4 HP:0001001 biolink:… Abno… NA <chr> <chr> http… <chr> HP
#> 5 HP:0001252 biolink:… Hypo… Hypotonia … <chr> <chr> http… <chr> HP
#> 6 HP:0001373 biolink:… Join… Displaceme… <chr> <chr> http… <chr> HP
#> 7 HP:0001385 biolink:… Hip … The presen… <chr> <chr> http… <chr> HP
#> 8 HP:0001387 biolink:… Join… Joint stif… <chr> <chr> http… <chr> HP
#> 9 HP:0002300 biolink:… Muti… Complete l… <chr> <chr> http… <chr> HP
#> 10 HP:0002381 biolink:… Apha… An acquire… <chr> <chr> http… <chr> HP
#> # ℹ 39 more rows
#> # ℹ 1 more variable: provided_by <list>
#> #
#> # Edge Data: 48 × 23
#> from to subject predicate object primary_knowledge_so…¹ knowledge_level
#> <int> <int> <chr> <chr> <chr> <chr> <chr>
#> 1 1 2 MONDO:000… biolink:… HP:00… infores:orphanet knowledge_asse…
#> 2 1 3 MONDO:000… biolink:… HP:00… infores:omim knowledge_asse…
#> 3 1 4 MONDO:000… biolink:… HP:00… infores:orphanet knowledge_asse…
#> # ℹ 45 more rows
#> # ℹ abbreviated name: ¹primary_knowledge_source
#> # ℹ 16 more variables: negated <lgl>, frequency_qualifier <chr>,
#> # original_subject <chr>, agent_type <chr>, knowledge_source <chr>,
#> # aggregator_knowledge_source <list>, has_evidence <list>,
#> # provided_by <list>, id <chr>, category <list>, has_total <chr>,
#> # has_quotient <chr>, has_count <chr>, has_percentage <chr>, …
## Using example KGX file packaged with monarchr
filename <- system.file("extdata", "eds_marfan_kg.tar.gz", package = "monarchr")
phenos <- file_engine(filename) |>
fetch_nodes(query_ids = "MONDO:0007525") |>
expand(predicates = "biolink:has_phenotype",
categories = "biolink:PhenotypicFeature")
print(phenos)
#> # A tbl_graph: 49 nodes and 48 edges
#> #
#> # A rooted tree
#> #
#> # Node Data: 49 × 16 (active)
#> id pcategory name symbol in_taxon_label description synonym category
#> <chr> <chr> <chr> <chr> <chr> <chr> <list> <list>
#> 1 MONDO:000… biolink:… Ehle… NA NA Arthrochal… <chr> <chr>
#> 2 HP:0000974 biolink:… Hype… NA NA A conditio… <chr> <chr>
#> 3 HP:0001382 biolink:… Join… NA NA The abilit… <chr> <chr>
#> 4 HP:0000023 biolink:… Ingu… NA NA Protrusion… <chr> <chr>
#> 5 HP:0000963 biolink:… Thin… NA NA Reduction … <chr> <chr>
#> 6 HP:0000978 biolink:… Brui… NA NA An ecchymo… <chr> <chr>
#> 7 HP:0001027 biolink:… Soft… NA NA A skin tex… <chr> <chr>
#> 8 HP:0001058 biolink:… Poor… NA NA A reduced … <chr> <chr>
#> 9 HP:0001075 biolink:… Atro… NA NA Scars that… <chr> <chr>
#> 10 HP:0001373 biolink:… Join… NA NA Displaceme… <chr> <chr>
#> # ℹ 39 more rows
#> # ℹ 8 more variables: iri <chr>, xref <list>, namespace <chr>,
#> # provided_by <chr>, in_taxon <chr>, full_name <chr>, type <list>,
#> # has_gene <chr>
#> #
#> # Edge Data: 48 × 25
#> from to subject predicate object primary_knowledge_so…¹ agent_type
#> <int> <int> <chr> <chr> <chr> <chr> <chr>
#> 1 1 8 MONDO:0007525 biolink:ha… HP:00… infores:hpo-annotatio… manual_ag…
#> 2 1 29 MONDO:0007525 biolink:ha… HP:00… infores:hpo-annotatio… manual_ag…
#> 3 1 20 MONDO:0007525 biolink:ha… HP:00… infores:hpo-annotatio… manual_ag…
#> # ℹ 45 more rows
#> # ℹ abbreviated name: ¹primary_knowledge_source
#> # ℹ 18 more variables: knowledge_level <chr>, knowledge_source <chr>,
#> # aggregator_knowledge_source <chr>, provided_by <chr>, id <chr>,
#> # category <chr>, original_object <chr>, original_subject <chr>,
#> # frequency_qualifier <chr>, has_evidence <chr>, has_total <dbl>,
#> # has_quotient <dbl>, has_count <dbl>, has_percentage <dbl>, …
## Using MONDO KGX file (remote) as an example
phenos <- file_engine("https://kghub.io/kg-obo/mondo/2024-03-04/mondo_kgx_tsv.tar.gz") |>
fetch_nodes(query_ids = "MONDO:0007525") |>
expand(predicates = "biolink:has_phenotype",
categories = "biolink:PhenotypicFeature")
print(phenos)
#> # A tbl_graph: 1 nodes and 0 edges
#> #
#> # A rooted tree
#> #
#> # Node Data: 1 × 11 (active)
#> id pcategory name description synonym category xref provided_by iri
#> <chr> <chr> <chr> <chr> <list> <list> <lis> <chr> <chr>
#> 1 MONDO:00… biolink:… Ehle… Arthrochal… <chr> <chr> <chr> mondo.json http…
#> # ℹ 2 more variables: same_as <list>, subsets <list>
#> #
#> # Edge Data: 0 × 5
#> # ℹ 5 variables: from <int>, to <int>, subject <chr>, predicate <chr>,
#> # object <chr>
file.remove("mondo_kgx_tsv.tar.gz") # cleanup - remove the downloaded file
#> [1] TRUE