Given an optional KG engine (e.g. a file_engine()
,
neo4j_engine()
, or monarch_engine()
) and a query tbl_kgx()
graph, fetches additional nodes and edges
from the KG, expanding the query graph according to specific criteria. If the first parameter is an engine, that
engine is used; if the first parameter is a query graph, the most recent engine associated with the graph is used.
expand(
graph,
engine = NULL,
direction = "both",
predicates = NULL,
categories = NULL,
transitive = FALSE,
drop_unused_query_nodes = FALSE,
...
)
A query tbl_kgx()
graph ot query with.
(Optional) An engine to use for fetching query graph edges. If not provided, the graph's most recent engine is used.
The direction of associations to fetch. Can be "in", "out", or "both". Default is "both".
A vector of relationship predicates (nodes in g are subjects in the KG), indicating which edges to consider in the neighborhood. If NULL (default), all edges are considered.
A vector of node categories, indicating which nodes in the larger KG may be fetched. If NULL (default), all nodes in the larger KG are will be fetched.
If TRUE, include transitive closure of the neighborhood. Default is FALSE. Useful in combination with predicates like biolink:subclass_of
.
If TRUE, remove query nodes from the result, unless they are at the neighborhood boundary, i.e., required for connecting to the result nodes. Default is FALSE.
Other parameters passed to methods.
A tbl_kgx()
graph
## Using Monarch (hosted)
phenos <- monarch_engine() |>
fetch_nodes(query_ids = "MONDO:0007525") |>
expand(predicates = "biolink:has_phenotype",
categories = "biolink:PhenotypicFeature")
#> Trying to connect to https://neo4j.monarchinitiative.org
#> Connected to https://neo4j.monarchinitiative.org
#> Fetching; counting matching nodes...
#> total: 1.
#> Fetching; fetched 1 of 1
#> Expanding; counting matching edges...
#> total: 48.
#> Expanding; fetched 48 of 48 edges.
print(phenos)
#> # A tbl_graph: 49 nodes and 48 edges
#> #
#> # A rooted tree
#> #
#> # Node Data: 49 × 10 (active)
#> id pcategory name description synonym category iri xref namespace
#> <chr> <chr> <chr> <chr> <named> <list> <chr> <nam> <chr>
#> 1 MONDO:000… biolink:… Ehle… Arthrochal… <chr> <chr> http… <chr> MONDO
#> 2 HP:0000963 biolink:… Thin… Reduction … <chr> <chr> http… <chr> HP
#> 3 HP:0000974 biolink:… Hype… A conditio… <chr> <chr> http… <chr> HP
#> 4 HP:0001001 biolink:… Abno… NA <chr> <chr> http… <chr> HP
#> 5 HP:0001252 biolink:… Hypo… Hypotonia … <chr> <chr> http… <chr> HP
#> 6 HP:0001373 biolink:… Join… Displaceme… <chr> <chr> http… <chr> HP
#> 7 HP:0001385 biolink:… Hip … The presen… <chr> <chr> http… <chr> HP
#> 8 HP:0001387 biolink:… Join… Joint stif… <chr> <chr> http… <chr> HP
#> 9 HP:0002300 biolink:… Muti… Complete l… <chr> <chr> http… <chr> HP
#> 10 HP:0002381 biolink:… Apha… An acquire… <chr> <chr> http… <chr> HP
#> # ℹ 39 more rows
#> # ℹ 1 more variable: provided_by <named list>
#> #
#> # Edge Data: 48 × 23
#> from to subject predicate object knowledge_level negated
#> <int> <int> <chr> <chr> <chr> <chr> <lgl>
#> 1 1 2 MONDO:0007525 biolink:has_phenotype HP:00… knowledge_asse… TRUE
#> 2 1 3 MONDO:0007525 biolink:has_phenotype HP:00… knowledge_asse… TRUE
#> 3 1 4 MONDO:0007525 biolink:has_phenotype HP:00… knowledge_asse… TRUE
#> # ℹ 45 more rows
#> # ℹ 16 more variables: primary_knowledge_source <chr>,
#> # frequency_qualifier <chr>, original_subject <chr>, agent_type <chr>,
#> # knowledge_source <chr>, aggregator_knowledge_source <named list>,
#> # has_evidence <named list>, provided_by <named list>, id <chr>,
#> # category <named list>, has_total <chr>, has_quotient <chr>,
#> # has_count <chr>, has_percentage <chr>, publications <named list>, …
## Using example KGX file packaged with monarchr
filename <- system.file("extdata", "eds_marfan_kg.tar.gz", package = "monarchr")
phenos <- file_engine(filename) |>
fetch_nodes(query_ids = "MONDO:0007525") |>
expand(predicates = "biolink:has_phenotype",
categories = "biolink:PhenotypicFeature")
print(phenos)
#> # A tbl_graph: 49 nodes and 48 edges
#> #
#> # A rooted tree
#> #
#> # Node Data: 49 × 16 (active)
#> id pcategory name symbol in_taxon_label description synonym category
#> <chr> <chr> <chr> <chr> <chr> <chr> <list> <list>
#> 1 MONDO:000… biolink:… Ehle… NA NA Arthrochal… <chr> <chr>
#> 2 HP:0000974 biolink:… Hype… NA NA A conditio… <chr> <chr>
#> 3 HP:0001382 biolink:… Join… NA NA The abilit… <chr> <chr>
#> 4 HP:0000023 biolink:… Ingu… NA NA Protrusion… <chr> <chr>
#> 5 HP:0000963 biolink:… Thin… NA NA Reduction … <chr> <chr>
#> 6 HP:0000978 biolink:… Brui… NA NA An ecchymo… <chr> <chr>
#> 7 HP:0001027 biolink:… Soft… NA NA A skin tex… <chr> <chr>
#> 8 HP:0001058 biolink:… Poor… NA NA A reduced … <chr> <chr>
#> 9 HP:0001075 biolink:… Atro… NA NA Scars that… <chr> <chr>
#> 10 HP:0001373 biolink:… Join… NA NA Displaceme… <chr> <chr>
#> # ℹ 39 more rows
#> # ℹ 8 more variables: iri <chr>, xref <list>, namespace <chr>,
#> # provided_by <chr>, in_taxon <chr>, full_name <chr>, type <list>,
#> # has_gene <chr>
#> #
#> # Edge Data: 48 × 25
#> from to subject predicate object agent_type knowledge_level
#> <int> <int> <chr> <chr> <chr> <chr> <chr>
#> 1 1 8 MONDO:0007525 biolink:has_pheno… HP:00… manual_ag… knowledge_asse…
#> 2 1 29 MONDO:0007525 biolink:has_pheno… HP:00… manual_ag… knowledge_asse…
#> 3 1 20 MONDO:0007525 biolink:has_pheno… HP:00… manual_ag… knowledge_asse…
#> # ℹ 45 more rows
#> # ℹ 18 more variables: knowledge_source <chr>,
#> # aggregator_knowledge_source <chr>, primary_knowledge_source <chr>,
#> # provided_by <chr>, id <chr>, category <chr>, original_object <chr>,
#> # original_subject <chr>, frequency_qualifier <chr>, has_evidence <chr>,
#> # has_total <dbl>, has_quotient <dbl>, has_count <dbl>, has_percentage <dbl>,
#> # onset_qualifier <chr>, publications <chr>, qualifiers <chr>, …
## Using MONDO KGX file (remote) as an example
phenos <- file_engine("https://kghub.io/kg-obo/mondo/2024-03-04/mondo_kgx_tsv.tar.gz") |>
fetch_nodes(query_ids = "MONDO:0007525") |>
expand(predicates = "biolink:has_phenotype",
categories = "biolink:PhenotypicFeature")
print(phenos)
#> # A tbl_graph: 1 nodes and 0 edges
#> #
#> # A rooted tree
#> #
#> # Node Data: 1 × 11 (active)
#> id pcategory name description synonym category xref provided_by iri
#> <chr> <chr> <chr> <chr> <list> <list> <lis> <chr> <chr>
#> 1 MONDO:00… biolink:… Ehle… Arthrochal… <chr> <chr> <chr> mondo.json http…
#> # ℹ 2 more variables: same_as <list>, subsets <list>
#> #
#> # Edge Data: 0 × 5
#> # ℹ 5 variables: from <int>, to <int>, subject <chr>, predicate <chr>,
#> # object <chr>
file.remove("mondo_kgx_tsv.tar.gz") # cleanup - remove the downloaded file
#> [1] TRUE