R/neo4j_engine.R
neo4j_engine.Rd
Creates a knowledge graph engine backed by a neo4j database, from a URL and optional username and password. Knowledge graph "engines" are objects that store information about how to connect to a (potentially large) knowledge graph, and can be used to fetch nodes and edges from the database as local graph objects.
neo4j_engine(
url,
username = NA,
password = NA,
preferences = NULL,
timeout = 1,
cache = TRUE,
...
)
A character string indicating the URL of the neo4j database. If given a vector, each will be tried in sequence; if a URL times out (see timeout) or fails, the next is tried.
A character string indicating the username for the neo4j database (if needed).
A character string indicating the password for the neo4j database (if needed).
A named list of preferences for the engine.
Number of sections to wait before trying the next url.
Whether to cache query results in memory for the length of the R session.
Additional arguments passed to neo2R::startGraph()
.
An object of class neo4j_engine
Engines store preference information specifying how data are fetched and manipulated; for example,
while node category
is multi-valued (nodes may have multiple categories, for example "biolink:Gene" and "biolink:NamedThing"),
typically a single category is used to represent the node in a graph, and is returned as the nodes' pcategory
. A preference list of categories to use for pcategory
is
stored in the engine's preferences. A default set of preferences is stored in the package for use with KGX (BioLink-compatible) graphs (see https://github.com/biolink/kgx/blob/master/specification/kgx-format.md),
but these can be overridden by the user.
For neo4j_engine()
s, preferences are also used to set the node properties to search when using search_nodes()
, defaulting to regex-based searches on id, name, and description. (The monarch_engine()
is a type
of neo4j_engine()
with the URL set to the Monarch Neo4j instance, and overrides search_nodes()
to use the Monarch search API, see monarch_engine()
for details).
library(tidygraph)
library(dplyr)
engine <- neo4j_engine(url = "https://neo4j.monarchinitiative.org")
#> Trying to connect to https://neo4j.monarchinitiative.org
#> Connected to https://neo4j.monarchinitiative.org
res <- engine |> fetch_nodes(query_ids = c("MONDO:0007522", "MONDO:0007947"))
#> Fetching; counting matching nodes...
#> total: 2.
#> Fetching; fetched 2 of 2
print(res)
#> # A tbl_graph: 2 nodes and 0 edges
#> #
#> # A rooted forest with 2 trees
#> #
#> # Node Data: 2 × 10 (active)
#> id category pcategory name description synonym iri xref namespace
#> <chr> <list> <chr> <chr> <chr> <named> <chr> <nam> <chr>
#> 1 MONDO:0007… <chr> biolink:… Ehle… Ehlers-Dan… <chr> http… <chr> MONDO
#> 2 MONDO:0007… <chr> biolink:… Marf… A disorder… <chr> http… <chr> MONDO
#> # ℹ 1 more variable: provided_by <named list>
#> #
#> # Edge Data: 0 × 5
#> # ℹ 5 variables: from <int>, to <int>, subject <chr>, predicate <chr>,
#> # object <chr>