Creates a knowledge graph engine backed by a neo4j database, from a URL and optional username and password. Knowledge graph "engines" are objects that store information about how to connect to a (potentially large) knowledge graph, and can be used to fetch nodes and edges from the database as local graph objects.

neo4j_engine(
  url,
  username = NA,
  password = NA,
  preferences = NULL,
  timeout = 1,
  cache = TRUE,
  ...
)

Arguments

url

A character string indicating the URL of the neo4j database. If given a vector, each will be tried in sequence; if a URL times out (see timeout) or fails, the next is tried.

username

A character string indicating the username for the neo4j database (if needed).

password

A character string indicating the password for the neo4j database (if needed).

preferences

A named list of preferences for the engine.

timeout

Number of sections to wait before trying the next url.

cache

Whether to cache query results in memory for the length of the R session.

...

Additional arguments passed to neo2R::startGraph().

Value

An object of class neo4j_engine

Details

Engines store preference information specifying how data are fetched and manipulated; for example, while node category is multi-valued (nodes may have multiple categories, for example "biolink:Gene" and "biolink:NamedThing"), typically a single category is used to represent the node in a graph, and is returned as the nodes' pcategory. A preference list of categories to use for pcategory is stored in the engine's preferences. A default set of preferences is stored in the package for use with KGX (BioLink-compatible) graphs (see https://github.com/biolink/kgx/blob/master/specification/kgx-format.md), but these can be overridden by the user.

For neo4j_engine()s, preferences are also used to set the node properties to search when using search_nodes(), defaulting to regex-based searches on id, name, and description. (The monarch_engine() is a type of neo4j_engine() with the URL set to the Monarch Neo4j instance, and overrides search_nodes() to use the Monarch search API, see monarch_engine() for details).

Examples

library(tidygraph)
library(dplyr)

engine <- neo4j_engine(url = "https://neo4j.monarchinitiative.org")
#> Trying to connect to https://neo4j.monarchinitiative.org
#> Connected to https://neo4j.monarchinitiative.org
res <- engine |> fetch_nodes(query_ids = c("MONDO:0007522", "MONDO:0007947"))
#> Fetching; counting matching nodes... 
#>  total: 2.
#> Fetching; fetched 2 of 2
print(res)
#> # A tbl_graph: 2 nodes and 0 edges
#> #
#> # A rooted forest with 2 trees
#> #
#> # Node Data: 2 × 10 (active)
#>   id          category pcategory name  description synonym iri   xref  namespace
#>   <chr>       <list>   <chr>     <chr> <chr>       <named> <chr> <nam> <chr>    
#> 1 MONDO:0007… <chr>    biolink:… Ehle… Ehlers-Dan… <chr>   http… <chr> MONDO    
#> 2 MONDO:0007… <chr>    biolink:… Marf… A disorder… <chr>   http… <chr> MONDO    
#> # ℹ 1 more variable: provided_by <named list>
#> #
#> # Edge Data: 0 × 5
#> # ℹ 5 variables: from <int>, to <int>, subject <chr>, predicate <chr>,
#> #   object <chr>