Given a KG engine, returns a graph representing the diversity of node categories and edge predicates for browsing. The returned graph is guaranteed to contain at least one node of every category, and at least one edge of every predicate. No other guarantees are made: the example graph is not minimal to satisfy these criteria, it is not random or even pseudo-random, and it may not be connected.
example_graph(engine, ...)
A tbl_kgx graph
# Using example KGX file packaged with monarchr
filename <- system.file("extdata", "eds_marfan_kg.tar.gz", package = "monarchr")
# prints a readable summary and returns a list of dataframes
g <- file_engine(filename) |> example_graph()
print(g)
#> # A tbl_graph: 12 nodes and 10 edges
#> #
#> # A directed acyclic multigraph with 3 components
#> #
#> # Node Data: 12 × 16 (active)
#> id pcategory name symbol in_taxon_label description synonym category
#> <chr> <chr> <chr> <chr> <chr> <chr> <list> <list>
#> 1 MONDO:000… biolink:… Marf… NA NA A disorder… <chr> <chr>
#> 2 MONDO:002… biolink:… Ehle… NA NA The Ehlers… <chr> <chr>
#> 3 MONDO:000… biolink:… Ehle… NA NA Ehlers-Dan… <chr> <chr>
#> 4 MONDO:001… biolink:… Ehle… NA NA A form of … <chr> <chr>
#> 5 ZFIN:ZDB-… biolink:… b3ga… NA Danio rerio NA <chr> <chr>
#> 6 HGNC:3603 biolink:… FBN1 FBN1 Homo sapiens NA <chr> <chr>
#> 7 HP:0000974 biolink:… Hype… NA NA A conditio… <chr> <chr>
#> 8 HP:0000007 biolink:… Auto… NA NA A mode of … <chr> <chr>
#> 9 MONDO:002… biolink:… rare NA NA A disease … <chr> <chr>
#> 10 CHEBI:508… biolink:… doxy… NA NA Tetracycli… <chr> <chr>
#> 11 CLINVAR:2… biolink:… NM_0… NA Homo sapiens NA <chr> <chr>
#> 12 CLINVAR:2… biolink:… NM_0… NA Homo sapiens NA <chr> <chr>
#> # ℹ 8 more variables: iri <chr>, xref <list>, namespace <chr>,
#> # provided_by <chr>, in_taxon <chr>, full_name <chr>, type <list>,
#> # has_gene <chr>
#> #
#> # Edge Data: 10 × 25
#> from to subject predicate object primary_knowledge_so…¹ agent_type
#> <int> <int> <chr> <chr> <chr> <chr> <chr>
#> 1 11 1 CLINVAR:200198 biolink:a… MONDO… infores:clingen manual_ag…
#> 2 6 1 HGNC:3603 biolink:c… MONDO… infores:omim manual_ag…
#> 3 6 1 HGNC:3603 biolink:g… MONDO… infores:orphanet manual_ag…
#> # ℹ 7 more rows
#> # ℹ abbreviated name: ¹primary_knowledge_source
#> # ℹ 18 more variables: knowledge_level <chr>, knowledge_source <chr>,
#> # aggregator_knowledge_source <chr>, provided_by <chr>, id <chr>,
#> # category <chr>, original_object <chr>, original_subject <chr>,
#> # frequency_qualifier <chr>, has_evidence <chr>, has_total <dbl>,
#> # has_quotient <dbl>, has_count <dbl>, has_percentage <dbl>, …
# prints a readable summary and returns a list of dataframes
g <- monarch_engine() |> example_graph()
#> Trying to connect to https://neo4j.monarchinitiative.org
#> Connected to https://neo4j.monarchinitiative.org
print(g)
#> # A tbl_graph: 63 nodes and 37 edges
#> #
#> # A directed acyclic multigraph with 27 components
#> #
#> # Node Data: 63 × 16 (active)
#> id category pcategory name symbol in_taxon_label description synonym
#> <chr> <list> <chr> <chr> <chr> <chr> <chr> <list>
#> 1 GO:0006493 <chr> biolink:… prot… NA NA A protein … <chr>
#> 2 RGD:62060 <chr> biolink:… Ogt Ogt Rattus norveg… NA <chr>
#> 3 NCBIGene:… <chr> biolink:… OGT OGT Canis lupus f… O-linked N… <lgl>
#> 4 GO:0000123 <chr> biolink:… hist… NA NA A protein … <chr>
#> 5 RGD:3982 <chr> biolink:… Yy1 Yy1 Rattus norveg… NA <chr>
#> 6 GO:0010467 <chr> biolink:… gene… NA NA The proces… <lgl>
#> 7 MONDO:010… <chr> biolink:… over… NA NA A disease … <chr>
#> 8 CLINVAR:1… <chr> biolink:… NM_0… NA Homo sapiens NA <lgl>
#> 9 OBA:20500… <chr> biolink:… seru… NA NA The amount… <chr>
#> 10 HP:0003542 <chr> biolink:… Incr… NA NA The concen… <chr>
#> # ℹ 53 more rows
#> # ℹ 8 more variables: iri <chr>, xref <list>, namespace <chr>,
#> # provided_by <list>, full_name <chr>, in_taxon <list>, type <list>,
#> # has_gene <list>
#> #
#> # Edge Data: 37 × 22
#> from to subject predicate object primary_knowledge_so…¹ knowledge_level
#> <int> <int> <chr> <chr> <chr> <chr> <chr>
#> 1 2 1 RGD:62060 biolink:a… GO:00… infores:rgd knowledge_asse…
#> 2 2 3 RGD:62060 biolink:o… NCBIG… infores:panther knowledge_asse…
#> 3 2 4 RGD:62060 biolink:p… GO:00… infores:rgd knowledge_asse…
#> # ℹ 34 more rows
#> # ℹ abbreviated name: ¹primary_knowledge_source
#> # ℹ 15 more variables: negated <lgl>, species_context_qualifier <chr>,
#> # agent_type <chr>, knowledge_source <chr>,
#> # aggregator_knowledge_source <list>, has_evidence <list>,
#> # provided_by <list>, id <chr>, category <list>, publications <list>,
#> # original_object <chr>, qualifiers <list>, original_predicate <chr>, …