Given a KGX file-based KG engine, provides summary information in the form of node counts, category counts across nodes, relationship type counts, and available properties. The returned summary object prints a readable console report and also contains data frames with this information. Also returned are cats, preds, and props entries, containing lists of available categories/predicates/properties for convenient auto-completion in RStudio.

# S3 method for class 'file_engine'
summary(object, ..., quiet = FALSE)

Arguments

object

A file_engine object

...

Other parameters (not used)

quiet

Logical, whether to suppress printing of the summary

Value

A classed list of data frames and named lists.

Details

When applied to a file_engine, also included are node-specific and edge-specific properties.

Examples

# Using example KGX file packaged with monarchr
data(eds_marfan_kg)

# prints a readable summary and returns a list of dataframes
res <- eds_marfan_kg |> summary()
print(res)
#> 
#> A KGX file-backed knowledge graph engine.
#> Total nodes: 3000
#> Total edges: 7148
#> 
#> Node category counts:
#>                                        category count
#> 1                                biolink:Entity  3000
#> 2                            biolink:NamedThing  3000
#> 3                      biolink:BiologicalEntity  2977
#> 4                        biolink:ThingWithTaxon  2977
#> 5            biolink:DiseaseOrPhenotypicFeature  2846
#> 6                     biolink:PhenotypicFeature  2736
#> 7                       biolink:PhysicalEssence   145
#> 8            biolink:PhysicalEssenceOrOccurrent   145
#> 9                         biolink:GenomicEntity   131
#> 10                        biolink:OntologyClass   131
#> 11                              biolink:Disease   110
#> 12                      biolink:SequenceVariant    81
#> 13    biolink:ChemicalEntityOrGeneOrGeneProduct    37
#> 14                             biolink:Genotype    27
#> 15                                 biolink:Gene    23
#> 16                    biolink:GeneOrGeneProduct    23
#> 17           biolink:MacromolecularMachineMixin    23
#> 18                       biolink:ChemicalEntity    14
#> 19 biolink:ChemicalEntityOrProteinOrPolypeptide    14
#> 20            biolink:ChemicalOrDrugOrTreatment    14
#> 21                      biolink:MolecularEntity    13
#> 
#> Edge type counts:
#>                                          predicate count
#> 1                              biolink:subclass_of  5244
#> 2                            biolink:has_phenotype  1709
#> 3                                   biolink:causes    56
#> 4  biolink:associated_with_increased_likelihood_of    38
#> 5           biolink:gene_associated_with_condition    28
#> 6                                 biolink:model_of    27
#> 7                  biolink:has_mode_of_inheritance    26
#> 8              biolink:genetically_associated_with    11
#> 9                               biolink:related_to     8
#> 10   biolink:treats_or_applied_or_studied_to_treat     1
#> 
#> Node property counts:
#>          property count
#> 16      pcategory  3000
#> 11    provided_by  3000
#> 7        category  3000
#> 1              id  3000
#> 2            name  2999
#> 10      namespace  2992
#> 8             iri  2868
#> 5     description  2668
#> 6         synonym  2320
#> 9            xref  1469
#> 12       in_taxon   120
#> 4  in_taxon_label   120
#> 15       has_gene    69
#> 14           type    33
#> 13      full_name    23
#> 3          symbol    23
#> 
#> Edge property counts:
#>                       property count
#> 13                    category  7148
#> 12                          id  7148
#> 11                 provided_by  7148
#> 10    primary_knowledge_source  7148
#> 9  aggregator_knowledge_source  7148
#> 8             knowledge_source  7148
#> 7              knowledge_level  7148
#> 6                   agent_type  7148
#> 5                       object  7148
#> 4                    predicate  7148
#> 3                      subject  7148
#> 2                           to  7148
#> 1                         from  7148
#> 15            original_subject  1782
#> 17                has_evidence  1729
#> 16         frequency_qualifier  1080
#> 23                publications   537
#> 18                   has_total   463
#> 21              has_percentage   453
#> 20                   has_count   453
#> 19                has_quotient   453
#> 25          original_predicate    81
#> 14             original_object    80
#> 24                  qualifiers    62
#> 22             onset_qualifier    21
#> 
#> For more information about Biolink node (Class) and edge
#> (Association) properties, see https://biolink.github.io/biolink-model/.