OntoGPT is run from the command line. See the full list of commands with:
For a simple example of text completion and testing to ensure OntoGPT is set up correctly, create a text file containing the following, saving the file as
Why did the squid cross the coral reef?
Then try the following command:
ontogpt complete example.txt
You should get text output like the following:
Perhaps the squid crossed the coral reef for a variety of reasons:
1. Food: Squids are known to feed on small fish and other marine organisms, and there could have been a rich food source on the other side of the reef.
OntoGPT is intended to be used for information extraction. The following examples show how to accomplish this.
Strategy 1: Knowledge extraction using SPIRES
- You provide an arbitrary data model, describing the structure you want to extract text into. This can be nested (but see limitations below). The predefined templates may be used.
- Provide your preferred annotations for grounding
- OntoGPT will:
- Generate a prompt
- Feed the prompt to a language model
- Parse the results into a dictionary structure
- Ground the results using a preferred annotator (e.g., an ontology)
Consider some text from one of the input files being used in the OntoGPT test suite. You can find the text file here. You can download the raw file from the GitHub link to that input text file, or copy its contents over into another file, say,
abstract.txt. An excerpt:
The cGAS/STING-mediated DNA-sensing signaling pathway is crucial for interferon (IFN) production and host antiviral responses
... [snip] ...
The underlying mechanism was the interaction of US3 with β-catenin and its hyperphosphorylation of β-catenin at Thr556 to block its nuclear translocation ... ...
We can extract knowledge from the above text this into the GO pathway datamodel by running the following command:
ontogpt extract -t gocam.GoCamAnnotations -i ~/path/to/abstract.txt
Note: The value accepted by the
--template argument is the base name of one of the LinkML schema / data model which can be found in the templates folder.
The output returned from the above command can be optionally redirected into an output file using the
The following is a small part of what the larger schema-compliant output looks like:
- gene1: US3
- gene: HGNC:2514
- gene: HGNC:2514
- gene: HGNC:21367
To use a local model, specify it with the
ontogpt extract -t drug -i ~/path/to/abstract.txt -m nous-hermes-13b
See the list of all available models with this command:
When specifying a local model for the first time, it will be downloaded to your local system.
Strategy 2: Gene Enrichment using SPINDOCTOR
Given a set of genes, OntoGPT can find similarities among them.
ontogpt enrichment -U tests/input/genesets/sensory-ataxia.yaml
The default is to use ontological gene function synopses (via the Alliance API).
- To use narrative/RefSeq summaries, use the
- To run without any gene descriptions, use the
This strategy does not currently support using local models.