ontorunner package
Subpackages
Submodules
ontorunner.oger_module module
Run OGER.
- ontorunner.oger_module.run_oger(content='data/input', termlist='data/terms/DICT.tsv', output='data/output/', output_format='tsv', settings='settings.ini', workers=1, nodes_and_edges='/home/runner/work/ontorunner/ontorunner/data/nodes_and_edges', need_ancestors=False) None
- Run OGER. - Parameters
- content – Input file OR folder containing txt files. 
- termlist – Path to the dictionary (TSV format). 
- output – Path to save the output file. 
- output_format – tsv (default). 
- settings – If this is provided, all other arguments 
 
 - are provided in this file and are hence optional. Make changes to this file according to project needs s(default:’settings.ini’). :param workers: Number of parallel threads (default = 1). :param nodes_and_edges: Directory where KGX nodes and edges tsv files. :param need_ancestors: Bool to decide if ancestors should be present in the output or no. :return: None. 
ontorunner.spacy_module module
Run Spacy.
- ontorunner.spacy_module.explode_df(df: DataFrame) DataFrame
- Explode multiple DataFrames in a single row into multiple rows. - Parameters
- df – Dataframe to be exploded. 
- Returns
- Exploded DataFrame where each row correspond to a row in the DataFrame. 
 
- ontorunner.spacy_module.export_tsv(df: DataFrame, data_dir: str, fn: str) None
- Export pandas DataFrame object into a TSV file. - Parameters
- df – Pandas DataFrame. 
- data_dir – Destination directory for export. 
- fn – Filename. 
 
 
- ontorunner.spacy_module.get_knowledge_base_enitities(doc: Doc, onto_ruler_obj: OntoRuler) DataFrame
- Get information from the SciSpacy pipeline. - Parameters
- doc – Doc object. 
- onto_ruler_obj – OntoRuler object. 
 
- Returns
- Pandas DataFrame. 
 
- ontorunner.spacy_module.get_token_info(doc: Doc) DataFrame
- Get metadata associated with spans within a document. - Parameters
- doc – Doc object. 
- Returns
- Pandas DataFrame. 
 
- ontorunner.spacy_module.onto_tokenize(doc: Doc, onto_ruler_obj: OntoRuler) Doc
- Set custom span information from the Doc object. - Parameters
- doc – Doc object. 
- onto_ruler_obj – OntoRuler object. 
 
- Returns
- Doc object. 
 
- ontorunner.spacy_module.run_spacy(data_dir: Path = '/home/runner/work/ontorunner/ontorunner/data', settings_file: Path = '/home/runner/work/ontorunner/ontorunner/ontorunner/settings.ini', linker: str = 'umls', to_pickle: bool = True, need_ancestors: bool = False, viz: bool = False) OntoRuler
- Run spacy with sciSpacy pipeline. - Parameters
- data_dir – Path to the data directory. 
- settings – Path to settings.ini file. 
- linker – Type of sciSpacy linker desired ([umls]/mesh). 
- to_pickle – Pickle intermediate files. (True/False) 
- need_ancestors – Include ancestors of annotated terms. (True/False) 
- viz – Include visualizations (png and svg) in output. (True/False) 
 
- Returns
- OntoRuler object. 
 
- ontorunner.spacy_module.run_viz(input_text: str = 'A bacterial isolate, designated strain SZ,was obtained from noncontaminated creek sediment microcosms based on its ability to derive energy from acetate oxidation coupled to tetrachloroethene.', obj: Optional[OntoRuler] = None)
- Text that needs to be annotated. - :param input_text:Text to be annotated, defaults to DEFAULT_TEXT 
Module contents
Constants.