🛠️ Tools¶

These lists of tools are focused on those developed and/or used by Monarch members; those with ‘True’ as their value for Internal are built by Monarch in whole or in part.

General Purpose Tools¶

Name	Description	Repo	Docs	Internal
AIO	Artificial Intelligence Ontology	GitHub	arXiv	True
Aurelian	Aurelian: Agentic Universal Research Engine for Literature, Integration, Annotation, and Navigation	Github	Docs	True
Datasette LLM library	Or llm for short. A CLI utility and Python library for interacting with Large Language Models, both via remote APIs and models that can be installed and run on your own machine.	GitHub	Docs	False
Langchain	A framework for developing applications powered by language models. Supports connecting a language model to sources of context and enabling reasoning.	GitHub	Docs	False
LiteLLM	A framework for accessing LLMs and their APIs in the OpenAI format, for drop-in replacement and other convenient integrations.	GitHub	Docs	False
Logfire	An observability platform and a set of tools for collecting structured logs. For LLMs, this provides a way to track input prompts, parameters, and generated outputs.	GitHub	Docs	False
Ollama	A framework for running LLMs locally, with GPU support.	GitHub	Site	False
OntoGPT	A tool for linking unstructured data to structured vocabularies with consistent identifiers. Uses SPIRES and TALISMAN methods.	GitHub	Docs	True
Ontology Access Toolkit (OAK)	Python library for common ontology operations over a variety of backends. OAK has its own TextAnnotator but it’s very simple. OntoGPT uses OAK for term retrieval, labeling, mapping, etc.	GitHub	Docs	True
Phenomics Assistant	An AI chatbot with access to the Monarch Initiative biomedical knowledgebase. See demo at https://phenomics-assistant.streamlit.app/	GitHub	bioRxiv	True
Pydantic.ai	A Python agent framework for working with LLMs.	GitHub	Docs	False

Data Preparation and Modeling Tools¶

Name	Description	Repo	Docs	Internal
LinkML	A modeling language and framework for describing, working with, and validating data in a variety of formats. OntoGPT uses LinkML to define extraction schemas.	GitHub	Docs; draft	True
PaperQA	A package for doing high-accuracy retrieval augmented generation (RAG) on PDFs or text files, with a focus on the scientific literature.	GitHub	arXiv	False
phenopacket2prompt	A tool for transforming data in the GA4GH Phenopacket standard into LLM-ready prompts.	GitHub	Docs	True

Evaluation Tools¶

Name	Description	Repo	Docs	Internal
DeepEval	An LLM evaluation framework built around unit tests.	GitHub	Docs	False
llm-matrix	A tool for running, evaluating, and comparing different language models across a matrix of hyperparameters. It allows systematic testing of models for accuracy, consistency, and performance on specific tasks.	GitHub		True
LangSmith	A framework for building LLM applications, including evaluations. Can be used with or without LangChain.	GitHub	Docs	False
Metacoder	A unified interface for command line AI coding assistants (claude code, gemini-cli, codex, goose, qwen-coder).	GitHub	Docs	True

Visualization and Interface Building Tools¶

Name	Description	Repo	Docs	Internal
Gradio	Tools for building an interface for Python projects, including those interfacing with LLMs.	GitHub	Docs	False
Streamlit	A framework for building web apps.	GitHub	Docs	False

Agentic Coding and Ontology Development Tools¶

Name	Description	Repo	Docs	Internal
aider	An agentic coding tool capable of working with a variety of LLM APIs and local models.	GitHub	Docs	False
Claude Code	Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflows - all through natural language commands.	GitHub	Docs	False
ODK-AI	A Docker container that extends the ODK image to use Claude Code and other LLM-powered tools with ontologies. It is designed to be executed either interactively or in "headless" mode.	GitHub	Docs	True
Goose	An open source AI agent for automating coding tasks. Supports a variety of LLMs. Can be used through an app or a CLI.	GitHub	Docs	False
Roo Code	Roo Code is an AI-powered autonomous coding agent that lives in your editor.	GitHub	Docs	False
Cherry Studio	Cherry Studio is a desktop client that supports multiple LLM providers, available on Windows, Mac and Linux.	GitHub		False
dragon-ai-agent	An automated AI agent specifically designed to assist with ontology curation and maintenance tasks.	GitHub	Docs	True
github-ai-integrations	A Copier Template to augment github repos with AI capabilities	GitHub	Docs	True

Model Context Protocol (MCP) Tools¶

Name	Description	Repo	Internal
landuse-mcp	A Model Context Protocol (MCP) server for retrieving land use data for given geographical locations using the National Land Cover Database (NLCD) and other geospatial datasets.	GitHub	True
oak-mcp	A model context protocol (MCP) to help agents interact with ontologies and the ontology access kit	GitHub	True
ols-mcp	A Model Context Protocol (MCP) server for retrieving information from the Ontology Lookup Service (OLS).	GitHub	True
artl-mcp	An MCP for retrieving scientific literature metadata and content using PMIDs, DOIs, and other identifiers.	GitHub	True
fitness-mcp	A FastMCP server for analyzing fitness data from barcoded Agrobacterium mutant libraries grown in mixed cultures across different conditions.	GitHub	True

¶

Tool-specific Guides¶

Accessing Monarch data with LLMs¶

2023-06-06 Monarch Tutorial - Monarch ChatGPT API Plugin Demo.mp4

Using LLMs with the Ontology Access Kit¶

OAK documentation: https://incatools.github.io/ontology-access-kit/howtos/use-llms

OntoGPT¶

Using Open Models¶

Some LLMs may be used on local hardware (e.g., your own laptop) rather than through a remote API. This will not be possible for the largest models and may be slow to produce results with even moderately sized models, but with less cost as compared to commercial services and greater flexibility in the availability of models.

The Ollama framework is a good place to start.

Models may be retrieved from the popular HuggingFace platform.

Other options:

The Datasette LLM library has multiple plugins available for running local models. See the plugin directory.
LangChain is also capable of running local models.
It can work with Ollama, among other frameworks.
Example: Running a Hugging Face Large Language Model (LLM) locally on my laptop | Mark Needham
- Video version: Running a Hugging Face LLM on your laptop
See also: openplayground (a GUI for chat completion with many different LLMs, both open and not)

Using LBNL CBORG¶

CBORG is a service provided by the Berkeley Lab’s IT Division and Science IT staff to provide access to AI models. If you work for LBNL, you may use CBORG. Models may be accessed in three ways:

An in-browser chat interface (https://chat.cborg.lbl.gov/)
Over an API
FAQ: https://cborg.lbl.gov/api_faq/
Full documentation: https://api.cborg.lbl.gov/
Through your favorite agentic platform (https://cborg.lbl.gov/tools_ai_101/)

Get a CBORG API Key¶

Need a CBORG API key? See this page.
Or, for more detail, follow these instructions:

Go here: https://chat.cborg.lbl.gov/login
Login with your LDAP credentials (your LBL username and password)
Some people report that they had to refresh the page (or go through the sign-in process) a few times before it worked.
Request a CBORG API key: https://cborg.lbl.gov/api_request/
You may only see the key immediately after requesting it!
Save it in a password safe.
If you lose your key, you can contact CBORG support and ask for a new one
- Email: ScienceIT
- CBORG Google Group: https://chat.google.com/room/AAQAqGsqgfQ?cls=7

Need a supplemental CBORG API key for a specific project with a defined spending limit or timeframe? Use this form.

Using CBORG Models¶

The CBORG API is OpenAI-compatible, which means it can handle requests in much the same way as the OpenAI API does. Tools and applications designed to work with OpenAI models will generally work with CBORG, with the caveat that all models are different and some have different features from others (e.g., functionality for using tools). So, in the absence of more specific instructions, you may be able to get CBORG working with your chosen software by:

Specifying a new model or API endpoint as OpenAI-compatible
Providing the API base (https://api.cborg.lbl.gov) and API key (see above)
Specifying a model name, like “lbl/cborg-chat:latest”
See the full list here, though you may have to scroll down to see the specific names to pass to the API

CBORG also provides proxy utilities for accessing their API. The immediate benefit of this is convenience: the proxy can automatically provide your API key along with each request. If you’re using the CBORG API from multiple applications, the proxy can also help to manage all the resulting connections. Find it on GitHub here: https://github.com/lbnl-science-it/cborg-client

Managing CBORG Usage¶

View your key budget here: https://api.cborg.lbl.gov/key/manage
Alternatively, use this shell function to get the same information in your terminal: https://gist.github.com/pkalita-lbl/eb9065e03157844ba3130449f0de8433

By default, each user is allocated $50 per month, unless you get additional grant-based funding. (Sierra says this lasts a while.)

Note that the open on-premises models (those with model names preceded by “lbl”) may be used at no monetary cost to you.