End-to-End Ontology Alignment: OBO + OWL + SSSOM
This tutorial aligns two disease ontologies by merging three complementary data sources:
| File | Format | What it provides |
|---|---|---|
mondo_subset.obo |
OBO | MONDO disease hierarchy + xref mappings to ORDO |
ordo_subset.ofn |
OWL Functional Syntax | ORDO rare disease hierarchy + disjointness constraints |
mondo_ordo_mappings.sssom.tsv |
SSSOM | Cross-ontology equivalence candidates with confidence scores |
Input Files
MONDO hierarchy (OBO)
Three diseases under a common root, with xref lines pointing to ORDO for the first two.
When BOOMER converts an OBO file, xref entries become probabilistic EquivalentTo facts
(default probability 0.7), while is_a relations become hard ProperSubClassOf facts.
See the Ontology Conversion docs for the full mapping table
and configuration options.
mondo_subset.obo
format-version: 1.4
ontology: mondo-subset
[Term]
id: MONDO:0000001
name: disease
[Term]
id: MONDO:0001234
name: alpha disease
is_a: MONDO:0000001 ! disease
xref: ORDO:100
[Term]
id: MONDO:0005678
name: beta disease
is_a: MONDO:0000001 ! disease
xref: ORDO:200
[Term]
id: MONDO:0009999
name: gamma disease
is_a: MONDO:0000001 ! disease
ORDO hierarchy (OWL)
Three rare diseases under a grouping class. Crucially, they are declared pairwise disjoint:
ordo_subset.ofn
Prefix(ORDO:=<http://www.orpha.net/ORDO/Orphanet_>)
Prefix(owl:=<http://www.w3.org/2002/07/owl#>)
Prefix(rdfs:=<http://www.w3.org/2000/01/rdf-schema#>)
Ontology(<http://www.orpha.net/ORDO/ordo-subset>
Declaration(Class(ORDO:100))
Declaration(Class(ORDO:200))
Declaration(Class(ORDO:300))
Declaration(Class(ORDO:999))
SubClassOf(ORDO:100 ORDO:999)
SubClassOf(ORDO:200 ORDO:999)
SubClassOf(ORDO:300 ORDO:999)
DisjointClasses(ORDO:100 ORDO:200 ORDO:300)
AnnotationAssertion(rdfs:label ORDO:100 "Alpha rare disease")
AnnotationAssertion(rdfs:label ORDO:200 "Beta rare disease")
AnnotationAssertion(rdfs:label ORDO:300 "Delta rare disease")
AnnotationAssertion(rdfs:label ORDO:999 "Rare disease grouping")
)
Cross-ontology mappings (SSSOM)
Four candidate equivalences. Note the conflicting last row — MONDO:0009999 has a weak match (0.3) to ORDO:100, which is already strongly matched to MONDO:0001234:
mondo_ordo_mappings.sssom.tsv
tsv
#curie_map:
# MONDO: http://purl.obolibrary.org/obo/MONDO_
# ORDO: http://www.orpha.net/ORDO/Orphanet_
# skos: http://www.w3.org/2004/02/skos/core#
# semapv: https://w3id.org/semapv/vocab/
#mapping_set_id: https://example.org/mondo-ordo-mappings
#mapping_set_description: MONDO to ORDO candidate mappings
subject_id subject_label predicate_id object_id object_label mapping_justification confidence
MONDO:0001234 alpha disease skos:exactMatch ORDO:100 Alpha rare disease semapv:LexicalMatching 0.9
MONDO:0005678 beta disease skos:exactMatch ORDO:200 Beta rare disease semapv:LexicalMatching 0.85
MONDO:0009999 gamma disease skos:exactMatch ORDO:300 Delta rare disease semapv:LexicalMatching 0.6
MONDO:0009999 gamma disease skos:exactMatch ORDO:100 Alpha rare disease semapv:LexicalMatching 0.3
Merge
Pass all three files directly to merge — formats are auto-detected from extensions.
Each converter auto-generates MemberOfDisjointGroup hard facts as a post-processing step:
every CURIE encountered during conversion is split on : and assigned to a disjoint group
named after its prefix (e.g. all MONDO:* entities go into group MONDO, all ORDO:* into
group ORDO). This encodes the assumption that entities within the same namespace are
distinct — a standard pattern in ontology alignment.
The behavior is controlled by auto_disjoint_groups in
OntologyConverterConfig (default True).
%%bash
uv run python -m boomer.cli merge \
docs/tutorial/ontology-alignment-files/mondo_subset.obo \
docs/tutorial/ontology-alignment-files/ordo_subset.ofn \
docs/tutorial/ontology-alignment-files/mondo_ordo_mappings.sssom.tsv \
-o docs/tutorial/ontology-alignment-files/merged.yaml \
-n "MONDO-ORDO Alignment"
merged.yaml
name: MONDO-ORDO Alignment
facts:
- fact_type: ProperSubClassOf
sub: MONDO:0001234
sup: MONDO:0000001
- fact_type: ProperSubClassOf
sub: MONDO:0005678
sup: MONDO:0000001
- fact_type: ProperSubClassOf
sub: MONDO:0009999
sup: MONDO:0000001
- fact_type: MemberOfDisjointGroup
sub: ORDO:200
group: ORDO
- fact_type: MemberOfDisjointGroup
sub: MONDO:0000001
group: MONDO
- fact_type: MemberOfDisjointGroup
sub: MONDO:0005678
group: MONDO
- fact_type: MemberOfDisjointGroup
sub: MONDO:0009999
group: MONDO
- fact_type: MemberOfDisjointGroup
sub: ORDO:100
group: ORDO
- fact_type: MemberOfDisjointGroup
sub: MONDO:0001234
group: MONDO
- fact_type: ProperSubClassOf
sub: ORDO:300
sup: ORDO:999
- fact_type: ProperSubClassOf
sub: ORDO:200
sup: ORDO:999
- fact_type: DisjointWith
sub: ORDO:100
sibling: ORDO:200
- fact_type: DisjointWith
sub: ORDO:100
sibling: ORDO:300
- fact_type: DisjointWith
sub: ORDO:200
sibling: ORDO:300
- fact_type: ProperSubClassOf
sub: ORDO:100
sup: ORDO:999
- fact_type: MemberOfDisjointGroup
sub: ORDO:200
group: ORDO
- fact_type: MemberOfDisjointGroup
sub: ORDO:999
group: ORDO
- fact_type: MemberOfDisjointGroup
sub: ORDO:100
group: ORDO
- fact_type: MemberOfDisjointGroup
sub: ORDO:300
group: ORDO
- fact_type: MemberOfDisjointGroup
sub: MONDO:0001234
group: MONDO
- fact_type: MemberOfDisjointGroup
sub: ORDO:100
group: ORDO
- fact_type: MemberOfDisjointGroup
sub: MONDO:0005678
group: MONDO
- fact_type: MemberOfDisjointGroup
sub: ORDO:200
group: ORDO
- fact_type: MemberOfDisjointGroup
sub: MONDO:0009999
group: MONDO
- fact_type: MemberOfDisjointGroup
sub: ORDO:300
group: ORDO
pfacts:
- fact:
fact_type: EquivalentTo
sub: MONDO:0001234
equivalent: ORDO:100
prob: 0.9
- fact:
fact_type: EquivalentTo
sub: MONDO:0005678
equivalent: ORDO:200
prob: 0.85
- fact:
fact_type: EquivalentTo
sub: MONDO:0001234
equivalent: ORDO:100
prob: 0.7
- fact:
fact_type: EquivalentTo
sub: MONDO:0005678
equivalent: ORDO:200
prob: 0.7
- fact:
fact_type: EquivalentTo
sub: MONDO:0009999
equivalent: ORDO:300
prob: 0.6
- fact:
fact_type: EquivalentTo
sub: MONDO:0009999
equivalent: ORDO:100
prob: 0.3
hypotheses: []
labels:
MONDO:0000001: disease
MONDO:0001234: alpha disease
MONDO:0005678: beta disease
MONDO:0009999: gamma disease
ORDO:999: Rare disease grouping
ORDO:300: Delta rare disease
ORDO:100: Alpha rare disease
ORDO:200: Beta rare disease
hyperparams: []
pfacts_entailed: []
Solve
solution.yaml
name: null
number_of_combinations: 37
number_of_satisfiable_combinations: 28
number_of_combinations_explored_including_implicit: 152
number_of_components: null
confidence: 0.5
prior_prob: 0.15743699999999994
posterior_prob: 0.11114287892650197
proportion_of_combinations_explored: 1.0
ground_pfacts: []
solved_pfacts:
- pfact:
fact:
fact_type: EquivalentTo
sub: MONDO:0001234
equivalent: ORDO:100
prob: 0.9
truth_value: true
posterior_prob: 0.968219477482972
metadata: null
- pfact:
fact:
fact_type: EquivalentTo
sub: MONDO:0005678
equivalent: ORDO:200
prob: 0.85
truth_value: true
posterior_prob: 0.9571896919792617
metadata: null
- pfact:
fact:
fact_type: EquivalentTo
sub: MONDO:0001234
equivalent: ORDO:100
prob: 0.7
truth_value: true
posterior_prob: 0.968219477482972
metadata: null
- pfact:
fact:
fact_type: EquivalentTo
sub: MONDO:0005678
equivalent: ORDO:200
prob: 0.7
truth_value: true
posterior_prob: 0.9571896919792617
metadata: null
- pfact:
fact:
fact_type: EquivalentTo
sub: MONDO:0009999
equivalent: ORDO:300
prob: 0.6
truth_value: true
posterior_prob: 0.5972095150960655
metadata: null
- pfact:
fact:
fact_type: EquivalentTo
sub: MONDO:0009999
equivalent: ORDO:100
prob: 0.3
truth_value: false
posterior_prob: 0.004650808173223542
metadata: null
sub_solutions: []
time_started: 1772508731.125963
time_finished: 1772508731.2317069
timed_out: false
time_elapsed: 0.1057438850402832
Interpreting the Results
| Mapping | Prior | Posterior | Verdict | Why |
|---|---|---|---|---|
| MONDO:0001234 ≡ ORDO:100 | 0.90 | ~0.97 | Accepted | Reinforced by both xref and SSSOM |
| MONDO:0005678 ≡ ORDO:200 | 0.85 | ~0.96 | Accepted | Reinforced by both xref and SSSOM |
| MONDO:0009999 ≡ ORDO:300 | 0.60 | ~0.60 | Accepted | Moderate match, no competition |
| MONDO:0009999 ≡ ORDO:100 | 0.30 | ~0.005 | Rejected | Crushed by disjointness constraint |
The key result: the false mapping MONDO:0009999≡ORDO:100 drops from 0.30 to 0.005.
This happens because ORDO:100 is already claimed by MONDO:0001234 (high confidence),
and the OWL DisjointClasses axiom makes it inconsistent for two MONDO terms to map to the same ORDO class.
Why do some mappings appear twice in the solution?
MONDO:0001234≡ORDO:100 has two pfacts in the merged KB: one from the OBO xref (prob 0.7)
and one from the SSSOM skos:exactMatch (prob 0.9). Both are reasoned over independently —
the reasoner finds both consistent, and the posterior for each reflects the combined evidence.
You can configure xref probabilities per-prefix via
OntologyConverterConfig.
TSV Export
solution.tsv
tsv
# BOOMER Solution TSV Output
#
# Metadata:
# generated_date: 2026-03-02T19:32:11.670343
# combinations: 37
# satisfiable_combinations: 28
# confidence: 0.5
# prior_probability: 0.15743699999999997
# posterior_probability: 0.11114287892650197
# time_elapsed_seconds: 0.09324312210083008
# timed_out: False
#
# Format: fact_type followed by arguments, then truth_value and probabilities
#
fact_type arg1 arg2 arg1_label arg2_label truth_value prior_probability posterior_probability
EquivalentTo MONDO:0001234 ORDO:100 alpha disease Alpha rare disease True 0.9 0.968219477482972
EquivalentTo MONDO:0005678 ORDO:200 beta disease Beta rare disease True 0.85 0.9571896919792622
EquivalentTo MONDO:0001234 ORDO:100 alpha disease Alpha rare disease True 0.7 0.968219477482972
EquivalentTo MONDO:0005678 ORDO:200 beta disease Beta rare disease True 0.7 0.9571896919792622
EquivalentTo MONDO:0009999 ORDO:300 gamma disease Delta rare disease True 0.6 0.5972095150960658
EquivalentTo MONDO:0009999 ORDO:100 gamma disease Alpha rare disease False 0.3 0.004650808173223544
SSSOM Export
SSSOM is the standard format for ontology mappings. Exporting as SSSOM lets you feed boomer results directly into mapping pipelines (sssom-py, OAK, etc.):
OBOGraphs Export
OBOGraphs is the standard graph exchange format for ontologies, used by OAK, Monarch, and the broader OBO community:
Python API Equivalent
from boomer.ontology_converter import obo_to_kb, owl_to_kb
from boomer.sssom_converter import sssom_to_kb
from boomer.search import solve, SearchConfig
mondo_kb = obo_to_kb(DIR / "mondo_subset.obo")
ordo_kb = owl_to_kb(DIR / "ordo_subset.ofn")
mapping_kb = sssom_to_kb(DIR / "mondo_ordo_mappings.sssom.tsv")
merged = mondo_kb.extend(
facts=ordo_kb.facts + mapping_kb.facts,
pfacts=ordo_kb.pfacts + mapping_kb.pfacts,
labels={**ordo_kb.labels, **mapping_kb.labels},
)
merged.normalize()
solution = solve(merged, config=SearchConfig(timeout_seconds=60))
for sp in solution.solved_pfacts:
f = sp.pfact.fact
if f.fact_type == "EquivalentTo":
verdict = "ACCEPTED" if sp.truth_value else "REJECTED"
print(f"{verdict}: {f.sub} \u2261 {f.equivalent} "
f"(prior={sp.pfact.prob:.2f} \u2192 posterior={sp.posterior_prob:.3f})")
Summary
# Merge all sources (formats auto-detected)
pyboomer merge ontology.obo hierarchy.ofn mappings.sssom.tsv -o merged.yaml
# Solve
pyboomer solve merged.yaml -O yaml -o solution.yaml
# Export as TSV
pyboomer solve merged.yaml -O tsv -o solution.tsv
# Export as SSSOM (standard mapping format)
pyboomer solve merged.yaml -O sssom -o solution.sssom.tsv
# Export as OBOGraphs JSON (standard ontology graph format)
pyboomer solve merged.yaml -O obographs -o solution.obographs.json
Structural constraints from ontologies (disjointness, hierarchy) interact with probabilistic evidence from mappings to produce better alignments than either source alone.