Skip to content

experiment: ROBOT BOT-extracted slim modules for RO + IAO + CCO#26

Merged
Amosk21 merged 2 commits intomainfrom
experiment/bot-extracted-imports
Apr 30, 2026
Merged

experiment: ROBOT BOT-extracted slim modules for RO + IAO + CCO#26
Amosk21 merged 2 commits intomainfrom
experiment/bot-extracted-imports

Conversation

@Amosk21
Copy link
Copy Markdown
Owner

@Amosk21 Amosk21 commented Apr 30, 2026

Summary

Replaces the full upstream imports of RO and IAO (and the CCO local stubs)
with ROBOT-extracted slim modules using --method BOT, a syntactic locality
module variant. BFO 2020 stays full-imported (matches IAO's own pattern).

Ontology Before After
BFO Full local import Full local import — unchanged
RO Full upstream release (1.2 MB) ROBOT BOT-extracted slim (291 KB)
IAO Full upstream release (610 KB) ROBOT BOT-extracted slim (115 KB)
CCO Local stubs only ROBOT BOT-extracted slim (421 KB) + bridges

Why BOT and not full imports

BOT is the OBO Foundry / ODK / ROBOT-recommended default. Gene Ontology, the
OBO Relations Ontology itself, and the hundreds of ODK-managed projects all
ship BOT-extracted slim modules for their dependencies. SLME-BOT carries a
formal deductive-conservativity guarantee (Cuenca Grau / Horrocks / Kazakov /
Sattler 2007/2008): the extracted module entails exactly the same things as
the full upstream for terms in the seed signature.

What changes

  • New files:
    • 03_TECHNICAL_CORE/ontology/imports/iao_bot.owl (115 KB, was 610 KB full)
    • 03_TECHNICAL_CORE/ontology/imports/ro_bot.owl (291 KB, was 1.2 MB full)
    • 03_TECHNICAL_CORE/ontology/imports/cco_bot.owl (421 KB, was stubs only)
    • 03_TECHNICAL_CORE/ontology/imports/seeds/{ro,iao,cco}_seed.txt (the
      explicit seed signature ARCO depends on, version-controlled)
  • Removed files:
    • 03_TECHNICAL_CORE/ontology/imports/iao.owl (full release no longer needed)
    • 03_TECHNICAL_CORE/ontology/imports/ro.owl (full release no longer needed)
  • Pipeline + workflow: run_pipeline.py, test_scenarios.py, and
    robot-validate.yml now load the BOT modules; CI job timeout reduced
    90 -> 30 min (HermiT now ~6.6 min locally vs 35-40 min on full imports).
  • README: "Why full upstream imports" subsection replaced with "Why
    ROBOT BOT slim modules" — five practical reasons grounded in the formal
    SLME guarantee, OBO Foundry / ODK convention, MIREOT being legacy, seed-
    file reproducibility, and operational scaling.

Pinned upstream releases

  • BFO 2020 (ISO/IEC 21838-2:2021): full local file
  • RO release 2025-12-17: BOT-extracted from ro.owl
  • IAO release 2026-03-30: BOT-extracted from iao.owl
  • CCO release v1.7-2024-11-03: BOT-extracted from merged 11 modules
    (last release before v2.0 IRI-namespace migration; preserves ARCO's
    existing cco: IRI namespace)

Local CCO bridging stubs (kept)

Two local assertions in ARCO_governance_extension.ttl remain:

  • cco:DirectiveICE rdfs:subClassOf iao:0000030
  • cco:DescriptiveICE rdfs:subClassOf iao:0000030

Upstream CCO does not link cco:InformationContentEntity to IAO's
iao:0000030. ARCO's bridges connect the two ICE hierarchies. The other
four CCO declarations (cco:Person rdfs:subClassOf bfo:0000040,
cco:Organization rdfs:subClassOf bfo:0000027, the property declarations)
are now redundant with the BOT module's upstream axioms but cause no
conflict; left in place for in-file readability.

Verification

Local:

  • python 03_TECHNICAL_CORE/scripts/run_pipeline.py: PASS, Sentinel-ID
    classified AnnexIII1aApplicableSystem (entailed). 7744 asserted ->
    27454 post-reasoning, +19710 entailed (was +47888).
  • python 03_TECHNICAL_CORE/scripts/test_scenarios.py: 5/5 PASS. Same
    classifications as full-import state. Both adversarial scenarios
    (equivalency decoy, blank-node ghost) still entail correctly.
  • robot validate-profile --profile DL: PASS — merged ontology is OWL 2
    DL conformant.
  • HermiT: ran in 397 seconds (~6.6 min) on the merged ontology locally.

CI (this PR): see ROBOT Validation run.

Why this is safe

The BOT extraction has a formal entailment-preservation guarantee for terms
in the seed signature. The seed signature was generated from grep of every
(bfo|ro|iao|cco): IRI in the ARCO TTL files plus defensive pulls of
ro:0000056 and ro:0000092 (which BOT would pull anyway via inverse-of /
sub-property axioms). The pipeline + scenario regression confirms identical
classifications. The DL profile and HermiT consistency confirm the merged
ontology is logically well-formed.

The full-import experiments leading up to this branch are preserved in git
history at PRs #24 (combined IAO + RO promotion) and #25 (README post-merge
cleanup). They are the empirical evidence base for the BOT decision: full
imports work but cost ~30 minutes of HermiT CI time per run, and the cost
projects badly when CCO is added. BOT modules deliver identical
classifications with the formal guarantee plus convention-matching audit
story.

Test plan

  • Local run_pipeline.py PASS
  • Local test_scenarios.py 5/5 PASS
  • Local robot validate-profile --profile DL PASS
  • Local HermiT terminates and produces reasoned graph
  • ROBOT Validation CI passes (Phase 1 DL profile, Phase 2 HermiT, Phase 3 owlrl-vs-HermiT diff)
  • ARCO Demo Run CI passes
  • Spot-check rendered README ontology-versions table and "Why ROBOT BOT slim modules" subsection

🤖 Generated with Claude Code

Amosk21 and others added 2 commits April 30, 2026 10:51
Replaces the full upstream releases of RO and IAO (and adds CCO instead
of stubs) with ROBOT-extracted slim modules using --method BOT (a syntactic
locality module variant). BFO 2020 stays as a full local import (matches
IAO's own pattern; BFO is small and ISO-standardized).

Why BOT and not full imports:
  - BOT is the OBO Foundry / ODK / ROBOT-recommended default ("when in doubt,
    use this one"). GO, RO's own Makefile, and ~hundreds of ODK-managed
    projects ship BOT-extracted slims.
  - SLME-BOT carries a formal deductive-conservativity guarantee: the
    extracted module entails exactly the same things as the full upstream
    for terms in the seed signature (Cuenca Grau / Horrocks / Kazakov /
    Sattler 2007/2008).
  - MIREOT is legacy and does NOT preserve property-typing or property-chain
    axioms reliably; ARCO's gate axioms use anonymous inverse property
    expressions on iao:0000136 that would silently drop from the DL fragment
    if iao:0000136 ended up declared only as AnnotationProperty in a slice.
    BOT preserves property typing and characteristic axioms reliably.
  - Full imports of RO + IAO worked but pushed HermiT CI to 35-40 minutes
    and projected to 1-3 hours when CCO was added. BOT slim modules drop
    HermiT CI to ~6.6 minutes locally on the merged ontology.

What's new:
  - 03_TECHNICAL_CORE/ontology/imports/iao_bot.owl  (115 KB, was 610 KB)
  - 03_TECHNICAL_CORE/ontology/imports/ro_bot.owl   (291 KB, was 1.2 MB)
  - 03_TECHNICAL_CORE/ontology/imports/cco_bot.owl  (421 KB, was stubs only)
  - 03_TECHNICAL_CORE/ontology/imports/seeds/{ro,iao,cco}_seed.txt
    (the explicit seed signature ARCO depends on, version-controlled)

What's removed:
  - 03_TECHNICAL_CORE/ontology/imports/iao.owl  (full release no longer needed)
  - 03_TECHNICAL_CORE/ontology/imports/ro.owl   (full release no longer needed)

Pinned upstream releases (provenance):
  - BFO 2020 (ISO/IEC 21838-2:2021): full local file
  - RO release 2025-12-17:           BOT-extracted from ro.owl
  - IAO release 2026-03-30:          BOT-extracted from iao.owl
  - CCO release v1.7-2024-11-03:     BOT-extracted from merged 11 modules
    (last release before v2.0 namespace migration; preserves ARCO's existing
    cco: IRI namespace http://www.ontologyrepository.com/CommonCoreOntologies/)

Verification (this branch, local):
  - Pipeline: PASS, Sentinel-ID classified AnnexIII1aApplicableSystem (entailed).
    7744 asserted -> 27454 post-reasoning, +19710 entailed (was +47888).
  - Five-scenario regression: 5/5 PASS. Same classifications as full-import
    state. Both adversarial scenarios (equivalency decoy, blank-node ghost)
    still entail correctly under BOT-preserved upstream axioms.
  - ROBOT validate-profile DL: PASS (881 KB merged ontology in profile).
  - HermiT reasoning: ~6.6 minutes locally (was ~35-40 min on full RO+IAO+ARCO).

CI workflow:
  - robot-validate.yml merge step now points at the BOT modules.
  - Job timeout reduced 90 -> 30 minutes (HermiT now ~6 min, 5x headroom).

ARCO local CCO stubs in ARCO_governance_extension.ttl are intentionally
preserved. Two of the six (cco:DirectiveICE / cco:DescriptiveICE rdfs:subClassOf
iao:0000030) bridge CCO's information-content-entity hierarchy to IAO's;
upstream CCO does not link these. The other four are now redundant with
the BOT module's upstream axiomatization but cause no conflict; left in
place for now.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Replaces the "Why full upstream imports" subsection with "Why ROBOT BOT slim
modules" framing that matches actual OBO Foundry / ODK convention. Updates
the Foundational ontology versions table to show "ROBOT BOT-extracted slim
module" rather than "Full ontology" for RO / IAO / CCO. Updates the headline
claims for the new state:

- Hero TL;DR bullet 1: "with ROBOT-BOT-extracted slim modules of OBO Relations
  Ontology, Information Artifact Ontology, and Common Core Ontologies — the
  OBO Foundry / ODK standard pattern, version-pinned and reproducible from
  seed files in the repo."
- Certificate sample: ENTAILED TRIPLES ADDED 47888 -> 19710.
- Getting started: 17266 -> 65154 -> 7744 -> 27454.

The new "Why ROBOT BOT slim modules" subsection lists five practical reasons:
formal SLME entailment-preservation guarantee, OBO Foundry / ODK convention,
MIREOT being legacy and unsafe for property-typing-sensitive reasoning,
seed-file reproducibility, and operational scaling. The previous "five
reasons full imports are right for ARCO" framing is gone — that argument
turned out not to match what mature BFO-based projects actually do (RO ships
a curated BFO subset, IAO MIREOTs everything except BFO, CCO consumers
typically use the modular structure rather than the full pre-integrated
file). The full-import experiments are preserved in git history at PRs #24
and #25 as the empirical evidence base for the BOT decision.

Documentation only; no code or ontology change.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copy link
Copy Markdown

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

@Amosk21 Amosk21 merged commit 7331a03 into main Apr 30, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant