Microbial Community Mechanisms Knowledge Base
A LinkML-based knowledge base for modeling microbial community structure, function, and ecological interactions with evidence-based validation.
Model specific microbial communities as individual YAML files, combining:
- Rich, expressive YAML for agent consumption
- Validated evidence chains (anti-hallucination)
- Ontology-grounded terms (NCBITaxon, ENVO, CHEBI, GO)
- Causal graphs for ecological interactions
- Faceted browser for scientists
- KG export for integration (via Koza)
Adapted from: Monarch Initiative's dismech
Rich YAML Source
kb/communities/Human_Gut_Healthy_Adult.yaml
↓
Validation Stack
├── Schema validation (linkml-validate)
├── Term validation (linkml-term-validator + OAK)
└── Reference validation (snippets vs PubMed)
↓
Dual Output
├── Koza Transform → KGX Edges (for KG stacks)
└── Browser Export → Faceted Search (for scientists)
# Clone repository
git clone https://github.com/CultureBotAI/CommunityMech.git
cd CommunityMech
# Install dependencies (once implemented)
just install
# Or with uv
uv sync --group dev# Schema validation
just validate kb/communities/Human_Gut_Healthy_Adult.yaml
# Reference validation (prevent hallucination)
just validate-references kb/communities/Human_Gut_Healthy_Adult.yaml
# Term validation (check ontology terms)
just validate-terms# Export to KG (Koza transform)
just kgx-export
# Generate faceted browser
just gen-browser
# Generate HTML pages
just gen-html-allNote: Commands above are planned for implementation. See COMMUNITY_MECH_PLAN.md for development roadmap.
CommunityMech/
├── src/
│ └── communitymech/
│ ├── schema/
│ │ └── communitymech.yaml # LinkML schema
│ ├── datamodel/ # Generated Python models
│ ├── export/
│ │ ├── kgx_export.py # Koza transform to KG
│ │ └── browser_export.py # Faceted browser export
│ ├── render.py # HTML page generator
│ └── templates/ # Jinja2 templates
├── kb/
│ └── communities/
│ ├── Human_Gut_Healthy_Adult.yaml
│ ├── Human_Gut_IBD_UC.yaml
│ ├── Soil_Grassland_Temperate.yaml
│ └── ...
├── conf/
│ ├── oak_config.yaml # OAK ontology adapters
│ └── qc_config.yaml # QC configuration
├── app/ # Faceted browser
│ ├── index.html
│ ├── data.js
│ └── schema.js
├── pages/
│ └── communities/ # Rendered HTML pages
├── tests/
│ └── test_communities.py
├── .claude/
│ └── skills/ # Claude Code curation skills
├── justfile # Task runner
├── pyproject.toml
├── COMMUNITY_MECH_PLAN.md # Full implementation plan
├── QUICK_START.md # Quick reference
└── README.md
- Implementation Plan - Complete 11-phase plan
- Quick Start Guide - Quick reference
- Schema Documentation - LinkML schema reference (once generated)
name: Human Gut Healthy Adult
ecological_state: HEALTHY
environment_term:
preferred_term: human gut
term:
id: ENVO:0001998
label: human gut environment
taxonomy:
- taxon_term:
preferred_term: Faecalibacterium prausnitzii
term:
id: NCBITaxon:853
label: Faecalibacterium prausnitzii
abundance_level: ABUNDANT
functional_role: [KEYSTONE, CORE]
evidence:
- reference: PMID:18936492
supports: SUPPORT
snippet: "F. prausnitzii represents more than 5%..."
ecological_interactions:
- name: Butyrate Production
source_taxon:
preferred_term: Faecalibacterium prausnitzii
term:
id: NCBITaxon:853
metabolites:
- preferred_term: butyrate
term:
id: CHEBI:30089
downstream:
- target: Host Colonocyte Energy
evidence:
- reference: PMID:18936492- Every claim backed by PMID references
- Snippets validated against PubMed abstracts
- Prevents AI hallucination
- NCBITaxon - Microbial taxa
- ENVO - Environments
- CHEBI - Chemical entities/metabolites
- GO - Biological processes
- UBERON - Host anatomy
- Model ecological interactions as directed graphs
- Represent cross-feeding, competition, mutualism
- Visualize with D3.js/Cytoscape
- Rich YAML - For agent consumption (full context)
- Simple KG - For graph algorithms (Biolink edges)
- Faceted browser (no coding required)
- Click-through to evidence
- Interactive visualizations
Current Phase: Foundation (Sprint 1)
- Repository created
- Implementation plan documented
- Reference implementation analyzed (Monarch dismech)
- Seed data added (35 communities)
- Schema design
- First example community YAML
- Validation stack setup
- Koza transform implementation
- Faceted browser adaptation
- HTML rendering
See COMMUNITY_MECH_PLAN.md for the complete roadmap.
This is a private repository during initial development. Once the core infrastructure is in place, we'll open it up for community contributions.
- Create a new branch for your feature/community
- Add/modify community YAML files
- Run validation:
just qc - Commit with evidence validation
- Create PR for review
# Schema validation
just validate kb/communities/YourCommunity.yaml
# Reference validation (anti-hallucination)
just validate-references kb/communities/YourCommunity.yaml
# Term validation (ontology checking)
just validate-terms-file kb/communities/YourCommunity.yaml
# Full QC
just qc(TBD once published)
Based on the dismech framework:
- Monarch Initiative. (2024). dismech: Disorder Mechanisms Knowledge Base. https://github.com/monarch-initiative/dismech
BSD-3-Clause (matching the dismech framework)
- dismech - Disease mechanisms KB (inspiration)
- LinkML - Modeling framework
- Koza - KG transformation tool
- BugSigDB - Microbial signatures database
- OAK - Ontology access toolkit
For questions or collaboration inquiries, please open an issue.
Status: 🚧 Under active development