Flexible Bioinformatics Personal Intelligent Pipeline Agent
基于终端Agent的面向多场景、定制化需求的生信分析工具和管线开发工作流
FlexBio-PIPA is a multi-agent system that automates the full lifecycle of bioinformatics pipeline development. A Research Orchestrator coordinates specialised sub-agents through a stage-gate workflow:
- Research Orchestrator (
ResearchOrchestratorAgent) — drives the end-to-end pipeline: requirements freeze, literature search, tool collection, benchmarking, Snakemake implementation, testing, and iterative refinement. - Plan (
PlanAgent) — set goals and decompose tasks into structured plans. - Literature Search (
LiteratureSearchAgent) — search PubMed, arXiv, and bioRxiv for relevant papers and methods. - Tool Collection (
ToolCollectionAgent) — search Bioconda, Galaxy ToolShed, and Snakemake Hub for existing tools and workflows. - Benchmark (
BenchmarkAgent) — gather benchmark results and performance comparisons for competing methods. - Workflow (
WorkflowAgent) — generate or refine Snakemake pipelines. - Build (
BuildAgent) — autonomously build custom tools/workflows when ready-made solutions are unavailable. - Test Plan (
TestPlanAgent) — design test plans with datasets and validation rules.
Each agent is enhanced with curated scientific skills drawn from
K-Dense-AI/claude-scientific-skills (19 skills covering literature
databases, bioinformatics libraries, and scientific methodology). Skills
are automatically installed during project scaffolding and scoped
per-agent via permission.skill frontmatter.
The LLM backend is OpenAI API-compatible, making it usable with:
| Backend | Notes |
|---|---|
| OpenCode | Primary target (native agent support) |
| Claude Code | Theoretically compatible |
| Codex / OpenAI | Theoretically compatible |
| Any OpenAI-compatible server | e.g. Ollama, LM Studio |
pip install flexbio-pipa
# or from source:
git clone https://github.com/WatchmanGu/FlexBio-PIPA.git
cd FlexBio-PIPA
pip install -e ".[dev]"Copy the example configuration and set your API key:
cp config/config.example.yaml config/config.yaml
export OPENAI_API_KEY="your-api-key"
# or set in config/config.yamlFlexBio-PIPA can run inside any container runtime as long as the project workspace is mounted and the OpenCode-compatible LLM endpoint is reachable from inside the container.
- Mount the project directory as a writable volume so generated artifacts stay on the host.
- Mount or inject
config/config.yamland any secrets the same way you would for a local run. - Run commands from the mounted project root so
AGENTS.md,.opencode/, and the stage directories stay aligned.
<container-runtime> run --rm \
-v "$PWD:/workspace" \
-w /workspace \
-e OPENAI_API_KEY="$OPENAI_API_KEY" \
<flexbio-pipa-image> \
flexbio-pipa research-run --project-dir /workspace --task "..."- Start an OpenCode-compatible server separately, for example with
opencode serve. - Set
llm.base_urlto the reachable endpoint exposed by that runtime. - If the runtime is outside the container, replace
localhostwith the host or service name visible from the container. - Keep the project-local OpenCode layout on the mounted workspace so OpenCode
can read
AGENTS.md,.opencode/agents/,.opencode/skills/, and.opencode/tools/.
llm:
base_url: "http://<reachable-host>:4096/v1"
api_key: "${OPENAI_API_KEY}"If you usually create a new directory for each client request or pipeline project, FlexBio-PIPA now supports a project-local OpenCode layout.
The idea is simple:
- each project gets its own
AGENTS.md - each project gets its own
.opencode/skills/ - each project keeps its own requirements, literature, tools, implementation, tests, and reports in one place
mkdir my-rnaseq-project
cd my-rnaseq-project
python -m venv .venv
source .venv/bin/activate
# Recommended while developing this agents team from source
pip install -e /path/to/FlexBio-PIPA
# Or install from a package release instead
# pip install flexbio-pipaYou can use either config/config.yaml or config.yaml inside the project
directory.
If you want a tiny starting point, copy the included templates:
mkdir -p config
cp /path/to/FlexBio-PIPA/examples/opencode_project/config.yaml config/config.yaml
cp /path/to/FlexBio-PIPA/examples/opencode_project/project.yaml ./project.yamlExample config/config.yaml:
llm:
base_url: "https://api.openai.com/v1"
api_key: "${OPENAI_API_KEY}"
model: "gpt-4o"
research:
execution_profile: "local"
auto_execute: false
dry_run: true
max_refinement_cycles: 1
scientific_skills_root: "/path/to/claude-scientific-skills/scientific-skills"Then export your API key if needed:
export OPENAI_API_KEY="your-api-key"# from the task text directly
flexbio-pipa init-project \
--project-dir . \
--task "Build an RNA-seq differential expression pipeline"
# or from the example task template you copied
flexbio-pipa init-project \
--project-dir . \
--task-file project.yamlThis creates:
AGENTS.md- project-level agent instructions for OpenCode.opencode/agents/- project-local agent definitions (one.mdper FlexBio-PIPA agent).opencode/skills/- project-local skill overrides.opencode/tools/- project-local custom tools00_requirements/through05_reports/- the working artifact layout
OpenCode should be started in the project directory you just initialized.
Important project files:
AGENTS.md- project-specific working rules.opencode/agents/- project-specific agent definitions for OpenCode.opencode/skills/- project-specific skills you want OpenCode to use.opencode/tools/- project-specific custom tools00_requirements/intake.md- the main task description for the project
Skill precedence is:
- project-local
.opencode/skills/ - shared scientific skills from
research.scientific_skills_root - repo-default skills bundled with FlexBio-PIPA
# Use the task already stored in 00_requirements/intake.md
flexbio-pipa research-run --project-dir . --no-execute
# Or use the project task template directly
flexbio-pipa research-run --project-dir . --task-file project.yaml --no-execute
# Or override the task for a specific run
flexbio-pipa research-run \
--project-dir . \
--task "Build a metagenomic assembly pipeline" \
--no-execute
# Run local execution as well
flexbio-pipa research-run --project-dir . --executeDuring a run, FlexBio-PIPA writes structured artifacts into:
00_requirements/- frozen scope, clarifications, skill audit01_literature/- search outputs and benchmark evidence02_tools/- tool candidates and shortlist03_implementation/- Snakemake files and implementation artifacts04_tests/- test plan and validation inputs05_reports/- execution analysis, revision requests, final summary
Typical project-local customization points are:
- edit
AGENTS.mdfor project-specific collaboration rules - add or override skills in
.opencode/skills/ - update
00_requirements/intake.mdas requirements change - keep local execution defaults in
config/config.yaml
This makes the workflow convenient for OpenCode: each new development request can live in its own self-contained project folder, with its own agent rules, skills, artifacts, and reports.
# Interactive mode — provide a task description
flexbio-pipa run --task "Develop a variant calling pipeline for WGS data"
# Run with a YAML task file
flexbio-pipa run --task-file examples/wgs_variant_calling.yaml
# List available agents
flexbio-pipa list-agents
# Run only the planning agent
flexbio-pipa plan --task "RNA-seq differential expression analysis"from flexbio_pipa.agents import PlanAgent, BuildAgent
from flexbio_pipa.utils.config import load_config
config = load_config("config/config.yaml")
# Create a plan for a bioinformatics task
plan_agent = PlanAgent(config=config)
plan = plan_agent.run("Develop a metagenomics classification pipeline")
print(plan.goals)
print(plan.steps)
# Execute the plan — build agents, search literature, collect tools
build_agent = BuildAgent(config=config)
result = build_agent.run(plan)flexbio_pipa/
├── agents/
│ ├── base.py # BaseAgent, AgentResult, Message
│ ├── sub_agent.py # SubAgent base class
│ ├── plan_agent.py # PlanAgent — goal setting & planning
│ ├── build_agent.py # BuildAgent — tool/workflow development
│ ├── research_orchestrator_agent.py # ResearchOrchestratorAgent — end-to-end orchestration
│ ├── literature_search_agent.py # PubMed / arXiv / bioRxiv search
│ ├── tool_collection_agent.py # Tool/workflow collection
│ ├── benchmark_agent.py # Benchmark result collection
│ ├── test_plan_agent.py # Test plan design & validation
│ └── workflow_agent.py # Snakemake pipeline generation
├── research/
│ ├── workspace.py # Project scaffolding, agent templates, skill installation
│ ├── skills.py # ScientificSkillRegistry — local skill audit & discovery
│ └── artifacts.py # Structured artifact I/O for stage directories
├── execution/
│ ├── base.py # ExecutionProfile ABC
│ ├── local.py # LocalExecutionProfile — Snakemake dry-run / local execution
│ └── parser.py # Execution output parsing and error extraction
├── tools/
│ ├── pubmed.py # PubMed E-utilities wrapper
│ ├── arxiv.py # arXiv API wrapper
│ ├── conda.py # Conda/Bioconda package search
│ ├── galaxy.py # Galaxy ToolShed search
│ ├── snakemake_hub.py # Snakemake workflow hub search
│ └── code_generator.py # LLM-based code generation
├── workflows/
│ ├── base.py # Workflow base class
│ └── snakemake.py # Snakemake workflow builder
├── utils/
│ ├── config.py # YAML configuration loader with ${ENV_VAR} resolution
│ ├── llm.py # LLM client (OpenAI-compatible)
│ └── logger.py # Rich-based logger
└── cli.py # Click CLI — init-project, research-run, run, plan, list-agents
When init-project scaffolds a new project, it creates:
my-pipeline-project/
├── AGENTS.md # Project-level agent instructions
├── .opencode/
│ ├── agents/ # One .md per FlexBio-PIPA agent (7 agents)
│ │ ├── research-orchestrator.md
│ │ ├── pipeline-planner.md
│ │ ├── literature-search.md
│ │ ├── tool-collection.md
│ │ ├── benchmark.md
│ │ ├── workflow.md
│ │ └── test-plan.md
│ ├── skills/ # Curated scientific skills (up to 19)
│ │ ├── flexbio-pipa-project-workspace/
│ │ ├── pubmed-database/
│ │ ├── arxiv-database/
│ │ ├── biorxiv-database/
│ │ ├── literature-review/
│ │ ├── biopython/
│ │ ├── ...
│ │ └── scientific-writing/
│ └── tools/ # Custom tool definitions (placeholder)
├── 00_requirements/ # Intake, requirement freeze, skill audit
├── 01_literature/ # Search strategies, evidence summaries
├── 02_tools/ # Tool candidates, shortlist
├── 03_implementation/ # Snakefile, config, scripts, envs
├── 04_tests/ # Test plan, datasets, validation rules
└── 05_reports/ # Execution reports, final recommendation
FlexBio-PIPA integrates 19 curated scientific skills from
K-Dense-AI/claude-scientific-skills
to give each agent domain-specific knowledge. Skills are installed into
.opencode/skills/ during init-project and scoped per-agent so each
agent only sees relevant skills.
| Agent | Skills |
|---|---|
| research-orchestrator | scientific-writing, scientific-brainstorming |
| pipeline-planner | scientific-brainstorming, hypothesis-generation |
| literature-search | pubmed-database, arxiv-database, biorxiv-database, literature-review |
| tool-collection | gget, ensembl-database, bioservices |
| benchmark | scientific-critical-thinking, peer-review |
| workflow | biopython, pysam, deeptools, pydeseq2, scanpy, scikit-bio |
| test-plan | scientific-critical-thinking |
All agents also have access to the built-in flexbio-pipa-project-workspace
skill, which describes the project layout and conventions.
-
Set
research.scientific_skills_rootin your config to point at a local clone of the scientific skills repository:research: scientific_skills_root: "/path/to/claude-scientific-skills/scientific-skills"
-
Run
flexbio-pipa init-project. The scaffolder copies each skill'sSKILL.mdplus anyreferences/,scripts/, andassets/subdirectories into the project's.opencode/skills/<name>/folder. -
Each agent's frontmatter includes a
permission.skillblock that allows only its mapped skills and denies everything else. This keeps agent context focused and prevents irrelevant skill loading.
Existing skills in the target directory are never overwritten, so project-local customizations are preserved across re-runs.
- Project-local
.opencode/skills/(highest priority) - Scientific skills installed from
research.scientific_skills_root - Repo-default skills bundled with FlexBio-PIPA
config/config.yaml:
llm:
base_url: "https://api.openai.com/v1" # or your OpenCode/Ollama URL
api_key: "${OPENAI_API_KEY}"
model: "gpt-4o"
temperature: 0.2
max_tokens: 4096
timeout: 120
agents:
# OpenCode agent model assignment (used by init-project)
opencode_agents:
models:
strong: "github-copilot/claude-opus-4.6" # orchestrator, planner, workflow
fast: "github-copilot/gpt-5.4-mini" # lit-search, tool-collection, benchmark, test-plan
# Per-agent overrides (optional):
# overrides:
# workflow: "github-copilot/claude-sonnet-4"
plan:
max_iterations: 3
build:
max_iterations: 10
research_orchestrator:
max_stages: 8
literature_search:
max_results: 20
databases: ["pubmed", "arxiv"]
tool_collection:
sources: ["bioconda", "galaxy", "snakemake_hub"]
max_results: 10
benchmark:
max_results: 10
test_plan:
test_data_sources: ["sra", "zenodo"]
workflow:
engine: snakemake
research:
workspace_root: "workspace/research"
scientific_skills_root: "/path/to/claude-scientific-skills/scientific-skills"
execution_profile: "local"
auto_execute: false
dry_run: true
max_refinement_cycles: 1
snakemake:
cores: 1
logging:
level: "INFO"
file: nullThe agents.opencode_agents.models section controls which LLM model is
written into each agent's .md frontmatter during init-project:
| Tier | Default | Agents |
|---|---|---|
| strong | claude-opus-4.6 |
research-orchestrator, pipeline-planner, workflow |
| fast | gpt-5.4-mini |
literature-search, tool-collection, benchmark, test-plan |
Use agents.opencode_agents.overrides to assign a specific model to any
individual agent.
pip install -e ".[dev]"
# Run the full test suite (91 tests)
pytest tests/ -v
# Lint and type-check
ruff check src/
mypy src/
# Format
black src/ tests/| Package | Purpose |
|---|---|
openai |
LLM chat completion API |
click |
CLI framework |
requests |
HTTP for PubMed/arXiv/Conda/Galaxy |
tenacity |
Retry with exponential backoff |
pyyaml |
YAML config loading |
rich |
Terminal output and logging |
pydantic |
Data validation |
jinja2 |
Templating |
Dev: pytest, pytest-cov, pytest-mock, responses, ruff, black, mypy.
MIT