Skip to content

dbcls/bsllmner-mk2

Repository files navigation

bsllmner-mk2

A tool for extracting biological named entities from BioSample records using Large Language Models (LLMs) and mapping them to ontology terms.

Key capabilities:

  • Extract mode -- Performs Named Entity Recognition (NER) to extract terms such as cell line, tissue, and organism from BioSample metadata
  • Select mode -- Extends extract mode by mapping extracted terms to ontology entries (Cellosaurus, UBERON, Cell Ontology, etc.)

bsllmner-mk2 uses Ollama as the LLM inference server.

Quick Start

docker compose up -d --build
docker compose exec app bsllmner2_extract \
  --bs-entries tests/data/example_biosample.json \
  --model llama3.1:70b --debug

For a complete walkthrough including ontology setup and Select mode, see Getting Started.

Documentation

Full documentation is available at https://dbcls.github.io/bsllmner-mk2.

Basics

Features

Operations

  • ChIP-Atlas -- Processing ChIP-Atlas data (hg38/mm10)
  • NIG Slurm -- Running on NIG Slurm environment

Development

  • Development -- Development environment setup
  • Testing -- Unit tests, linting, mutation testing, model evaluation

Related Resources

Other Interfaces

bsllmner-mk2 also includes a FastAPI server (bsllmner2_api) and a React-based web UI, but these are not actively maintained and their operation is unverified.

License

This repository is released under the MIT License. For details, see the LICENSE file.

About

A tool for extracting biological named entities from BioSample records using Large Language Models (LLMs) and mapping them to ontology terms.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors