kmtools

A collection of utilities extending the functionality of km, providing parallelisation, filtering, merging, and plotting of km find_mutation results.

kmtools is designed to streamline large-scale km workflows by allowing you to:

Run km find_mutation across multiple threads via target sequence chunking
Merge and filter results
Generate summary plots for downstream analysis

Installation

Using pipx (recommended)

pipx install git+https://github.com/trentzz/kmtools

This tool may eventually be published to PyPI. Using pipx is recommended to isolate kmtools from your system Python environment.

Prerequisites

Python ≥ 3.10
km must be installed and accessible in your PATH.

To install km via pipx:

pipx install km-walk

Any of the installation methods listed in the km documentation will also work.

Overview of Commands

Command	Description
`kmtools chunk`	Run multiple instances of `km find_mutation` in parallel using target sequence chunking.
`kmtools merge`	Merge multiple `km` outputs into a combined result.
`kmtools filter`	Filter `km` results based on a reference or other criteria.
`kmtools plot`	Generate visualisations from filtered results.
`kmtools runall`	Run the full pipeline (`chunk → merge → filter → plot`) automatically.

Usage

`kmtools chunk`

Run km find_mutation concurrently across multiple threads by splitting target sequences.

kmtools chunk \
    --threads 8 \
    --km-find-mutation-options "--ratio 0.0001 targets_dir database.jf" \
    --merge \
    --verbose

Arguments

Argument	Description
`--threads`	Number of threads (parallel `km` processes).
`--km-find-mutation-options`	Options passed directly to `km find_mutation`. Do not include output redirection (`>`).
`--merge`	Merge chunked results automatically after processing.
`--verbose`	Enable detailed logging and timing information.

Process

Splits the list of target sequences into chunks (one per thread).
Runs km find_mutation on each chunk in parallel.
Optionally merges all results when complete.

`kmtools merge`

Combine multiple chunk outputs into a single result file.

kmtools merge \
    chunk_*/results.txt \
    --output merged.txt \
    --verbose

Arguments

Argument	Description
`inputs`	Input files or directories to merge.
`--output`	Output file for merged data.
`--verbose`	Enable detailed logging.

This is typically used after chunking to combine partial km results.

`kmtools filter`

Filter km output based on a reference sequence or other logic.

kmtools filter \
    --reference ref.fa \
    --km-output merged.txt \
    --output filtered.txt \
    --verbose

Arguments

Argument	Description
`--reference`	Reference FASTA file.
`--km-output`	Input `km` output file to filter.
`--output`	Output file for filtered results.
`--verbose`	Enable detailed logging.

This step removes false positives or unwanted variants.

`kmtools plot`

Generate plots and summary charts from filtered results.

kmtools plot \
    filtered.txt \
    --output-dir plots \
    --charts vaf,patient,sample \
    --verbose

Arguments

Argument	Description
`file`	Filtered results file.
`--output-dir`	Directory to save plots (default: current directory).
`--charts`	Comma-separated list of charts to generate (`vaf`, `patient`, `sample`, `overall`, or `all`).
`--verbose`	Enable detailed logging.

Example outputs include:

vaf_distribution.png
patient_summary.png
sample_overview.png

`kmtools runall`

Run the full workflow — chunk → merge → filter → plot — in a single command.

kmtools runall \
    --threads 8 \
    --km-find-mutation-options "--ratio 0.0001 targets_dir database.jf" \
    --merge-inputs chunk_*/results.txt \
    --merge-output merged.txt \
    --reference ref.fa \
    --filtered-output filtered.txt \
    --output-dir plots \
    --charts all \
    --verbose

Arguments

Argument	Description
`--threads`	Number of threads for chunking.
`--km-find-mutation-options`	Options for `km find_mutation`.
`--merge-inputs`	Files or directories to merge after chunking.
`--merge-output`	Output file for merged data.
`--reference`	Reference file for filtering.
`--filtered-output`	Output file for filtered results.
`--output-dir`	Directory for plots.
`--charts`	Charts to generate (default: `all`).
`--verbose`	Enable detailed logging and timing information.

Workflow

Splits and runs km find_mutation in parallel.
Merges chunk results.
Filters merged results using the provided reference.
Generates plots and summaries.

Notes and Best Practices

Always test your --km-find-mutation-options with a single km find_mutation run before using kmtools chunk or runall.
Use --working-dir (if supported) to isolate intermediate outputs.
Maintain a consistent output naming scheme to simplify automated merging.
Use --verbose for debugging and benchmarking; it reports timing and process details.

Example Full Workflow

kmtools runall \
  --threads 8 \
  --km-find-mutation-options "--ratio 0.0001 targets_dir database.jf" \
  --merge-inputs chunk_*/results.txt \
  --merge-output merged.txt \
  --reference ref.fa \
  --filtered-output filtered.txt \
  --output-dir plots \
  --charts all \
  --verbose

Project Structure

kmtools/
├── __init__.py
├── cli.py           # Entry point for all subcommands
├── chunk.py         # Handles kmtools chunk
├── merge.py         # Handles kmtools merge
├── filter.py        # Handles kmtools filter
├── plot.py          # Handles kmtools plot
└── utils.py         # Shared utilities (logging, validation, etc.)

Future Plans

Automatic detection of chunk outputs in runall
Multiprocessing backend for chunking
Extended filtering and visualisation modules
Support for JSON and TSV output formats

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
src/kmtools		src/kmtools
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.DEV.md		README.DEV.md
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

kmtools

Installation

Using pipx (recommended)

Prerequisites

Overview of Commands

Usage

`kmtools chunk`

Arguments

Process

`kmtools merge`

Arguments

`kmtools filter`

Arguments

`kmtools plot`

Arguments

`kmtools runall`

Arguments

Workflow

Notes and Best Practices

Example Full Workflow

Project Structure

Future Plans

About

Uh oh!

Releases

Packages

Languages

License

CCICB/kmtools

Folders and files

Latest commit

History

Repository files navigation

kmtools

Installation

Using pipx (recommended)

Prerequisites

Overview of Commands

Usage

kmtools chunk

Arguments

Process

kmtools merge

Arguments

kmtools filter

Arguments

kmtools plot

Arguments

kmtools runall

Arguments

Workflow

Notes and Best Practices

Example Full Workflow

Project Structure

Future Plans

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

`kmtools chunk`

`kmtools merge`

`kmtools filter`

`kmtools plot`

`kmtools runall`

Packages