open2c · nvictus · Dec 2, 2025 · Dec 2, 2025 · Dec 2, 2025
diff --git a/.github/dependabot.yml b/.github/dependabot.yml
@@ -5,9 +5,9 @@ updates:
     schedule:
       interval: "weekly"
     groups:
-          actions:
-            patterns:
-              - "*"
+      actions:
+        patterns:
+          - "*"
   - package-ecosystem: "pip"
     directory: "/"
     schedule:

diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
@@ -2,22 +2,21 @@ name: CI
 
 on:
   push:
-    branches: [ main ]
+    branches: [main]
 
   pull_request:
-    branches: [ main ]
+    branches: [main]
 
 concurrency:
   group: ${{ github.workflow }}-${{ github.ref }}
   cancel-in-progress: true
 
 jobs:
-
   Test:
     runs-on: ubuntu-latest
     strategy:
       matrix:
-        python-version: [ "3.10", "3.11", "3.12", "3.13" ]
+        python-version: ["3.10", "3.11", "3.12", "3.13"]
     steps:
       - uses: actions/checkout@v6
       - name: Set up Python ${{ matrix.python-version }}

diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -1,3 +1,7 @@
+ci:
+  autoupdate_schedule: monthly
+  autoupdate_commit_msg: "chore: Update pre-commit hooks"
+  autofix_commit_msg: "style: Pre-commit fixes"
 repos:
   - repo: https://github.com/pre-commit/pre-commit-hooks
     rev: v6.0.0
@@ -11,6 +15,22 @@ repos:
   - repo: https://github.com/astral-sh/ruff-pre-commit
     rev: v0.14.7
     hooks:
-      - id: ruff
+      - id: ruff-check
         types_or: [python, pyi, jupyter]
         args: [--fix, --show-fixes, --exit-non-zero-on-fix]
+      - id: ruff-format
+
+  - repo: https://github.com/pre-commit/pygrep-hooks
+    rev: v1.10.0
+    hooks:
+      - id: python-no-log-warn
+      - id: rst-backticks
+      - id: rst-directive-colons
+      - id: rst-inline-touching-normal
+      - id: text-unicode-replacement-char
+
+  - repo: https://github.com/rbubley/mirrors-prettier
+    rev: v3.7.3
+    hooks:
+      - id: prettier
+        args: ["--cache-location=.prettier_cache/cache"]
diff --git a/.readthedocs.yaml b/.readthedocs.yaml
@@ -2,6 +2,8 @@
 # Read the Docs configuration file
 # See https://docs.readthedocs.io/en/stable/config-file/v2.html for details
 
+version: 2
+
 build:
   os: ubuntu-24.04
   tools:

diff --git a/CHANGES.md b/CHANGES.md
diff --git a/CITATION.cff b/CITATION.cff
@@ -2,14 +2,14 @@ cff-version: 1.2.0
 type: software
 title: bioframe
 license: MIT
-repository-code: 'https://github.com/open2c/bioframe'
+repository-code: "https://github.com/open2c/bioframe"
 message: >-
   If you use this software, please cite it using the
   metadata from this file.
 authors:
   - given-names: Nezar
     family-names: Abdennur
-    orcid: 'https://orcid.org/0000-0001-5814-0864'
+    orcid: "https://orcid.org/0000-0001-5814-0864"
   - given-names: Geoffrey
     family-names: Fudenberg
     orcid: "https://orcid.org/0000-0001-5905-6517"
@@ -57,7 +57,7 @@ preferred-citation:
     - family-names: Open2C
     - given-names: Nezar
       family-names: Abdennur
-      orcid: 'https://orcid.org/0000-0001-5814-0864'
+      orcid: "https://orcid.org/0000-0001-5814-0864"
     - given-names: Geoffrey
       family-names: Fudenberg
       orcid: "https://orcid.org/0000-0001-5905-6517"

diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -1,22 +1,19 @@
 # Contributing
 
-
 ## General guidelines
 
 If you haven't contributed to open-source before, we recommend you read [this excellent guide by GitHub on how to contribute to open source](https://opensource.guide/how-to-contribute). The guide is long, so you can gloss over things you're familiar with.
 
 If you're not already familiar with it, we follow the [fork and pull model](https://help.github.com/articles/about-collaborative-development-models) on GitHub. Also, check out this recommended [git workflow](https://www.asmeurer.com/git-workflow/).
 
-
 ## Contributing Code
 
 This project has a number of requirements for all code contributed.
 
-* We follow the [PEP-8 style](https://www.python.org/dev/peps/pep-0008/) convention.
-* We use [NumPy-style docstrings](https://numpydoc.readthedocs.io/en/latest/format.html).
-* It's ideal if user-facing API changes or new features have documentation added.
-* It is best if all new functionality and/or bug fixes have unit tests added with each use-case.
-
+- We follow the [PEP-8 style](https://www.python.org/dev/peps/pep-0008/) convention.
+- We use [NumPy-style docstrings](https://numpydoc.readthedocs.io/en/latest/format.html).
+- It's ideal if user-facing API changes or new features have documentation added.
+- It is best if all new functionality and/or bug fixes have unit tests added with each use-case.
 
 ## Setting up Your Development Environment
 
@@ -96,7 +93,6 @@ This will build the documentation and serve it on a local http server which list
 
 Documentation from the `main` branch and tagged releases is automatically built and hosted on [readthedocs](https://readthedocs.org/).
 
-
 ## Acknowledgments
 
 This document is based off of the [guidelines from the sparse project](https://github.com/pydata/sparse/blob/master/docs/contributing.rst).
diff --git a/README.md b/README.md
@@ -14,9 +14,9 @@ Bioframe enables flexible and scalable operations on genomic interval dataframes
 
 Bioframe is built directly on top of [Pandas](https://pandas.pydata.org/). Bioframe provides:
 
-* A variety of genomic interval operations that work directly on dataframes.
-* Operations for special classes of genomic intervals, including chromosome arms and fixed-size bins.
-* Conveniences for diverse tabular genomic data formats and loading genome assembly summary information.
+- A variety of genomic interval operations that work directly on dataframes.
+- Operations for special classes of genomic intervals, including chromosome arms and fixed-size bins.
+- Conveniences for diverse tabular genomic data formats and loading genome assembly summary information.
 
 Read the [documentation](https://bioframe.readthedocs.io/en/latest/), including the [guide](https://bioframe.readthedocs.io/en/latest/guide-intervalops.html), as well as the [publication](https://doi.org/10.1093/bioinformatics/btae088) for more information.
 
@@ -34,10 +34,10 @@ pip install bioframe
 
 Interested in contributing to bioframe? That's great! To get started, check out the [contributing guide](https://github.com/open2c/bioframe/blob/main/CONTRIBUTING.md). Discussions about the project roadmap take place on the [Open2C Discord](https://discord.com/invite/qVfSbDYHNG) server and regular developer meetings scheduled there. Anyone can join and participate!
 
-
 ## Interval operations
 
 Key genomic interval operations in bioframe include:
+
 - `overlap`: Find pairs of overlapping genomic intervals between two dataframes.
 - `closest`: For every interval in a dataframe, find the closest intervals in a second dataframe.
 - `cluster`: Group overlapping intervals in a dataframe into clusters.
@@ -46,6 +46,7 @@ Key genomic interval operations in bioframe include:
 Bioframe additionally has functions that are frequently used for genomic interval operations and can be expressed as combinations of these core operations and dataframe operations, including: `coverage`, `expand`, `merge`, `select`, and `subtract`.
 
 To `overlap` two dataframes, call:
+
 ```python
 import bioframe as bf
 
@@ -62,8 +63,8 @@ For these two input dataframes, with intervals all on the same chromosome:
 <img src="https://github.com/open2c/bioframe/raw/main/docs/figs/overlap_inner_0.png" width=60%>
 <img src="https://github.com/open2c/bioframe/raw/main/docs/figs/overlap_inner_1.png" width=60%>
 
-
 To `merge` all overlapping intervals in a dataframe, call:
+
 ```python
 import bioframe as bf
 
@@ -90,12 +91,12 @@ ctcf_motif_calls = bioframe.read_table(jaspar_url, schema='jaspar', skiprows=1)
 ```
 
 ## Tutorials
-See this [jupyter notebook](https://github.com/open2c/bioframe/tree/master/docs/tutorials/tutorial_assign_motifs_to_peaks.ipynb) for an example of how to assign TF motifs to ChIP-seq peaks using bioframe.
 
+See this [jupyter notebook](https://github.com/open2c/bioframe/tree/master/docs/tutorials/tutorial_assign_motifs_to_peaks.ipynb) for an example of how to assign TF motifs to ChIP-seq peaks using bioframe.
 
 ## Citing
 
-If you use ***bioframe*** in your work, please cite:
+If you use **_bioframe_** in your work, please cite:
 
 ```bibtex
 @article{bioframe_2024,

diff --git a/docs/api-resources.rst b/docs/api-resources.rst
@@ -8,7 +8,7 @@ Bioframe provides a collection of genome assembly metadata for commonly used
 genomes. These are accessible through a convenient dataclass interface via :func:`bioframe.assembly_info`.
 
 The assemblies are listed in a manifest YAML file, and each assembly
-has a mandatory companion file called `seqinfo` that contains the sequence
+has a mandatory companion file called _seqinfo_ that contains the sequence
 names, lengths, and other information. The records in the manifest file contain
 the following fields:
 
@@ -22,7 +22,7 @@ the following fields:
 - ``default_units``: default assembly units to include from the seqinfo file
 - ``url``: URL to where the corresponding sequence files can be downloaded
 
-The `seqinfo` file is a TSV file with the following columns (with header):
+The _seqinfo_ file is a TSV file with the following columns (with header):
 
 - ``name``: canonical sequence name
 - ``length``: sequence length
@@ -31,21 +31,20 @@ The `seqinfo` file is a TSV file with the following columns (with header):
 - ``unit``: assembly unit of the chromosome (e.g., "primary", "non-nuclear", "decoy")
 - ``aliases``: comma-separated list of aliases for the sequence name
 
-We currently do not include sequences with "alt" or "patch" roles in `seqinfo` files, but we
+We currently do not include sequences with "alt" or "patch" roles in _seqinfo_ files, but we
 do support the inclusion of additional decoy sequences (as used by so-called NGS *analysis
 sets* for human genome assemblies) by marking them as members of a "decoy" assembly unit.
 
-The `cytoband` file is an optional TSV file with the following columns (with header):
-
+The _cytoband_ file is an optional TSV file with the following columns (with header):
 - ``chrom``: chromosome name
 - ``start``: start position
 - ``end``: end position
 - ``band``: cytogenetic coordinate (name of the band)
 - ``stain``: Giesma stain result
 
-The order of the sequences in the `seqinfo` file is treated as canonical.
-The ordering of the chromosomes in the `cytobands` file should match the order
-of the chromosomes in the `seqinfo` file.
+The order of the sequences in the _seqinfo_ file is treated as canonical.
+The ordering of the chromosomes in the _cytobands_ file should match the order
+of the chromosomes in the _seqinfo_ file.
 
 The manifest and companion files are stored in the ``bioframe/io/data`` directory.
 New assemblies can be requested by opening an issue on GitHub or by submitting a pull request.

diff --git a/docs/guide-bedtools.md b/docs/guide-bedtools.md
@@ -14,7 +14,6 @@ kernelspec:
 
 # Bioframe for bedtools users
 
-
 Bioframe is built around the analysis of genomic intervals as a pandas [DataFrame](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html) in memory, rather than working with tab-delimited text files saved on disk.
 
 Bioframe supports reading a number of standard genomics text file formats via [`read_table`](https://bioframe.readthedocs.io/en/latest/api-fileops.html#bioframe.io.fileops.read_table), including BED files (see [schemas](https://github.com/open2c/bioframe/blob/main/bioframe/io/schemas.py)), which will load them as pandas DataFrames, a complete list of helper functions is [available here](API_fileops).
@@ -25,7 +24,6 @@ For example, with gtf files, you do not need to turn them into bed files, you ca
 
 Finally, if needed, bioframe provides a convenience function to write dataframes to a standard BED file using [`to_bed`](https://bioframe.readthedocs.io/en/latest/api-fileops.html#bioframe.io.bed.to_bed).
 
-
 ## `bedtools intersect`
 
 ### Select unique entries from the first bed overlapping the second bed `-u`
@@ -107,7 +105,6 @@ out = bf.overlap(A, B, how='inner', suffixes=('_', ''))[B.columns]
 
 > **Note:** This gives one row per overlap and can contain duplicates. The output dataframe of the former method will use the same pandas index as the input dataframe `B`, while the latter result --- the join output --- will have an integer range index, like a pandas merge.
 
-
 ### Intersect multiple beds against A
 
 ```sh