Skip to content

Support literal genomic-range operands in DISTANCE #155

Description

@conradbzura

Description

DISTANCE(a, b) currently requires both operands to be genomic interval column references. A literal genomic range in either position — e.g. DISTANCE(peaks.interval, 'chr1:1000-2000') or DISTANCE('chr1:1000-2000', genes.interval) — fails at transpile time with Literal range as first argument not yet supported (or "second argument", depending on position). The guard lives in _distance_operand (src/giql/generators/base.py:573): pass 1 (resolve_operator_refs_resolve_distance, src/giql/resolver.py:526) cannot resolve a literal operand to a column, so the generator raises. The restriction is symmetric — both the first (this) and second (expression) argument positions reject literals. This parallels #101 (literal references in DISJOIN); it was surfaced by the #139 cross-target oracle review (PR #152).

Motivation

The spatial predicates (INTERSECTS / CONTAINS / WITHIN) already accept a literal range operand, and NEAREST's reference is a literal range by design — DISTANCE and DISJOIN are the only operators that still require column operands. "How far is each feature from a fixed region?" is a natural query — e.g. the distance from every peak to a known locus 'chr1:1000-2000' — but today that region must already exist as a column. Supporting a literal operand removes a join/CTE the user otherwise has to construct by hand and brings DISTANCE to parity with the predicate operators. It also unblocks DISTANCE-literal coverage in the #139 cross-target oracle, which currently excludes it because the feature is absent on all targets (not a DataFusion-specific gap).

Expected outcome

  • DISTANCE(<column>, '<range>') and DISTANCE('<range>', <column>) transpile to valid SQL computing the (signed/unsigned) distance between the column interval and the parsed literal range, honoring coordinate_system / interval_type and the bedtools-parity offset, exactly as the column-to-column form does.
  • Works on every target (generic / DuckDB / DataFusion), with a cross-target oracle case added.
  • The Literal range as {position} argument not yet supported guard is removed once both positions resolve literals (or remains only for genuinely-unresolvable operands).

Metadata

Metadata

Assignees

No one assigned

    Labels

    featureNew feature or capability

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions