Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,7 @@ Numerical comparisons use epsilon tolerance — use `AssertCase`/`AssertBatch` r
## Active Technologies
- Python 3.8+ + numpy >= 1.20 (sole runtime dependency per constitution) (002-decompose-intervals-pipeline)
- Python 3.8+ + numpy >= 1.20 (sole runtime dependency) (002-decompose-intervals-pipeline)
- Python 3.8+ + numpy >= 1.20 (sole runtime dependency; no additions) (004-partials-package)

## Recent Changes
- 002-decompose-intervals-pipeline: Added Python 3.8+ + numpy >= 1.20 (sole runtime dependency per constitution)
2 changes: 1 addition & 1 deletion mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -79,8 +79,8 @@ nav:
- "foapy.order": references/order.md
- "foapy.binding": references/binding.md
- "foapy.chain_mode": references/chain_mode.md
- "foapy.tuple_mode": references/tuple_mode.md
- "foapy.intervals_chain": references/intervals_chain.md
- "foapy.tuple_mode": references/tuple_mode.md
- "foapy.intervals_tuple": references/intervals_tuple.md
- "foapy.intervals_distribution": references/intervals_distribution.md
- "foapy.ma":
Expand Down
39 changes: 39 additions & 0 deletions specs/004-partials-package/checklists/requirements.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# Specification Quality Checklist: foapy.partials Package

**Purpose**: Validate specification completeness and quality before proceeding to planning
**Created**: 2026-04-19
**Feature**: [spec.md](../spec.md)

## Content Quality

- [x] No implementation details (languages, frameworks, APIs)
- [x] Focused on user value and business needs
- [x] Written for non-technical stakeholders
- [x] All mandatory sections completed

## Requirement Completeness

- [x] No [NEEDS CLARIFICATION] markers remain
- [x] Requirements are testable and unambiguous
- [x] Success criteria are measurable
- [x] Success criteria are technology-agnostic (no implementation details)
- [x] All acceptance scenarios are defined
- [x] Edge cases are identified
- [x] Scope is clearly bounded (4 functions only; intervals_distribution excluded)
- [x] Dependencies and assumptions identified

## Feature Readiness

- [x] All functional requirements have clear acceptance criteria
- [x] User scenarios cover primary flows
- [x] Feature meets measurable outcomes defined in Success Criteria
- [x] No implementation details leak into specification

## Notes

All items pass. Clarifications session (2026-04-19) resolved 3 open items:
- intervals_distribution excluded from scope
- plain array input accepted (auto-wrapped)
- submodule-only access pattern confirmed

Spec is ready for `/speckit.plan`.
45 changes: 45 additions & 0 deletions specs/004-partials-package/contracts/alphabet.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# Contract: partials.alphabet

## Signature

```python
foapy.partials.alphabet(X) -> numpy.ndarray
```

## Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `X` | `array_like` or `numpy.ma.MaskedArray` | Yes | 1-D sequence (plain or masked). Masked positions are excluded. |

## Returns

| Return | Type | Shape | dtype |
|--------|------|-------|-------|
| `alphabet` | `numpy.ndarray` | `(p,)` | Same as input element type |

- `p` = number of unique non-masked values
- Values appear in order of first appearance among non-masked positions
- Returns empty array when all positions are masked or input is empty

## Raises

| Exception | Condition |
|-----------|-----------|
| `Not1DArrayException` | Input has more than 1 dimension |

## Invariants

- `len(result)` equals the number of unique values in `X.compressed()`
- Result is a plain ndarray (not masked), even when input is masked

## Examples

```python
import numpy.ma as ma
import foapy.partials as partials

X = ma.masked_array(['a', 'b', 'a', 'c'], mask=[False, True, False, False])
alphabet = partials.alphabet(X)
# alphabet: ['a', 'c'] (b is masked, c appears after a)
```
71 changes: 71 additions & 0 deletions specs/004-partials-package/contracts/intervals_chain.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
# Contract: partials.intervals_chain

## Signature

```python
foapy.partials.intervals_chain(X, binding: int, chain_mode: int) -> numpy.ma.MaskedArray
```

## Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `X` | `array_like` or `numpy.ma.MaskedArray` | Yes | 1-D raw sequence (plain or masked). Masked positions are gaps. Pass the original sequence, not the order output. |
| `binding` | `int` | Yes | `foapy.binding.start` (1) or `foapy.binding.end` (2) |
| `chain_mode` | `int` | Yes | `foapy.chain_mode.boundary` (1) or `foapy.chain_mode.cycle` (2) |

## Returns

| Return | Type | Shape | dtype |
|--------|------|-------|-------|
| `chain` | `numpy.ma.MaskedArray` | `(n,)` | `numpy.intp` |

- `n` = length of input sequence
- Output mask is identical to input mask
- Non-masked positions hold the interval distance as a positive integer
- **Interval distance uses actual positional indices in the full array** (gap positions count toward distance)
- Minimum interval value: 1. Maximum: n.

## Raises

| Exception | Condition |
|-----------|-----------|
| `Not1DArrayException` | Input has more than 1 dimension |
| `ValueError` | `binding` is not `binding.start` or `binding.end` |
| `ValueError` | `chain_mode` is not `chain_mode.boundary` or `chain_mode.cycle` |

## Invariants

- `len(result) == len(X)` always
- `result.mask` equals `ma.getmaskarray(X)` always
- For two non-masked occurrences of the same element at actual positions `i < j`: interval at position `j` = `j - i`
- For the first non-masked occurrence of an element at actual position `p` (binding.start, boundary): interval = `p + 1`
- When `X` has no masked positions: result equals `foapy.core.intervals_chain(X, binding, chain_mode)`

## Key Semantic Difference from `foapy.ma.intervals_chain`

`foapy.ma.intervals_chain` compresses masked elements first, so intervals measure distances in the compressed sequence. `foapy.partials.intervals_chain` uses actual positional indices, so a gap between two occurrences increases the interval distance.

| Sequence | foapy.ma result | foapy.partials result |
|----------|-----------------|-----------------------|
| `[A, --, A]` (binding.start, boundary) | `[1, 1]` (compressed) | `[1, --, 2]` (full array) |

## Examples

```python
import numpy.ma as ma
import foapy
import foapy.partials as partials

X = ma.masked_array(['C', 'T', 'C', 'G'], mask=[False, False, False, False])
chain = partials.intervals_chain(X, foapy.binding.start, foapy.chain_mode.boundary)
# chain: [1, 2, 2, 4] (same as core — no gaps)

X_partial = ma.masked_array(
['_', 'C', 'T', 'C', '_', 'G'],
mask=[True, False, False, False, True, False]
)
chain = partials.intervals_chain(X_partial, foapy.binding.start, foapy.chain_mode.boundary)
# chain.data at non-masked positions [1,2,3,5]: [2, 3, 2, 6]
# chain.mask: [True, False, False, False, True, False]
```
70 changes: 70 additions & 0 deletions specs/004-partials-package/contracts/intervals_tuple.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
# Contract: partials.intervals_tuple

## Signature

```python
foapy.partials.intervals_tuple(chain, binding: int, tuple_mode: int) -> numpy.ma.MaskedArray
```

## Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `chain` | `array_like` or `numpy.ma.MaskedArray` | Yes | 1-D intervals chain produced by `partials.intervals_chain`. |
| `binding` | `int` | Yes | Must match the binding used to produce `chain`. `foapy.binding.start` (1) or `foapy.binding.end` (2). |
| `tuple_mode` | `int` | Yes | `foapy.tuple_mode.normal` (2), `foapy.tuple_mode.lossy` (1), or `foapy.tuple_mode.redundant` (3) |

## Returns

| `tuple_mode` | Shape | Mask | Description |
|--------------|-------|------|-------------|
| `normal` | `(n,)` | Same as input | Chain returned unchanged |
| `lossy` | `(n,)` | Input mask ∪ boundary positions | Boundary (first-occurrence) intervals additionally masked |
| `redundant` | `(n + k,)` | Input mask + `k` unmasked trailing elements | Trailing complementary boundary intervals appended; `k` = unique symbol count inferred from compressed chain |

- `n` = length of input chain
- `k` = number of unique elements inferred from the compressed chain

## Raises

| Exception | Condition |
|-----------|-----------|
| `ValueError` | `binding` is not `binding.start` or `binding.end` |
| `ValueError` | `tuple_mode` is not a recognised value |

## Invariants

- For `normal`: `result.data == chain.data` and `result.mask == chain.mask`
- For `lossy`: `result.mask[i] == True` whenever `chain.mask[i] == True`; additionally masked at boundary positions
- For `lossy`: `len(result) == len(chain)` always
- For `redundant`: `len(result) == len(chain) + k`; trailing k elements are unmasked
- When `chain` has no masked positions: non-masked values in result equal `foapy.core.intervals_tuple(chain, binding, tuple_mode)`

## Boundary Detection (lossy)

A non-masked element at compressed-array index `i` is a boundary interval if `compressed_chain[i] > i`. This is the same criterion as `foapy.core.intervals_tuple` applied to the compressed sub-sequence.

## Examples

```python
import numpy.ma as ma
import foapy
import foapy.partials as partials

# Partial chain: positions 0 and 4 are gaps
chain = ma.masked_array(
[0, 2, 3, 2, 0, 6],
mask=[True, False, False, False, True, False]
)

# normal: unchanged
result = partials.intervals_tuple(chain, foapy.binding.start, foapy.tuple_mode.normal)
# result: same as chain

# lossy: C's first occurrence at compressed-index 0 has value 2 > 0 → masked
# T's first occurrence at compressed-index 1 has value 3 > 1 → masked
# G's first occurrence at compressed-index 3 has value 6 > 3 → masked
# C's second occurrence at compressed-index 2 has value 2 == 2 → kept
result = partials.intervals_tuple(chain, foapy.binding.start, foapy.tuple_mode.lossy)
# Only position 3 (C second occurrence) remains unmasked
```
54 changes: 54 additions & 0 deletions specs/004-partials-package/contracts/order.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# Contract: partials.order

## Signature

```python
foapy.partials.order(X, return_alphabet=False) -> numpy.ma.MaskedArray
```

## Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `X` | `array_like` or `numpy.ma.MaskedArray` | Yes | 1-D sequence (plain or masked). Masked positions treated as gaps. |
| `return_alphabet` | `bool` | No (default `False`) | If `True`, also return the alphabet of non-masked unique values. |

## Returns

| Mode | Return | Type | Shape | dtype |
|------|--------|------|-------|-------|
| `return_alphabet=False` | `order` | `numpy.ma.MaskedArray` | `(n,)` | `numpy.intp` |
| `return_alphabet=True` | `(order, alphabet)` | `(numpy.ma.MaskedArray, numpy.ndarray)` | `(n,)`, `(p,)` | `numpy.intp`, same as input |

- `n` = length of input sequence
- `p` = number of unique non-masked values (alphabet power)
- Output mask is identical to input mask
- Non-masked positions hold the element's 0-based index in the alphabet (first-appearance order)

## Raises

| Exception | Condition |
|-----------|-----------|
| `Not1DArrayException` | Input has more than 1 dimension |

## Invariants

- `len(result) == len(X)` always
- `result.mask` equals `ma.getmaskarray(X)` always
- When `return_alphabet=True`: `alphabet[result.data[~result.mask]]` reconstructs `X.compressed()`
- When all positions are masked: result is fully masked; alphabet is empty array

## Examples

```python
import numpy.ma as ma
import foapy.partials as partials

X = ma.masked_array(['a', 'b', 'a', 'c'], mask=[False, True, False, False])
result = partials.order(X)
# result.data (at non-masked): [0, 0, 1]
# result.mask: [False, True, False, False]

result, alphabet = partials.order(X, return_alphabet=True)
# alphabet: ['a', 'c']
```
87 changes: 87 additions & 0 deletions specs/004-partials-package/data-model.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
# Data Model: foapy.partials

**Date**: 2026-04-19 | **Branch**: `004-partials-package`

## Entities

### Partial Sequence

A 1-D masked array where each position holds either a symbolic value or a gap marker.

| Attribute | Type | Constraints |
|-----------|------|-------------|
| data | any element type (str, int, …) | 1-D only; multi-dimensional raises `Not1DArrayException` |
| mask | boolean ndarray | `True` = gap position; `False` = present element |
| length `n` | int | ≥ 0 |
| compressed length `m` | int | 0 ≤ m ≤ n |

**Construction**: Accept `numpy.ma.MaskedArray`, plain `numpy.ndarray`, or Python list. Plain inputs are auto-wrapped with no mask (`ma.asarray(X)`).

---

### Partial Order

Output of `partials.order()`. A 1-D masked array of the same length as the Partial Sequence.

| Attribute | Type | Constraints |
|-----------|------|-------------|
| data | `numpy.intp` | alphabet index at non-masked positions; value at masked positions is undefined but always 0 |
| mask | boolean ndarray | Identical to input Partial Sequence mask |
| shape | `(n,)` | Same length as input |
| value range | int | 0 ≤ value < alphabet power (number of unique non-masked elements) |

**Reconstruction**: `partial_order` paired with the `alphabet` array from `return_alphabet=True` allows full reconstruction: `alphabet[partial_order.data[~mask]]` recovers the original non-masked elements.

---

### Partial Intervals Chain

Output of `partials.intervals_chain()`. A 1-D masked array of the same length as the input.

| Attribute | Type | Constraints |
|-----------|------|-------------|
| data | `numpy.intp` | positional interval distance at non-masked positions |
| mask | boolean ndarray | Identical to input chain mask |
| shape | `(n,)` | Same length as input |
| value range | int | 1 ≤ value ≤ n (actual positional distance, including gap positions) |

**Semantics**: For two consecutive non-masked occurrences of the same element at actual positions `i` and `j` (i < j), the interval at position `j` is `j - i`. For the first occurrence at position `p`, the boundary interval is `p + 1` (binding.start, chain_mode.boundary).

---

### Partial Intervals Tuple

Output of `partials.intervals_tuple()`. A 1-D masked array.

| Attribute | Type | Constraints |
|-----------|------|-------------|
| data | `numpy.intp` | boundary-adjusted interval distances |
| mask | boolean ndarray | input mask ∪ additionally masked boundary positions (lossy) |
| shape | `(n,)` for normal and lossy; `(n + k,)` for redundant | n = chain length; k = unique symbol count inferred from chain |

**Mode effects on mask**:
- `tuple_mode.normal`: mask unchanged from input
- `tuple_mode.lossy`: first-occurrence boundary positions additionally masked; output length = n
- `tuple_mode.redundant`: original mask unchanged; k trailing elements appended as unmasked; output length = n + k

## Pipeline Flow

```
Partial Sequence (masked 1-D)
partials.order()
│ returns Partial Order (masked 1-D, same length)
partials.intervals_chain()
│ takes raw Partial Sequence (not the order)
│ returns Partial Intervals Chain (masked 1-D, same length)
partials.intervals_tuple()
│ returns Partial Intervals Tuple (masked 1-D)
│ normal/lossy: same length; redundant: n + k length
(downstream characteristics or further analysis)
```

Note: `intervals_chain` accepts the original Partial Sequence (raw symbols), not the order output. This mirrors `foapy.core.intervals_chain` which also operates on raw sequences.
Loading
Loading