Skip to content

Releases: haalfi/remote-store

v0.26.0

25 May 20:47
Immutable release. Only release title and notes can be modified.
bfbe076

Choose a tag to compare

What's Changed

Added

  • RemoteStoreComputeLogManager: new Dagster compute log manager (ID-208, RFC-0014). Captures op/step stdout/stderr and persists to any remote-store backend via dagster.yaml. Credential-named options in backend_options are now Secret-wrapped via the shared _build_store, which also retroactively masks credentials for the v2 DagsterStoreResource / RemoteStoreIOManager. Install via pip install "remote-store[dagster]".

Fixed

  • Broken docs.remotestore.dev links in the README and the data-lake patterns guide (BK-236); a new check_docs_site_links lint gate (DOCFRAME-009) catches future drift offline.

Documentation

  • docs-src/reference/tested-versions.md (ID-182): new user-facing page recording the upper-bound transitive versions CI was last green against per [<extra>].

Internal

  • Formal Verification wave (ID-190, ID-206, BK-196, BK-232, BK-195, BK-233, BK-231): WellFormedPath predicate on all 13 contract methods; mechanical spec ↔ Dafny ↔ test traceability gate dual-wired into hatch run lint + CI (baseline shrink-only); metadata pinned in Dafny Copy/Move postconditions; sync + async test_metadata_round_trips_through_move_copy conformance test gated by the compiled oracle.
  • Scheduled CI drift guard for unbounded extra-dependency floors (ID-182): weekly .github/workflows/drift-guard.yml re-resolves each remote-store[<extra>], diffs against infra/drift-locks/, runs smoke targets, reconciles a single rolling GitHub issue. Early warning, not automated remediation.
  • benchmarks/infra/ → top-level infra/ (ID-204): single source of truth for local-infra ports/hosts/credentials in infra/.env; scripts/check_infra_settings.py fails lint on any literal -p N:M outside infra/.env.
  • Live HNS suites trimmed; two async conformance gaps closed (BK-182, BK-228, BK-229): per-backend HNS suites cut sync 31 → 13 and async 33 → 12 after the v0.25.0 Stage 3 cassette/replay infrastructure made duplicates redundant; the deleted duplicates had been masking iter_children and write_atomic async conformance gaps, both closed in the same PR.
  • CI annotation silencing (BK-230): nested Node 20 deprecation warnings in verify-formal silenced via FORCE_JAVASCRIPT_ACTIONS_TO_NODE24; uv cache reservation race between test-primary and e2e resolved via per-job cache-suffix; drift-guard artifact actions bumped to v7/v8.

Links:

v0.25.0

18 May 17:43
Immutable release. Only release title and notes can be modified.
01a7227

Choose a tag to compare

What's Changed

Added

  • Store.list_folders(pattern=…): glob filter on folder basenames via fnmatch. Mirrors list_files(pattern=…), composes with max_depth=. Async sibling included. (ID-178)
  • SFTPUtils preflight helpers: scan_host_keys, scan_host_algorithms, enable_ssh_rsa_compat for paramiko 5+ legacy-server compatibility. HostKeyPolicy accepts enum-name aliases. (BK-197/198/199/200)
  • Azure HNS account setup guide: step-by-step az CLI recipe for provisioning an ADLS Gen2 account suitable for the live HNS test suite.

Fixed

Azure HNS correctness on real ADLS Gen2. Twelve fixes surfaced by the new Stage 3 live test suite. Azurite forgave each of these; real HNS rejected or silently corrupted state.

  • File-API data loss (data-loss fix): read/read_bytes/read_seekable/delete on a directory path now raise InvalidPath instead of returning b"" or destroying the directory marker. (BUG-197)
  • write_atomic streaming: drives the DataLake DFS append protocol directly. Closes MissingRequiredQueryParameter. (BUG-194, BUG-202)
  • Directory-vs-file error fidelity: get_file_info, is_folder, get_folder_info, delete_folder, move/copy, open_atomic, write/write_atomic all raise the canonical InvalidPath instead of wrong-type errors. (BUG-190/192/195/198/199/200/203)
  • get_folder_info("") root: no longer fails with "Please specify a file system name and file path". (BUG-213)
  • move(p, p) / copy(p, p) self-op: now a no-op (was AlreadyExists). (BUG-201)
  • Store.move/copy self-op error type: NotFoundInvalidPath when source is a directory. (BK-227)
  • Async write_atomic post-rename quirk: tolerates get_file_properties() failure on HNS. (BUG-196)

Other:

  • SFTPBackend error swallowing: connect-time PermissionError surfaces as PermissionDenied / BackendUnavailable instead of False. (BUG-211)
  • SFTPBackend inline known_host_keys on Windows: STRICT verification no longer silently bypassed. (BUG-209)
  • S3Backend.check_health(): fixed silent no-op from unawaited coroutine. (BUG-208)
  • AsyncMemoryBackend error fidelity: type-mismatched paths now raise InvalidPath matching sync sibling. (BUG-189)
  • MemoryBackend / AsyncMemoryBackend.copy(): metadata preserved on the destination (was silently dropped). (BK-192)
  • AsyncMemoryBackend metadata round-tripping through get_file_info / list_files / iter_children. (BK-176)
  • Docs fixes: benchmark SVGs, EthicalAds floating, iOS Safari graph viz. (BUG-186/187/188)

Changed

  • [sftp] extra requires paramiko>=3.0 for the channel_timeout= kwarg. paramiko 2.x floor lifted. Migration required. (BUG-204)
  • pyarrow<25 cap lifted: all extras now allow pyarrow 24.x. (BK-168, BK-172)
  • hatch run test-cov no longer enforces 95% floor: strict gate moved to hatch run test-cov-strict.

Documentation

  • aio.md leads with AsyncStore: full per-category method sections. The four members: false stubs now render their full member surface. (ID-192)
  • Async docstring ripple completed: Raises: clauses on nine I/O methods on SyncBackendAdapter. InvalidPath documented on async write paths. (BK-173, BK-174)
  • SFTPUtils rendering: helpers documented as true @staticmethod. (BK-202)
  • Migration guide: v0.24.1 → v0.25.0 section with full HNS error-type table.

Internal

  • Stage 3 live cloud test infrastructure: azure_live / azure_live_async / s3_live fixtures, BackendFixture.aclose, HTTP cassette/replay layer, live HNS integration test classes. (BK-179/180/181/184/204; BUG-182/191/193/210/212)
  • Physical fixture/backend registry SSoT: per-fixture flat-namespace and self-op flags replace identity-keyed sets. (BK-185, BK-186)
  • tests/ root cleanup: placement checks wired into CI; six BK-191 slices reshape per-backend test layout. (BK-188/189/190/215-222)
  • End-to-end S3 control-path coverage: test_s3_moto.py lifecycle in default suite catches BUG-178/185-class regressions. (BK-166)
  • Documentation framework tooling: ADR-0027 + Spec 047; universal on-disk link rule via LinkResolver. (BK-167/167a/167b/169/170/171/205; ID-175)

Links:

v0.24.1

30 Apr 06:18
Immutable release. Only release title and notes can be modified.
1393fdf

Choose a tag to compare

What's Changed

Fixed

  • S3 botocore Config collision (BUG-185) — S3Backend(client_options={"client_kwargs": {"config": Config(...)}}) raised TypeError: got multiple values for keyword argument 'config' because s3fs's set_session() always passes its own config= to aiobotocore.create_client(). Fixed by routing every botocore Config option through opts["config_kwargs"]; pre-built Config objects in client_kwargs now fail fast with ValueError. Migration guide updated.

Changed

  • Diátaxis docs reorganisation (ID-174) — prose moved into docs-src/<bucket>/ (how-to, explanation, reference, further); the intermediate docs/ layer was collapsed and removed. Bookmarks to specific guide URLs may need updating.

Added (graph IR foundation)

  • CAPABILITIES: ClassVar[CapabilitySet] on every backend and ABC (ID-159) — class-level capability set; the capabilities property delegates to it. Custom backends should follow the same pattern.
  • _GATING dict in _store.py (ID-159) — single source of truth for the method → capability mapping read by Store._gate().
  • __mirror__: ClassVar[type[...]] on async backends (ID-159) — points at the sync peer for static graph extraction.
  • RFC-0012 — Documentation Graph Model (ID-159, accepted)
  • Documentation graph IR generator (scripts/gen_graph.py, ID-159, ID-163, ID-164, ID-162) — emits docs-src/_data/graph/graph.json (107 nodes, 222 edges); schema 1.2.
  • FEATURES.md projection from graph IR (scripts/gen_features.py, ID-163, ID-169) — auto-regenerates the mechanical sections, alphabetised.
  • API-docs verifier (scripts/check_api_docs.py, ID-170, ID-171) — flags missing ::: directives or capability-admonition drift in store.md / backend.md.
  • Interactive graph visualisation (scripts/gen_graph_viz.py, ID-165) — D3 v7 force-directed HTML at docs-src/_data/graph/graph_viz.html.

Internal

  • Test subpackage placement enforced via scripts/check_test_placement.py (ID-166–ID-168).
  • Context7 indexing fixes (ID-160).
  • AsyncAzureBackend.__del__ extracted helper (CodeQL alert #55).
  • tla2tools.jar pinned to v1.7.4; em-dash CI guard added.
  • Async backends moved to aio/backends/ subpackage (mirrors sync layout); public imports unchanged.

Links

v0.24.0

26 Apr 14:39
Immutable release. Only release title and notes can be modified.
4b23909

Choose a tag to compare

What's Changed

Added

  • WriteResult — every write*() call returns WriteResult(path, size, source, digest, etag, version_id, last_modified, metadata). Two new capabilities gate the rich fields: WRITE_RESULT_NATIVE and USER_METADATA. Includes Store.head(path), ext.write hash helpers, async parity (AsyncStore + AsyncBackend), and proxy forwarding across ProxyStore/ObservedStore/CachedStore (ID-146, ID-148, ID-013b)
  • AsyncBackendSyncAdapter — wraps any AsyncBackend as a synchronous Backend via a dedicated daemon-thread event loop; full spec coverage (ASYNC-080–093), unit + Azurite integration suites, decision guide guides/async-sync-bridges.md; unblocks Graph backend (ID-127) (ID-141–143c)

Fixed

  • AsyncMemoryBackend.delete raises InvalidPath for directory paths instead of silently returning (BUG-184)
  • S3 lazy init got multiple values for keyword argument 'config' when config_kwargs and RetryPolicy are both supplied (BUG-178)
  • SQLBlobBackend.glob drops zero-segment **/ matches on SQLite (BUG-175)
  • SQLBlobBackend.copy(src, src) silently destroyed data with overwrite=True (BUG-176)
  • AsyncAzureBackend.write streaming — no longer buffers AsyncIterable[bytes] payload into memory (BUG-165)
  • Docs deployment on release tagspages job moved to gh-pages-deploy.yml, eliminating environment-protection failures (BUG-164)

Changed

  • pyarrow 24.x mypy compatfollow_imports = "skip" for pyarrow.*; removed redundant type: ignore[import-untyped] across four modules (BK-154)

Documentation

  • Audit-011 gaps resolved — custom-backend guide, async.md, extensions.md, s3.md, azure.md, local.md updated for WriteResult/metadata/capabilities (BK-162)
  • Backend-specifics visibility — three-tier admonition vocabulary across all API reference pages; closes all 20 audit-009 findings (BK-153)
  • SFTPGo compatibility note in README and SFTP guide
  • Docs spacing and typography — compact CSS for tables, headings, parameter blocks (BK-157)
  • Formal layer: Dafny vs TLA+ principles, Observer.tla, informational verify-tla CI job (ID-147)

Links:

v0.23.0

12 Apr 15:32
44dd5fb

Choose a tag to compare

What's Changed

Added

  • Capability.LAZY_READ (BK-146): Quality flag for lazy-fetching backends; declared by Local, HTTP, S3, S3-PyArrow, SFTP, Azure. Spec SIO-009 added; conformance tests and capabilities matrix updated.

Fixed

  • Azure chunked upload (BUG-161): write() now uses staged-block upload; configurable via client_options.
  • Transfer pipe memory overhead (BUG-162): Explicit 256 KiB copy buffer (was platform default ~1 MiB on Windows).
  • known_hosts mode test (BUG-163): Mode assertion skipped on Windows (NTFS ignores POSIX mode bits).
  • cast() string-literal removal + redundant BufferedReader wrappers (BK-146): Memory, SQL, and S3 backends cleaned up.
  • Pages deployment (BUG-144): Batch mike pushes; deploy-pages@v4 fixes "Multiple artifacts" error and DEP0040 warning.

Documentation

  • Content longevity rules (BK-148, BK-149, BK-149b): New sdd/CONTENT-RULES.md; stale counts and enumerations removed from Guides, Explanation, Examples, and README.
  • Design index and Further Reading (BK-150): design/ surfaces all sdd/ process docs; Further Reading de-duplicated.
  • Azure and SQL blob backend guides (ID-137, ID-136): Block-size defaults and non-lazy write clarifications.

Internal

  • Streaming (ID-135, ID-137, BUG-161, BUG-162): E2e SHA-256 + tracemalloc integrity test across all backends; violations are hard failures; per-hop allocation overhead reduced.
  • Dafny 4.11.0 (ID-134, ID-134c): Toolchain upgraded; GetFolderInfo aggregate postcondition verified; SumSizesAddOneLocal workaround removed.
  • Mutation testing CI (BK-145): Manual + weekly scheduled runs, parallel matrix, HTML artifact upload.
  • SDD Expert in orchestrate skill (BK-147): 5th domain expert for spec-code consistency.
  • Codecov upload moved to publish workflow: Reports only on released versions.

Links:

v0.22.1

10 Apr 07:54
a456d9b

Choose a tag to compare

What's Changed

Fixed

  • SFTP known_hosts permissions (0o6440o600): File was world-readable; now restricted to owner-only. Resolves a CodeQL High finding.
  • Resource cleanup hardening: __del__ methods on SFTPBackend, AzureBackend, and AsyncAzureBackend now delegate to _del_cleanup() helpers, preventing exceptions from being silently swallowed during garbage collection.
  • ProxyStore: Added missing super().__init__() call.
  • read_seekable(): Stream ownership is now explicit, preventing handle leaks if wrapping fails.
  • Silent exception swallowing: Empty except/pass blocks replaced with contextlib.suppress.
  • Protocol/abstract stubs: Removed ... no-ops from stubs that already have docstrings (CodeQL Note findings).
  • BinaryIO import: Kept as a runtime import (suppressing ruff TCH003/TC006) so cast(BinaryIO, ...) call sites remain genuinely used at runtime and are not flagged as unused by CodeQL.
  • Unused imports: Removed or restructured 6 unused imports across the codebase.

All 31 open CodeQL security/quality alerts resolved.

Full Changelog: v0.22.0...v0.22.1

v0.22.0

09 Apr 19:09
ff8ada8

Choose a tag to compare

What's new

Added

  • Capability.ATOMIC_MOVE: New capability flag indicating move() is guaranteed atomic under concurrent access. Declared by Local, Memory, and SQLBlob backends. S3, S3-PyArrow, Azure, and SFTP omit it (copy-then-delete semantics). Query with store.supports(Capability.ATOMIC_MOVE).
  • Dafny-compiled oracle as conformance gate: The mathematically verified MemoryBackend.dfy (53 proofs, 0 errors) is compiled to Python and runs through the full conformance suite as DafnyOracleBackend. If the oracle passes a test, the test is known-correct.
  • Extended conformance suite: 42 test functions (~53 parameterized cases per backend) derived from Dafny BackendContract.dfy postconditions. Covers error fidelity, precondition ordering, listing completeness, depth filtering, move/copy semantics, and resource cleanup.
  • ResourceWarning safety net: __del__ on SFTPBackend, AzureBackend, and AsyncAzureBackend emits ResourceWarning if .close() / .aclose() was never called.
  • Ruff BLE001 rule: Blind-exception linter rule enabled; 44 intentional broad catches annotated with # noqa: BLE001.
  • Property-based tests: Hypothesis PBT suite covering partition roundtrip, config from_dict corruption, path normalization idempotence, and a stateful MemoryBackend model.
  • _safe_wrap() helper: Safely wraps a stream through wrapper layers, closing all acquired resources if any constructor raises.
  • Dafny formal verification layer (sdd/formal/): Machine-checkable backend contract covering precondition ordering, error mapping, listing semantics, depth counting, move atomicity, and resource safety.
  • Query methods under path-type conflicts (ID-129): exists(), is_file(), is_folder() return False (not InvalidPath) when a path ancestor is a file. Formally verified and covered by extended conformance tests.

Fixed

  • Type-mismatch errors now raise InvalidPath per spec (ID-131): read(), read_bytes(), delete(), get_file_info(), get_folder_info(), delete_folder() on the wrong path type now raise InvalidPath in LocalBackend, MemoryBackend, and SFTPBackend. Self-move and self-copy (src == dst) are now no-ops across all backends.
  • S3 read() leaks file handle if stream wrapping fails (BUG-159): Both S3Backend and S3PyArrowBackend now use _safe_wrap() to close raw handles if wrapping constructors raise.
  • ext.arrow Tier 1 probe catches too broadly (BK-141): Exception catch narrowed from Exception to (CapabilityNotSupported, TypeError, OSError); pattern documented in ADR-0008.

Documentation

  • Custom backend guide: New section on conformance suite integration — how to register a backend, capability gating, flat-namespace vs. hierarchical distinction, and a conformance checklist table.
  • README: Added "Quality & Testing" section covering SDD, unit tests, PBT, formal verification, mutation testing, benchmarks, and examples.

Full changelog: https://github.com/haalfi/remote-store/blob/master/CHANGELOG.md

v0.21.1

03 Apr 07:33
v0.21.1
57eb18f

Choose a tag to compare

What's Changed

Patch release: 22 bug fixes across all backends, plus housekeeping.

Bug Fixes

  • SFTP (BUG-142..147): resource leaks, over-broad error suppression, st_mode None handling
  • S3 / S3PyArrow (BUG-148..152): deep-copy client_options, get_file_info ETag/digest, max_depth listing
  • Local (BUG-153..154): IsADirectoryError leaks mapped to correct RemoteStoreError subclasses
  • Azure (BUG-155..158): max_depth listing, credential close, non-HNS delete_folder, stream-wrapping cleanup
  • Cache (BUG-137..138): ancestor directory invalidation, shared cache for child() stores
  • Config (BUG-139..140): null options/type/backend handling in RegistryConfig.from_dict
  • Partition (BUG-141): reject = in partition keys
  • Examples (BUG-136): Windows backslash in TOML/YAML path interpolation

Changed

  • Examples reorganized into 7 topical subdirectories

Internal

  • Pygments <2.21 upper-bound pin removed
  • pyproject.toml dependency lists deduplicated (BK-138)

Full Changelog: https://github.com/haalfi/remote-store/blob/master/CHANGELOG.md#0211---2026-04-03

v0.21.0

01 Apr 14:31
v0.21.0
973eef5

Choose a tag to compare

Highlights

  • Async Store API (): AsyncStore, AsyncBackend, SyncBackendAdapter, AsyncMemoryBackend (Phase 1) and AsyncAzureBackend — first native async backend (Phase 2)
  • resolve_env() config interpolation: ${VAR} and ${VAR:-default} placeholders in config dicts, opt-in on from_toml()/from_yaml()
  • Dagster multi-partition loading: load_input returns dict[str, Any] for time-window aggregation
  • Breaking: ParquetSerializer.deserialize() returns pyarrow.Table instead of pandas.DataFrame — see migration guide

See the full CHANGELOG for details.

v0.20.0

30 Mar 13:47
v0.20.0
a2adddd

Choose a tag to compare

Highlights

New backends: SQLBlobBackend (full read-write KV store), SQLQueryBackend (read-only query materializer)

New extensions: ParquetDatasetStore (manifests, _SUCCESS markers, atomic-commit), Dagster v2 (ConfigurableResource + IOManagerFactory)

New APIs: Store.resolve() → ResolutionPlan introspection, Store.read_seekable() promoted to core, depth-limited listing (max_depth=N)

Performance: S3 paginated listing (O(widest dir) memory), MemoryBackend lock reduction, benchmark suite v2 with overhead-vs-RTT charts

Bug fixes: SFTP TOFU persistence, cache coherency in move/copy, snippet indentation in docs

Breaking: cached_store(), remote_store_io_manager(), pydantic_to_registry_config() removed — use cache(), dagster_io_manager(), from_pydantic()

See the full changelog for details.

pip install remote-store==0.20.0