feat(tools): add Copilot backend, per-commit cache, and async refactor to release notes generator by romange · Pull Request #7264 · dragonflydb/dragonfly

romange · 2026-05-05T12:17:47Z

Summary

Adds a GitHub Copilot backend (--backend copilot) as an alternative to the Anthropic backend; routes requests through the Copilot CLI — no ANTHROPIC_API_KEY required
Adds per-commit analysis disk cache (default <repo>/.release_notes_cache/); cache key covers backend name, model, token budget, system prompt, and commit SHA — any prompt or model change auto-invalidates
Migrates the analysis pipeline from concurrent.futures.ThreadPoolExecutor to native asyncio; compose_release_notes streams output with a 15 s heartbeat and live token counters
Tightens commit categorisation: category descriptions and CRITICAL RULE block now strongly enforce that any fix (crash, data corruption, race, wrong result) lands in bugfix regardless of subsystem; domain categories are reserved for new features
Output filename gains a backend tag for Copilot runs (release_notes_copilot_<range>.md) to distinguish from Anthropic runs

Refactoring (no behaviour change to existing Anthropic path):

LLMBackend interface extended with analyze_commit, compose_progress_status, post_process_notes, notes_filename — all backend-specific decisions live in the backend, not in callers
AnthropicBackend._sync_stream replaces the 75-line nested _run closure; _ComposeStats dataclass replaces five loose instance variables
_model renamed to public model; _make_cache_key(backend, sha) replaces the old 5-argument function
main() decomposed: _build_arg_parser, _check_prerequisites, _build_backend, _resolve_cache_dir, _run_async (single asyncio.run call)
_is_fatal_error, FAIL_FAST_THRESHOLD, and all token/heartbeat/threshold constants promoted to module level

Release notes comparison (v1.37.0..v1.38.0)

Three outputs were generated from the same commit range to compare quality:

	Gold (pre-PR, Anthropic)	New Anthropic	Copilot
File	`release_notes_v1.37.0_to_v1.38.0.gold.md`	`release_notes_v1.37.0_to_v1.38.0.md`	`release_notes_copilot_v1.37.0_to_v1.38.0.md`
Lines	309	307	307
H1 title	✅	✅	✅
Preamble artefact	—	—	⚠️ model emitted "Now I have full context…" before the markdown
Highlights	ZSTD compression, TTL embed, TOPK+CMS	TTL embed, Top-K+CMS, HNSW range search	TTL embed, ZSTD compression, TOPK+CMS
Structured sections	8 sections	8 sections	9 sections (adds Docker)

Anthropic new vs gold — structurally identical; new output groups HNSW improvements more tightly in Highlights and surfaces the 30+ bug-fix count explicitly in the opening.

Copilot vs gold — equivalent quality and coverage; the Copilot model occasionally violates the "no preamble" system-prompt constraint (the spurious leading line above), which post_process_notes does not yet strip.

release_notes_v1.37.0_to_v1.38.0.gold.md (pre-PR baseline)

# Dragonfly v1.38.0 Release Notes

This release delivers major memory efficiency improvements, expanded probabilistic data structure support, and significant search engine enhancements. The standout change is a ~26% reduction in memory for workloads with expiring keys by embedding TTL directly into keys and eliminating the separate expire table — a meaningful win for any deployment using key expiry at scale. New probabilistic data structures (TOPK and CMS command families) join the existing HyperLogLog and Bloom filter support, and vector search gains HNSW range queries, improved hybrid search accuracy, and correct replication across shard count mismatches. Operators will also find improved observability through new Prometheus metrics for TLS, pipelines, streams, and defragmentation.

## Highlights

- [~26% memory reduction for workloads with expiring keys] TTL is now embedded directly in each key, eliminating the separate per-shard expire table and saving roughly 26–35 bytes per key with expiry (#6923, #6933).
- Adds the TOPK command family (TOPK.RESERVE, TOPK.ADD, TOPK.INCRBY, TOPK.QUERY, TOPK.COUNT, TOPK.LIST, TOPK.INFO) and the CMS (Count-Min Sketch) command family (CMS.INITBYDIM, CMS.INITBYPROB, CMS.INCRBY, CMS.QUERY, CMS.INFO, CMS.MERGE), both with RDB persistence (#6950, #6896).
- [Up to 75x less memory for repetitive list data; 3–10x for real-world workloads] Introduces ZSTD dictionary-based compression for Redis lists via the new `list_experimental_zstd_dict_threshold` flag (#6967).

release_notes_v1.37.0_to_v1.38.0.md (new Anthropic backend)

# Dragonfly v1.38.0 Release Notes

This release delivers major advances across memory efficiency, vector search, probabilistic data structures, and operational observability. The standout improvement is a **~26% reduction in memory for TTL-heavy workloads** by eliminating the per-shard expire table and embedding TTL directly in keys. Vector search gains HNSW range search, improved hybrid search accuracy, deferred write operations during serialization, and better FT.AGGREGATE integration. Two new probabilistic data structure families — **Top-K (HeavyKeeper)** and **Count-Min Sketch** — are now fully implemented with RDB persistence. Operators gain richer Prometheus metrics for TLS, pipelines, defragmentation, stream access patterns, and pipeline latency. The release also includes over 30 bug fixes spanning search crashes, replication correctness, tiering races, and stream handling.

## Highlights

- **Eliminated the per-shard expire table**, embedding TTL directly in keys for [~26% memory reduction (900 MB → 665 MB)] on workloads with many expiring keys (#6923, #6933).
- **Full Top-K and Count-Min Sketch command families** added with RDB persistence, ACL integration, and Redis Stack compatibility (#6950, #6896, #6932).
- **HNSW vector search significantly expanded**: range search in FT.SEARCH, KNN + APPLY in FT.AGGREGATE, filtered brute-force for small candidate sets, and deferred writes during serialization (#6898, #7066, #6730, #6746).

release_notes_copilot_v1.37.0_to_v1.38.0.md (new Copilot backend)

# Dragonfly v1.38.0 Release Notes

Dragonfly v1.38.0 delivers substantial memory efficiency gains, expanded probabilistic data structure support, and richer vector search capabilities. The headline change is elimination of the per-shard ExpireTable — embedding TTL directly into keys saves ~26% memory for expiry-heavy workloads (900 MB → 665 MB for 10M keys with TTL). New TOPK and Count-Min Sketch command families join the existing probabilistic primitives with full RDB persistence and RedisBloom compatibility, while ZSTD dictionary-based list compression achieves up to 75× memory reduction in benchmarks. Vector search gains HNSW range queries in both FT.SEARCH and FT.AGGREGATE, improved hybrid search accuracy, and correct index restoration across shard-count-mismatched replicas. Operators also get a distroless Docker image, early TLS/TCP validation, new Prometheus metrics for TLS handshakes and pipeline latency, and 4× higher tiered storage write depth.

## Highlights

- [~26% memory reduction for workloads with expiring keys (900 MB → 665 MB for 10M keys with TTL)] Eliminated the per-shard ExpireTable by embedding TTL directly into each key's CompactKey encoding, removing 26–35 bytes of overhead per expiring key (#6923, #6933).
- [Up to 75× memory reduction in synthetic benchmarks (2.56 GiB → 32.84 MiB); 3–10× expected in real-world workloads] Introduced ZSTD dictionary-based list compression via a shared thread-local dictionary across all QList instances, controlled by the new `list_experimental_zstd_dict_threshold` flag (#6967).
- Added TOPK and Count-Min Sketch (CMS) command families with full RDB persistence, compatible with RedisBloom (#6950, #6896).

🤖 Generated with Claude Code

… cache, async Add a GitHub Copilot backend (--backend copilot) that authenticates via the Copilot CLI instead of ANTHROPIC_API_KEY, making the tool usable without an Anthropic account. Add a per-commit analysis disk cache (.release_notes_cache/ by default). The cache key hashes backend name, model, token budget, system prompt, and commit SHA, so any change to prompts or model automatically invalidates stale entries. Re-running over a range that was already analyzed skips all API calls. Migrate the analysis pipeline from concurrent.futures.ThreadPoolExecutor to native asyncio, enabling a single asyncio.run() entry point and a live heartbeat loop during the long compose call. Tighten commit categorisation prompts: add a CRITICAL RULE block and annotate every domain category as "(features only)", so crash fixes / data-corruption fixes / race conditions always land in bugfix rather than leaking into domain sections. Refactor for separation of concerns: - LLMBackend interface gains analyze_commit(), compose_progress_status(), post_process_notes(), notes_filename() — backend-specific logic stays in the backend, not in callers - AnthropicBackend._sync_stream() replaces a 75-line nested closure; _ComposeStats dataclass replaces five loose instance variables - _model renamed to public model; _make_cache_key(backend, sha) replaces the old 5-argument function - main() decomposed into _build_arg_parser, _check_prerequisites, _build_backend, _resolve_cache_dir, _run_async - FAIL_FAST_THRESHOLD, _is_fatal_error(), and all numeric constants (ANALYZE_MAX_TOKENS_*, COMPOSE_MAX_TOKENS, COMPOSE_HEARTBEAT_S, …) promoted to module level Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

augmentcode · 2026-05-05T12:21:04Z

🤖 Augment PR Summary

Summary: This PR significantly expands tools/release_notes_generator.py by adding a GitHub Copilot-backed LLM option, introducing a per-commit on-disk analysis cache, and refactoring the pipeline to native asyncio for better concurrency control and streaming composition.

Changes:

Adds --backend copilot (Copilot CLI auth) alongside the existing Anthropic backend.
Introduces a per-commit JSON cache under .release_notes_cache/ (also added to .gitignore), with cache keys derived from backend/model/prompt/token budget + commit SHA.
Refactors commit analysis to asyncio with bounded parallelism, retries, and a fail-fast path for clearly non-retriable errors.
Adds streaming composition heartbeat/status reporting and backend-specific post-processing hooks.
Tightens commit categorization guidance (especially forcing fixes into bugfix) and theme promotion rules for release notes structure.
Updates output naming to distinguish Copilot runs (e.g. release_notes_copilot_<range>.md).

Technical Notes: Backends are now unified behind an LLMBackend interface, with backend-specific decisions (composition, progress, post-processing, filename) kept out of the orchestration code.

_{🤖 Was this summary useful? React with 👍 or 👎}

augmentcode

Review completed. 3 suggestions posted.

Comment augment review to trigger a new review at any time.

augmentcode · 2026-05-05T12:21:06Z

+    text = text.strip()
+    text = re.sub(r"^```(?:json)?\s*", "", text)
+    text = re.sub(r"\s*```$", "", text)
+    match = re.search(r"\{.*\}", text, re.DOTALL)


re.search(r"\{.*\}", ...) is greedy and can capture more than the intended JSON object if the backend returns extra braces (or multiple JSON objects), leading to intermittent json.loads failures. This makes the Copilot JSON path more brittle than necessary.

Severity: medium

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

Copilot

Pull request overview

This PR upgrades the tools/release_notes_generator.py release-notes tool by adding a GitHub Copilot-backed LLM option, introducing a per-commit disk cache for analysis results, and refactoring the execution pipeline to use asyncio (including streaming/heartbeat progress during composition).

Changes:

Add a new --backend copilot option alongside the existing Anthropic backend, with backend-specific behavior encapsulated behind an LLMBackend interface.
Add a per-commit analysis disk cache (default .release_notes_cache/) keyed by backend/model/prompt/token budget + commit SHA.
Migrate commit analysis + composition from ThreadPoolExecutor to asyncio, adding compose heartbeats and progress counters.

Reviewed changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated 1 comment.

File	Description
tools/requirements.txt	Notes optional Copilot SDK install for the new backend.
tools/release_notes_generator.py	Implements Copilot backend, async pipeline refactor, streaming compose progress, and per-commit analysis caching.
.gitignore	Ignores the new default on-disk cache directory.

romange · 2026-05-05T12:27:06Z

Gold

Before the pr.
release_notes_v1.37.0_to_v1.38.0.gold.md

Copilot and Anthropic - after this change

Most notable change is more issues are put under bug fixes category.

release_notes_copilot_v1.37.0_to_v1.38.0.md
release_notes_v1.37.0_to_v1.38.0.md

- Separate anthropic and pydantic imports so pydantic (needed by CommitAnalysis) is still available for --backend copilot when the anthropic package is not installed - Fix _make_cache_key to length-prefix each field before hashing, preventing hash collisions between distinct (name, model, ...) tuples that share the same raw byte concatenation - Fix _parse_commit_analysis_json to use json.JSONDecoder.raw_decode instead of a greedy r"\{.*\}" regex, which correctly handles nested braces and stops at the first complete JSON object - Fix _analyze_round to add commits that were silently skipped due to abort_event back into the failed list after asyncio.gather completes, so they are visible to the caller and included in the next retry round Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>

romange · 2026-05-05T12:52:59Z

augment review

augmentcode

Review completed. 1 suggestion posted.

Comment augment review to trigger a new review at any time.

augmentcode · 2026-05-05T12:56:17Z

+    return parser
+
+
+def _check_prerequisites(backend_name: str) -> Optional[str]:


_check_prerequisites doesn’t validate that Pydantic is installed/compatible, but later code unconditionally uses Pydantic v2 APIs like model_dump()/model_dump_json() (including for the Copilot backend + cache). This can crash at runtime if users follow the Copilot install path (pip install github-copilot-sdk) or have Pydantic v1 (Anthropic allows <3).

Severity: medium

Other Locations

tools/release_notes_generator.py:799

tools/release_notes_generator.py:943

tools/release_notes_generator.py:959

_{🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.}

vyavdoshenko

lgtm

vyavdoshenko · 2026-05-05T12:38:23Z

    python tools/release_notes_generator.py HEAD~50..HEAD --max-parallel 4
+
+    # GitHub Copilot backend (uses Copilot CLI auth, no API key needed):
+    pip install github-copilot-sdk


pip install -r tools/requirements.txt

Copilot AI review requested due to automatic review settings May 5, 2026 12:17

romange force-pushed the feat/release-notes-copilot-backend branch from 2241a32 to 8d2e3e5 Compare May 5, 2026 12:18

Copilot started reviewing on behalf of romange May 5, 2026 12:18 View session

augmentcode Bot reviewed May 5, 2026

View reviewed changes

Copilot AI reviewed May 5, 2026

View reviewed changes

Comment thread tools/release_notes_generator.py

romange requested a review from vyavdoshenko May 5, 2026 12:27

augmentcode Bot reviewed May 5, 2026

View reviewed changes

vyavdoshenko approved these changes May 5, 2026

View reviewed changes

romange merged commit cbdb4fa into main May 5, 2026
12 checks passed

romange deleted the feat/release-notes-copilot-backend branch May 5, 2026 14:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(tools): add Copilot backend, per-commit cache, and async refactor to release notes generator#7264

feat(tools): add Copilot backend, per-commit cache, and async refactor to release notes generator#7264
romange merged 2 commits into
mainfrom
feat/release-notes-copilot-backend

romange commented May 5, 2026

Uh oh!

augmentcode Bot commented May 5, 2026 •

edited

Loading

Uh oh!

augmentcode Bot left a comment

Uh oh!

Uh oh!

augmentcode Bot May 5, 2026 •

edited

Loading

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

romange commented May 5, 2026

Uh oh!

romange commented May 5, 2026

Uh oh!

augmentcode Bot left a comment

Uh oh!

augmentcode Bot May 5, 2026 •

edited

Loading

Uh oh!

vyavdoshenko left a comment

Uh oh!

vyavdoshenko May 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		return parser


		def _check_prerequisites(backend_name: str) -> Optional[str]:

Conversation

romange commented May 5, 2026

Summary

Release notes comparison (v1.37.0..v1.38.0)

Uh oh!

augmentcode Bot commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

augmentcode Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

augmentcode Bot May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

romange commented May 5, 2026

Gold

Copilot and Anthropic - after this change

Uh oh!

romange commented May 5, 2026

Uh oh!

augmentcode Bot left a comment

Choose a reason for hiding this comment

Uh oh!

augmentcode Bot May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vyavdoshenko left a comment

Choose a reason for hiding this comment

Uh oh!

vyavdoshenko May 5, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

augmentcode Bot commented May 5, 2026 •

edited

Loading

augmentcode Bot May 5, 2026 •

edited

Loading

augmentcode Bot May 5, 2026 •

edited

Loading