Releases · rianvdm/discogs-mcp

09 Apr 01:50

rianvdm

v3.1.0

c47fa89

v3.1.0 - Search Ranking and Dedup Fixes Latest

Latest

Quality-of-life fixes to search_collection ranking. Addresses #14 in full.

What's Fixed

🎯 Explicit genre terms now act as a hard filter

Before: the query "mellow jazz for late evening" would fire the mood detector on "mellow", OR its suggested styles against your collection, and ignore the literal word "jazz". Top 8 results came back with zero jazz tags — dominated by ambient, drone, and shoegaze.

Now: if a query contains a term from the CONCRETE_GENRES set (jazz, rock, folk, ambient, etc.), that term becomes a hard filter. "mellow jazz" returns only jazz records, with mood acting as a soft ranking signal on top. Pure mood queries like "something for a rainy Sunday" still work the same way they did before.

📊 Proportional mood scoring (no more recency bias)

Before: mood matching was boolean — a release either matched the mood-expanded style set or it didn't. Tiebreakers fell through to date_added, so recently-added records dominated mood query results regardless of semantic fit. A 2025 alt-rock LP with one incidental mood-match tag would surface in the top 8 for unrelated mood queries.

Now: mood score = matches / unique_tags. A release tagged ["Jazz", "Smooth Jazz"] scores 1.0 for "mellow" (both match). A release tagged ["Alt Rock", "Indie", "Dream Pop", "Shoegaze", "Lo-Fi", "Rock"] with one mood match scores ~0.17. Broadly-tagged recent records get penalized relative to focused ones.

date_added is also removed from the general sort chain entirely. The new tiebreaker order for non-temporal queries is:

moodScore desc → rating desc → year asc → artist+title alpha → id asc

Deterministic, no recency leak. Temporal queries (recent, oldest, etc.) still sort by date_added — that's their explicit purpose.

🎵 Dedup of multiple pressings

Before: owning both a Vinyl and a CD pressing of the same album meant two rows in every search result, wasting slots in an 8-result response.

Now: dedupByMaster collapses pressings by Discogs master_id (with a normalized artist + title + year fallback for masterless releases). Earliest-year pressing is the representative. Result rows show aggregated formats:

Kenny Burrell - Midnight Blue (1999)
  Format: Vinyl, CD | Genre: Jazz | Styles: Post Bop

The representative inherits the MAX mood score across the group, so a well-tagged reissue can still surface a poorly-tagged original. Temporal queries skip dedup — "recent additions" correctly shows each copy you added.

Architecture

Two new pure modules under src/utils/:

searchQueryParser.ts — classifies a query once (temporal flags, decade terms, explicit genre terms, mood analysis). Consumers share the parsed result without re-parsing.
searchRanking.ts — scoreReleases, dedupByMaster, sortScoredReleases, and the end-to-end applySearchPipeline orchestrator. No I/O, no mutation, no network.

The search_collection tool now calls applySearchPipeline(allResults, parsed) in place of the old rating+date sort, and the render loop reads release.ownedFormats so the Format: line shows aggregated formats from deduped groups.

Cleanup

Deleted ~145 lines of dead scoring code in DiscogsClient.searchCollectionWithQuery that were unreachable once the cached path always runs the new pipeline. Two tests that asserted on the dead behavior were removed with the code.

Test coverage

203 tests passing across 20 test files. New coverage:

9 unit tests for searchQueryParser
19 unit tests for searchRanking (6 scoring, 8 dedup, 5 sort, 4 pipeline)
3 end-to-end regression tests encoding the exact scenarios from #14

Verified in production

Tested against a real 1510-item collection on the hosted instance. Query "mellow jazz for late evening" now returns Kenny Burrell, Freddie Hubbard, Dave Brubeck, Wes Montgomery, Bill Evans, and Khruangbin — all jazz, with multi-pressing rows correctly collapsed.

Commits

c47fa89 chore: bump version badge to 3.1.0
af6776e refactor(search): delete dead scoring code in DiscogsClient.searchCollectionWithQuery
9ffa3e6 test(search): add Issue #14 regression tests for ranking bugs
e0e12c5 feat(search): wire ranking pipeline into search_collection tool
027f20f feat(search): add sortScoredReleases and applySearchPipeline
2651ff8 feat(search): add dedupByMaster for master_id grouping with fallback
9828dfc feat(search): add scoreReleases with proportional mood scoring
07d0bc9 feat(search): add searchQueryParser for one-time query classification
a8c8eba chore: export CONCRETE_GENRES and add master_id to collection item type

Full diff: v3.0.0...v3.1.0

Assets 2

08 Apr 14:34

rianvdm

v3.0.0

93f5556

v3.0.0 - Private Instance & Self-Hosting Model

⚠️ Breaking Change

https://discogs-mcp.com/mcp is no longer a shared public MCP server. It is now the maintainer's private instance, locked to a single Discogs account. Any user previously connected will get a 403 "Access Restricted" page on their next authentication attempt.

Why the change? Discogs API rate limits (60 req/min, effectively per-IP from a Cloudflare Worker) are too tight to share across users. A single active collection query can saturate the budget, which degrades the experience for everyone. Rather than run a broken multi-tenant service, the hosted instance is now owner-only, and everyone else deploys their own copy with their own Discogs API credentials.

If you were using the hosted instance: clone the repo and follow the new Self-Hosting guide. It takes about ten minutes, runs free on the Cloudflare Workers paid tier, and you get a sole-tenant 60 req/min budget.

What's New

🔒 Allowlist-based access control

Configurable via a new ALLOWED_DISCOGS_USER_ID environment variable in wrangler.toml under [env.production.vars]:

Accepts a single numeric Discogs user ID (e.g. "123456") or a comma-separated list (e.g. "123456,789012,345678") so small trusted groups can share one deployment
Empty / unset = open instance (the default, which is what self-hosters and local dev get out of the box)
Primary gate at the OAuth callback: unauthorized users see a friendly 403 HTML page linking to the self-hosting docs, and no stale auth tokens are ever persisted for them
Belt-and-braces runtime check on every authenticated MCP request, so any grants or sessions that predate enabling the allowlist get invalidated cleanly on their next request
New exported helpers parseAllowlist() and checkAllowlist() in src/auth/oauth-handler.ts, covered by 10 new unit tests

🏠 Repository pivot to a self-host-first model

README.md restructured around a full Self-Hosting walkthrough covering prerequisites, KV namespace setup, Discogs API credentials, optional allowlist configuration, deployment, and MCP client configuration for Claude Desktop, Claude Code, Cursor, Windsurf, OpenCode, and generic MCP clients
Prominent "This Is Not a Shared Service" notice at the top of Quick Start explaining why
Marketing page at / reframed with a new yellow private-instance banner, a primary CTA pointing to the self-hosting guide, and updated metadata so search engines index it as an open-source project rather than a free shared service
## Self-Hosting anchor matches the link used by the 403 rejection page and the marketing CTA

🚦 Centralized rate limiter via a Durable Object (architectural rewrite)

The rate limiter was rebuilt from scratch earlier in this release cycle. The old per-user KV-based throttle has been replaced by a single DiscogsRateLimiter Durable Object that serializes all outbound Discogs traffic through one authoritative queue:

Authoritative budget tracking — reads x-discogs-ratelimit and x-discogs-ratelimit-remaining directly from Discogs responses rather than guessing locally, so the DO self-corrects against reality
Adaptive throttling tiers — full speed when the budget has headroom, 1s / 3s / 10s throttles as the budget drains, full pause until the window resets at zero
FIFO queue with 20-slot cap and 90s per-request timeout, protecting against runaway memory growth or hung callers
429 recovery — reads Retry-After, forces the local budget to zero, re-queues the failed request at the front with a fresh deadline, sets a Durable Object alarm, and pauses the drainer until recovery
Alarm-driven window reset refills the budget even during idle periods, so the first request after a quiet stretch never inherits a stale depleted state
Cold-start staleness check discards any persisted budget older than the 60s window, avoiding false exhaustion on DO restart
Observability logging ([RL] lines) exposes queue depth, budget state, throttle decisions, and 429 events in wrangler tail for live debugging

All DiscogsClient calls now route through this DO. Pure helpers (getDelay, updateBudgetFromHeaders, shouldRejectQueue) are exported and unit-tested so the throttle tiers can't silently drift.

⏱️ Collection-fetch time budget raised (40s → 105s)

get_collection_stats, search_collection, and get_recommendations were consistently returning partial results on larger collections (e.g. "700 of 1510 indexed") because the outer tool handler capped itself at 40s out of caution about a presumed 45s MCP client timeout. Under sustained Discogs throttling of 3-10s per page, 15 pages couldn't fit in the window, so the fetcher would bail halfway and the stats would lie to the user.

TOOL_BUDGET_MS: 40s → 105s across all five call sites in src/mcp/tools/authenticated.ts and src/mcp/resources/discogs.ts
Inner fetcher timeBudgetMs: 45s → 110s
POLL_TIMEOUT_MS cap in cachedDiscogs.ts: 45s → 110s
Verified end-to-end against a 1510-item collection — full stats now return in a single call with no partial warning
Known compatibility note: if your MCP client enforces a strict sub-105s streamable-http timeout, you may see disconnections mid-fetch. Confirmed working in Claude Code and MCP Inspector. Claude Desktop behavior under sustained pressure is untested — please file an issue if you hit a timeout and I'll drop the budgets back down and solve this differently.

Other changes

Rate limiter hardening: fixed stale budget on DO cold start, increased queue timeout from 60s to 90s so queued requests survive 60s 429 pauses, reset per-request timeouts on 429 re-queue to prevent premature 504s, increased the collection-fetch time budget from 30s to 45s (later bumped again in this release to 105s)
Rate limiter observability: added [RL] structured logging for queue state, budget transitions, and throttle decisions

Test suite

170 tests passing across 17 test files (up from 148 in v2.5.0)
New unit tests cover allowlist parsing (single ID, list, whitespace, empty, mismatch), the DO rate limiter (delay tiers, header parsing, queue-full rejection), and the updated marketing-page / client-integration paths

Housekeeping

Dead MCP_RL KV binding removed
Older per-user throttle code removed in favor of the centralized DO

Full changelog: v2.5.1...v3.0.0

Assets 2

28 Mar 01:49

rianvdm

v2.5.1

d32ff66

v2.5.1 - Bug Fixes

Bug Fixes

Fixed search_collection to expose instance_id and folder_id in output — these are required by move_release, rate_release, and remove_from_collection, but were previously missing, making collection mutations impossible without manual lookup
Fixed folder_id being incorrectly included in output for folder 0 (not returned by the Discogs API)

Assets 2

25 Mar 23:48

rianvdm

v2.5.0

cfa4c97

v2.5.0

What's New

Smarter Semantic Search (#10)

Semantic queries like "empowering female vocals" or "road trip music" no longer dump your entire collection into the LLM context. Instead:

Best-effort keyword filter first — extracts meaningful keywords and tries in-memory matching before falling back
750-release cap on LLM dump — when fallback is needed, prioritizes rated items and recent additions instead of sending everything
"Search more broadly" escape hatch — say this to trigger the full (capped) collection search if keyword matches aren't what you want
Better LLM instructions — asks the model to pick 8–12 matches with brief rationale for each

Faster First Calls / Rate Limit Fix (#11)

The first search on a cold cache is dramatically faster:

Per-user throttle — each user gets their own Discogs API rate budget; one user's requests no longer block another
Parallel page fetching — collection pages now fetch in batches of 3 instead of one-at-a-time
Reduced proactive throttle — 500ms between requests (down from 1500ms), with retry logic as safety net
Cold-cache fetch time for a 1500-item collection: ~3s (was ~23s)

Housekeeping

Removed dead RateLimiter code that was never wired up
148 tests passing across 15 test files

Assets 2

24 Mar 12:46

jhuggart

v2.4.0

ec824f1

v2.4.0 - Collection Write Tools

What's New

This release adds write capabilities to your Discogs collection, letting you manage your collection directly through Claude.

Collection Folders

list_folders — View all your collection folders
create_folder — Create a new folder
edit_folder — Rename an existing folder
delete_folder — Delete a folder (must be empty)

Collection Items

add_to_folder — Add a release to a folder by release ID
remove_from_folder — Remove an instance from a folder
edit_instance — Update a collection item's rating (1–5) or move it to a different folder

Custom Fields

list_custom_fields — View your custom field definitions
edit_custom_field — Update a custom field value on a collection item

All write operations automatically invalidate the local cache so subsequent reads reflect the latest state.

Assets 2

16 Mar 01:18

rianvdm

v2.3.0

8eb32c7

v2.3.0 - MCP OAuth 2.1 Compliance

What's New

All MCP clients (Claude Code, Claude Desktop, opencode) now open a browser automatically for first-time authentication — no more copy-paste URLs.

Changes

MCP OAuth 2.1 compliance — unauthenticated /mcp requests return 401 WWW-Authenticate triggering automatic browser flow via @cloudflare/workers-oauth-provider
New auth routes — /authorize, /discogs-callback, /.well-known/oauth-protected-resource implement the full OAuth 2.1 + Discogs OAuth 1.0a bridge
Session-based auth preserved — existing session_id param and Mcp-Session-Id header paths continue to work unchanged
Security fix — removed OAuth signing key from console logs
Removed JWT sessions — replaced with direct KV storage (7-day sessions) and OAuth bearer tokens

Assets 2

15 Mar 23:43

rianvdm

v2.2.1

267c8d7

v2.2.1 - Large Collection Support (up to 5,000 items)

What's Changed

Bug Fix: Incomplete results for large collections (#6)

Users with collections larger than 2,500 items were seeing truncated results in both search_collection and get_collection_stats. This release fixes that.

Changes:

Raises the collection index cap from 2,500 to 5,000 items (maxPages 25→50)
Fixes get_collection_stats and search_collection to report the real Discogs collection total, not just the number of indexed items
Fixes a broken truncation detection check that never fired
Fixes the collections KV cache TTL from 30 min to 45 min to match the intended cache window
Adds a clear warning in tool output when a collection exceeds 5,000 items (e.g. "Your collection has 6,500 releases but only 5,000 were indexed")

Full changelog: v2.2.0...v2.2.1

Assets 2

01 Mar 20:15

rianvdm

v2.2.0

609e7ba

v2.2.0 - Semantic Collection Search

What's New

Semantic Search

The search_collection tool now understands conceptual and descriptive queries that go beyond literal metadata matching. Queries like "strong empowering female voice", "perfect for a road trip through the desert", or "albums my dad would love" now work by returning the full collection context so the calling LLM can apply its world knowledge to select the best matches.

How it works

Queries are classified as semantic (conceptual/descriptive) or literal (artist, album, genre, year)
Literal queries continue to use the existing filter pipeline
Semantic queries return the full collection in a compact format, letting the LLM make intelligent selections based on its knowledge of artists and albums
The existing mood mapping system remains active for mood-based queries (e.g., "mellow jazz")

CI Fix

Added pull-requests: write permission to the deploy workflow so the PR comment step no longer fails

Closes #4

Assets 2

01 Mar 19:55

rianvdm

v2.1.1

2867029

v2.1.1 - Security & Rate Limit Fixes

Summary

Fixes a security leak, removes dead code, and resolves Cloudflare Workers stalled HTTP response issues that caused rate limit exhaustion.

What Changed

Security & code quality (`code-review-fixes`)

Fixed security leak: Removed generateAuthInstructions parameter that accepted baseUrl from env, replaced with config constant to prevent URL injection
Removed dead code: Cleaned up unused variables and functions flagged by linter
Lint fixes: Prefixed unused params with _, used const for non-reassigned variables

Rate limiting & stalled responses (`fix/rate-limiting-stalled-responses`)

Fixed stalled HTTP responses: ResponseError now consumes the response body (await response.text()) before throwing, preventing Cloudflare Workers from holding unconsumed response bodies that cause deadlocks when the concurrent in-flight response limit is reached
Always use in-memory filtering: Single-query searches now use the cached complete collection + in-memory filtering (same as mood-expanded queries), instead of paginating the entire collection via API. This eliminates unnecessary API calls for the most common search path.

Commits

eff4182 Fix lint errors: prefix unused param, use const for non-reassigned variable
ebba3bd New skills
6735bd4 Fix security leak, remove dead code, and move hardcoded URL to config
2867029 Fix stalled HTTP responses and always use in-memory filtering for searches

Assets 2

04 Feb 16:54

rianvdm

v2.1.0

3aea6a3

v2.1.0 - Rate Limit Optimization

Summary

Dramatically reduces Discogs API call usage by routing all tools through a single cached complete-collection dataset. Previously, a mood-based search on a 1000-item collection could trigger 40+ API calls (exhausting most of the 60/minute budget in one request). Now the same operation costs 0-11 calls.

What Changed

Fetch once, filter many

All tools that need the full collection (search_collection, get_recommendations, get_collection_stats) now share a single cached dataset via getCompleteCollection() (45-minute TTL). The first tool call in a session fetches all pages; every subsequent call filters in-memory with zero API calls.

API call reduction (1000-item collection)

Scenario	Before	After
Mood search (cold cache)	~41 calls	~11 calls
Mood search (warm cache)	0	0
search + stats + recommendations (cold)	~33 calls	~11 calls
Same sequence (warm cache)	0	0
Second search, different query (warm)	~41 calls	0 calls

Changes

CachedDiscogsClient: Added getCompleteCollectionReleases() and computeStatsFromReleases(). Increased maxPages from 10 to 25 (supports up to 2500-item collections).
search_collection tool: Mood-expanded queries (up to 4 variants) now filter a single cached collection in-memory instead of each triggering full API pagination.
get_recommendations tool: Replaced manual pagination loop with getCompleteCollectionReleases().
get_collection_stats tool: Computes stats from cached collection data instead of its own pagination pass.
MCP resources: discogs://collection returns the full collection (not just page 1). discogs://search filters the cached collection in-memory.
All paths fall back gracefully when no cached client is available.

Documentation

Added docs/RATE-LIMIT-OPTIMIZATION.md with full problem analysis and implementation details.

Assets 2

Releases: rianvdm/discogs-mcp

v3.1.0 - Search Ranking and Dedup Fixes

What's Fixed

🎯 Explicit genre terms now act as a hard filter

📊 Proportional mood scoring (no more recency bias)

🎵 Dedup of multiple pressings

Architecture

Cleanup

Test coverage

Verified in production

Commits

Uh oh!

v3.0.0 - Private Instance & Self-Hosting Model

⚠️ Breaking Change

What's New

🔒 Allowlist-based access control

🏠 Repository pivot to a self-host-first model

🚦 Centralized rate limiter via a Durable Object (architectural rewrite)

⏱️ Collection-fetch time budget raised (40s → 105s)

Other changes

Test suite

Housekeeping

Uh oh!

v2.5.1 - Bug Fixes

Bug Fixes

Uh oh!

v2.5.0

What's New

Smarter Semantic Search (#10)

Faster First Calls / Rate Limit Fix (#11)

Housekeeping

Uh oh!

v2.4.0 - Collection Write Tools

What's New

Collection Folders

Collection Items

Custom Fields

Uh oh!

v2.3.0 - MCP OAuth 2.1 Compliance

What's New

Changes

Uh oh!

v2.2.1 - Large Collection Support (up to 5,000 items)

What's Changed

Bug Fix: Incomplete results for large collections (#6)

Uh oh!

v2.2.0 - Semantic Collection Search

What's New

Semantic Search

How it works

CI Fix

Uh oh!

v2.1.1 - Security & Rate Limit Fixes

Summary

What Changed

Security & code quality (code-review-fixes)

Rate limiting & stalled responses (fix/rate-limiting-stalled-responses)

Commits

Uh oh!

v2.1.0 - Rate Limit Optimization

Summary

What Changed

Fetch once, filter many

API call reduction (1000-item collection)

Changes

Documentation

Uh oh!

Security & code quality (`code-review-fixes`)

Rate limiting & stalled responses (`fix/rate-limiting-stalled-responses`)