Adopt w3c_api 0.3.0: delegate rate-limiting, drop RDF stack#48
Merged
Conversation
w3c_api 0.3.0 builds the HAL client with a retry layer for the W3C rate-limit (HTTP 403) and connection/timeout errors. relaton-w3c now relies on the client for those retries, so require the released version. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Retries now live upstream: w3c_api retries the W3C rate-limit (HTTP 403) and connection/timeout errors, and lutaml-hal retries 429/5xx. So RateLimitHandler no longer retries — it only memoizes realized objects and, on a terminal error, skips the resource (caches nil) so one bad link doesn't abort the crawl. Network errors are left uncached so a later reference can try again. Specs updated to the no-retry contract. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The W3C data is now fetched through w3c_api (the REST client), so the old RDF/SPARQL/scraping stack is dead weight. Remove linkeddata, mechanize, rdf, rdf-normalize, shex, csv and sparql — none are referenced anywhere in lib/ or spec/. This drops ~57 transitive gems from the install. rubyzip is no longer a runtime dependency either: the runtime index zip is unpacked by relaton-index, and the only direct use is a test helper reading a fixture zip. Move it to the Gemfile as a test dependency. Full suite green (71 examples, 0 failures). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Reflect the current architecture: RateLimitHandler no longer retries (retries live in w3c_api for 403/connection/timeout and lutaml-hal for 429/5xx); add a Rate limiting & retries section; correct dependency versions and drop the removed RDF/SPARQL/scraping stack. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Re-record the W3C API cassettes against the current w3c_api 0.3.0 stack. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
w3c_api now caches realized objects (thread-safely as of lutaml-hal
0.2.1, required via w3c_api ~> 0.3.2), so RateLimitHandler's own
{ href => object } map was redundant. Drop it: realize now just calls
obj.realize (served by w3c_api's cache) and the handler only remembers
hrefs that failed terminally (renamed `skipped`) to skip a broken
resource. Network errors aren't remembered, so a later reference retries.
Full suite green (71 examples, 0 failures) on lutaml-hal 0.2.1 + w3c_api 0.3.2.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The module no longer rate-limits or retries (that lives in w3c_api / lutaml-hal) — it just makes `realize` fault-tolerant: skip a resource that fails terminally so one bad link doesn't abort the crawl. Rename to match (RateLimitHandler -> SafeRealize, rate_limit_handler.rb -> safe_realize.rb), updating the includes, specs and docs. Also initialize the shared `skipped` map eagerly instead of via a lazy `||=`, so the parallel fetcher's first concurrent access can't race two maps into existence. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Move relaton-w3c onto the published lutaml-hal/w3c_api stack and shed the obsolete RDF/scraping dependencies.
~> 0.1.3). w3c_api 0.3.0 builds its HAL client withfaraday-retry.RateLimitHandlerno longer retries. Retries now live upstream: w3c_api handles HTTP 403 (the W3C rate-limit signal) + connection/timeout, and lutaml-hal handles 429/5xx. The handler only memoizes realized objects and, on a terminal error, skips the resource (cachesnil) so one bad link doesn't abort the crawl; network errors are left uncached so a later reference can retry. Specs updated to the no-retry contract.linkeddata,mechanize,rdf,rdf-normalize,shex,csv,sparql(referenced nowhere now that fetching goes through w3c_api).rubyzipmoves to a test-only Gemfile dep (runtime index zip is handled by relaton-index). Drops ~57 transitive gems.Verification
Full suite green: 71 examples, 0 failures (98.91% coverage), resolving
w3c_api 0.3.0andlutaml-hal 0.2.0from RubyGems.🤖 Generated with Claude Code