Conversation
…r reporting v0 is gone — v1 is now the only supported Kagi API. API surface: - Remove v0 constructors (query_search, query_enrich_web, query_enrich_news, query_summarize, query_fastgpt) and the v0-only summarize_with_kagi() helper. - Rename v1 constructors so the _v1 suffix is no longer needed: query_search_v1() -> kagi_query_search() (class kagi_query_search_v1 -> kagi_query_search); query_extract() -> kagi_query_extract(). - kagi_connection() keeps the api_version argument for forward compatibility but accepts only "v1"; auth is always Bearer. - kagi_fetch() writes search output to <project>/search/ (was search_v1/). - kagi_request() gains a `pages` argument (1-10) and threads body$page correctly per iteration so each page is a distinct request. Robustness: - kagi_connection() retries on 408/429/500/502/503/504 (was only 429/503), with a custom backoff capped at 10s/attempt. - perform_request() reads e$resp (httr2's actual field) so HTTP error envelopes propagate full HTTP status + API code + API msg + body. - write_search_parquet() pre-detects which typed result arrays are present in the JSON before UNNEST, eliminating DuckDB Binder Error noise. Docs, tests, AI artifacts: - DESCRIPTION bumped to 0.5.0; NEWS.md, PROJECT_DESIGN.md updated. - Delete v0 endpoint vignettes and skill packs (user-enrich, user-summarize, user-fastgpt); rewrite quickstart, corpus-workflow, v1-api-and-corpus, agent-quick-index for the v1-only surface. - Resync llms.txt / llms-full.txt + pkgdown/extra mirrors; trim _pkgdown.yml navbar and article list; update README and CLAUDE.md. - Replace v0 cassette-driven tests with v1 unit-test suite (test-v1.R). - Add inst/api_specs/openapi.yaml (Kagi v1 contract) and scripts/diff-against-generated.R. devtools::check() clean (0 errors / 0 warnings / 0 notes); devtools::test() 29/29 PASS; scripts/check-ai-docs.sh passes. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
After per-query/per-type Hive partitions are written, `combine = TRUE` union-merges them by column name into a single `<output>/combined.parquet` file via DuckDB (NULL-fills absent columns across search result types) and removes the partition directories. Plumbed through `kagi_fetch()` with default `combine = TRUE`; pass `combine = FALSE` to retain the partitioned layout. Smoke-tested against a real 19-partition / 1211-row search dataset: collapses to one parquet file, preserving `query` and `type` as columns. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
R/ — dead code: - Remove unused api_version_for_query_class() from utils.R (zero callers after v0 retirement). - Drop the unused `connection` argument (and its validation) from markdown_abstract(); it was scaffolded for a Kagi-side summarizer that no longer exists. - Drop the unused @importFrom utils tail and @importFrom httr2 req_url_query from kagi_request.R; NAMESPACE regenerated. R/ — stale roxygen: - kagi_request(): @param limit no longer mentions "enrich"; @details pagination text now describes the v1 body-driven `pages` mechanism instead of the dead `meta$next_cursor` cursor. - kagi_request(): expand @param pages to explain body-paginated semantics. - kagi_fetch(): @param limit drops "/enrich". Prose docs: - CLAUDE.md: rewrite Project + Architecture sections for the v1-only surface — drop v0 wording, list current constructors and their classes, surface `pages` and `combine`, describe the retry/error model accurately. Update class-check note (kagi_query_search, not kagi_query_search_v1). Note vcr plumbing is retained but tests no longer use cassettes. - PROJECT_DESIGN.md: prune Skills Layer/Skill Mapping to the surviving skills (user-search, user-corpus-workflow + maintainer-*); replace the "(toward 0.4.1)" historical block with a one-line pointer to NEWS.md. - llms-full.txt: surface the `pages` arg on kagi_request(), and the `combine` arg on kagi_fetch() / kagi_request_parquet(). pkgdown/extra mirror resynced. Verification: - devtools::document(): NAMESPACE drops the two stale @importFrom lines. - devtools::test(): 29/29 pass. - devtools::check(--no-manual): 0 errors / 0 warnings / 0 notes. - scripts/check-ai-docs.sh: passes (llms mirrors byte-identical). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
open_search_query(): - Rename to kagi_open_search_query() for namespace consistency. - Accept a single kagi_query_search object OR a list of them (open all). - Build the URL from the query object: every non-null shaping field (workflow, lens, lens_id, filters, safe_search, page, limit, timeout, format, personalizations, extract) is URL-encoded into the query string alongside `q`. Scalars are URL-encoded; nested lists are JSON-encoded then URL-encoded so the full search intent is visible in the address bar. kagi_query_search(): - Add `file_type` arg validated against Kagi's "Format" filter whitelist (pdf, ps, csv, epub, kml, kmz, gpx, hwp, htm, html, xls, xlsx, ppt, pptx, doc, docx, odp, ods, odt, rtf, svg, tex, txt, xml). Each value is appended to the query string as `filetype:<ext>`. - Add `domain` arg: each value is appended as `site:<domain>`. - Add `where` arg: `"anywhere"` (default), `"title"`, or `"url"`. Wraps the query term in `intitle:"..."` or `inurl:"..."` when not anywhere. - The shaping transforms apply after `expand`, so per-term operators inherit the operator suffix. - `open_in_browser = TRUE` now delegates to kagi_open_search_query(result) on the whole list. Touched: NAMESPACE, R/, man/, vignettes/quickstart.qmd, vignettes/ agent-quick-index.qmd, inst/skills/user-search/SKILL.md. Verification: devtools::document/test/check all clean (0/0/0); scripts/check-ai-docs.sh passes. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Restructure tests/testthat:
- One file per exported function:
test-kagi_connection.R (11 tests)
test-kagi_query_search.R (55 tests — covers file_type / domain /
where, every validation path, body
shaping, and print method)
test-kagi_query_extract.R (23 tests — chunking, HTTPS-only,
validation)
test-kagi_open_search_query.R (21 tests — single + list dispatch,
URL encoding, session_token, error
modes; browseURL is mocked)
test-utils.R (24 tests — dispatch helpers and
key resolution)
test-kagi_request_parquet.R (18 tests — search + extract fixtures,
combine = TRUE/FALSE, error paths)
test-as_corpus_parquet.R (7 tests — happy path, endpoint dir
input, missing columns, id_prefix,
no-overwrite)
test-clean_request.R (5 tests — dry_run, real delete,
empty project)
test-kagi_request.R (14 tests; 4 cassettes — search,
extract, list dispatch, write_dummy
fallback)
test-kagi_fetch.R (8 tests; 3 cassettes — combined,
partitioned, extract)
Old aggregate test-v1.R removed.
- Fixtures under tests/testthat/fixtures/:
json_search/query_1/{search_1.json,_query_meta.json} (real Kagi
response copied from output/)
json_extract/query_1/{extract_1.json,_query_meta.json} (synthetic)
cassettes/*.yml (7 vcr cassettes; Authorization
header filtered, no real key leaks)
- setup-vcr.R now feeds the keyring-resolved key into vcr's
filter_sensitive_data list, so cassettes recorded with a keyring-only
setup also get scrubbed.
- DESCRIPTION: add `withr` to Suggests (used by test-kagi_connection.R
for envvar isolation).
Default behaviour: if a cassette exists, tests replay it with a
placeholder key (no network, no credentials needed). If a cassette is
missing and a key is available via KAGI_API_KEY or keyring `API_kagi`,
vcr records live. Otherwise the cassette-backed test skips with a clear
message.
Verification: devtools::test 186/186 pass; devtools::check
0/0/1 (only the environmental "future file timestamps" note);
scripts/check-ai-docs.sh passes.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Three documented modes, resolved in helper_kagi.R::cassette_record_mode():
default -> "once" — replay if cassette exists,
record only when missing. Routine
local re-runs do NOT re-record.
KAGIPRO_RECORD_CASSETTES=true -> "all" — force re-record every run
(requires KAGI_API_KEY or keyring
"API_kagi").
KAGIPRO_RECORD_CASSETTES=false -> "none" — strict replay; cassette
miss errors out (CI default).
VCR_RECORD_MODE=<mode> -> passthrough (highest precedence).
Helper consolidation in tests/testthat/helper_kagi.R:
- cassette_record_mode() resolve effective vcr mode
- cassette_will_record(name) true iff the run hits the network
- make_kagi_test_conn(name) real key when recording, placeholder
when replaying
- skip_if_cannot_serve_cassette(name)
skip with a clear message when a
recording is needed but no key is set
(or vcr is missing)
Removed duplicated copies of these helpers from test-kagi_request.R and
test-kagi_fetch.R.
setup-vcr.R now derives `record` from cassette_record_mode() so the
config and the per-test logic always agree.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
R-CMD-check.yaml: - Push trigger updated from [main, master] to [main, dev] to match this repo's long-lived branches. - pull_request trigger made explicit ([main, dev]) so PRs into either long-lived branch are tested; PRs into feature branches are not. - Add workflow_dispatch for manual runs. - Set KAGIPRO_RECORD_CASSETTES=false in env. CI has no Kagi API key, so cassettes must be replayed strictly; a missing or mismatched cassette now fails the test cleanly instead of trying to record live. pkgdown.yaml: - Push (deploy) trigger narrowed to [main]; PRs into [main, dev] still build the site (preview) but do not deploy. - Add the same KAGIPRO_RECORD_CASSETTES=false guard so vignette rebuilds never touch the API. Dependency caching was already in place via r-lib/actions/setup-r- dependencies@v2; no additional actions/cache steps needed. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
v0beta API entirely;v1is now the only supported surface. Rename v1 constructors to drop the_v1suffix.pages(body-paginated search),combine(single parquet output), andfile_type/domain/wherequery helpers.Headline changes (DESCRIPTION → 0.5.0, breaking)
API surface
query_search,query_enrich_web,query_enrich_news,query_summarize,query_fastgpt) andsummarize_with_kagi().query_search_v1()→kagi_query_search()(classkagi_query_search);query_extract()→kagi_query_extract().kagi_connection()keeps theapi_versionargument for forward compat but accepts only"v1"; auth is alwaysBearer.kagi_fetch()writes search output to<project>/search/(wassearch_v1/). Existing project folders need to be rerun.kagi_request()gainspages(1–10) and threadsbody$pagecorrectly per iteration.kagi_request_parquet()/kagi_fetch()gaincombine— collapse Hive partitions into a singlecombined.parquet.kagi_query_search()gainsfile_type(validated against Kagi's "Format" filter whitelist),domain(site:operator), andwhere(intitle:/inurl:/ anywhere).open_search_query()→kagi_open_search_query(); accepts a single object or a list ofkagi_query_searchobjects; URL-encodes every shaping field intoq=+ parameters.Robustness
kagi_connection()retries on408/429/500/502/503/504with a custom backoff capped at 10 s/attempt.kagi_request()readse$resp(httr2's actual condition field) so HTTP error envelopes propagate full HTTP status + Kagierror[].code/error[].msg+ raw body.kagi_request_parquet()pre-detects which typed result arrays are present before UNNEST, eliminating DuckDB "Could not find key" Binder Errors.Tests
test-v1.Rinto one file per exported function: 10 files, 186 tests.kagi_open_search_querycovered end-to-end with a mockedbrowseURL.kagi_requestandkagi_fetch: 7 cassettes recorded against live API, Authorization header filtered out (leak-checked).helper_kagi.R::cassette_record_mode()— default\"once\"(replay),KAGIPRO_RECORD_CASSETTES=truere-records all,KAGIPRO_RECORD_CASSETTES=falseis strict replay.VCR_RECORD_MODEtakes precedence.CI
R-CMD-check.yaml: push trigger →[main, dev]; explicit PR trigger →[main, dev]; envKAGIPRO_RECORD_CASSETTES=falseso missing/mismatched cassettes fail cleanly;workflow_dispatchadded.pkgdown.yaml: push (deploy) →[main]only; PR (build, no deploy) →[main, dev]; same strict-replay env.r-lib/actions/setup-r-dependencies@v2.Docs
search/enrich/summarize/fastgpt-endpoint.qmd).quickstart,corpus-workflow,v1-api-and-corpus,agent-quick-index.CLAUDE.md,PROJECT_DESIGN.md,README.md,NEWS.md,llms.txt/llms-full.txt(+ pkgdown mirrors),_pkgdown.yml,inst/skills/all aligned with the v1-only surface.Test plan
devtools::test()— 186 / 186 PASS locallydevtools::check(args = '--no-manual')— 0 errors / 0 warnings / 0 notesbash scripts/check-ai-docs.sh— passesRscript -e 'devtools::test()'works without anyKAGI_API_KEYset🤖 Generated with Claude Code