From 9c52e1b1941c9660283b440f81829ba26483970d Mon Sep 17 00:00:00 2001 From: Abe Gong Date: Sun, 28 Jun 2026 11:45:50 -0600 Subject: [PATCH 1/2] Move config reference into configs section --- docs/content/deep-dives/domain-model/base.md | 4 ++-- .../content/deep-dives/domain-model/checks.md | 2 +- .../deep-dives/domain-model/collections.md | 4 ++-- docs/content/getting-started.md | 2 +- docs/content/how-to/add-a-schema.md | 4 ++-- docs/content/how-to/configure-rules.md | 4 ++-- docs/content/how-to/validate-in-ci.md | 2 +- docs/content/reference/cli.md | 4 ++-- .../{configuration.md => configs/_index.md} | 24 ++++++++++--------- .../reference/data-surfaces/plain-text.md | 2 +- .../data-surfaces/structured-object.md | 2 +- 11 files changed, 28 insertions(+), 26 deletions(-) rename docs/content/reference/{configuration.md => configs/_index.md} (93%) diff --git a/docs/content/deep-dives/domain-model/base.md b/docs/content/deep-dives/domain-model/base.md index 92ba87d..ec6558d 100644 --- a/docs/content/deep-dives/domain-model/base.md +++ b/docs/content/deep-dives/domain-model/base.md @@ -70,7 +70,7 @@ The same evidence can power a future `doctor` / `explain` command: list the coll ## Variants route checks, not membership A collection may run different checks on different items via -[variants]({{< relref "../../reference/configuration.md" >}}#variants), but that is +[variants]({{< relref "../../reference/configs/_index.md" >}}#variants), but that is a *check-engine* concern, not a base one. A variant's discriminator is a predicate over an item's **metadata**: portable across every base type, since each yields a metadata map (frontmatter for a file, columns for a row). It never @@ -125,6 +125,6 @@ the definition's pattern are two views of the same thing. - [Domain model]({{< relref "_index.md" >}}) for the cross-subsystem entity map. - [Collections]({{< relref "collections.md" >}}) for the collection and item hierarchy bases expose. -- [Configuration]({{< relref "../../reference/configuration.md" >}}) for the +- [Configs]({{< relref "../../reference/configs/_index.md" >}}) for the precise `.katalyst/bases/` surface. - `go doc ./internal/storage` for the code-level base contracts. diff --git a/docs/content/deep-dives/domain-model/checks.md b/docs/content/deep-dives/domain-model/checks.md index bdcbb31..bbc8664 100644 --- a/docs/content/deep-dives/domain-model/checks.md +++ b/docs/content/deep-dives/domain-model/checks.md @@ -89,7 +89,7 @@ item, the simplest correct path. Per item, the engine resolves which checks apply, then runs them. Resolution starts from the collection's configured checks and adds the checks of -the first [variant]({{< relref "../../reference/configuration.md" >}}) whose `when` +the first [variant]({{< relref "../../reference/configs/_index.md" >}}) whose `when` predicates the item's metadata satisfies. The object schema is selected by a precedence the JSON Schema library owns (a forced `--schema`, then an inline `schema:` directive, then the collection's object checks); see the [domain diff --git a/docs/content/deep-dives/domain-model/collections.md b/docs/content/deep-dives/domain-model/collections.md index 3cb3113..b19438e 100644 --- a/docs/content/deep-dives/domain-model/collections.md +++ b/docs/content/deep-dives/domain-model/collections.md @@ -13,7 +13,7 @@ registry validates a declared `type`, and a collection parses its own block in `check` lifecycle is driven from here. This page is the model and the *why*; for the key-by-key surface see the -[configuration reference]({{< relref "../../reference/configuration.md" >}}). +[configs reference]({{< relref "../../reference/configs/_index.md" >}}). Collections are declared *inside* a [base]({{< relref "base.md" >}}), which owns the base-to-collection mapping. This page covers the collection model and @@ -120,7 +120,7 @@ until real usage shows the need. The base page frames the same decision as ## See also -- The [configuration reference]({{< relref "../../reference/configuration.md" >}}) +- The [configs reference]({{< relref "../../reference/configs/_index.md" >}}) for the precise `.katalyst/` surface. - The [base]({{< relref "base.md" >}}) for how a backend source maps onto collections, and the base model. diff --git a/docs/content/getting-started.md b/docs/content/getting-started.md index da618b2..fe011a3 100644 --- a/docs/content/getting-started.md +++ b/docs/content/getting-started.md @@ -67,5 +67,5 @@ declare a collection inside `.katalyst/bases/local.yaml`, then run `katalyst check`. Next: -- [Configuration]({{< relref "reference/configuration.md" >}}) +- [Configs]({{< relref "reference/configs/_index.md" >}}) - [Check types reference]({{< relref "reference/check-types/_index.md" >}}) diff --git a/docs/content/how-to/add-a-schema.md b/docs/content/how-to/add-a-schema.md index 20258bb..348b119 100644 --- a/docs/content/how-to/add-a-schema.md +++ b/docs/content/how-to/add-a-schema.md @@ -86,8 +86,8 @@ katalyst check books --schema ./schemas/strict-book.json ``` The precedence is `--schema` > inline `schema:` key > the collection's object -check. See the [configuration -reference]({{< relref "../reference/configuration.md" >}}) for the key surface, +check. See the [configs +reference]({{< relref "../reference/configs/_index.md" >}}) for the key surface, or [Collections]({{< relref "../deep-dives/domain-model/collections.md" >}}) for why. ## See also diff --git a/docs/content/how-to/configure-rules.md b/docs/content/how-to/configure-rules.md index 004195a..12f67a6 100644 --- a/docs/content/how-to/configure-rules.md +++ b/docs/content/how-to/configure-rules.md @@ -136,5 +136,5 @@ supported yet. - [Add a schema]({{< relref "add-a-schema.md" >}}) to validate frontmatter shape. -- [Configuration reference]({{< relref "../reference/configuration.md" >}}) - for every key, including [`variants`]({{< relref "../reference/configuration.md" >}}#variants). +- [Configs reference]({{< relref "../reference/configs/_index.md" >}}) + for every key, including [`variants`]({{< relref "../reference/configs/_index.md" >}}#variants). diff --git a/docs/content/how-to/validate-in-ci.md b/docs/content/how-to/validate-in-ci.md index 979b749..dfafc56 100644 --- a/docs/content/how-to/validate-in-ci.md +++ b/docs/content/how-to/validate-in-ci.md @@ -64,4 +64,4 @@ step enforces canonical frontmatter without modifying files. See ## See also - [Configure checks for a collection]({{< relref "configure-rules.md" >}}) -- [Configuration reference]({{< relref "../reference/configuration.md" >}}) +- [Configs reference]({{< relref "../reference/configs/_index.md" >}}) diff --git a/docs/content/reference/cli.md b/docs/content/reference/cli.md index 8c8e026..9ee7193 100644 --- a/docs/content/reference/cli.md +++ b/docs/content/reference/cli.md @@ -71,7 +71,7 @@ Shared across the validating commands (`check`, `fix --check`): ## Filter predicates The `--filter` flag of `katalyst item list` and the `when:` clause of a -[collection variant]({{< relref "configuration.md#variants" >}}) share one +[collection variant]({{< relref "configs/_index.md#variants" >}}) share one predicate language, so it is documented here once. A predicate is `field OP value`; the operator is the first one found scanning left to right: @@ -97,6 +97,6 @@ the listing pipeline (`--grep`, `--sort`, `--skip`/`--limit`) is documented in ## See also -- [Configuration reference]({{< relref "configuration.md" >}}) +- [Configs reference]({{< relref "configs/_index.md" >}}) - [Check types reference]({{< relref "check-types/_index.md" >}}) - [Inspectors reference]({{< relref "inspectors/_index.md" >}}) diff --git a/docs/content/reference/configuration.md b/docs/content/reference/configs/_index.md similarity index 93% rename from docs/content/reference/configuration.md rename to docs/content/reference/configs/_index.md index 842b3bd..0e5c5c1 100644 --- a/docs/content/reference/configuration.md +++ b/docs/content/reference/configs/_index.md @@ -1,9 +1,11 @@ +++ -title = "Configuration" -weight = 10 +title = "Configs" +weight = 30 +bookCollapseSection = true +aliases = ["/reference/configuration/"] +++ -# Configuration +# Configs Katalyst reads a `.katalyst/` directory, found by walking upward from the current working directory to the nearest ancestor that contains one. That @@ -13,9 +15,9 @@ macOS `$TMPDIR` lives behind `/var` to `/private/var` and relative-path resolution would otherwise produce garbage. For *why* the config is shaped this way, see [How collections -work]({{< relref "../deep-dives/domain-model/collections.md" >}}). To set one up step by +work]({{< relref "../../deep-dives/domain-model/collections.md" >}}). To set one up step by step, see [Configure checks for a -collection]({{< relref "../how-to/configure-rules.md" >}}). +collection]({{< relref "../../how-to/configure-rules.md" >}}). ## Layout @@ -165,7 +167,7 @@ schema: book ## `checks` Each entry has a `kind` and the keys that check type requires. Every check -type is documented one per page in the [check types reference]({{< relref "check-types/_index.md" >}}): +type is documented one per page in the [check types reference]({{< relref "../check-types/_index.md" >}}): ```yaml checks: @@ -227,7 +229,7 @@ pages: useExhaustiveVariants: false # default ``` -**`when`** is a list of [`item list --filter`]({{< relref "cli.md#filter-predicates" >}}) +**`when`** is a list of [`item list --filter`]({{< relref "../cli.md#filter-predicates" >}}) predicates (`field=value`, `field>=n`, `field=~regex`, `!field`, ...), evaluated against the item's frontmatter. All entries must hold (AND). Three shapes are accepted, the first two desugaring to the third: @@ -252,7 +254,7 @@ failure (`matches no variant`), so every item is provably accounted for. Discrimination is by metadata only; selecting items by path or filename is not supported yet (a page type distinguishable only by location needs a frontmatter marker). `pattern` still governs collection **membership** and which files are -reported as [unmatched]({{< relref "../deep-dives/domain-model/_index.md" >}}#invariants); +reported as [unmatched]({{< relref "../../deep-dives/domain-model/_index.md" >}}#invariants); variants only route checks. ## `listing` @@ -302,9 +304,9 @@ variant), even when `--schema` is used. ## See also -- [Check types reference]({{< relref "check-types/_index.md" >}}), every check type. -- [Bases]({{< relref "../deep-dives/domain-model/base.md" >}}), the base / +- [Check types reference]({{< relref "../check-types/_index.md" >}}), every check type. +- [Bases]({{< relref "../../deep-dives/domain-model/base.md" >}}), the base / collection-mapping model and its lineage. -- [Collections]({{< relref "../deep-dives/domain-model/collections.md" >}}), the +- [Collections]({{< relref "../../deep-dives/domain-model/collections.md" >}}), the config/collection model and rationale: schema resolution, variants, unmatched-as-error. diff --git a/docs/content/reference/data-surfaces/plain-text.md b/docs/content/reference/data-surfaces/plain-text.md index 7bab502..1dc86f5 100644 --- a/docs/content/reference/data-surfaces/plain-text.md +++ b/docs/content/reference/data-surfaces/plain-text.md @@ -44,5 +44,5 @@ literal strings." - [Plain text check types]({{< relref "../check-types/plain-text/_index.md" >}}) - [Markdown body text]({{< relref "markdown-body-text.md" >}}) -- [Configuration]({{< relref "../configuration.md#text-rules" >}}) for text +- [Configs]({{< relref "../configs/_index.md#text-rules" >}}) for text rule configuration. diff --git a/docs/content/reference/data-surfaces/structured-object.md b/docs/content/reference/data-surfaces/structured-object.md index 6828db3..8d6bfae 100644 --- a/docs/content/reference/data-surfaces/structured-object.md +++ b/docs/content/reference/data-surfaces/structured-object.md @@ -47,5 +47,5 @@ the item and is removed before the item is validated against that schema. - [Structured object check types]({{< relref "../check-types/structured-object/_index.md" >}}) - [Markdown body text]({{< relref "markdown-body-text.md" >}}) -- [Configuration]({{< relref "../configuration.md#object-schema-resolution-precedence" >}}) +- [Configs]({{< relref "../configs/_index.md#object-schema-resolution-precedence" >}}) - [Collections]({{< relref "../../deep-dives/domain-model/collections.md" >}}) From ccbfa49e9bcb0b2008cf3b0581905d81901399b6 Mon Sep 17 00:00:00 2001 From: Abe Gong Date: Sun, 28 Jun 2026 11:57:37 -0600 Subject: [PATCH 2/2] Split configs reference by concept --- docs/content/deep-dives/domain-model/base.md | 2 +- .../content/deep-dives/domain-model/checks.md | 2 +- docs/content/how-to/configure-rules.md | 2 +- docs/content/reference/cli.md | 2 +- docs/content/reference/configs/_index.md | 283 +----------------- docs/content/reference/configs/bases.md | 58 ++++ docs/content/reference/configs/checks.md | 59 ++++ docs/content/reference/configs/collections.md | 57 ++++ docs/content/reference/configs/discovery.md | 42 +++ docs/content/reference/configs/listing.md | 36 +++ docs/content/reference/configs/schemas.md | 16 + docs/content/reference/configs/variants.md | 58 ++++ .../reference/data-surfaces/plain-text.md | 2 +- .../data-surfaces/structured-object.md | 2 +- 14 files changed, 342 insertions(+), 279 deletions(-) create mode 100644 docs/content/reference/configs/bases.md create mode 100644 docs/content/reference/configs/checks.md create mode 100644 docs/content/reference/configs/collections.md create mode 100644 docs/content/reference/configs/discovery.md create mode 100644 docs/content/reference/configs/listing.md create mode 100644 docs/content/reference/configs/schemas.md create mode 100644 docs/content/reference/configs/variants.md diff --git a/docs/content/deep-dives/domain-model/base.md b/docs/content/deep-dives/domain-model/base.md index ec6558d..f69418f 100644 --- a/docs/content/deep-dives/domain-model/base.md +++ b/docs/content/deep-dives/domain-model/base.md @@ -70,7 +70,7 @@ The same evidence can power a future `doctor` / `explain` command: list the coll ## Variants route checks, not membership A collection may run different checks on different items via -[variants]({{< relref "../../reference/configs/_index.md" >}}#variants), but that is +[variants]({{< relref "../../reference/configs/variants.md" >}}), but that is a *check-engine* concern, not a base one. A variant's discriminator is a predicate over an item's **metadata**: portable across every base type, since each yields a metadata map (frontmatter for a file, columns for a row). It never diff --git a/docs/content/deep-dives/domain-model/checks.md b/docs/content/deep-dives/domain-model/checks.md index bbc8664..9ca4733 100644 --- a/docs/content/deep-dives/domain-model/checks.md +++ b/docs/content/deep-dives/domain-model/checks.md @@ -89,7 +89,7 @@ item, the simplest correct path. Per item, the engine resolves which checks apply, then runs them. Resolution starts from the collection's configured checks and adds the checks of -the first [variant]({{< relref "../../reference/configs/_index.md" >}}) whose `when` +the first [variant]({{< relref "../../reference/configs/variants.md" >}}) whose `when` predicates the item's metadata satisfies. The object schema is selected by a precedence the JSON Schema library owns (a forced `--schema`, then an inline `schema:` directive, then the collection's object checks); see the [domain diff --git a/docs/content/how-to/configure-rules.md b/docs/content/how-to/configure-rules.md index 12f67a6..30f9499 100644 --- a/docs/content/how-to/configure-rules.md +++ b/docs/content/how-to/configure-rules.md @@ -137,4 +137,4 @@ supported yet. - [Add a schema]({{< relref "add-a-schema.md" >}}) to validate frontmatter shape. - [Configs reference]({{< relref "../reference/configs/_index.md" >}}) - for every key, including [`variants`]({{< relref "../reference/configs/_index.md" >}}#variants). + for every key, including [`variants`]({{< relref "../reference/configs/variants.md" >}}). diff --git a/docs/content/reference/cli.md b/docs/content/reference/cli.md index 9ee7193..104bd4c 100644 --- a/docs/content/reference/cli.md +++ b/docs/content/reference/cli.md @@ -71,7 +71,7 @@ Shared across the validating commands (`check`, `fix --check`): ## Filter predicates The `--filter` flag of `katalyst item list` and the `when:` clause of a -[collection variant]({{< relref "configs/_index.md#variants" >}}) share one +[collection variant]({{< relref "configs/variants.md" >}}) share one predicate language, so it is documented here once. A predicate is `field OP value`; the operator is the first one found scanning left to right: diff --git a/docs/content/reference/configs/_index.md b/docs/content/reference/configs/_index.md index 0e5c5c1..dfbcaf4 100644 --- a/docs/content/reference/configs/_index.md +++ b/docs/content/reference/configs/_index.md @@ -10,9 +10,16 @@ aliases = ["/reference/configuration/"] Katalyst reads a `.katalyst/` directory, found by walking upward from the current working directory to the nearest ancestor that contains one. That ancestor is the repo root; all relative paths resolve against it. -Discovery resolves symlinks on both the root and the input path, because on -macOS `$TMPDIR` lives behind `/var` to `/private/var` and relative-path -resolution would otherwise produce garbage. + +The config reference is organized by concept: + +- [Discovery]({{< relref "discovery.md" >}}): root discovery, symlinks, file layout, formats, and legacy storage paths. +- [Schemas]({{< relref "schemas.md" >}}): schema file discovery, naming, and schema handles. +- [Bases]({{< relref "bases.md" >}}): backend stores and their top-level config keys. +- [Collections]({{< relref "collections.md" >}}): collection membership, identity, paths, and per-collection files. +- [Checks]({{< relref "checks.md" >}}): `schema:` shorthand, `checks:` entries, text rules, and object-schema precedence. +- [Variants]({{< relref "variants.md" >}}): conditional check routing with `when`. +- [Listing]({{< relref "listing.md" >}}): default behavior for `katalyst item list`. For *why* the config is shaped this way, see [How collections work]({{< relref "../../deep-dives/domain-model/collections.md" >}}). To set one up step by @@ -32,276 +39,6 @@ collection]({{< relref "../../how-to/configure-rules.md" >}}). books.yaml ``` -By default, schemas and bases are discovered by **convention**: -every file under `schemas/` is a schema whose name is its filename stem -(`book.json` → `book`), and every file under `bases/` is a -[base](#bases) named for its filename stem (`local.yaml` → `local`). -`config.yaml` is optional; it carries `listing:` -defaults and can switch a kind to **explicit** discovery, listing definitions -inline instead of as files. - -`config.yaml` is YAML; schema and base files default to YAML/JSON, and the -accepted format is set per kind there. - -Legacy projects that still use `storage:` in `config.yaml` or -`.katalyst/storage/` continue to load. Do not mix legacy and new forms in the -same project; move legacy base files to `.katalyst/bases/` when you edit them. - -## Schemas - -Each file under `.katalyst/schemas/` is a JSON Schema. Its **name**, the -filename stem, is the stable public handle used by `schema get `, by -an inline `schema: ` key in a document's frontmatter, and by a -collection's `schema:` shorthand. The path can move; the name should not. - -Schemas are stored flat; the check library that compiles a schema is determined -by the referencing check type's `kind` (the `object` check uses JSON Schema). - -## Bases - -A **base** is one configured backend store, plus the collections it maps onto -the domain model. Each file under -`.katalyst/bases/` is one base, named for its filename stem. There is no -implicit base; `katalyst init` writes a default `local` one. - -| Key | Required | Default | Meaning | -|---|---|---|---| -| `type` | no | `filesystem` | Backend kind: `filesystem` or `sqlite`. | -| `root` | no | `.` | Base root directory, relative to the repo root. Collection paths resolve against it. | -| `path` | for `sqlite` | - | SQLite database path, relative to the repo root. Alias for `root` on SQLite bases. | -| `collections` | no | - | Map of collection name → definition (see below). | - -```yaml -# .katalyst/bases/local.yaml -type: filesystem -root: . -collections: - books: - path: notes/books - schema: book - checks: - - kind: markdown_title_matches_h1 -``` - -Collection names are unique across the whole project (selectors are -`/`, with no base qualifier). - -SQLite instances use one table per collection. Each row is one item: - -```yaml -# .katalyst/bases/db.yaml -type: sqlite -path: content.sqlite -collections: - books: - table: books - id: slug - attributes: - title: title - status: status - author: - columns: - first: author_first - last: author_last - content: - kind: markdown - column: body - checks: - - kind: object_required_field - field: title -``` - -## Collections - -A **collection** is a directory of items plus the checks every item must pass. -Collections are declared inside their base, under `collections:`. - -| Key | Required | Default | Meaning | -|---|---|---|---| -| `path` | no | the collection name | Directory, relative to the base `root`. | -| `pattern` | no | `*.md` | Filename glob selecting items in the directory. | -| `table` | for `sqlite` | - | SQLite table backing the collection. | -| `id` | for `sqlite` | - | SQLite column that provides item identity. | -| `attributes` | no | all scalar columns except `id` and `content.column` | SQLite column captures exposed as item attributes. | -| `content` | no | - | Optional SQLite content mapping, with `kind: text` or `kind: markdown` and `column: `. | -| `schema` | no | - | Schema name; shorthand for a leading `object` check. | -| `checks` | no | - | List of checks (see below). | -| `listing` | no | - | `item list` listing defaults for this collection (see [`listing`](#listing)). | - -A collection must configure at least one check: set `schema`, or provide a -non-empty `checks` list, or both. Files in the directory that do not match -`pattern` are reported as errors. - -SQLite collections do not support filesystem check types. Configure -structured-object checks against captured attributes. Text and markdown -body-text checks require a compatible `content` mapping. - -`attributes` accepts shorthand single-column captures and structured -multi-column captures: - -```yaml -attributes: - title: title - author: - columns: - first: author_first - last: author_last -``` - -The structured form above exposes `author.first` and `author.last` as fields -inside the `author` attribute object. - -### Per-collection files - -A base whose `collections:` block grows unwieldy may split collections into -one file each under `.katalyst/bases//.yaml`, named for -its filename stem. Inline and per-file collections coexist; a name declared both -inline and in a file is an error. - -```yaml -# .katalyst/bases/local/books.yaml -path: notes/books -schema: book -``` - -## `checks` - -Each entry has a `kind` and the keys that check type requires. Every check -type is documented one per page in the [check types reference]({{< relref "../check-types/_index.md" >}}): - -```yaml -checks: - - kind: object - schema: book - - kind: object_field_type - field: year - type: integer - - kind: markdown_title_matches_h1 - - kind: filesystem_name_matches_field -``` - -### Text rules - -The `text_*` check types lint the item **body** as raw text, independent of -markdown structure, and also apply to plain-text items (a `.txt` file or a -markdown file with no frontmatter). Each is evaluated against a set of **spans** -chosen by `target`: - -| `target` | Spans | -|---|---| -| `body` (default) | the entire body as one multiline string | -| `line` | each body line | -| `first-line` | the first non-blank body line | -| `matched-lines` | each body line matching `select: ` | - -- `text_requires` and `text_forbids` take a Go `pattern`, matched **unanchored** - (it must appear *somewhere* in a span: unlike `filesystem_name_regex`, which - anchors with `^…$`). `text_requires` also takes `match: any` (default, at - least one span matches) or `match: all` (every span must match). -- `text_denylist` takes `values:`, a list of literal substrings; regex - metacharacters are inert. -- `text_forbids` may declare a `fix:`: a replacement template (`$1`, `${name}` - capture syntax) applied to the matched text by `katalyst fix`. The fix - re-checks its own work and fails rather than writing a file the rule would - still reject. `text_requires` and `text_denylist` are report-only. - -## `variants` - -A collection runs its base `schema`/`checks` against every item. **Variants** -let it run *extra* checks on a subset, chosen by the item's metadata. Each -entry in a collection's `variants:` list has a `when` discriminator and its own -`schema`/`checks`: - -```yaml -pages: - path: docs/content - pattern: "**/*.md" - schema: page # base: every page needs a title - variants: - - when: "bookCollapseSection" # section landing pages have this flag - schema: section_index - - when: "!bookCollapseSection" # every other page is a content page - schema: content_page - checks: - - kind: object_required_field - field: weight - - kind: markdown_requires_h1 - useExhaustiveVariants: false # default -``` - -**`when`** is a list of [`item list --filter`]({{< relref "../cli.md#filter-predicates" >}}) -predicates (`field=value`, `field>=n`, `field=~regex`, `!field`, ...), evaluated -against the item's frontmatter. All entries must hold (AND). Three shapes are -accepted, the first two desugaring to the third: - -```yaml -when: "kind=section" # one predicate -when: ["kind=section", "w>1"] # a list of predicates -when: { where: ["kind=section"] } -``` - -**Resolution.** An item runs the base checks plus the checks of the **first** -variant (in list order) whose `when` it satisfies, at most one variant -applies. A variant *adds* to the base, so a check belongs in a variant exactly -when some page type must skip it: in the example, `weight` and the H1 -requirement apply to content pages but not section indexes. A variant may -declare no checks at all (a deliberate exemption). - -An item that matches **no** variant runs the base checks alone. Set -**`useExhaustiveVariants: true`** to instead make an unmatched item a check -failure (`matches no variant`), so every item is provably accounted for. - -Discrimination is by metadata only; selecting items by path or filename is not -supported yet (a page type distinguishable only by location needs a frontmatter -marker). `pattern` still governs collection **membership** and which files are -reported as [unmatched]({{< relref "../../deep-dives/domain-model/_index.md" >}}#invariants); -variants only route checks. - -## `listing` - -Two `item list` behaviors have configurable defaults. A `listing:` block sets -them project-wide in `.katalyst/config.yaml`, and a collection's file can -override either key for that collection. - -| Key | Values | Default | Meaning | -|---|---|---|---| -| `filterTypeMismatch` | `skip` · `error` | `skip` | A `--filter` comparison against an incompatible type either skips the item or exits 2. | -| `sortMissing` | `last` · `lowest` | `last` | Where items lacking the `--sort` key land: at the end (both directions), or below any present value. | - -```yaml -# .katalyst/config.yaml — project default -listing: - filterTypeMismatch: skip - sortMissing: last -``` - -```yaml -# under a base's collections: override for one collection -books: - path: notes/books - schema: book - listing: - filterTypeMismatch: error -``` - -Resolution is highest-precedence first: the `--on-type-mismatch` / -`--sort-missing` flags, then the collection's `listing:`, then the project -`listing:`, then the built-in default. An unset key falls through to the next -level. - -## Object-schema resolution precedence - -When an item is checked against an object schema, the schema is chosen -highest-precedence first: - -1. `--schema ` flag (applies to every selected item). -2. Inline `schema: ` key in the item's frontmatter. -3. The collection's `object` check (from `schema:` or an explicit entry), plus - the matched [variant](#variants)'s schema: both apply, additively. - -Markdown and filesystem checks always come from the collection (and the matched -variant), even when `--schema` is used. - ## See also - [Check types reference]({{< relref "../check-types/_index.md" >}}), every check type. diff --git a/docs/content/reference/configs/bases.md b/docs/content/reference/configs/bases.md new file mode 100644 index 0000000..1dcbf40 --- /dev/null +++ b/docs/content/reference/configs/bases.md @@ -0,0 +1,58 @@ ++++ +title = "Bases" +weight = 30 ++++ + +# Bases + +A **base** is one configured backend store, plus the collections it maps onto +the domain model. Each file under `.katalyst/bases/` is one base, named for its +filename stem. There is no implicit base; `katalyst init` writes a default +`local` one. + +| Key | Required | Default | Meaning | +|---|---|---|---| +| `type` | no | `filesystem` | Backend kind: `filesystem` or `sqlite`. | +| `root` | no | `.` | Base root directory, relative to the repo root. Collection paths resolve against it. | +| `path` | for `sqlite` | - | SQLite database path, relative to the repo root. Alias for `root` on SQLite bases. | +| `collections` | no | - | Map of collection name -> definition. See [Collections]({{< relref "collections.md" >}}). | + +```yaml +# .katalyst/bases/local.yaml +type: filesystem +root: . +collections: + books: + path: notes/books + schema: book + checks: + - kind: markdown_title_matches_h1 +``` + +Collection names are unique across the whole project (selectors are +`/`, with no base qualifier). + +SQLite bases use one table per collection. Each row is one item: + +```yaml +# .katalyst/bases/db.yaml +type: sqlite +path: content.sqlite +collections: + books: + table: books + id: slug + attributes: + title: title + status: status + author: + columns: + first: author_first + last: author_last + content: + kind: markdown + column: body + checks: + - kind: object_required_field + field: title +``` diff --git a/docs/content/reference/configs/checks.md b/docs/content/reference/configs/checks.md new file mode 100644 index 0000000..a788c60 --- /dev/null +++ b/docs/content/reference/configs/checks.md @@ -0,0 +1,59 @@ ++++ +title = "Checks" +weight = 50 ++++ + +# Checks + +Each entry in a collection's `checks:` list has a `kind` and the keys that +check type requires. Every check type is documented one per page in the +[check types reference]({{< relref "../check-types/_index.md" >}}): + +```yaml +checks: + - kind: object + schema: book + - kind: object_field_type + field: year + type: integer + - kind: markdown_title_matches_h1 + - kind: filesystem_name_matches_field +``` + +## Text rules + +The `text_*` check types lint the item **body** as raw text, independent of +markdown structure, and also apply to plain-text items (a `.txt` file or a +markdown file with no frontmatter). Each is evaluated against a set of **spans** +chosen by `target`: + +| `target` | Spans | +|---|---| +| `body` (default) | the entire body as one multiline string | +| `line` | each body line | +| `first-line` | the first non-blank body line | +| `matched-lines` | each body line matching `select: ` | + +- `text_requires` and `text_forbids` take a Go `pattern`, matched **unanchored** + (it must appear *somewhere* in a span: unlike `filesystem_name_regex`, which + anchors with `^...$`). `text_requires` also takes `match: any` (default, at + least one span matches) or `match: all` (every span must match). +- `text_denylist` takes `values:`, a list of literal substrings; regex + metacharacters are inert. +- `text_forbids` may declare a `fix:`: a replacement template (`$1`, `${name}` + capture syntax) applied to the matched text by `katalyst fix`. The fix + re-checks its own work and fails rather than writing a file the rule would + still reject. `text_requires` and `text_denylist` are report-only. + +## Object-schema resolution precedence + +When an item is checked against an object schema, the schema is chosen +highest-precedence first: + +1. `--schema ` flag (applies to every selected item). +2. Inline `schema: ` key in the item's frontmatter. +3. The collection's `object` check (from `schema:` or an explicit entry), plus + the matched [variant]({{< relref "variants.md" >}})'s schema: both apply, additively. + +Markdown and filesystem checks always come from the collection (and the matched +variant), even when `--schema` is used. diff --git a/docs/content/reference/configs/collections.md b/docs/content/reference/configs/collections.md new file mode 100644 index 0000000..d68b38b --- /dev/null +++ b/docs/content/reference/configs/collections.md @@ -0,0 +1,57 @@ ++++ +title = "Collections" +weight = 40 ++++ + +# Collections + +A **collection** is a directory of items plus the checks every item must pass. +Collections are declared inside their base, under `collections:`. + +| Key | Required | Default | Meaning | +|---|---|---|---| +| `path` | no | the collection name | Directory, relative to the base `root`. | +| `pattern` | no | `*.md` | Filename glob selecting items in the directory. | +| `table` | for `sqlite` | - | SQLite table backing the collection. | +| `id` | for `sqlite` | - | SQLite column that provides item identity. | +| `attributes` | no | all scalar columns except `id` and `content.column` | SQLite column captures exposed as item attributes. | +| `content` | no | - | Optional SQLite content mapping, with `kind: text` or `kind: markdown` and `column: `. | +| `schema` | no | - | Schema name; shorthand for a leading `object` check. | +| `checks` | no | - | List of checks. See [Checks]({{< relref "checks.md" >}}). | +| `listing` | no | - | `item list` listing defaults for this collection. See [Listing]({{< relref "listing.md" >}}). | + +A collection must configure at least one check: set `schema`, or provide a +non-empty `checks` list, or both. Files in the directory that do not match +`pattern` are reported as errors. + +SQLite collections do not support filesystem check types. Configure +structured-object checks against captured attributes. Text and markdown +body-text checks require a compatible `content` mapping. + +`attributes` accepts shorthand single-column captures and structured +multi-column captures: + +```yaml +attributes: + title: title + author: + columns: + first: author_first + last: author_last +``` + +The structured form above exposes `author.first` and `author.last` as fields +inside the `author` attribute object. + +## Per-collection files + +A base whose `collections:` block grows unwieldy may split collections into +one file each under `.katalyst/bases//.yaml`, named for +its filename stem. Inline and per-file collections coexist; a name declared both +inline and in a file is an error. + +```yaml +# .katalyst/bases/local/books.yaml +path: notes/books +schema: book +``` diff --git a/docs/content/reference/configs/discovery.md b/docs/content/reference/configs/discovery.md new file mode 100644 index 0000000..9cb2eb0 --- /dev/null +++ b/docs/content/reference/configs/discovery.md @@ -0,0 +1,42 @@ ++++ +title = "Discovery" +weight = 10 ++++ + +# Discovery + +Katalyst reads a `.katalyst/` directory, found by walking upward from the +current working directory to the nearest ancestor that contains one. That +ancestor is the repo root; all relative paths resolve against it. + +Discovery resolves symlinks on both the root and the input path, because on +macOS `$TMPDIR` lives behind `/var` to `/private/var` and relative-path +resolution would otherwise produce garbage. + +## Layout + +``` +.katalyst/ + config.yaml # optional: listing defaults and discovery settings + schemas/ # one JSON Schema file per named schema + book.json + bases/ # one file per base + local.yaml # a base + the collections it declares + local/ # optional: one file per collection (escape hatch) + books.yaml +``` + +By default, schemas and bases are discovered by **convention**: +every file under `schemas/` is a schema whose name is its filename stem +(`book.json` -> `book`), and every file under `bases/` is a +[base]({{< relref "bases.md" >}}) named for its filename stem (`local.yaml` -> `local`). +`config.yaml` is optional; it carries [listing]({{< relref "listing.md" >}}) defaults and can +switch a kind to **explicit** discovery, listing definitions inline instead of +as files. + +`config.yaml` is YAML; schema and base files default to YAML/JSON, and the +accepted format is set per kind there. + +Legacy projects that still use `storage:` in `config.yaml` or +`.katalyst/storage/` continue to load. Do not mix legacy and new forms in the +same project; move legacy base files to `.katalyst/bases/` when you edit them. diff --git a/docs/content/reference/configs/listing.md b/docs/content/reference/configs/listing.md new file mode 100644 index 0000000..34fe3da --- /dev/null +++ b/docs/content/reference/configs/listing.md @@ -0,0 +1,36 @@ ++++ +title = "Listing" +weight = 70 ++++ + +# Listing + +Two `item list` behaviors have configurable defaults. A `listing:` block sets +them project-wide in `.katalyst/config.yaml`, and a collection's file can +override either key for that collection. + +| Key | Values | Default | Meaning | +|---|---|---|---| +| `filterTypeMismatch` | `skip` / `error` | `skip` | A `--filter` comparison against an incompatible type either skips the item or exits 2. | +| `sortMissing` | `last` / `lowest` | `last` | Where items lacking the `--sort` key land: at the end (both directions), or below any present value. | + +```yaml +# .katalyst/config.yaml - project default +listing: + filterTypeMismatch: skip + sortMissing: last +``` + +```yaml +# under a base's collections: override for one collection +books: + path: notes/books + schema: book + listing: + filterTypeMismatch: error +``` + +Resolution is highest-precedence first: the `--on-type-mismatch` / +`--sort-missing` flags, then the collection's `listing:`, then the project +`listing:`, then the built-in default. An unset key falls through to the next +level. diff --git a/docs/content/reference/configs/schemas.md b/docs/content/reference/configs/schemas.md new file mode 100644 index 0000000..8725b59 --- /dev/null +++ b/docs/content/reference/configs/schemas.md @@ -0,0 +1,16 @@ ++++ +title = "Schemas" +weight = 20 ++++ + +# Schemas + +Each file under `.katalyst/schemas/` is a JSON Schema. Its **name**, the +filename stem, is the stable public handle used by `schema get `, by +an inline `schema: ` key in a document's frontmatter, and by a +collection's `schema:` shorthand. The path can move; the name should not. + +Schemas are stored flat; the check library that compiles a schema is determined +by the referencing check type's `kind` (the `object` check uses JSON Schema). + +See [Checks]({{< relref "checks.md" >}}) for object-schema resolution precedence. diff --git a/docs/content/reference/configs/variants.md b/docs/content/reference/configs/variants.md new file mode 100644 index 0000000..13b4926 --- /dev/null +++ b/docs/content/reference/configs/variants.md @@ -0,0 +1,58 @@ ++++ +title = "Variants" +weight = 60 ++++ + +# Variants + +A collection runs its base `schema`/`checks` against every item. **Variants** +let it run *extra* checks on a subset, chosen by the item's metadata. Each +entry in a collection's `variants:` list has a `when` discriminator and its own +`schema`/`checks`: + +```yaml +pages: + path: docs/content + pattern: "**/*.md" + schema: page # base: every page needs a title + variants: + - when: "bookCollapseSection" # section landing pages have this flag + schema: section_index + - when: "!bookCollapseSection" # every other page is a content page + schema: content_page + checks: + - kind: object_required_field + field: weight + - kind: markdown_requires_h1 + useExhaustiveVariants: false # default +``` + +**`when`** is a list of [`item list --filter`]({{< relref "../cli.md#filter-predicates" >}}) +predicates (`field=value`, `field>=n`, `field=~regex`, `!field`, ...), evaluated +against the item's frontmatter. All entries must hold (AND). Three shapes are +accepted, the first two desugaring to the third: + +```yaml +when: "kind=section" # one predicate +when: ["kind=section", "w>1"] # a list of predicates +when: { where: ["kind=section"] } +``` + +## Resolution + +An item runs the base checks plus the checks of the **first** variant (in list +order) whose `when` it satisfies, at most one variant applies. A variant *adds* +to the base, so a check belongs in a variant exactly when some page type must +skip it: in the example, `weight` and the H1 requirement apply to content pages +but not section indexes. A variant may declare no checks at all (a deliberate +exemption). + +An item that matches **no** variant runs the base checks alone. Set +**`useExhaustiveVariants: true`** to instead make an unmatched item a check +failure (`matches no variant`), so every item is provably accounted for. + +Discrimination is by metadata only; selecting items by path or filename is not +supported yet (a page type distinguishable only by location needs a frontmatter +marker). `pattern` still governs collection **membership** and which files are +reported as [unmatched]({{< relref "../../deep-dives/domain-model/_index.md" >}}#invariants); +variants only route checks. diff --git a/docs/content/reference/data-surfaces/plain-text.md b/docs/content/reference/data-surfaces/plain-text.md index 1dc86f5..7fbbf1a 100644 --- a/docs/content/reference/data-surfaces/plain-text.md +++ b/docs/content/reference/data-surfaces/plain-text.md @@ -44,5 +44,5 @@ literal strings." - [Plain text check types]({{< relref "../check-types/plain-text/_index.md" >}}) - [Markdown body text]({{< relref "markdown-body-text.md" >}}) -- [Configs]({{< relref "../configs/_index.md#text-rules" >}}) for text +- [Configs]({{< relref "../configs/checks.md#text-rules" >}}) for text rule configuration. diff --git a/docs/content/reference/data-surfaces/structured-object.md b/docs/content/reference/data-surfaces/structured-object.md index 8d6bfae..a4a9958 100644 --- a/docs/content/reference/data-surfaces/structured-object.md +++ b/docs/content/reference/data-surfaces/structured-object.md @@ -47,5 +47,5 @@ the item and is removed before the item is validated against that schema. - [Structured object check types]({{< relref "../check-types/structured-object/_index.md" >}}) - [Markdown body text]({{< relref "markdown-body-text.md" >}}) -- [Configs]({{< relref "../configs/_index.md#object-schema-resolution-precedence" >}}) +- [Configs]({{< relref "../configs/checks.md#object-schema-resolution-precedence" >}}) - [Collections]({{< relref "../../deep-dives/domain-model/collections.md" >}})