diff --git a/CHANGELOG.md b/CHANGELOG.md
index 8ba38cf6..4bab1cdc 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -7,8 +7,65 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
## Unreleased
+### Changed
+
+- Breaking: defining a name that is already defined is now an error, at runtime
+ and in the type checker. This covers a second `def` in the same file, a
+ script `def` whose name is already taken by the standard library or an init
+ file, and an interactive redefinition. Definition lookup is first-match-wins,
+ so a duplicate never took effect anyway — it was silently dead code while the
+ first definition kept running; the error makes that visible. The message
+ reports both positions: `Duplicate definition 'id'; already defined at
+ lib/std.msh:62:5.`
+
+### Fixed
+
+- `gridSetCell` no longer silently drops a value whose type differs from the
+ column's original type (e.g. setting a string, enum, or bool into an int
+ column). The column is promoted to mixed storage so the value is stored, and
+ the other rows are preserved.
+- Equality (`=`) is now total and defined for every value type. Lists,
+ quotations, pipes, and grids previously raised "equality not defined" at
+ runtime; lists/pipes/grids now compare structurally (element- and cell-wise)
+ and quotations compare by identity. Comparing values of different runtime
+ types yields `false` rather than an error (a genuinely incompatible
+ comparison is already a static type error), so the result no longer depends
+ on operand order and union members like `int | null` compare cleanly.
+- Converting a cyclic value (a container appended into itself) with `str` or
+ `toJson` now fails with a clear error instead of hanging forever. Internal
+ rendering (error messages, stack dumps) prints a `` marker at the
+ back-reference instead.
+- A `match` used as the body of an inferred quotation (e.g.
+ `(match leaf n : @n, node a b : 0, end) map`) now type-checks; it previously
+ always failed with "stack underflow at 'match'", rejecting the canonical way
+ to consume enums and Maybe values inside `map`/`filter`/`each`.
+- `uniq` now accepts a list of any value type (matching its `([t] -- [t])`
+ signature) and deduplicates by structural equality, instead of throwing at
+ runtime for non-primitive elements such as enums, dicts, and booleans.
+- `sort` now reorders the original elements and preserves their type, instead of
+ replacing every element with its string form. Previously `[10 2 1] sort` gave
+ the strings `1 10 2` (lexical order), and sorting a list of enums silently
+ dropped their payloads; now numbers sort numerically and stay numbers, and
+ every value keeps its type. Ordering is a total structural order: numbers
+ numerically, text lexically, lists positionally, dicts by sorted key/value,
+ enums by declaration order then payload, and different types by a fixed type
+ rank. (Use `sortV` for version/string sorting.)
+
### Added
+- `enum` declarations: a generative tagged sum type. Members are separated by
+ `|` and the declaration is closed by `end` (like `def`/`if`/`match`):
+ `enum CmdResult = ok str | failed int str | timeout end`. A member is a bare
+ constructor name optionally followed by payload types; a bare member word
+ constructs a value (consuming any payload from the stack, e.g.
+ `404 "x" failed`), and `match` dispatches on members with binding
+ (`failed code msg : ...`) and exhaustiveness checking — a match that omits a
+ member (or is empty) is rejected unless it has a `_` arm. Enums are nominal:
+ two enums with the same members are distinct types. Member names are
+ identifiers (not keywords) and are unique across all enums. `str` renders a
+ value as `member(payload ...)` and `toJson` uses the externally-tagged
+ convention. Payloads may reference the enum itself, so recursive enums like
+ `enum Tree = leaf int | node Tree Tree end` are supported.
- Octal, hexadecimal, and binary integer literals via `0o`, `0x`, and `0b`
prefixes (case-insensitive), e.g. `0o644`, `0xFF`, `0b101`. The base is purely
a way of writing the literal; the value is an ordinary integer and prints in
diff --git a/ai/enum_implementation_plan.md b/ai/enum_implementation_plan.md
new file mode 100644
index 00000000..1e74346d
--- /dev/null
+++ b/ai/enum_implementation_plan.md
@@ -0,0 +1,139 @@
+# Enum (generative tagged sum type) — implementation plan
+
+Companion to `design/literal_or_enum_typing.html` (the design + rationale). This is the
+file-by-file build plan. Plans live here in `ai/`; the design lives in `design/`.
+
+**Status: implemented on branch `enum-types`** — nullary + payload-carrying members,
+construction, nominal distinctness, and `match` (member dispatch, payload binding,
+exhaustiveness) all ship in one PR. Payloads use a parenthesized list (`member(T..)`)
+rather than the space-separated form originally sketched, because mshell has no statement
+terminator and space-separated payloads are ambiguous against following code.
+Out of scope, as agreed: derived `decode`/`encode`/`values`, backing strings, qualified
+`Enum.member` names, generics, and `Result` (Maybe suffices). JSON stays a structural union.
+
+## Scope & non-goals
+
+In scope (a generative tagged sum type declared with `enum`, inline `= a | b | c`):
+
+- `enum Name = c1 | c2 | ...` and `enum Name = c1 t.. | c2 t.. | ...`.
+- Constructors are case-free; produced only by a constructor word or `decode` (**Position 1** —
+ no implicit coercion, `str as Enum` rejected).
+- `match` over members with exhaustiveness; payload binding reuses the `just v` path.
+
+Explicit non-goals (per the design + owner direction):
+
+- **No `decode` / `encode` / `values` derived functions** in v1. Reading config back is handled at
+ the use site with `match` (or whatever fits). This removes the wire/serialization surface.
+- **No backing strings** (`member = "wire"`) in v1 — they only existed to feed `decode`/`encode`.
+ The member's own name is its identity. (Easy to add later when serialization is wanted.)
+- **No qualified `Enum.member` / `Enum.method` dispatch** in v1. Members are referenced by bare name,
+ resolved by context; member names are unique across enums (collision is a declaration error). This
+ removes the `.`-lexing / qualified-dispatch unknown entirely.
+- **No `Result` type.** `Maybe` already covers the common case.
+- **No change to JSON typing.** `JsonScalar` / `Json` stay *structural unions* — their variants are
+ distinguishable by structural type, so they do not need tags. Enums are only for cases structure
+ cannot discriminate (e.g. two variants with the same payload type) and for closed config sets.
+- **No generic enums** (`Enum[t, e]`) in v1.
+- **No `?`-propagation sugar** in v1.
+- A *checked* `"GET" as Method` stays deferred (it needs literal/singleton types).
+
+## Phasing
+
+Two PRs. Phase 1 (nullary enums) is now small — declaration, construction, match — with **no
+serialization surface and no qualified names**. Phase 2 adds payload variants and the tagged runtime
+value where `Evaluator.go` gets touched substantially.
+
+---
+
+## Phase 1 — Nullary enums
+
+`enum Mode = read | write | readwrite`, match-by-member + exhaustiveness, Position 1. The full v1
+surface is: declare, construct (bare member word), and `match`. No `decode`/`encode`/`values`, no
+backing strings, no qualified names.
+
+The runtime value just needs to carry **which member it is** (enum `NameId` + member `NameId`). A
+lightweight value suffices; no new heavy `MShellObject` is required for Phase 1.
+
+### Type system
+
+1. **`Lexer.go`** — add an `ENUM` keyword token (via `literalOrKeywordType`). Audit existing user
+ identifiers/usages of `enum`.
+2. **`TypeParseIntegration.go`** — add `MShellEnumDecl` (parallel to `MShellTypeDecl`) and
+ `ParseEnumDecl`: `enum` Name `=` member (`|` member)*, where a member is a bare `LITERAL`. No
+ backing clause.
+3. **`Parser.go`** — add `case ENUM:` beside `case TYPE:` (≈ line 677) dispatching to `ParseEnumDecl`.
+4. **`Type.go`** — add `TKEnum` kind + a variants side table (`[]EnumVariant{Name NameId; Payload
+ []TypeId}`; Payload empty in Phase 1), `MakeEnum`, hashconsing key, and accessors. Nominal
+ identity = the declaration `NameId` (two `enum`s with identical members stay distinct, like brands).
+5. **`Type.go` / `TypeUnify.go`** — extend `walkTypeVars` and `typeRewriter.mapType` with a `TKEnum`
+ arm (recurse payload types; none in Phase 1). `unify` (`TypeChecker.go`): `TKEnum` unifies only
+ with the same enum (by name).
+6. **Constructors as words** — register each member as a nullary sig `( -- Mode)`. Members live in a
+ **global constructor namespace**; a member name duplicated across two enums is a declaration error
+ in v1 (no qualification to disambiguate yet). A bare member word resolves to its enum; where an
+ expected enum type is in context (match subject, sig slot) that pins it.
+7. **Pre-pass registration** — mirror `DeclareType` registration (`TypeCheckProgram.go:99-101`):
+ collect `enum` headers with placeholder TypeIds, resolve bodies, register constructor words,
+ detect cross-enum member-name collisions.
+8. **Match** — `analyzeTokenPattern` (`TypeCheckProgram.go:1381`): an enum member name is a
+ recognized pattern that **credits coverage** against the enum's closed set (flip the
+ "value literals credit no coverage" behavior at `:1402` for enum subjects). `TypeBranch.go`:
+ exhaustiveness over the member set; narrowing (subject known to be that member in the arm).
+
+### Runtime (`Evaluator.go`)
+
+9. A lightweight enum value (enum + member ids). Constructor evaluation pushes it; member-pattern
+ matching extends the `matchTokenPattern` path that already handles `none`/type keywords near
+ `:1117`; plus equality, `DebugString`, `ToJson`.
+
+### Docs / housekeeping
+
+10. `doc/type_system.inc.html` + `doc/mshell.md` (rebuild with `cd doc; msh build.msh`).
+11. `CHANGELOG.md` → Unreleased / Added.
+12. `lib/std.msh` completions, in the documented Vim-fold pattern.
+13. Tests: `tests/` (+ `typecheck_test.sh`) and `mshell/ go test`. Cover: decl parse, construct,
+ match exhaustive (no `_`), non-exhaustive rejected, member narrowing, `str as Enum` rejected,
+ two enums with same members stay distinct, duplicate member name across enums rejected.
+
+---
+
+## Phase 2 — Payload-carrying variants
+
+`enum CmdResult = ok str | failed int str | timeout`. Adds:
+
+1. **Parser** — arms parse a constructor name followed by payload type exprs (reuse
+ `parseTypeExpr` productions for each payload).
+2. **`Type.go`** — `EnumVariant.Payload` populated; payload types flow through hashconsing and the
+ rewriter arms added in Phase 1.
+3. **Constructors with payloads** — `failed : (int str -- CmdResult)`, postfix, consume from the
+ stack like `5 just`.
+4. **Runtime value** — a new `MShellObject` generalizing `Maybe`: `{ enum NameId; tag; payload
+ []MShellObject }`. `Maybe` is the proven two-variant precedent; follow its equality/`DebugString`/
+ `ToJson` shape. Phase-1 nullary values fold in as the empty-payload case.
+5. **Match payload binding** — extend the `just v`-style binding (`TypeCheckProgram.go:1348`,
+ `Evaluator.go:1055`) to N payloads: `failed c e : ...` binds `c`, `e`.
+6. **Recursive enums** — already work via the placeholder-TypeId pre-pass.
+7. Docs / changelog / completions / tests as above (payload construct + destructure + recursive
+ enum + exhaustiveness with payloads).
+
+(Serialization helpers — `decode`/`encode`/backing strings — remain out of scope until a concrete
+need appears; config reads are handled with `match` at the use site.)
+
+---
+
+## Process
+
+- New feature branch before any code (per `CLAUDE.md`).
+- Build in `mshell/` (`go build -o ...`, in-repo cache if needed) before testing.
+- `gofmt` only with explicit permission.
+- `CHANGELOG.md` for user-facing additions; `mshell/BuiltInList.go` kept in sync if builtins added.
+
+## Decisions still to nail before coding Phase 1
+
+None blocking. The former unknowns (qualified-name dispatch, backing defaults, decode/encode
+delivery) are all dropped from v1 scope above. Remaining small calls can be made during the build:
+
+- Exact lexical home for the lightweight runtime enum value (new `MShellObject` vs. reuse).
+- Whether a bare member word with **no** expected-type context (e.g. stored straight into a var) is
+ allowed (resolves via the global member namespace) or requires a context — default: allowed, since
+ member names are unique across enums in v1.
diff --git a/code/syntaxes/mshell.textmate.json b/code/syntaxes/mshell.textmate.json
index b7bfb655..678189e1 100644
--- a/code/syntaxes/mshell.textmate.json
+++ b/code/syntaxes/mshell.textmate.json
@@ -45,6 +45,46 @@
{
"name": "variable.other.set.mshell",
"match": "@[a-zA-Z0-9_]+"
+ },
+ {
+ "name": "keyword.control.mshell",
+ "match": "else\\*|\\*if"
+ },
+ {
+ "name": "keyword.control.mshell",
+ "match": "\\b(def|end|if|iff|loop|read|str|break|continue|else|match|enum|type)\\b"
+ },
+ {
+ "name": "keyword.operator.word.mshell",
+ "match": "\\b(and|or|not)\\b"
+ },
+ {
+ "name": "keyword.other.mshell",
+ "match": "\\bsoe\\b"
+ },
+ {
+ "name": "storage.type.mshell",
+ "match": "\\b(int|float|bool)\\b"
+ },
+ {
+ "name": "constant.numeric.integer.hex.mshell",
+ "match": "\\b0[xX][0-9A-Fa-f]+\\b"
+ },
+ {
+ "name": "constant.numeric.integer.octal.mshell",
+ "match": "\\b0[oO][0-7]+\\b"
+ },
+ {
+ "name": "constant.numeric.integer.binary.mshell",
+ "match": "\\b0[bB][01]+\\b"
+ },
+ {
+ "name": "constant.numeric.float.mshell",
+ "match": "\\b\\d+\\.\\d*(?:[eE][+-]?\\d+)?\\b"
+ },
+ {
+ "name": "constant.numeric.integer.mshell",
+ "match": "\\b\\d+(?:[eE][+-]?\\d+)?\\b"
}
],
"repository": {
diff --git a/design/literal_or_enum_typing.html b/design/literal_or_enum_typing.html
new file mode 100644
index 00000000..d9bc41e3
--- /dev/null
+++ b/design/literal_or_enum_typing.html
@@ -0,0 +1,567 @@
+
+
+
+
+
+ Enums & Generative Types — mshell design
+
+
+
+
+
+
+
+ The motivating request was type ConfigOption = "string1" | "string2". Working
+ through it showed the real missing primitive is not "literal types" but a generative
+ declaration — one that mints constructors and tags values at runtime — of which a
+ plain enumeration is just the simplest case. The proposal is one new keyword, enum, whose
+ grammar is a one-token delta from the existing type X = A | B.
+
+ 1. The question, answered in one line
+ "At what point does an enum differ from a regular type definition with constructors?"
+ It differs exactly when the declaration introduces constructors — new ways to
+ make values that the runtime must tag — rather than merely naming or
+ combining types that already have values. Below that point ("an enum" of bare constants) the difference
+ is cosmetic; at or above it (variants carrying payloads) it is a genuinely new capability that today's
+ type cannot express. So there is one mechanism, and the colloquial "enum" is its degenerate
+ case.
+
+ 2. Where mshell stands today
+ Most of the machinery already exists, which is why this is an extension and not a bolt-on:
+
+
+ | Capability | Status | Where |
+
+ Hashconsed type arena (TypeId = structural identity) | have | Type.go |
+ Structural unions A | B (flatten / sort / dedupe) | have | TKUnion, MakeUnion |
+ Nominal brands / newtypes (type X = ...) | have | TKBrand, brandify (TypeCast.go) |
+ HM unification, type vars, occurs check; subtyping folded into unify | have | TypeUnify.go, TypeChecker.go |
+ Surface type Name = <expr>, | unions, as casts | have | TypeExpr.go, TypeParseIntegration.go |
+ One generative tagged sum type: Maybe[t] = just t | none | have | MShellObject.go:172; just/none |
+ | Match with constructor destructuring + exhaustiveness | have | TypeBranch.go; match |
+ | A declaration that introduces new constructors | have | enum — TypeEnum.go, MShellEnum in MShellObject.go |
+
+
+
+ Two structural facts frame everything:
+
+ - Literals widen at the push site.
TypeChecker.go:205 pushes
+ TidStr for any string literal; there is no "the type of only "string1"".
+ type is structural and non-generative. type X = A | B names
+ a combination of existing types; even a branded type only re-tags an existing
+ structural value via as. At runtime a branded str is a
+ str — the brand has zero footprint. It cannot introduce a constructor.
+
+
+ 3. The conceptual spine: structural vs. generative
+
+ Two genuinely different kinds of declaration:
+
+
+ | type / unions / brands | generative sum type (enum) |
+
+ | Introduces new constructors? | no | yes |
+ | Discriminates by… | structural type | stored tag |
+ | Runtime footprint of the wrapper | none (re-tag only) | a stored tag |
+ | Two cases with the same payload type? | impossible | fine |
+ | How you build a value | <structural value> as X | call a constructor |
+
+
+
+ The litmus test
+ Try to express two cases that share a representation:
+ ok str | err str # two cases, SAME payload type
+ A structural union str | str collapses to str; even branded, both cases are
+ str at runtime, so match — which discriminates by structural type —
+ cannot tell an ok from an err. The moment two variants share a representation
+ (or you want ok 5 and err 5 to be different values), you need a stored
+ tag, i.e. a real sum type. That is the precise line where "enum" stops being expressible as a
+ type.
+
+ 4. Prior art: Haskell, Rust, TypeScript
+
+
+ | Tag / discriminant | "enum" is… | Structural unions? | type keyword |
+
+ | Haskell | implicit (the constructor is the tag) | an all-nullary data; enum ops via the derivable Enum class | no — sum types are the only union | transparent alias (structural) |
+ | Rust | implicit (compiler-managed) | the C-like case of the one enum keyword | no — must declare an enum | transparent alias (structural) |
+ | TypeScript | explicit, hand-written data field | a separate weak runtime construct, mostly avoided | yes (untagged) — the default idiom | structural; unions + literals do the work |
+
+
+
+
+ - Haskell —
data is the one generative form; an "enum" is just a
+ data whose constructors are all nullary, and enumerable behaviour
+ (succ, [Red ..]) is a derived typeclass, not syntax. type
+ is a transparent alias; newtype is the zero-cost wrapper (≈ mshell's brand).
+ - Rust — one
enum keyword subsumes C-enums and tagged unions.
+ Fieldless variants get bonus powers (explicit discriminants, as i32, compact layout);
+ payload variants compile to a tag + union. Variants are not standalone types; there are no untagged
+ structural unions at all.
+ - TypeScript — the split world: a weak nominal
enum keyword (real
+ runtime objects, widely avoided) plus the dominant structural idiom of literal types and
+ untagged unions, with tagged sum types emulated by hand via a discriminant property
+ ({ kind: "circle"; ... } | { kind: "rect"; ... }). No generative primitive; it is a
+ pattern.
+
+
+ Punchline: Haskell and Rust both concluded the tagged sum type is the primitive and
+ the enum is its degenerate nullary case — one generative form, with type reserved for
+ transparent structural aliases. TypeScript went structural-first and bolted enum on, which
+ is exactly the schism to avoid. mshell already has the structural side (unions; brands ≈ transparent
+ aliases) and one generative sum type (Maybe), so it sits closer to Haskell/Rust.
+ The natural, non-bolted-on move is theirs.
+
+ 5. Decision
+ Add a single generative tagged sum type declaration. Keep type exactly as
+ it is (the transparent / branded structural form). The colloquial "enum of constants" is the all-nullary
+ special case of the one mechanism — not a second concept. This conceptually subsumes Maybe
+ ("the built-in enum Maybe[t] = just t | none"; it stays a distinct built-in until enums
+ grow type parameters — see §10.E) and unlocks "make illegal states unrepresentable" for command
+ results, parse results, and JSON.
+
+ 6. Syntax
+
+ Keyword
+ The keyword — not capitalization — signals "everything here is a constructor," which is what
+ lets the design be entirely case-free (see §8). Candidates:
+
+ | Keyword | Precedent | Note |
+
+ enum (chosen) | Rust, Swift | most recognizable; Rust/Swift legitimized it for payload-carrying variants |
+ variant | OCaml / Reason | accurate, no baggage; less universally known |
+ oneof | Protobuf | self-describing; slightly informal |
+ union | — | rejected: | already means a structural union |
+
+
+
+ Declaration form — |-separated members, closed by end
+ Each arm is a constructor name followed by zero or more space-separated payload types,
+ arms are |-separated, and the body is closed by end — the same block
+ terminator def, if, match, and loop already use. A
+ nullary member is just a bare name.
+
+ enum Mode = readonly | writeonly | readwrite end
+enum Shape = circle float | rect float float | point end
+enum CmdResult = ok str | failed int str | timeout end
+
+ The end terminator is what keeps the grammar whitespace-insensitive. mshell
+ has no statement terminator, so an open-ended, space-separated payload list has no way to mark its end:
+ after a nullary member, the parser otherwise cannot tell a following (...) /
+ [...] statement from a payload. Delimiters attached to the member name
+ (failed(int str)) would fix it only by making whitespace significant, which we reject.
+ end bounds the whole member list instead — and because every payload type is
+ itself self-delimiting (a quote type (a -- b) closes at its )), space-separated
+ payloads inside the body are unambiguous. This is exactly why type X = (a -- b) already has
+ no boundary problem: a type body is one self-bounded expression, whereas an enum body is an open list
+ that needs a terminator.
+
+ Grammar:
+
+ enum Name = arm (| arm)* end
+ - arm ::= constructorName typePrimary* (the constructor name is a bare identifier — not a
+ language keyword such as
read or str, and not a name already taken by a
+ def or another enum's member, see §8; each payload is a type primary, so a union payload
+ must be named via a type alias)
+
+ The only new wrinkle vs. a structural union is that an arm's first token is a binding
+ occurrence (a constructor being declared) rather than a reference to an existing type. The
+ enum keyword is what tells the parser to read it that way — no capitalization rule.
+
+ A long block is still fine
+ Arms may be placed one per line, with an optional leading | for alignment:
+ enum Event =
+ | click int int
+ | key int
+ | close
+end
+
+ 7. Construction and matching (use sites)
+ Both reuse the existing Maybe machinery verbatim — the whole point of "Maybe
+ is just the built-in case."
+
+ Construction is postfix, exactly like 5 just (just : (a -- Maybe[a])).
+ A constructor is a word whose stack effect is "consume the payloads, push the enum":
+ "hello\n" ok # ( str -- CmdResult )
+404 "not found" failed # ( int str -- CmdResult )
+timeout # ( -- CmdResult ), like bare `none`
+
+ Match reuses the just v binding arm; payload names bind in the body. No
+ _ is needed when every constructor is covered — the set is closed, so the checker
+ proves exhaustiveness (the same path just/none use in
+ TypeBranch.go):
+ cmd run match
+ ok out : @out wl,
+ failed c e : $"{@c}: {@e}" wl,
+ timeout : "timed out" wl,
+end
+
+ The payoff: add a fourth constructor later and every match that forgot it becomes a
+ compile error.
+
+ 8. Unique member names, not case, not namespacing
+ Capitalization is deliberately given no meaning anywhere. The one job case used to do — telling a
+ bare constructor apart from a bare variable / def — had a draft answer here
+ (a qualified CmdResult.ok form plus checker-context resolution). What shipped is
+ simpler and stricter: collisions are declaration errors, so a bare member name is always
+ unambiguous and no qualified form exists:
+
+ - Member names are unique across all enums. Declaring a member another enum already
+ uses is a static error, so a bare
ok identifies its enum by itself.
+ - Members and
defs share the word namespace, and a collision in either direction
+ is an error: an enum member may not reuse an existing def/builtin name, and a
+ def may not reuse a member name. (The same rule applies between defs
+ themselves — a duplicate definition name is an error, since lookup is first-match-wins and a
+ duplicate would be silently dead code.)
+ - Member names may not be language keywords (
read, str,
+ type, …), and a lone _ is reserved as the match
+ wildcard.
+
+ match arms need no qualification either: the checker identifies the subject's enum from
+ the member names in the arms, including when the match is the body of an inferred
+ quotation ((match … end) map).
+
+ 9. Representation & implementation sketch
+
+ - Type kind. Add
TKEnum (nominal, keyed by the declaration's
+ NameId) with a side table of variants: each variant is a (NameId, []TypeId)
+ payload list. Hashconsing, walkTypeVars, and typeRewriter.mapType gain one
+ arm each (recurse through payload types; nominal identity by name).
+ - Constructors register as words with quote signatures
+ (
failed : (int str -- CmdResult)), so application and overload machinery type-check them
+ with zero special-casing. Nullary constructors are ( -- Enum), like none.
+ - Runtime value. A new
MShellObject generalizing Maybe: a tag
+ (the variant NameId / index) plus a payload slice. Maybe is the pre-existing
+ two-variant instance, so the model is already proven; equality and DebugString follow its
+ shape.
+ - Match. Constructor patterns credit coverage against the declaration's closed set
+ (flip today's "value literals credit no coverage",
TypeCheckProgram.go:1402, for enum
+ subjects); exhaustiveness reuses the Maybe path; payload binding reuses the
+ just v path (TypeCheckProgram.go:1348).
+ - Forward / recursive types work via the existing top-level pre-pass that reserves
+ placeholder
TypeIds for type headers (extend it to enum):
+ enum Json = jnull | jbool bool | jnum float | jstr str | jarr [Json] | jobj {str: Json} end.
+ - Serialization (as shipped).
str renders member /
+ member(p0 p1 ...); toJson uses serde's externally-tagged convention (a
+ nullary member is the bare member string, one payload is {"member": value}, several are
+ {"member": [v0, v1, ...]}); equality is structural and total; sort orders
+ members by declaration order (a stored member index), so low | medium | high sorts as
+ the author intended. Parsing JSON/argv strings back into an enum still needs an explicit
+ decode word — see §10.B and §11.
+
+
+ 10. Conciseness sugar for common cases
+ Ranked by value-for-effort. Status: A is the shipped base form, and the payload half of C
+ (a shape as a payload type) shipped for free; B, C's destructuring sugar, and D–F remain future
+ work — recorded here as designed so they can land without re-deciding.
+
+ A. The nullary one-liner (already the base form) — shipped
+ enum Env = dev | staging | prod end
+enum Level = debug | info | warn | error end
+ This is the everyday "configuration option" case, and it is already as terse as it can be.
+
+ B. Auto string backing + derived decode / encode / values — future
+ For an all-nullary enum, auto-back each constructor by its name and derive three functions, since config
+ crosses a string boundary. Override the backing with = when the wire format differs:
+ enum Method = get = "GET" | post = "POST" | put = "PUT"
+
+"GET" Method.decode # ( str -- Maybe[Method] ) runtime-validated
+@m Method.encode # ( Method -- str )
+Method.values # ( -- [Method] ) all members, in order
+ An all-nullary, all-backed enum can be represented at runtime as its backing string (free
+ serialization, no tag object) — the Rust "fieldless variants get a compact representation" move.
+ Payload-carrying enums fall back to the tagged value of §9.
+ $CONFIG_LEVEL Level.decode match
+ just lvl : @lvl configureLogging,
+ none : $"bad level: {$CONFIG_LEVEL}; try {Level.values}" wl 1 exit,
+end
+
+ The backing string is wire-only — two paths to a value, no implicit coercion
+ Decided (V1): the only ways to produce an enum value are a constructor or
+ decode. The backing string is purely a serialization detail, never a way to name a
+ member in code — the same separation Rust/Serde draws (you write Method::Get in code;
+ "GET" is the wire format's business).
+
+ | Path | Signature | For |
+
+ | constructor | get : ( -- Method) | a compile-time-known member — the normal in-source form |
+ decode | ( str -- Maybe[Method]) | a runtime string of unknown value, validated at the boundary |
+
+
+ A bare str is not a Method, and
+ "GET" as Method is rejected: with no literal/singleton types, the checker
+ cannot verify a plain str is a valid backing, so an as here would be an
+ unchecked re-tag that could mint a Method matching no arm. This keeps the language's
+ existing "no implicit coercions" invariant intact and the structural→nominal boundary crisp:
+ constructors are the only way to make one. It also costs no source terseness — the concise form
+ was never "GET", it is the constructor get.
+ Deferred, additive: if literal/singleton types are ever added for other reasons, a
+ checked "GET" as Method (compile-time membership-verified, "GXT" as Method
+ an error) becomes a safe, optional nicety that can be layered on then with no migration. Implicit
+ bare-literal coercion is explicitly not a goal — it would be mshell's first
+ implicit coercion and would leak the wire format into program logic.
+
+ C. Shape payloads → free named destructuring — half shipped
+ Shipped: a payload may already be any type expression, including a shape —
+ enum Event = click {"x": int} | close end works today, with a match arm binding the whole
+ dict (click d : @d :x? …). Future: the sugar below, destructuring
+ the shape into named bindings in the pattern itself; it reuses TKShape and the existing
+ { 'k': name } dict-pattern matcher (including optional ?: fields) for both
+ construction and destructuring:
+ enum Event = click {x: int, y: int} | key {code: int, shift?: bool} | close
+
+ev match
+ click {x: x, y: y} : $"click {@x},{@y}" wl,
+ key {code: c} : @c handleKey,
+ close : "bye" wl,
+end
+
+ D. Combined arms (|) for fan-in — future
+ status match
+ pending | running : "in progress" wl,
+ done : "complete" wl,
+ failed e : @e wl,
+end
+
+ E. Generic enums subsume the built-ins — future
+ Enums are not yet generic: a recursive payload (node Tree Tree) works, type parameters do
+ not, so Maybe remains the built-in instance rather than being literally subsumed.
+ enum Result[t, e] = ok t | err e end
+# and Maybe[t] = just t | none is simply built in
+
+ F. ?-style propagation (speculative)
+ mshell already has guard-style return. For a designated Result-shaped enum, an
+ unwrap-or-early-return operator (Rust's ?) could compress
+ @x parseInt match just v : @v, none : return end to something like @x parseInt!?.
+ Defer until Result is idiomatic.
+
+ 11. End-to-end example (as shipped)
+ The wire boundary is an explicit match on the string; §10.B's derived
+ Mode.decode / Mode.values would replace parseMode if backing
+ ever lands:
+ enum Mode = readonly | writeonly | readwrite end
+
+def parseMode (str -- Maybe[Mode])
+ match
+ "ro" : readonly just,
+ "wo" : writeonly just,
+ "rw" : readwrite just,
+ _ : none,
+ end
+end
+
+def openFile (path Mode -- Handle)
+ mode!
+ @mode match
+ readonly : @path openRead,
+ writeonly : @path openWrite,
+ readwrite : @path openRW,
+ end
+end
+
+# at the boundary: a string from argv, validated exactly once
+$1 parseMode match
+ just m : somePath @m openFile use,
+ none : "--mode must be one of ro, wo, rw" wl 1 exit,
+end
+ The checker guarantees openFile handles every mode, that no unvalidated string reaches it,
+ and that adding an append constructor breaks the build at openFile until it is
+ handled.
+
+ 12. Open questions — as resolved
+
+ - Backing types & defaults — still open: auto-back nullary enums by
+ constructor name verbatim, or lowercased? Allow
int backings as well as
+ str? Deferred with the rest of §10.B; nothing shipped constrains the answer.
+ - Ergonomics — decided (V1), holds as shipped: a value is produced only by a
+ constructor (or a hand-written parse word returning
Maybe, as in §11; §10.B's derived
+ decode would join them). A bare backing literal does not stand in for a member
+ and str as Method is rejected (see §10.B). A checked as remains a clean,
+ additive follow-on gated on literal types; implicit coercion is ruled out.
+ - Constructor namespace collisions — resolved, differently than §8's draft:
+ member names are globally unique and member/
def collisions are declaration errors in
+ both directions; there is no qualified form and bare names are always unambiguous (§8).
+ - Match-form ambiguity — resolved as designed: the parser stays uncommitted;
+ the checker discriminates constructor-pattern vs. type-pattern vs.
Maybe-pattern by the
+ subject's static type, inferring the subject even when the match is an inferred
+ quotation's body. A bare enum type name is a valid arm for an enum inside a
+ type union.
+ - Sequencing — resolved: nullary and payload variants shipped together (one
+ declaration form); backed enums remain future work.
+ - Reserved word — done:
enum is a lexer keyword, covered in the
+ docs highlighter and the Sublime / VS Code grammars; a lone _ is reserved as the
+ match wildcard at the same time.
+
+
+
+
+
+
+
+
diff --git a/doc/base.html b/doc/base.html
index 6e5491ee..2ea961f7 100644
--- a/doc/base.html
+++ b/doc/base.html
@@ -27,7 +27,7 @@
color: #0000FF;
}
- .mshellIF, .mshellELSE, .mshellELSESTAR, .mshellSTARIF, .mshellEND, .mshellDEF, .mshellMATCH {
+ .mshellIF, .mshellELSE, .mshellELSESTAR, .mshellSTARIF, .mshellEND, .mshellDEF, .mshellMATCH, .mshellENUM, .mshellTYPE {
color: #0F4C81;
font-weight: bold;
}
diff --git a/doc/mshell.md b/doc/mshell.md
index e2592b6b..07f0b174 100644
--- a/doc/mshell.md
+++ b/doc/mshell.md
@@ -655,6 +655,53 @@ An operation that is valid for only some members is a type error — dividing an
For more detail, see the generated Type System help page.
+## Enums
+
+An `enum` declares a generative tagged sum type: members separated by `|`, closed by `end` (like `def`/`if`/`match`):
+
+```
+enum Color = red | green | blue end
+```
+
+A bare member name constructs that value (`green` pushes a `Color`).
+Member names are identifiers (not keywords) and are unique across all enums.
+Unlike a union, an enum is nominal — two enums with the same members are distinct types.
+
+Members may carry a payload — types written after the member name; the constructor consumes those values from the stack.
+A nullary member has no payload. The closing `end` bounds the member list, so payloads are never confused with the following code. Payloads may reference the enum itself (recursive enums).
+
+```
+enum CmdResult = ok str | failed int str | timeout end
+404 "not found" failed # ( int str -- CmdResult )
+
+enum Tree = leaf int | node Tree Tree end
+```
+
+`match` dispatches on the member and binds payload values (like `just v`).
+A match must cover every member or include a `_` arm; omitting a member is a static error.
+
+```
+result match
+ ok out : @out wl,
+ failed c e : $"{@e} ({@c})" wl,
+ timeout : "timed out" wl,
+end
+```
+
+An enum may also be a member of a `type` union (e.g. `type T = Color | int`).
+A `match` on such a union discriminates it with the enum's *type name* as an arm,
+which matches any value of that enum:
+
+```
+enum Color = red | green | blue end
+type T = Color | int
+
+x match
+ Color : "a color" wl,
+ int : "an int" wl,
+end
+```
+
## Definitions
Definitions use `def` with an optional metadata dictionary before the type signature.
@@ -667,6 +714,11 @@ end
Metadata values must be static: strings (single or double quoted), integers, floats, booleans, or nested lists/dicts of the same. Interpolated strings are not allowed.
+Definition names must be unique.
+Defining a name that is already defined — by the same file, the standard library, or an init file — is an error,
+both at runtime and in the type checker.
+(Lookup is first-match-wins, so a duplicate would never take effect; the error makes that visible.)
+
### Tail-Call Optimization
Recursive definitions in tail position are optimized to avoid stack overflow.
@@ -1265,8 +1317,8 @@ groupBy
## Sorting
-- `sort`: Sort list. Converts all items to strings, then sorts using go's `sort.Strings` `(list -- list)`
-- `sortV`: Version sort list. Converts all items to strings, then sorts like GNU `sort -V` (`list -- list`)
+- `sort`: Sort a list by a total structural order, preserving each element's type (numbers sort numerically and stay numbers; a list of enums keeps its payloads). The order is: numbers numerically, text (str/path/literal) lexically, dates chronologically, bytes bytewise, lists positionally, dicts by sorted key then value, enums by declaration order then payload, and values of different types by a fixed type rank. `([t] -- [t])`
+- `sortV`: Version sort list. Converts each item to a string, then sorts like GNU `sort -V`, keeping the original elements. `([t] -- [t])`
- `sortBy`: Sort a Grid or GridView by one or more columns ascending. Spec is a column name (str) or list of column names ([str]); priority is left-to-right. Stable; `none` cells sort last; cross-type values in a generic column error. Compose with `reverse` for descending. `(Grid|GridView str|[str] -- Grid)`
- `sortByCmp`: Sort a list, Grid, or GridView using a comparison function. The function/quotation receives two items (or two `GridRow`s) and should return -1 when a < b, 0 when a = b, or 1 when a > b. Stable. `[a] (a a -- int) -- [a]` / `(Grid|GridView (GridRow GridRow -- int) -- Grid)`
- `reverse`: Reverse a list, Grid, or GridView, returning a new value with elements/rows in reverse order. `(list -- list)` / `(Grid|GridView -- Grid)`
diff --git a/doc/type_system.inc.html b/doc/type_system.inc.html
index bae627c4..f79d23d1 100644
--- a/doc/type_system.inc.html
+++ b/doc/type_system.inc.html
@@ -11,6 +11,7 @@
Type Expressions
Dictionaries
Lists
+ Enums
Quotations
Control Flow
Current Boundaries
@@ -341,6 +342,69 @@
+
+
+
+An enum declares a named type whose values are a fixed set of members (constructors).
+The members are separated by |, and the declaration is closed by end — like def, if, and match.
+Unlike a union, an enum is generative: each member is a brand-new value that the language tags, so two enums with the same members are still distinct types, and a member can carry data that another member does not.
+
+
+enum Color = red | green | blue end
+
+
+A member name is written bare to construct that value.
+Member names are ordinary identifiers (they may not be language keywords), and every member name is unique across all enums.
+
+
+green # a value of type Color
+
+
+Members may carry a payload: a list of types written after the member name.
+The constructor then consumes those values from the stack, just like any other word.
+A member with no payload is nullary.
+The closing end marks where the member list stops, so a payload list is never confused with the code that follows.
+
+
+enum CmdResult = ok str | failed int str | timeout end
+
+"output" ok # ( str -- CmdResult )
+404 "not found" failed # ( int str -- CmdResult )
+timeout # ( -- CmdResult )
+
+
+A payload may reference the enum itself, so recursive enums are allowed.
+
+
+enum Tree = leaf int | node Tree Tree end
+
+
+Use match to dispatch on the member.
+A payload member binds its values to names in the arm body, the same way just v binds a Maybe payload.
+The match must cover every member, or include a wildcard _ arm; a match that omits a member is a static error, so adding a member later forces every match to handle it.
+
+
+def render (CmdResult -- str)
+ match
+ ok out : @out,
+ failed c e : $"{@e} ({@c})",
+ timeout : "timed out",
+ end
+end
+
+
+An enum can also be one member of a type union, such as type T = Color | int.
+A match on the union discriminates it with the enum's type name as an arm, which matches any value of that enum:
+
+
+enum Color = red | green | blue end
+type T = Color | int
+
+x match
+ Color : "a color",
+ int : "an int",
+end
+
diff --git a/mshell/Evaluator.go b/mshell/Evaluator.go
index a2c35f48..eda38f95 100644
--- a/mshell/Evaluator.go
+++ b/mshell/Evaluator.go
@@ -396,6 +396,59 @@ type EvalState struct {
defIndex map[string]int
defIndexLen int
+
+ // EnumMembers maps a member name to its enum and payload arity. Populated
+ // from `enum` declarations (RegisterEnums) before evaluation; member names
+ // are unique across enums, so this flat lookup is enough to construct a
+ // value from a bare member word, including consuming its payload.
+ EnumMembers map[string]EnumMemberInfo
+
+ // EnumTypeNames is the set of declared enum type names. A bare enum type
+ // name is a valid match-arm pattern (a type test that any member of that
+ // enum satisfies), e.g. matching a `C | int` union value against `C`.
+ EnumTypeNames map[string]bool
+}
+
+// EnumMemberInfo records where a member came from and how many payload values
+// its constructor consumes from the stack.
+type EnumMemberInfo struct {
+ EnumName string
+ Arity int
+ // Ordinal is the member's 0-based position in its enum declaration, stamped
+ // onto constructed values (MShellEnum.MemberIndex) so sorting can order by
+ // declaration order.
+ Ordinal int
+}
+
+// RegisterEnums scans parse items for `enum` declarations and records each
+// member, so a bare member word can be constructed at evaluation time. Called
+// once before top-level evaluation; mirrors the checker's enum pre-pass so the
+// two agree regardless of declaration order.
+func (state *EvalState) RegisterEnums(items []MShellParseItem) {
+ for _, item := range items {
+ d, ok := item.(*MShellEnumDecl)
+ if !ok {
+ continue
+ }
+ if state.EnumMembers == nil {
+ state.EnumMembers = make(map[string]EnumMemberInfo)
+ }
+ if state.EnumTypeNames == nil {
+ state.EnumTypeNames = make(map[string]bool)
+ }
+ state.EnumTypeNames[d.Name] = true
+ for i, m := range d.Members {
+ if _, exists := state.EnumMembers[m]; exists {
+ continue
+ }
+ state.EnumMembers[m] = EnumMemberInfo{EnumName: d.Name, Arity: len(d.MemberPayloads[i]), Ordinal: i}
+ }
+ }
+}
+
+// isEnumTypeName reports whether name is a declared enum type name.
+func (state *EvalState) isEnumTypeName(name string) bool {
+ return state.EnumTypeNames != nil && state.EnumTypeNames[name]
}
// RebuildDefinitionIndex records the first index for each name, matching
@@ -422,6 +475,38 @@ func (state *EvalState) lookupDefinition(definitions []MShellDefinition, name st
return definitions[i], true
}
+// tokenPosStr formats a token's position as `path:line:col` (or `line:col`
+// when the source file is unknown, e.g. stdin) for use in error messages.
+func tokenPosStr(t Token) string {
+ if t.TokenFile != nil && t.TokenFile.Path != "" {
+ return fmt.Sprintf("%s:%d:%d", t.TokenFile.Path, t.Line, t.Column)
+ }
+ return fmt.Sprintf("%d:%d", t.Line, t.Column)
+}
+
+// FindDuplicateDefinition scans the given definition slices, in order, and
+// returns an error for the first name defined twice. Definition lookup is
+// first-match-wins, so a second definition of a name is never an override —
+// it would be silently dead code. Erroring keeps a script (or init file, or
+// interactive input) from redefining a name already taken by the standard
+// library, an earlier startup file, or itself.
+func FindDuplicateDefinition(defLists ...[]MShellDefinition) error {
+ seen := make(map[string]Token)
+ for _, defs := range defLists {
+ for i := range defs {
+ name := defs[i].Name
+ prev, exists := seen[name]
+ if !exists {
+ seen[name] = defs[i].NameToken
+ continue
+ }
+ return fmt.Errorf("%s: Duplicate definition '%s'; already defined at %s.\n",
+ tokenPosStr(defs[i].NameToken), name, tokenPosStr(prev))
+ }
+ }
+ return nil
+}
+
func (state *EvalState) AddCompletionDefinitions(definitions []MShellDefinition) {
if state.CompletionDefinitions == nil {
state.CompletionDefinitions = make(map[string][]MShellDefinition)
@@ -909,6 +994,11 @@ func (state *EvalState) processToken(token MShellParseItem, frame *EvaluationFra
// Static-only: type declarations have no runtime effect by design.
return SimpleSuccess()
+ case *MShellEnumDecl:
+ // Static-only: enum declarations have no runtime effect; members are
+ // pre-registered via RegisterEnums.
+ return SimpleSuccess()
+
case *MShellAsCast:
// Static-only: `as` is a checker hint; no runtime work.
return SimpleSuccess()
@@ -1114,15 +1204,68 @@ func (state *EvalState) processMatchBlock(matchBlock *MShellParseMatchBlock, fra
return state.FailWithMessage(fmt.Sprintf("%d:%d: No matching arm found in match block and no wildcard '_' arm provided.\n", startToken.Line, startToken.Column))
}
+// parseItemLexeme renders a parse item for a diagnostic: a token's lexeme, or a
+// non-token pattern's debug form.
+func parseItemLexeme(item MShellParseItem) string {
+ if tok, ok := item.(Token); ok {
+ return tok.Lexeme
+ }
+ return item.DebugString()
+}
+
// matchPattern checks if a subject matches a pattern (list of parse items).
// Returns (matched bool, bindings map, result EvalResult).
func (state *EvalState) matchPattern(pattern []MShellParseItem, subject MShellObject, startToken Token) (bool, map[string]MShellObject, EvalResult) {
+ // Enum patterns against an enum value. A member name (`member` or
+ // `member b1 b2 ...`) matches that member and binds its payload; a sibling
+ // member just fails this arm and the next is tried. A bare enum *type name*
+ // (`C`) is a type-test arm that matches any member of that enum — this is
+ // how a `C | int` union value is discriminated. `_` and `none` fall through
+ // to the generic handling.
+ if enumVal, ok := subject.(*MShellEnum); ok && len(pattern) >= 1 {
+ if tok, okTok := pattern[0].(Token); okTok && tok.Type == LITERAL && tok.Lexeme != "_" && tok.Lexeme != "none" {
+ if tok.Lexeme == enumVal.Member {
+ binds := pattern[1:]
+ if len(binds) != len(enumVal.Payload) {
+ return false, nil, state.FailWithMessage(fmt.Sprintf("%d:%d: enum member '%s' binds %d payload value(s), got %d.\n",
+ tok.Line, tok.Column, tok.Lexeme, len(enumVal.Payload), len(binds)))
+ }
+ bindings := make(map[string]MShellObject)
+ for _, b := range binds {
+ // A payload binding must be a plain name (a LITERAL) or the
+ // `_` wildcard — not a keyword/operator token (`end`, `x`,
+ // ...) or a nested pattern. This mirrors the type checker
+ // (enumMemberPattern) and the `just`/type-test binding forms,
+ // so the runtime never accepts an arm the checker rejects.
+ bt, okBt := b.(Token)
+ if !okBt || (bt.Type != LITERAL && bt.Type != UNDERSCORE) {
+ return false, nil, state.FailWithMessage(fmt.Sprintf("%d:%d: enum member '%s' payload bindings must be names, not '%s'.\n",
+ tok.Line, tok.Column, tok.Lexeme, parseItemLexeme(b)))
+ }
+ }
+ for i, b := range binds {
+ if bt := b.(Token); bt.Lexeme != "_" {
+ bindings[bt.Lexeme] = enumVal.Payload[i]
+ }
+ }
+ return true, bindings, SimpleSuccess()
+ }
+ // Not this value's member. A single bare enum type name is a type
+ // test: it matches iff the value belongs to that enum.
+ if len(pattern) == 1 && state.isEnumTypeName(tok.Lexeme) {
+ return tok.Lexeme == enumVal.EnumName, nil, SimpleSuccess()
+ }
+ // A sibling member name (or any other literal): this arm fails.
+ return false, nil, SimpleSuccess()
+ }
+ }
+
// Handle multi-token patterns (e.g., "just v" for maybe destructuring,
// or " name" for type-test binding).
if len(pattern) == 2 {
first, firstOk := pattern[0].(Token)
second, secondOk := pattern[1].(Token)
- if firstOk && secondOk && second.Type == LITERAL {
+ if firstOk && secondOk && (second.Type == LITERAL || second.Type == UNDERSCORE) {
if first.Type == LITERAL && first.Lexeme == "just" {
// Maybe Just destructuring
var maybeVal Maybe
@@ -1185,6 +1328,8 @@ func (state *EvalState) matchPattern(pattern []MShellParseItem, subject MShellOb
// matchTokenPattern matches a single token pattern against a subject.
func (state *EvalState) matchTokenPattern(p Token, subject MShellObject) (bool, EvalResult) {
switch p.Type {
+ case UNDERSCORE:
+ return true, SimpleSuccess()
case LITERAL:
if p.Lexeme == "_" {
return true, SimpleSuccess()
@@ -1233,6 +1378,13 @@ func (state *EvalState) matchTokenPattern(p Token, subject MShellObject) (bool,
_, ok := subject.(MShellBinary)
return ok, SimpleSuccess()
}
+ // A bare enum type name is a type test: it matches an enum value of
+ // that enum, and simply fails (try the next arm) for any other value.
+ // This lets a union like `C | int` be discriminated by the arm `C`.
+ if state.isEnumTypeName(p.Lexeme) {
+ en, ok := subject.(*MShellEnum)
+ return ok && en.EnumName == p.Lexeme, SimpleSuccess()
+ }
return false, state.FailWithMessage(fmt.Sprintf("%d:%d: Unknown match pattern literal '%s'. %s\n", p.Line, p.Column, p.Lexeme, matchPatternFormsHint))
case TYPEINT:
@@ -1423,7 +1575,7 @@ func (state *EvalState) matchDictPattern(pattern *MShellParseDict, subject MShel
return false, nil, state.FailWithMessage(fmt.Sprintf("%d:%d: Dict pattern value must be a single binding name.\n", startToken.Line, startToken.Column))
}
tok, ok := kv.Value[0].(Token)
- if !ok || tok.Type != LITERAL {
+ if !ok || (tok.Type != LITERAL && tok.Type != UNDERSCORE) {
return false, nil, state.FailWithMessage(fmt.Sprintf("%d:%d: Dict pattern value must be a literal binding name.\n", startToken.Line, startToken.Column))
}
if tok.Lexeme != "_" {
@@ -5847,6 +5999,25 @@ func (state *EvalState) evaluateToken(t Token, stack *MShellStack, context Execu
return SimpleSuccess()
}
+ // Enum constructor: a bare member word consumes its payload
+ // (if any) from the stack and pushes the enum value.
+ if state.EnumMembers != nil {
+ if info, ok := state.EnumMembers[t.Lexeme]; ok {
+ var payload []MShellObject
+ if info.Arity > 0 {
+ if len(*stack) < info.Arity {
+ return state.FailWithMessage(fmt.Sprintf("%d:%d: enum constructor '%s' needs %d payload value(s) on the stack.\n", t.Line, t.Column, t.Lexeme, info.Arity))
+ }
+ payload = make([]MShellObject, info.Arity)
+ for i := info.Arity - 1; i >= 0; i-- {
+ payload[i], _ = stack.Pop()
+ }
+ }
+ stack.Push(&MShellEnum{EnumName: info.EnumName, Member: t.Lexeme, MemberIndex: info.Ordinal, Payload: payload})
+ return SimpleSuccess()
+ }
+ }
+
if t.Lexeme == "stack" {
// Print current stack
fmt.Fprint(os.Stderr, stack.String())
@@ -9303,7 +9474,10 @@ func (state *EvalState) evaluateToken(t Token, stack *MShellStack, context Execu
return state.FailWithMessage(fmt.Sprintf("%d:%d: Cannot do 'toJson' operation on an empty stack.\n", t.Line, t.Column))
}
- jsonStr := obj1.ToJson()
+ jsonStr, cycled := renderValueDetect(obj1, flavorJson)
+ if cycled {
+ return state.FailWithMessage(fmt.Sprintf("%d:%d: Cannot convert a cyclic value (a container that contains itself) to JSON.\n", t.Line, t.Column))
+ }
stack.Push(MShellString{jsonStr})
} else if t.Lexeme == "typeof" {
obj1, err := stack.Pop()
@@ -9682,7 +9856,7 @@ func (state *EvalState) evaluateToken(t Token, stack *MShellStack, context Execu
floatsSeen := make(map[float64]any)
dateTimesSeen := make(map[time.Time]any)
- for i, item := range listObj.Items {
+ for _, item := range listObj.Items {
switch itemTyped := item.(type) {
case MShellString:
strItem := itemTyped
@@ -9723,7 +9897,23 @@ func (state *EvalState) evaluateToken(t Token, stack *MShellStack, context Execu
stringsSeen[literalItem.LiteralText] = nil
}
default:
- return state.FailWithMessage(fmt.Sprintf("%d:%d: Cannot remove duplicates from a list with a %s at index %d (%s).\n", t.Line, t.Column, item.TypeName(), i, item.DebugString()))
+ // Any value without a fast hash path (enum, list,
+ // dict, bool, bytes, ...) is deduplicated by
+ // structural equality against the values kept so
+ // far. O(n^2) for these, but it lets `uniq` accept
+ // every value type, matching its `([t] -- [t])`
+ // signature so the type checker never accepts a
+ // `uniq` that fails at runtime.
+ seen := false
+ for _, kept := range newList.Items {
+ if eq, _ := item.Equals(kept); eq {
+ seen = true
+ break
+ }
+ }
+ if !seen {
+ newList.Items = append(newList.Items, item)
+ }
}
}
@@ -11246,6 +11436,13 @@ func (state *EvalState) evaluateToken(t Token, stack *MShellStack, context Execu
stack.Push(MShellLiteral{t.Lexeme})
}
+ } else if t.Type == UNDERSCORE {
+ // A lone `_` is the pattern wildcard; in a list it is the
+ // literal argv word "_", and nowhere else does it have meaning.
+ if callStackItem.CallStackType != CALLSTACKLIST {
+ return state.FailWithMessage(fmt.Sprintf("%d:%d: '_' is reserved as the match wildcard; use \"_\" for a literal underscore string.\n", t.Line, t.Column))
+ }
+ stack.Push(MShellLiteral{"_"})
} else if t.Type == ASTERISK {
obj1, err := stack.Pop()
if err != nil {
@@ -12319,7 +12516,11 @@ func (state *EvalState) evaluateToken(t Token, stack *MShellStack, context Execu
return state.FailWithMessage(fmt.Sprintf("%d:%d: Cannot convert an empty stack to a string.\n", t.Line, t.Column))
}
- stack.Push(MShellString{obj.ToString()})
+ strVal, cycled := renderValueDetect(obj, flavorStr)
+ if cycled {
+ return state.FailWithMessage(fmt.Sprintf("%d:%d: Cannot convert a cyclic value (a container that contains itself) to a string.\n", t.Line, t.Column))
+ }
+ stack.Push(MShellString{strVal})
} else if t.Type == INDEXER { // Token Type
obj1, err := stack.Pop()
if err != nil {
diff --git a/mshell/Lexer.go b/mshell/Lexer.go
index 7a641f4b..66eee988 100644
--- a/mshell/Lexer.go
+++ b/mshell/Lexer.go
@@ -109,6 +109,8 @@ const (
TRY
FAIL_KEYWORD
PURE
+ ENUM
+ UNDERSCORE // a lone `_`: the match/binding wildcard, reserved as a name
)
func (t TokenType) String() string {
@@ -283,6 +285,10 @@ func (t TokenType) String() string {
return "AS"
case TYPE:
return "TYPE"
+ case ENUM:
+ return "ENUM"
+ case UNDERSCORE:
+ return "UNDERSCORE"
case TRY:
return "TRY"
case FAIL_KEYWORD:
@@ -394,6 +400,7 @@ func (l *Lexer) makeToken(tokenType TokenType) Token {
Start: l.start,
Lexeme: lexeme,
Type: tokenType,
+ TokenFile: l.tokenFile,
}
}
@@ -556,6 +563,14 @@ func (l *Lexer) literalOrKeywordType() TokenType {
}
return l.checkKeyword(2, "se", ELSE)
case 'n':
+ if l.curLen() > 2 {
+ switch l.input[l.start+2] {
+ case 'd':
+ return l.checkKeyword(3, "", END)
+ case 'u':
+ return l.checkKeyword(3, "m", ENUM)
+ }
+ }
return l.checkKeyword(2, "d", END)
}
}
@@ -641,6 +656,13 @@ func (l *Lexer) literalOrKeywordType() TokenType {
return VARSTORE
}
+ // A lone `_` is the wildcard token, not an identifier — this is what
+ // reserves it as a name (it can't be an enum member, def, etc.) and keeps
+ // the checker and runtime from disagreeing about what `_` means.
+ if l.current-l.start == 1 && l.input[l.start] == '_' {
+ return UNDERSCORE
+ }
+
return LITERAL
}
diff --git a/mshell/MShellObject.go b/mshell/MShellObject.go
index f68309f6..54ed3d17 100644
--- a/mshell/MShellObject.go
+++ b/mshell/MShellObject.go
@@ -7,6 +7,7 @@ import (
"fmt"
"golang.org/x/net/html"
"os"
+ "reflect"
"regexp"
"slices"
"sort"
@@ -160,7 +161,7 @@ func (b MShellBinary) Equals(other MShellObject) (bool, error) {
return true, nil
}
}
- return false, fmt.Errorf("Cannot compare Binary with %s.\n", other.TypeName())
+ return false, nil
}
func (b MShellBinary) CastString() (string, error) {
@@ -199,10 +200,7 @@ func (m Maybe) CommandLine() string {
// This is meant for things like error messages, should be limited in length to 30 chars or so.
func (m Maybe) DebugString() string {
- if m.obj == nil {
- return "None"
- }
- return fmt.Sprintf("Maybe(%s)", m.obj.DebugString())
+ return renderValue(m, flavorDebug)
}
func (m Maybe) Index(index int) (MShellObject, error) {
return nil, fmt.Errorf("Cannot index into a Maybe.\n")
@@ -221,17 +219,11 @@ func (m Maybe) Slice(startInc int, endExc int) (MShellObject, error) {
}
func (m Maybe) ToJson() string {
- if m.obj == nil {
- return "null"
- }
- return m.obj.ToJson()
+ return renderValue(m, flavorJson)
}
func (m Maybe) ToString() string {
- if m.obj == nil {
- return "None"
- }
- return fmt.Sprintf("Just(%s)", m.obj.ToString())
+ return renderValue(m, flavorStr)
}
func (m Maybe) IndexErrStr() string {
@@ -243,21 +235,7 @@ func (m Maybe) Concat(other MShellObject) (MShellObject, error) {
}
func (m Maybe) Equals(other MShellObject) (bool, error) {
- otherMaybe, ok := other.(Maybe)
- if !ok {
- return false, nil
- }
-
- if m.obj == nil && otherMaybe.obj == nil {
- return true, nil
- }
-
- if m.obj == nil || otherMaybe.obj == nil {
- return false, nil
- }
-
- equal, err := m.obj.Equals(otherMaybe.obj)
- return equal, err
+ return equalsIter(m, other)
}
func (m Maybe) CastString() (string, error) {
@@ -341,6 +319,71 @@ func (n MShellNull) CastString() (string, error) {
// }}}
+// Enum {{{
+
+// MShellEnum is a value of a user-declared `enum` (a generative tagged sum
+// type): the enum's declared name, the chosen member, and the member's payload
+// values (nil for a nullary member). Member names are unique across enums, so
+// the member identifies the value; the enum name rides along for diagnostics
+// and `match`.
+type MShellEnum struct {
+ EnumName string
+ Member string
+ // MemberIndex is the member's 0-based position in its enum declaration.
+ // Sorting orders enum values by this (declaration order) rather than by
+ // member name, so an ordered enum (`low | medium | high`) sorts in the
+ // author's intended order. Stamped at construction from the enum registry.
+ MemberIndex int
+ Payload []MShellObject
+}
+
+func (e *MShellEnum) TypeName() string { return e.EnumName }
+func (e *MShellEnum) IsCommandLineable() bool { return true }
+func (e *MShellEnum) IsNumeric() bool { return false }
+func (e *MShellEnum) FloatNumeric() float64 { return 0 }
+func (e *MShellEnum) CommandLine() string { return renderValue(e, flavorStr) }
+func (e *MShellEnum) DebugString() string { return renderValue(e, flavorStr) }
+
+func (e *MShellEnum) Index(index int) (MShellObject, error) {
+ return nil, fmt.Errorf("Cannot index into an enum.\n")
+}
+
+func (e *MShellEnum) SliceStart(startInclusive int) (MShellObject, error) {
+ return nil, fmt.Errorf("Cannot slice an enum.\n")
+}
+
+func (e *MShellEnum) SliceEnd(end int) (MShellObject, error) {
+ return nil, fmt.Errorf("Cannot slice an enum.\n")
+}
+
+func (e *MShellEnum) Slice(startInc int, endExc int) (MShellObject, error) {
+ return nil, fmt.Errorf("Cannot slice an enum.\n")
+}
+
+// ToJson uses serde's externally-tagged convention — the de-facto standard for
+// tagged unions in JSON: a nullary member is the bare member string; a member
+// with a single payload is `{"member": value}`; with several, `{"member":
+// [v0, v1, ...]}`. Rendering runs on renderValue's shared work stack, so an
+// arbitrarily deep value cannot overflow the call stack.
+func (e *MShellEnum) ToJson() string {
+ return renderValue(e, flavorJson)
+}
+
+func (e *MShellEnum) ToString() string { return renderValue(e, flavorStr) }
+func (e *MShellEnum) IndexErrStr() string { return "" }
+
+func (e *MShellEnum) Concat(other MShellObject) (MShellObject, error) {
+ return nil, fmt.Errorf("Cannot concatenate an enum.\n")
+}
+
+func (e *MShellEnum) Equals(other MShellObject) (bool, error) {
+ return equalsIter(e, other)
+}
+
+func (e *MShellEnum) CastString() (string, error) { return e.Member, nil }
+
+// }}}
+
// Date time {{{
type MShellDateTime struct {
@@ -463,16 +506,7 @@ func (*MShellDict) CommandLine() string {
// This is meant for things like error messages, should be limited in length to 30 chars or so.
func (d *MShellDict) DebugString() string {
- // TODO: implement this
-
- sb := strings.Builder{}
- sb.WriteString("Dictionary{")
- for key, value := range d.Items {
- sb.WriteString(fmt.Sprintf("%s: %s, ", key, value.DebugString()))
- }
- sb.WriteString("}")
- return sb.String()
-
+ return renderValue(d, flavorDebug)
}
func (*MShellDict) Index(index int) (MShellObject, error) {
return nil, fmt.Errorf("Cannot index into a dictionary.\n")
@@ -488,43 +522,7 @@ func (*MShellDict) Slice(startInc int, endExc int) (MShellObject, error) {
return nil, fmt.Errorf("Cannot slice a dictionary.\n")
}
func (d *MShellDict) ToJson() string {
- var sb strings.Builder
-
- if len(d.Items) == 0 {
- return "{}"
- }
-
- if len(d.Items) == 1 {
- for key, value := range d.Items {
- keyEnc, _ := json.Marshal(key)
- return fmt.Sprintf("{%s: %s}", string(keyEnc), value.ToJson())
- }
- }
-
- keys := make([]string, 0, len(d.Items))
- for key := range d.Items {
- keys = append(keys, key)
- }
- sort.Strings(keys)
-
- sb.WriteString("{")
-
- // Write the first key-value pair
- firstKey := keys[0]
- firstValue := d.Items[firstKey]
-
- firstKeyEnc, _ := json.Marshal(firstKey)
- sb.WriteString(fmt.Sprintf("%s: %s", string(firstKeyEnc), firstValue.ToJson()))
-
- for _, key := range keys[1:] {
- value := d.Items[key]
- keyEnc, _ := json.Marshal(key)
- sb.WriteString(fmt.Sprintf(", %s: %s", string(keyEnc), value.ToJson()))
- }
-
- sb.WriteString("}")
-
- return sb.String()
+ return renderValue(d, flavorJson)
}
func (d *MShellDict) ToString() string { // This is what is used with 'str' command
@@ -540,51 +538,7 @@ func (*MShellDict) Concat(other MShellObject) (MShellObject, error) {
}
func (thisDict *MShellDict) Equals(other MShellObject) (bool, error) {
- thisKeys := make([]string, 0, len(thisDict.Items))
- for key := range thisDict.Items {
- thisKeys = append(thisKeys, key)
- }
- sort.Strings(thisKeys)
-
- otherDict, ok := other.(*MShellDict)
- if !ok {
- return false, nil
- }
-
- otherKeys := make([]string, 0, len(otherDict.Items))
- for key := range otherDict.Items {
- otherKeys = append(otherKeys, key)
- }
- sort.Strings(otherKeys)
-
- if len(thisKeys) != len(otherKeys) {
- return false, nil
- }
-
- for i, key := range thisKeys {
- if key != otherKeys[i] {
- return false, nil
- }
- }
-
- for _, key := range thisKeys {
- thisValue := thisDict.Items[key]
- otherValue := otherDict.Items[key]
-
- if thisValue.TypeName() != otherValue.TypeName() {
- return false, nil
- }
-
- equal, err := thisValue.Equals(otherValue)
- if err != nil {
- return false, err
- }
- if !equal {
- return false, nil
- }
- }
-
- return true, nil
+ return equalsIter(thisDict, other)
}
// This is meant for completely unambiougous conversion to a string value.
@@ -802,46 +756,654 @@ func NewList(initLength int) *MShellList {
}
// Sort the list. Returns an error if any item cannot be cast to a string.
-func SortList(list *MShellList) (*MShellList, error) {
- stringsToSort := make([]string, len(list.Items))
- for i, item := range list.Items {
- str, err := item.CastString()
- if err != nil {
- return nil, fmt.Errorf("Cannot sort a list with a %s inside (%s).\n", item.TypeName(), item.DebugString())
+// valueTypeRank assigns each value kind a fixed slot in the cross-type sort
+// order, so a list mixing types still sorts totally and deterministically. The
+// exact sequence is arbitrary but stable; within a rank, compareValues uses the
+// value's natural order. Text kinds (str/path/literal) share a rank and compare
+// by content, matching structural equality.
+func valueTypeRank(obj MShellObject) int {
+ switch obj.(type) {
+ case MShellNull:
+ return 0
+ case MShellBool:
+ return 1
+ case MShellInt, MShellFloat:
+ return 2
+ case MShellString, MShellPath, MShellLiteral:
+ return 3
+ case *MShellDateTime:
+ return 4
+ case MShellBinary:
+ return 5
+ case Maybe, *Maybe:
+ return 6
+ case *MShellList:
+ return 7
+ case *MShellDict:
+ return 8
+ case *MShellEnum:
+ return 9
+ default:
+ return 10
+ }
+}
+
+func cmpInt(a, b int) int {
+ if a < b {
+ return -1
+ }
+ if a > b {
+ return 1
+ }
+ return 0
+}
+
+func cmpFloat(a, b float64) int {
+ if a < b {
+ return -1
+ }
+ if a > b {
+ return 1
+ }
+ return 0
+}
+
+// numericFloat returns an int/float value as a float64 for cross-type numeric
+// comparison. Only called for MShellInt / MShellFloat.
+func numericFloat(obj MShellObject) float64 {
+ switch v := obj.(type) {
+ case MShellInt:
+ return float64(v.Value)
+ case MShellFloat:
+ return v.Value
+ }
+ return 0
+}
+
+// textContent returns the underlying string of a text-kind value
+// (str / path / literal). Only called for those types.
+func textContent(obj MShellObject) string {
+ switch v := obj.(type) {
+ case MShellString:
+ return v.Content
+ case MShellPath:
+ return v.Path
+ case MShellLiteral:
+ return v.LiteralText
+ }
+ return ""
+}
+
+func asMaybe(obj MShellObject) (Maybe, bool) {
+ switch v := obj.(type) {
+ case Maybe:
+ return v, true
+ case *Maybe:
+ return *v, true
+ }
+ return Maybe{}, false
+}
+
+func sortedDictKeys(m map[string]MShellObject) []string {
+ keys := make([]string, 0, len(m))
+ for k := range m {
+ keys = append(keys, k)
+ }
+ sort.Strings(keys)
+ return keys
+}
+
+// isRefKind reports whether obj's dynamic type is a pointer — a value with
+// heap identity. Only these kinds can form shared substructure or reference
+// cycles, and only these are safe in interface comparisons and as map keys: a
+// value kind may wrap a non-comparable type (MShellBinary, a []byte), which
+// panics at runtime. Checking the dynamic kind instead of enumerating types
+// means a newly added pointer kind is covered with no list to keep in sync.
+func isRefKind(obj MShellObject) bool {
+ return obj != nil && reflect.TypeOf(obj).Kind() == reflect.Pointer
+}
+
+// sameRef reports whether a and b are the identical heap object (a value built
+// as `@t @t node` reuses one subtree twice). A pointer-identical pair is equal
+// by definition, so equality and ordering walks skip it instead of expanding
+// it — without this, walking a value with n levels of sharing costs 2^n.
+// Interface equality is safe here: if b's dynamic type differs from a's the
+// comparison is false without inspecting values, and if it matches, isRefKind
+// guarantees it is a comparable pointer type.
+func sameRef(a, b MShellObject) bool {
+ return isRefKind(a) && a == b
+}
+
+// dagGuard bounds a comparison walk over values with shared substructure that
+// sameRef alone cannot catch: whenever the two operands are not the same
+// pointer (a value compared against an independently built copy, or two
+// distinct subtrees each shared internally), repeated substructure produces
+// repeated *pairs*, every one re-expands, and the walk goes exponential. The
+// guard counts pops; once a walk runs long enough to suggest blowup, it
+// memoizes the pointer pairs it has already expanded and skips repeats. This
+// is the single mechanism for the whole blowup class: duplicate expansion
+// below the threshold is capped by the threshold itself, and past it every
+// repeated pair memo-hits.
+//
+// Skipping a repeated pair is sound in a LIFO walk: the first occurrence's
+// entire expansion resolves before any later duplicate (which sat lower in the
+// stack) pops, and a mismatch anywhere returns from the walk immediately — so
+// if a duplicate pops at all, its subtree already compared equal.
+//
+// Ordinary comparisons never allocate: below the step threshold the guard is
+// one integer increment. Past it, the memo grows without bound — see skip for
+// why an unbounded memo is the correct trade (a bounded one turns "assumption
+// exceeded" into exponential time).
+type dagGuard struct {
+ steps int
+ memo map[refPair]bool
+}
+
+type refPair struct{ a, b MShellObject }
+
+const dagStepThreshold = 1 << 19
+
+// skip reports whether this pair was already expanded earlier in the walk.
+// Call once per popped pair; it records the pair (past the threshold) so
+// later duplicates skip.
+//
+// The memo is deliberately UNBOUNDED. Every revisited pointer pair memo-hits,
+// which makes a comparison polynomial in actual heap nodes for any sharing
+// pattern — self-doubling, alternating, cross-parent diamonds, any container
+// mix, any depth. Earlier versions capped the memo to bound memory and then
+// patched the resulting exponential cliffs case by case (generational
+// eviction, per-container dedup); any bounded memo loses to a working set
+// larger than its bound (measured, not theorized), so no cap. Memory tracks
+// pairs actually walked: it activates only past the step threshold, so
+// ordinary comparisons never allocate, and a comparison large enough to build
+// a big memo already holds operands larger than the memo itself.
+func (g *dagGuard) skip(a, b MShellObject) bool {
+ g.steps++
+ if g.steps < dagStepThreshold {
+ return false
+ }
+ // Only pointer kinds get keys: repeated pairs of anything else cannot
+ // cause blowup, and only pointers are guaranteed comparable as map keys.
+ if !isRefKind(a) || !isRefKind(b) {
+ return false
+ }
+ key := refPair{a, b}
+ if g.memo == nil {
+ g.memo = make(map[refPair]bool, 1024)
+ }
+ if g.memo[key] {
+ return true
+ }
+ g.memo[key] = true
+ return false
+}
+
+// renderFlavor selects which of a value's three textual forms renderValue
+// emits: flavorStr is ToString (the `str` form), flavorDebug is DebugString
+// (stack dumps, list display), flavorJson is ToJson. Containers pick their
+// children's flavor the same way the per-type methods always did: a list
+// renders children as DebugString, a dict's `str` form is its JSON form, an
+// enum renders payloads with ToString, and Maybe keeps its own flavor.
+type renderFlavor uint8
+
+const (
+ flavorStr renderFlavor = iota
+ flavorDebug
+ flavorJson
+)
+
+type renderTask struct {
+ lit string
+ obj MShellObject
+ flavor renderFlavor
+ isLit bool
+ // isExit marks the sentinel popped after a container's children have
+ // rendered; it removes the container from the on-path cycle set.
+ isExit bool
+}
+
+func renderLit(s string) renderTask { return renderTask{lit: s, isLit: true} }
+
+// renderJoin builds the task sequence `open item0 sep item1 sep ... close`,
+// rendering each item in the given flavor.
+func renderJoin(open, sep, close string, items []MShellObject, flavor renderFlavor) []renderTask {
+ seq := make([]renderTask, 0, len(items)*2+2)
+ if open != "" {
+ seq = append(seq, renderLit(open))
+ }
+ for i, it := range items {
+ if i > 0 {
+ seq = append(seq, renderLit(sep))
+ }
+ seq = append(seq, renderTask{obj: it, flavor: flavor})
+ }
+ if close != "" {
+ seq = append(seq, renderLit(close))
+ }
+ return seq
+}
+
+// renderValue renders a value in the requested flavor. It is total: a cyclic
+// value renders with a `` marker at the back-reference, which keeps
+// internal rendering (error messages, stack dumps) from hanging. User-facing
+// operations (`str`, `toJson`) call renderValueDetect instead and report a
+// cyclic value as an error — mshell is strict, so a cycle is always the
+// degenerate result of appending a container into itself, not a value with a
+// meaningful rendering.
+func renderValue(root MShellObject, flavor renderFlavor) string {
+ s, _ := renderValueDetect(root, flavor)
+ return s
+}
+
+// renderValueDetect renders a value in the requested flavor with one explicit
+// work stack instead of method recursion, expanding every container kind —
+// enum, Maybe, list, dict, pipe — inline. Arbitrarily deep values therefore
+// cannot overflow the call stack even when kinds alternate (enum→Maybe→enum,
+// ...), which per-type iterative renderers could not guarantee: each one
+// delegated other kinds to the child's own recursive method. Leaf kinds
+// (scalars, grids, quotations) still render via their own methods; their
+// nesting depth is bounded by their own structure.
+//
+// Containers currently being expanded are tracked as an on-path set; reaching
+// one again is a true reference cycle (a DAG merely revisits a finished
+// pointer, which is fine), so the walk emits `` instead of descending
+// and reports cycled=true.
+func renderValueDetect(root MShellObject, flavor renderFlavor) (string, bool) {
+ var sb strings.Builder
+ cycled := false
+ var onPath map[MShellObject]bool
+ stack := []renderTask{{obj: root, flavor: flavor}}
+ // push schedules seq to pop in order (reversed onto the LIFO stack).
+ push := func(seq []renderTask) {
+ for i := len(seq) - 1; i >= 0; i-- {
+ stack = append(stack, seq[i])
}
- stringsToSort[i] = str
}
+ // enter marks t.obj as on the current path and schedules its removal
+ // after seq (the container's children) has fully rendered. Only pointer
+ // kinds are tracked — value kinds are copied and cannot be revisited by
+ // identity, so they cannot sit on a reference cycle.
+ enter := func(obj MShellObject, seq []renderTask) []renderTask {
+ if !isRefKind(obj) {
+ return seq
+ }
+ if onPath == nil {
+ onPath = make(map[MShellObject]bool, 8)
+ }
+ onPath[obj] = true
+ return append(seq, renderTask{obj: obj, isExit: true})
+ }
+ for len(stack) > 0 {
+ t := stack[len(stack)-1]
+ stack = stack[:len(stack)-1]
+ if t.isLit {
+ sb.WriteString(t.lit)
+ continue
+ }
+ if t.isExit {
+ delete(onPath, t.obj)
+ continue
+ }
+ // Only pointer kinds are ever on the path; the guard also keeps
+ // unhashable dynamic types (MShellBinary, a []byte) away from the
+ // map lookup, which would panic even on a read.
+ if isRefKind(t.obj) && onPath[t.obj] {
+ sb.WriteString("")
+ cycled = true
+ continue
+ }
+ if m, ok := asMaybe(t.obj); ok {
+ switch {
+ case m.IsNone() && t.flavor == flavorJson:
+ sb.WriteString("null")
+ case m.IsNone():
+ sb.WriteString("None")
+ case t.flavor == flavorJson:
+ push(enter(t.obj, []renderTask{{obj: m.obj, flavor: flavorJson}}))
+ case t.flavor == flavorDebug:
+ push(enter(t.obj, []renderTask{renderLit("Maybe("), {obj: m.obj, flavor: flavorDebug}, renderLit(")")}))
+ default:
+ push(enter(t.obj, []renderTask{renderLit("Just("), {obj: m.obj, flavor: flavorStr}, renderLit(")")}))
+ }
+ continue
+ }
+ switch v := t.obj.(type) {
+ case *MShellEnum:
+ if t.flavor == flavorJson {
+ // serde's externally-tagged convention: a nullary member is
+ // the bare member string, one payload is {"member": value},
+ // several are {"member": [v0, v1, ...]}.
+ if len(v.Payload) == 0 {
+ fmt.Fprintf(&sb, "%q", v.Member)
+ continue
+ }
+ seq := make([]renderTask, 0, len(v.Payload)*2+4)
+ seq = append(seq, renderLit(fmt.Sprintf("{%q: ", v.Member)))
+ if len(v.Payload) == 1 {
+ seq = append(seq, renderTask{obj: v.Payload[0], flavor: flavorJson})
+ } else {
+ seq = append(seq, renderJoin("[", ", ", "]", v.Payload, flavorJson)...)
+ }
+ seq = append(seq, renderLit("}"))
+ push(enter(t.obj, seq))
+ continue
+ }
+ // `member` (nullary) or `member(p0 p1 ...)`, payloads as ToString.
+ if len(v.Payload) == 0 {
+ sb.WriteString(v.Member)
+ continue
+ }
+ push(enter(t.obj, renderJoin(v.Member+"(", " ", ")", v.Payload, flavorStr)))
+ case *MShellList:
+ if t.flavor == flavorJson {
+ push(enter(t.obj, renderJoin("[", ", ", "]", v.Items, flavorJson)))
+ } else {
+ push(enter(t.obj, renderJoin("[", " ", "]", v.Items, flavorDebug)))
+ }
+ case *MShellPipe:
+ if t.flavor == flavorJson {
+ push(enter(t.obj, renderJoin("[", ", ", "]", v.List.Items, flavorJson)))
+ } else {
+ push(enter(t.obj, renderJoin("", " | ", "", v.List.Items, flavorDebug)))
+ }
+ case *MShellDict:
+ keys := sortedDictKeys(v.Items)
+ if t.flavor == flavorDebug {
+ seq := make([]renderTask, 0, len(keys)*3+2)
+ seq = append(seq, renderLit("Dictionary{"))
+ for _, k := range keys {
+ seq = append(seq, renderLit(k+": "), renderTask{obj: v.Items[k], flavor: flavorDebug}, renderLit(", "))
+ }
+ seq = append(seq, renderLit("}"))
+ push(enter(t.obj, seq))
+ continue
+ }
+ // The `str` form of a dict is its JSON form.
+ if len(keys) == 0 {
+ sb.WriteString("{}")
+ continue
+ }
+ seq := make([]renderTask, 0, len(keys)*2+2)
+ seq = append(seq, renderLit("{"))
+ for i, k := range keys {
+ keyEnc, _ := json.Marshal(k)
+ if i > 0 {
+ seq = append(seq, renderLit(", "))
+ }
+ seq = append(seq, renderLit(string(keyEnc)+": "), renderTask{obj: v.Items[k], flavor: flavorJson})
+ }
+ seq = append(seq, renderLit("}"))
+ push(enter(t.obj, seq))
+ default:
+ switch t.flavor {
+ case flavorDebug:
+ sb.WriteString(t.obj.DebugString())
+ case flavorJson:
+ sb.WriteString(t.obj.ToJson())
+ default:
+ sb.WriteString(t.obj.ToString())
+ }
+ }
+ }
+ return sb.String(), cycled
+}
- // Sort the strings
- sort.Strings(stringsToSort)
+// equalsIter is structural equality over any two values, walked with one
+// explicit pair stack that expands every container kind — enum, Maybe, list,
+// dict, pipe — inline, so deep values cannot overflow the call stack even
+// when kinds alternate. Pointer-identical pairs are skipped (equal by
+// definition), and past a step threshold already-expanded pairs are memoized
+// (see dagGuard), so shared substructure cannot blow up exponentially. Leaf
+// kinds compare via their own Equals.
+type eqPair struct{ a, b MShellObject }
- // Create a new list and add the sorted strings to it
- newList := NewList(0)
- for _, str := range stringsToSort {
- newList.Items = append(newList.Items, MShellString{str})
+// pushPairs pushes element-wise comparison pairs onto the walk stack.
+// Duplicate pairs from shared substructure are not filtered here; the
+// dagGuard memo handles them (see dagGuard).
+func pushPairs(stack []eqPair, as, bs []MShellObject) []eqPair {
+ for i := range as {
+ stack = append(stack, eqPair{a: as[i], b: bs[i]})
}
+ return stack
+}
+
+func equalsIter(a, b MShellObject) (bool, error) {
+ var guard dagGuard
+ stack := []eqPair{{a: a, b: b}}
+ for len(stack) > 0 {
+ p := stack[len(stack)-1]
+ stack = stack[:len(stack)-1]
+ if sameRef(p.a, p.b) || guard.skip(p.a, p.b) {
+ continue
+ }
+ if am, aok := asMaybe(p.a); aok {
+ bm, bok := asMaybe(p.b)
+ if !bok || am.IsNone() != bm.IsNone() {
+ return false, nil
+ }
+ if !am.IsNone() {
+ stack = append(stack, eqPair{a: am.obj, b: bm.obj})
+ }
+ continue
+ }
+ switch av := p.a.(type) {
+ case *MShellEnum:
+ bv, ok := p.b.(*MShellEnum)
+ if !ok || av.EnumName != bv.EnumName || av.Member != bv.Member || len(av.Payload) != len(bv.Payload) {
+ return false, nil
+ }
+ stack = pushPairs(stack, av.Payload, bv.Payload)
+ case *MShellList:
+ bv, ok := p.b.(*MShellList)
+ if !ok || len(av.Items) != len(bv.Items) {
+ return false, nil
+ }
+ stack = pushPairs(stack, av.Items, bv.Items)
+ case *MShellPipe:
+ bv, ok := p.b.(*MShellPipe)
+ if !ok || len(av.List.Items) != len(bv.List.Items) {
+ return false, nil
+ }
+ stack = pushPairs(stack, av.List.Items, bv.List.Items)
+ case *MShellDict:
+ bv, ok := p.b.(*MShellDict)
+ if !ok || len(av.Items) != len(bv.Items) {
+ return false, nil
+ }
+ for key, aval := range av.Items {
+ bval, ok := bv.Items[key]
+ if !ok {
+ return false, nil
+ }
+ stack = append(stack, eqPair{a: aval, b: bval})
+ }
+ default:
+ eq, err := p.a.Equals(p.b)
+ if err != nil || !eq {
+ return eq, err
+ }
+ }
+ }
+ return true, nil
+}
+
+// compareValues returns -1, 0, or 1, giving a total order over every value
+// type. Different kinds are ordered by a fixed type rank (valueTypeRank); within
+// a kind the natural order is used (numbers numerically with int/float
+// interleaved, text lexically, dates chronologically, bytes bytewise).
+// Structured values compare lexicographically: lists positionally (shorter
+// prefix first), dicts by sorted key then value, enums by name then declaration
+// order then payloads. For those kinds the order agrees with structural
+// equality: compareValues returns 0 exactly when the two values are Equals.
+// Unorderable kinds (quotation, grid, ...) are the exception — they share a
+// rank and always compare 0, so a stable sort preserves their original order,
+// while Equals still distinguishes them (identity for quotations, cell-wise
+// for grids).
+//
+// The comparison is driven by an explicit work stack rather than recursion, so
+// arbitrarily deep values (e.g. a long `node(node(...))` enum chain) cannot
+// overflow the call stack. Each task is either a pair of values to compare or a
+// precomputed literal result (used for length tiebreaks and dict key / enum
+// name comparisons). Pending tasks pop in lexicographic order; the first
+// non-zero result short-circuits. Children of a compound value are pushed on top
+// of that value's own length-tiebreak, so the tiebreak is only reached when the
+// whole prefix compared equal.
+func compareValues(a, b MShellObject) int {
+ type task struct {
+ a, b MShellObject
+ lit int
+ isLit bool
+ }
+ var guard dagGuard
+ stack := []task{{a: a, b: b}}
+ for len(stack) > 0 {
+ t := stack[len(stack)-1]
+ stack = stack[:len(stack)-1]
+ if t.isLit {
+ if t.lit != 0 {
+ return t.lit
+ }
+ continue
+ }
+ // Shared substructure: a pointer-identical pair compares 0 by
+ // definition, and a pair this walk already expanded proved 0 (any
+ // non-zero would have returned; see dagGuard). Skipping both keeps
+ // DAG-shaped values linear instead of 2^n.
+ if sameRef(t.a, t.b) || guard.skip(t.a, t.b) {
+ continue
+ }
+ ra, rb := valueTypeRank(t.a), valueTypeRank(t.b)
+ if ra != rb {
+ return cmpInt(ra, rb)
+ }
+ switch av := t.a.(type) {
+ case MShellNull:
+ // Two nulls are equal; move to the next task.
+ case MShellBool:
+ bv := t.b.(MShellBool)
+ if av.Value != bv.Value {
+ if !av.Value { // false < true
+ return -1
+ }
+ return 1
+ }
+ case MShellInt:
+ if bv, ok := t.b.(MShellInt); ok {
+ if c := cmpInt(av.Value, bv.Value); c != 0 {
+ return c
+ }
+ } else if c := cmpFloat(numericFloat(t.a), numericFloat(t.b)); c != 0 {
+ return c
+ }
+ case MShellFloat:
+ if c := cmpFloat(numericFloat(t.a), numericFloat(t.b)); c != 0 {
+ return c
+ }
+ case MShellString, MShellPath, MShellLiteral:
+ if c := strings.Compare(textContent(t.a), textContent(t.b)); c != 0 {
+ return c
+ }
+ case *MShellDateTime:
+ bt := t.b.(*MShellDateTime).Time
+ if av.Time.Before(bt) {
+ return -1
+ }
+ if av.Time.After(bt) {
+ return 1
+ }
+ case MShellBinary:
+ if c := bytes.Compare(av, t.b.(MShellBinary)); c != 0 {
+ return c
+ }
+ case Maybe, *Maybe:
+ am, _ := asMaybe(t.a)
+ bm, _ := asMaybe(t.b)
+ an, bn := am.IsNone(), bm.IsNone()
+ if an != bn {
+ if an { // none < just
+ return -1
+ }
+ return 1
+ }
+ if !an { // both `just`: compare payloads
+ stack = append(stack, task{a: am.obj, b: bm.obj})
+ }
+ case *MShellList:
+ bl := t.b.(*MShellList)
+ n := min(len(av.Items), len(bl.Items))
+ stack = append(stack, task{lit: cmpInt(len(av.Items), len(bl.Items)), isLit: true})
+ for i := n - 1; i >= 0; i-- {
+ stack = append(stack, task{a: av.Items[i], b: bl.Items[i]})
+ }
+ case *MShellDict:
+ bd := t.b.(*MShellDict)
+ ak := sortedDictKeys(av.Items)
+ bk := sortedDictKeys(bd.Items)
+ n := min(len(ak), len(bk))
+ stack = append(stack, task{lit: cmpInt(len(ak), len(bk)), isLit: true})
+ for i := n - 1; i >= 0; i-- {
+ // Pushed so `key compare` pops before its `value compare`.
+ stack = append(stack, task{a: av.Items[ak[i]], b: bd.Items[bk[i]]})
+ stack = append(stack, task{lit: strings.Compare(ak[i], bk[i]), isLit: true})
+ }
+ case *MShellEnum:
+ be := t.b.(*MShellEnum)
+ n := min(len(av.Payload), len(be.Payload))
+ stack = append(stack, task{lit: cmpInt(len(av.Payload), len(be.Payload)), isLit: true})
+ for i := n - 1; i >= 0; i-- {
+ stack = append(stack, task{a: av.Payload[i], b: be.Payload[i]})
+ }
+ // Name and member (declaration order) compare before any payload.
+ stack = append(stack, task{lit: cmpInt(av.MemberIndex, be.MemberIndex), isLit: true})
+ stack = append(stack, task{lit: strings.Compare(av.EnumName, be.EnumName), isLit: true})
+ default:
+ // Unorderable kinds (quotation, pipe, grid, ...) share a rank and
+ // compare equal, so a stable sort leaves them in their original
+ // relative order.
+ }
+ }
+ return 0
+}
+
+// SortList returns a new list with the same elements sorted by the total order
+// compareValues defines. Element identity and type are preserved (a list of
+// ints stays ints, enum payloads are kept) — sorting only reorders.
+func SortList(list *MShellList) (*MShellList, error) {
+ newItems := make([]MShellObject, len(list.Items))
+ copy(newItems, list.Items)
+ sort.SliceStable(newItems, func(i, j int) bool {
+ return compareValues(newItems[i], newItems[j]) < 0
+ })
+ newList := NewList(0)
+ newList.Items = newItems
CopyListParams(list, newList)
return newList, nil
}
-// Sort the list. Returns an error if any item cannot be cast to a string.
+// SortListFunc sorts by a string key (each element's CastString) using the given
+// string comparer — used for version sort. Original elements are preserved in
+// the result. Returns an error if any element cannot be cast to a string.
func SortListFunc(list *MShellList, cmp func(a string, b string) int) (*MShellList, error) {
- stringsToSort := make([]string, len(list.Items))
+ type keyed struct {
+ key string
+ obj MShellObject
+ }
+ items := make([]keyed, len(list.Items))
for i, item := range list.Items {
str, err := item.CastString()
if err != nil {
return nil, fmt.Errorf("Cannot sort a list with a %s inside (%s).\n", item.TypeName(), item.DebugString())
}
- stringsToSort[i] = str
+ items[i] = keyed{key: str, obj: item}
}
- // Sort the strings to function
- slices.SortFunc(stringsToSort, cmp)
+ slices.SortStableFunc(items, func(a, b keyed) int {
+ return cmp(a.key, b.key)
+ })
- // Create a new list and add the sorted strings to it
newList := NewList(0)
- for _, str := range stringsToSort {
- newList.Items = append(newList.Items, MShellString{str})
+ for _, it := range items {
+ newList.Items = append(newList.Items, it.obj)
}
CopyListParams(list, newList)
return newList, nil
@@ -1129,19 +1691,6 @@ func (obj MShellFloat) CommandLine() string {
return strconv.FormatFloat(obj.Value, 'f', -1, 64)
}
-// DebugString
-func DebugStrs(objs []MShellObject) []string {
- debugStrs := make([]string, len(objs))
- for i, obj := range objs {
- if obj == nil {
- debugStrs[i] = "nil"
- } else {
- debugStrs[i] = obj.DebugString()
- }
- }
- return debugStrs
-}
-
func (obj MShellLiteral) DebugString() string {
return obj.LiteralText
}
@@ -1175,8 +1724,8 @@ func (obj *MShellQuotation) DebugString() string {
}
func (obj *MShellList) DebugString() string {
- // Join the tokens with a space, surrounded by '[' and ']'
- return "[" + strings.Join(DebugStrs(obj.Items), " ") + "]"
+ // Elements joined with a space, surrounded by '[' and ']'
+ return renderValue(obj, flavorDebug)
}
func cleanStringForTerminal(input string) string {
@@ -1221,8 +1770,8 @@ func (obj MShellPath) DebugString() string {
}
func (obj *MShellPipe) DebugString() string {
- // Join each item with a ' | '
- return strings.Join(DebugStrs(obj.List.Items), " | ")
+ // Each item joined with ' | '
+ return renderValue(obj, flavorDebug)
}
func (obj MShellInt) DebugString() string {
@@ -1718,17 +2267,7 @@ func (obj *MShellQuotation) ToJson() string {
}
func (obj *MShellList) ToJson() string {
- builder := strings.Builder{}
- builder.WriteString("[")
- if len(obj.Items) > 0 {
- builder.WriteString(obj.Items[0].ToJson())
- for _, item := range obj.Items[1:] {
- builder.WriteString(", ")
- builder.WriteString(item.ToJson())
- }
- }
- builder.WriteString("]")
- return builder.String()
+ return renderValue(obj, flavorJson)
}
func (obj MShellString) ToJson() string {
@@ -1949,62 +2488,66 @@ func (obj MShellLiteral) Equals(other MShellObject) (bool, error) {
case MShellPath:
return obj.LiteralText == o.Path, nil
default:
- return false, fmt.Errorf("Cannot compare a literal with a %s.\n", other.TypeName())
+ return false, nil
}
}
func (obj MShellBool) Equals(other MShellObject) (bool, error) {
asBool, ok := other.(MShellBool)
if !ok {
- return false, fmt.Errorf("Cannot compare a boolean with a %s.\n", other.TypeName())
+ return false, nil
}
return obj.Value == asBool.Value, nil
}
+
func (obj *MShellQuotation) Equals(other MShellObject) (bool, error) {
- return false, fmt.Errorf("Equality currently not defined for quotations.\n")
+ // Quotations are code values; two are equal only when they are the same
+ // quotation object (reference identity).
+ o, ok := other.(*MShellQuotation)
+ return ok && obj == o, nil
}
func (obj *MShellList) Equals(other MShellObject) (bool, error) {
- return false, fmt.Errorf("Equality currently not defined for lists.\n")
+ return equalsIter(obj, other)
}
func (obj MShellString) Equals(other MShellObject) (bool, error) {
- // Define equality for other as string or as literal.
- switch other.(type) {
+ // str/path/literal compare by their text content (the `=` overloads
+ // permit str/path comparison); any other type is simply not equal.
+ switch o := other.(type) {
case MShellString:
- asString, _ := other.(MShellString)
- return obj.Content == asString.Content, nil
+ return obj.Content == o.Content, nil
case MShellLiteral:
- asLiteral, _ := other.(MShellLiteral)
- return obj.Content == asLiteral.LiteralText, nil
+ return obj.Content == o.LiteralText, nil
+ case MShellPath:
+ return obj.Content == o.Path, nil
default:
- return false, fmt.Errorf("Cannot compare a string with a %s.\n", other.TypeName())
+ return false, nil
}
}
func (obj MShellPath) Equals(other MShellObject) (bool, error) {
- // Define equality for other as string or as literal.
- switch other.(type) {
+ switch o := other.(type) {
case MShellPath:
- asPath, _ := other.(MShellPath)
- return obj.Path == asPath.Path, nil
+ return obj.Path == o.Path, nil
case MShellLiteral:
- asLiteral, _ := other.(MShellLiteral)
- return obj.Path == asLiteral.LiteralText, nil
+ return obj.Path == o.LiteralText, nil
+ case MShellString:
+ return obj.Path == o.Content, nil
default:
- return false, fmt.Errorf("Cannot compare a path with a %s.\n", other.TypeName())
+ return false, nil
}
}
func (obj *MShellPipe) Equals(other MShellObject) (bool, error) {
- return false, fmt.Errorf("Equality currently not defined for pipes.\n")
+ return equalsIter(obj, other)
}
func (obj MShellInt) Equals(other MShellObject) (bool, error) {
asInt, ok := other.(MShellInt)
if !ok {
- return false, fmt.Errorf("Cannot compare an integer with a %s.\n", other.TypeName())
+ return false, nil
}
return obj.Value == asInt.Value, nil
}
@@ -2012,7 +2555,7 @@ func (obj MShellInt) Equals(other MShellObject) (bool, error) {
func (obj MShellFloat) Equals(other MShellObject) (bool, error) {
asFloat, ok := other.(MShellFloat)
if !ok {
- return false, fmt.Errorf("Cannot compare a float with a %s.\n", other.TypeName())
+ return false, nil
}
return obj.Value == asFloat.Value, nil
}
@@ -2145,28 +2688,59 @@ func (col *GridColumn) Get(index int) MShellObject {
}
}
-// Set sets the value at the given row index
+// Set sets the value at the given row index. If a typed column is given a value
+// of a different type, the column is promoted to generic storage so the value
+// is stored rather than silently dropped.
func (col *GridColumn) Set(index int, value MShellObject) {
switch col.ColType {
case COL_INT:
if intVal, ok := value.(MShellInt); ok {
col.IntData[index] = int64(intVal.Value)
+ return
}
case COL_FLOAT:
if floatVal, ok := value.(MShellFloat); ok {
col.FloatData[index] = floatVal.Value
+ return
}
case COL_STRING:
if strVal, ok := value.(MShellString); ok {
col.StringData[index] = strVal.Content
+ return
}
case COL_DATETIME:
if dtVal, ok := value.(*MShellDateTime); ok {
col.DateTimeData[index] = dtVal.Time
+ return
}
- default:
+ case COL_GENERIC:
col.GenericData[index] = value
+ return
+ }
+ // Typed column received a value of a different type: promote the whole
+ // column to generic storage, then store the value.
+ col.promoteToGeneric()
+ col.GenericData[index] = value
+}
+
+// promoteToGeneric materializes a typed column's data into generic storage so
+// the column can hold values of any type. It is a no-op for an already-generic
+// column.
+func (col *GridColumn) promoteToGeneric() {
+ if col.ColType == COL_GENERIC {
+ return
+ }
+ n := col.Len()
+ generic := make([]MShellObject, n)
+ for i := 0; i < n; i++ {
+ generic[i] = col.Get(i)
}
+ col.ColType = COL_GENERIC
+ col.GenericData = generic
+ col.IntData = nil
+ col.FloatData = nil
+ col.StringData = nil
+ col.DateTimeData = nil
}
// Len returns the number of rows in the column
@@ -2368,7 +2942,25 @@ func (g *MShellGrid) Concat(other MShellObject) (MShellObject, error) {
}
func (g *MShellGrid) Equals(other MShellObject) (bool, error) {
- return false, fmt.Errorf("Equality currently not defined for grids.\n")
+ o, ok := other.(*MShellGrid)
+ if !ok {
+ return false, nil
+ }
+ if g.RowCount != o.RowCount || len(g.Columns) != len(o.Columns) {
+ return false, nil
+ }
+ for i, col := range g.Columns {
+ if col.Name != o.Columns[i].Name {
+ return false, nil
+ }
+ }
+ for i := 0; i < g.RowCount; i++ {
+ eq, err := g.GetRow(i).ToDict().Equals(o.GetRow(i).ToDict())
+ if err != nil || !eq {
+ return eq, err
+ }
+ }
+ return true, nil
}
func (g *MShellGrid) CastString() (string, error) {
@@ -2484,7 +3076,20 @@ func (v *MShellGridView) Concat(other MShellObject) (MShellObject, error) {
}
func (v *MShellGridView) Equals(other MShellObject) (bool, error) {
- return false, fmt.Errorf("Equality currently not defined for grid views.\n")
+ o, ok := other.(*MShellGridView)
+ if !ok {
+ return false, nil
+ }
+ if len(v.Indices) != len(o.Indices) {
+ return false, nil
+ }
+ for i := range v.Indices {
+ eq, err := v.GetRow(i).ToDict().Equals(o.GetRow(i).ToDict())
+ if err != nil || !eq {
+ return eq, err
+ }
+ }
+ return true, nil
}
func (v *MShellGridView) CastString() (string, error) {
@@ -2593,7 +3198,11 @@ func (r *MShellGridRow) Concat(other MShellObject) (MShellObject, error) {
}
func (r *MShellGridRow) Equals(other MShellObject) (bool, error) {
- return false, fmt.Errorf("Equality currently not defined for grid rows.\n")
+ o, ok := other.(*MShellGridRow)
+ if !ok {
+ return false, nil
+ }
+ return r.ToDict().Equals(o.ToDict())
}
func (r *MShellGridRow) CastString() (string, error) {
diff --git a/mshell/Main.go b/mshell/Main.go
index eba1046e..d01ff600 100644
--- a/mshell/Main.go
+++ b/mshell/Main.go
@@ -166,7 +166,7 @@ func getStartupFileSpecs(options startupLoadOptions) (startupFileSpec, startupFi
return stdlibSpec, initSpec, nil
}
-func loadStartupFile(path string, description string, stack *MShellStack, context ExecuteContext, state *EvalState, definitions *[]MShellDefinition) error {
+func loadStartupFile(path string, description string, stack *MShellStack, context ExecuteContext, state *EvalState, definitions *[]MShellDefinition, items *[]MShellParseItem) error {
sourceBytes, err := os.ReadFile(path)
if err != nil {
if errors.Is(err, os.ErrNotExist) {
@@ -181,7 +181,17 @@ func loadStartupFile(path string, description string, stack *MShellStack, contex
}
*definitions = append(*definitions, parsedFile.Definitions...)
+ if err := FindDuplicateDefinition(*definitions); err != nil {
+ return fmt.Errorf("error loading %s at %s: %w", description, path, err)
+ }
state.AddCompletionDefinitions(parsedFile.Definitions)
+ // Register enum constructors declared in this startup file, and retain the
+ // top-level items so the type checker can register the file's `type` and
+ // `enum` declarations — startup declarations behave like the main file's.
+ state.RegisterEnums(parsedFile.Items)
+ if items != nil {
+ *items = append(*items, parsedFile.Items...)
+ }
if len(parsedFile.Items) > 0 {
callStackItem := CallStackItem{
@@ -257,16 +267,17 @@ func preflightStartupFile(spec startupFileSpec) string {
return fmt.Sprintf("present at %s (parses ok; not evaluated because the other startup file failed first)", spec.path)
}
-func loadStartupDefinitions(options startupLoadOptions, stack *MShellStack, context ExecuteContext, state *EvalState) ([]MShellDefinition, error) {
+func loadStartupDefinitions(options startupLoadOptions, stack *MShellStack, context ExecuteContext, state *EvalState) ([]MShellDefinition, []MShellParseItem, error) {
stdlibSpec, initSpec, err := getStartupFileSpecs(options)
if err != nil {
- return nil, err
+ return nil, nil, err
}
definitions := make([]MShellDefinition, 0)
- if err := loadStartupFile(stdlibSpec.path, stdlibSpec.description, stack, context, state, &definitions); err != nil {
+ var items []MShellParseItem
+ if err := loadStartupFile(stdlibSpec.path, stdlibSpec.description, stack, context, state, &definitions, &items); err != nil {
initStatus := preflightStartupFile(initSpec)
- return nil, &startupLoadError{
+ return nil, nil, &startupLoadError{
which: "stdlib",
spec: stdlibSpec,
options: options,
@@ -276,14 +287,14 @@ func loadStartupDefinitions(options startupLoadOptions, stack *MShellStack, cont
}
}
- if err := loadStartupFile(initSpec.path, initSpec.description, stack, context, state, &definitions); err != nil {
+ if err := loadStartupFile(initSpec.path, initSpec.description, stack, context, state, &definitions, &items); err != nil {
if !initSpec.required && errors.Is(err, os.ErrNotExist) {
- return definitions, nil
+ return definitions, items, nil
}
- return nil, &startupLoadError{which: "init", spec: initSpec, options: options, cause: err}
+ return nil, nil, &startupLoadError{which: "init", spec: initSpec, options: options, cause: err}
}
- return definitions, nil
+ return definitions, items, nil
}
// formatStartupErrorMessage builds a multi-line explanation of how msh searches
@@ -821,7 +832,7 @@ func main() {
var allDefinitions []MShellDefinition
- startupDefinitions, err := loadStartupDefinitions(startupLoadOptions{
+ startupDefinitions, startupItems, err := loadStartupDefinitions(startupLoadOptions{
version: effectiveVersion,
allowEnvOverrides: allowStartupEnvOverrides,
requireInit: requireVersionedInit,
@@ -836,7 +847,7 @@ func main() {
state.AddCompletionDefinitions(file.Definitions)
if checkTypes {
- errs, ok := TypeCheckProgram(file, startupDefinitions)
+ errs, ok := TypeCheckProgram(file, startupDefinitions, startupItems)
if !ok {
for _, e := range errs {
fmt.Fprintln(os.Stderr, e)
@@ -853,10 +864,22 @@ func main() {
}
}
+ // Definition lookup is first-match-wins, so a script def whose name is
+ // already taken (by the stdlib, the init file, or the script itself)
+ // would be silently dead code. Reject it instead.
+ if err := FindDuplicateDefinition(allDefinitions); err != nil {
+ fmt.Fprint(os.Stderr, err.Error())
+ os.Exit(1)
+ }
+
if len(file.Items) == 0 {
os.Exit(0)
}
+ // Register enum constructors before evaluation so bare member words can
+ // be constructed (mirrors the checker's enum pre-pass).
+ state.RegisterEnums(file.Items)
+
callStackItem := CallStackItem{
MShellParseItem: nil,
Name: "main",
@@ -2968,10 +2991,21 @@ func (state *TermState) ExecuteCurrentCommand() (bool, int) {
term.Restore(state.stdInFd, &state.oldState)
if len(parsed.Definitions) > 0 {
+ // Definition lookup is first-match-wins, so a redefinition would be
+ // silently ignored rather than take effect; reject the input instead.
+ if err := FindDuplicateDefinition(state.stdLibDefs, parsed.Definitions); err != nil {
+ fmt.Fprint(os.Stderr, err.Error())
+ goto PromptPrint
+ }
state.stdLibDefs = append(state.stdLibDefs, parsed.Definitions...)
state.evalState.AddCompletionDefinitions(parsed.Definitions)
}
+ // Register enum constructors declared on this line, so an interactive
+ // `enum` declaration works like one in a script: its member words
+ // construct values on subsequent lines.
+ state.evalState.RegisterEnums(parsed.Items)
+
if len(parsed.Items) > 0 {
state.initCallStackItem.MShellParseItem = parsed.Items[0]
result := state.evalState.Evaluate(parsed.Items, &state.stack, state.context, state.stdLibDefs, state.initCallStackItem)
@@ -3158,11 +3192,15 @@ func (state *TermState) getCurrentPos() (int, int, error) {
}
func stdLibDefinitions(stack *MShellStack, context ExecuteContext, state *EvalState) ([]MShellDefinition, error) {
- return loadStartupDefinitions(startupLoadOptions{
+ // The interactive path has no whole-program type-check pass, so the
+ // startup items (already registered on the EvalState for runtime enum
+ // construction inside loadStartupFile) are not needed here.
+ defs, _, err := loadStartupDefinitions(startupLoadOptions{
version: mshellVersion,
allowEnvOverrides: true,
requireInit: false,
}, stack, context, state)
+ return defs, err
}
func registerTempFileForCleanup(tempFileName string) {
diff --git a/mshell/Parser.go b/mshell/Parser.go
index a31692f0..9a4a127a 100644
--- a/mshell/Parser.go
+++ b/mshell/Parser.go
@@ -680,6 +680,12 @@ func (parser *MShellParser) ParseFile() (file *MShellFile, err error) {
return file, err
}
file.Items = append(file.Items, decl)
+ case ENUM:
+ decl, err := parser.ParseEnumDecl()
+ if err != nil {
+ return file, err
+ }
+ file.Items = append(file.Items, decl)
case VER:
if file.Version != "" {
return file, fmt.Errorf("%d:%d: Duplicate VER directive; version already set to %q", parser.curr.Line, parser.curr.Column, file.Version)
diff --git a/mshell/Startup_test.go b/mshell/Startup_test.go
index be7902dd..88c067df 100644
--- a/mshell/Startup_test.go
+++ b/mshell/Startup_test.go
@@ -151,7 +151,7 @@ func TestLoadStartupDefinitionsLoadsVersionedStdlibAndInit(t *testing.T) {
stack, context, state := newStartupTestContext()
- definitions, err := loadStartupDefinitions(startupLoadOptions{
+ definitions, _, err := loadStartupDefinitions(startupLoadOptions{
version: version,
allowEnvOverrides: false,
requireInit: true,
@@ -215,7 +215,7 @@ func TestLoadStartupDefinitionsRequiresInitForExplicitVersion(t *testing.T) {
stack, context, state := newStartupTestContext()
- _, err := loadStartupDefinitions(startupLoadOptions{
+ _, _, err := loadStartupDefinitions(startupLoadOptions{
version: version,
allowEnvOverrides: false,
requireInit: true,
@@ -251,7 +251,7 @@ func TestLoadStartupDefinitionsAllowsMissingInitForImplicitVersion(t *testing.T)
stack, context, state := newStartupTestContext()
- definitions, err := loadStartupDefinitions(startupLoadOptions{
+ definitions, _, err := loadStartupDefinitions(startupLoadOptions{
version: version,
allowEnvOverrides: true,
requireInit: false,
@@ -417,3 +417,53 @@ func TestEnvWithoutStartupOverridesRemovesOnlyStartupVars(t *testing.T) {
t.Fatalf("filtered env missing KEEP_ME: %q", filteredJoined)
}
}
+
+func TestStartupFileEnumRegistersConstructors(t *testing.T) {
+ // An `enum` declared in a startup file (stdlib / init) must register its
+ // constructors on the EvalState, so a member word in the main program (or
+ // at the interactive prompt) constructs a value instead of falling through
+ // to the bare-literal path.
+ dir := t.TempDir()
+ path := filepath.Join(dir, "init.msh")
+ if err := os.WriteFile(path, []byte("enum Status = active | inactive end\n"), 0644); err != nil {
+ t.Fatalf("WriteFile(init) error = %v", err)
+ }
+
+ stack, context, state := newStartupTestContext()
+ var defs []MShellDefinition
+ var items []MShellParseItem
+ if err := loadStartupFile(path, "test init", &stack, context, &state, &defs, &items); err != nil {
+ t.Fatalf("loadStartupFile() error = %v", err)
+ }
+
+ info, ok := state.EnumMembers["active"]
+ if !ok {
+ t.Fatalf("expected member 'active' registered from startup file")
+ }
+ if info.EnumName != "Status" || info.Arity != 0 {
+ t.Fatalf("EnumMembers[active] = %+v, want Status arity 0", info)
+ }
+ if len(items) == 0 {
+ t.Fatalf("expected startup items to be retained for the checker")
+ }
+
+ parsed, err := parseMShellInput("active", &TokenFile{"main"})
+ if err != nil {
+ t.Fatalf("parse error: %v", err)
+ }
+ callStackItem := CallStackItem{MShellParseItem: parsed.Items[0], Name: "main", CallStackType: CALLSTACKFILE}
+ result := state.Evaluate(parsed.Items, &stack, context, defs, callStackItem)
+ if !result.Success {
+ t.Fatalf("evaluating member word failed")
+ }
+ if len(stack) != 1 {
+ t.Fatalf("len(stack) = %d, want 1", len(stack))
+ }
+ en, ok := stack[0].(*MShellEnum)
+ if !ok {
+ t.Fatalf("stack top = %T (%s), want *MShellEnum", stack[0], stack[0].DebugString())
+ }
+ if en.EnumName != "Status" || en.Member != "active" {
+ t.Fatalf("enum value = %s.%s, want Status.active", en.EnumName, en.Member)
+ }
+}
diff --git a/mshell/Type.go b/mshell/Type.go
index 674a9e03..7b97ae61 100644
--- a/mshell/Type.go
+++ b/mshell/Type.go
@@ -60,6 +60,11 @@ const (
TKGridView // Extra = index into gridSchemas (0 = unknown schema)
TKGridRow // Extra = index into gridSchemas (0 = unknown schema)
+ // Generative tagged sum type (a user `enum`). Nominal: identity is the
+ // declaration NameId, not the member set. A = decl NameId; Extra = index
+ // into enumVariants. See design/literal_or_enum_typing.html.
+ TKEnum
+
// TKStrLit is a `str` refined with a statically known value: A holds the
// interned NameId of the literal content. It is a subtype of `str` —
// unify and every container constructor widen it back to TidStr — so it
@@ -102,6 +107,8 @@ func (k TypeKind) String() string {
return "GridView"
case TKGridRow:
return "GridRow"
+ case TKEnum:
+ return "Enum"
case TKStrLit:
return "StrLit"
}
@@ -140,6 +147,15 @@ type ShapeField struct {
Optional bool
}
+// EnumVariant is one constructor of a TKEnum. Payload is the (ordered)
+// list of payload types the constructor carries; it is empty for a
+// nullary member. Order is meaningful — it sets member order for
+// diagnostics and exhaustiveness.
+type EnumVariant struct {
+ Name NameId
+ Payload []TypeId
+}
+
// GridSchemaCol is one column in a TKGrid / TKGridView / TKGridRow schema.
// Order is meaningful (grids have column order).
type GridSchemaCol struct {
@@ -186,6 +202,7 @@ type TypeArena struct {
unionMembers [][]TypeId // each slice is sorted, deduped
gridSchemas []GridSchema
gridSchemaCons map[string]uint32
+ enumVariants [][]EnumVariant
}
// NewTypeArena constructs an arena pre-populated with the primitive ids
@@ -223,6 +240,8 @@ func NewTypeArena() *TypeArena {
a.shapeFields = append(a.shapeFields, nil)
a.quoteSigs = append(a.quoteSigs, QuoteSig{})
a.overloadedQuoteSigs = append(a.overloadedQuoteSigs, nil)
+ // Reserve enumVariants[0] as a placeholder so non-zero Extra is meaningful.
+ a.enumVariants = append(a.enumVariants, nil)
return a
}
@@ -401,6 +420,58 @@ func (a *TypeArena) MakeOverloadedQuote(sigs []QuoteSig) TypeId {
return id
}
+// MakeEnum returns the canonical TypeId for a user enum named by nameId,
+// with the given variants. Identity is nominal: it keys on the declaration
+// NameId alone, so two enums with identical members are distinct types and
+// the same enum always interns to the same TypeId. A duplicate declaration
+// is caught higher up (DeclareEnum); reaching here with a name already
+// interned returns the existing id.
+func (a *TypeArena) MakeEnum(nameId NameId, variants []EnumVariant) TypeId {
+ key := "E:" + strconv.FormatUint(uint64(nameId), 10)
+ if id, ok := a.cons[key]; ok {
+ return id
+ }
+ cp := make([]EnumVariant, len(variants))
+ copy(cp, variants)
+ idx := uint32(len(a.enumVariants))
+ a.enumVariants = append(a.enumVariants, cp)
+ id := a.append(TypeNode{Kind: TKEnum, A: uint32(nameId), Extra: idx})
+ a.cons[key] = id
+ return id
+}
+
+// SetEnumVariants replaces the variant list of an already-created enum type.
+// Used to finalize an enum after its name was registered with a placeholder
+// (empty) variant list, so member payloads can reference the enum itself or
+// other enums regardless of declaration order.
+func (a *TypeArena) SetEnumVariants(id TypeId, variants []EnumVariant) {
+ n := a.Node(id)
+ if n.Kind != TKEnum {
+ panic("TypeArena.SetEnumVariants: not an enum")
+ }
+ cp := make([]EnumVariant, len(variants))
+ copy(cp, variants)
+ a.enumVariants[n.Extra] = cp
+}
+
+// EnumNameId returns the declaration NameId of an enum type.
+func (a *TypeArena) EnumNameId(id TypeId) NameId {
+ n := a.Node(id)
+ if n.Kind != TKEnum {
+ panic("TypeArena.EnumNameId: not an enum")
+ }
+ return NameId(n.A)
+}
+
+// EnumVariants returns the variants of an enum type. Caller must not mutate.
+func (a *TypeArena) EnumVariants(id TypeId) []EnumVariant {
+ n := a.Node(id)
+ if n.Kind != TKEnum {
+ panic("TypeArena.EnumVariants: not an enum")
+ }
+ return a.enumVariants[n.Extra]
+}
+
// MakeGrid returns the canonical TypeId for a grid type. schemaIdx of 0
// denotes "schema unknown" (the V1 default until schema tracking lands).
func (a *TypeArena) MakeGrid(schemaIdx uint32) TypeId {
diff --git a/mshell/TypeBranch.go b/mshell/TypeBranch.go
index 877f0a7a..c760cead 100644
--- a/mshell/TypeBranch.go
+++ b/mshell/TypeBranch.go
@@ -76,6 +76,10 @@ const (
MatchArmFalse // bool literal `false` pattern
// MatchArmEmptyList: `[]` pattern. Covers empty lists.
MatchArmEmptyList
+ // MatchArmEnumMember: an enum constructor pattern (`member` or
+ // `member b1 b2 ...`). TypeArm holds the enum type, EnumMember the
+ // member's NameId.
+ MatchArmEnumMember
// MatchArmListWithRest: `[a ...rest]`, `[a b ...rest]`, or
// `[...rest]` — any list pattern with a `...name` element.
// Covers all lists whose length is at least the number of
@@ -87,8 +91,9 @@ const (
// type effects flow through ReconcileArms; this struct only feeds
// the exhaustiveness check.
type MatchArmTag struct {
- Kind MatchArmKind
- TypeArm TypeId // valid when Kind == MatchArmType
+ Kind MatchArmKind
+ TypeArm TypeId // valid when Kind == MatchArmType or MatchArmEnumMember
+ EnumMember NameId // valid when Kind == MatchArmEnumMember
}
// CheckMatchExhaustive verifies that arms cover every inhabitant of
@@ -178,6 +183,30 @@ func (c *Checker) CheckMatchExhaustive(matched TypeId, arms []MatchArmTag, callS
}
}
+ case TKEnum:
+ variants := c.arena.enumVariants[n.Extra]
+ covered := make(map[NameId]bool, len(variants))
+ for _, arm := range arms {
+ if arm.Kind == MatchArmEnumMember && c.subst.Apply(c.arena, arm.TypeArm) == matched {
+ covered[arm.EnumMember] = true
+ }
+ }
+ var missing []string
+ for _, v := range variants {
+ if !covered[v.Name] {
+ missing = append(missing, c.names.Name(v.Name))
+ }
+ }
+ if len(missing) == 0 {
+ return true
+ }
+ c.errors = append(c.errors, TypeError{
+ Kind: TErrNonExhaustiveMatch,
+ Pos: callSite,
+ Hint: "enum match must cover every member or include a wildcard; missing: " + strings.Join(missing, ", "),
+ })
+ return false
+
case TKList:
// A list's inhabitants split by length: zero (empty) vs
// one-or-more. `[]` covers empty; any list pattern that ends
diff --git a/mshell/TypeCheckProgram.go b/mshell/TypeCheckProgram.go
index 94038f05..2c0a0c3b 100644
--- a/mshell/TypeCheckProgram.go
+++ b/mshell/TypeCheckProgram.go
@@ -43,12 +43,18 @@ import (
// type-checked here — std.msh exercises features (process lists,
// format strings, dynamic exec) the v1 checker does not yet model,
// and we trust the runtime tests catch breakage there.
-func TypeCheckProgram(file *MShellFile, stdlibDefs []MShellDefinition) (errors []string, ok bool) {
+//
+// startupItems is the startup files' top-level parse items; their `type`
+// and `enum` declarations are registered (bodies are not checked, matching
+// the def treatment) so the checker sees the same declarations the runtime
+// does.
+func TypeCheckProgram(file *MShellFile, stdlibDefs []MShellDefinition, startupItems []MShellParseItem) (errors []string, ok bool) {
arena := NewTypeArena()
names := NewNameTable()
checker := NewChecker(arena, names)
checker.RegisterStdlibSigs(stdlibDefs)
+ checker.RegisterStartupTypes(startupItems)
checker.CheckProgram(file)
out := make([]string, 0, len(checker.errors))
@@ -81,6 +87,12 @@ func (c *Checker) RegisterStdlibSigs(defs []MShellDefinition) {
for i := range defs {
def := &defs[i]
nameId := c.names.Intern(def.Name)
+ // Record the name even when the sig registration below is skipped:
+ // the runtime's first-match-wins lookup still resolves to this def,
+ // so a later def of the same name is a duplicate regardless.
+ if c.recordDefName(nameId, def) {
+ continue
+ }
if _, exists := c.nameBuiltins[nameId]; exists {
continue
}
@@ -89,18 +101,83 @@ func (c *Checker) RegisterStdlibSigs(defs []MShellDefinition) {
}
}
+// recordDefName registers a definition's name for duplicate detection. If the
+// name is already taken by an earlier definition, it records an error and
+// returns true (mirroring the runtime's FindDuplicateDefinition, where the
+// first definition wins and a duplicate would be silently dead code).
+func (c *Checker) recordDefName(nameId NameId, def *MShellDefinition) bool {
+ if prev, exists := c.defNameToks[nameId]; exists {
+ c.errors = append(c.errors, TypeError{
+ Kind: TErrTypeParse, Pos: def.NameToken,
+ Hint: "duplicate definition '" + def.Name + "'; already defined at " + tokenPosStr(prev),
+ })
+ return true
+ }
+ if c.defNameToks == nil {
+ c.defNameToks = make(map[NameId]Token)
+ }
+ c.defNameToks[nameId] = def.NameToken
+ return false
+}
+
+// RegisterStartupTypes registers the `type` and `enum` declarations found in
+// the startup files' top-level items (the stdlib, then the user init file),
+// so the checked program sees the same declarations the runtime does. It runs
+// the same three-phase order as CheckProgram's own pre-passes — enum names,
+// then type aliases, then enum payload bodies + constructor words — so
+// startup declarations may reference each other in any order. Call after
+// RegisterStdlibSigs (so a member colliding with a startup def is caught) and
+// before CheckProgram (whose def pre-pass catches the reverse collision).
+func (c *Checker) RegisterStartupTypes(items []MShellParseItem) {
+ var enumDecls []*MShellEnumDecl
+ for _, item := range items {
+ if d, ok := item.(*MShellEnumDecl); ok {
+ if c.predeclareEnum(d) {
+ enumDecls = append(enumDecls, d)
+ }
+ }
+ }
+ for _, item := range items {
+ if d, ok := item.(*MShellTypeDecl); ok {
+ body := c.resolveTypeExpr(d.Body, nil)
+ c.DeclareType(d.Name, body)
+ }
+ }
+ for _, d := range enumDecls {
+ c.defineEnum(d)
+ }
+}
+
// CheckProgram is the file-level type-check pass. It registers all
// type declarations and user-defined function sigs, then walks the
// parse tree driving the type stack. Error accumulation lives on the
// Checker.
func (c *Checker) CheckProgram(file *MShellFile) {
- // Pre-pass 1: register all `type` declarations.
+ // Pre-pass 0: predeclare every `enum` name with a placeholder type, so a
+ // `type` body (next) and an enum payload (after that) can reference any
+ // enum by name, in any order.
+ var enumDecls []*MShellEnumDecl
+ for _, item := range file.Items {
+ if d, ok := item.(*MShellEnumDecl); ok {
+ if c.predeclareEnum(d) {
+ enumDecls = append(enumDecls, d)
+ }
+ }
+ }
+ // Pre-pass 1: register all `type` declarations. Enum names are already
+ // available, so a `type` body may reference an enum.
for _, item := range file.Items {
if d, ok := item.(*MShellTypeDecl); ok {
body := c.resolveTypeExpr(d.Body, nil)
c.DeclareType(d.Name, body)
}
}
+ // Pre-pass 1b: resolve enum payload bodies and register constructor words.
+ // Both enum names and `type` aliases are now registered, so a payload may
+ // reference either.
+ for _, d := range enumDecls {
+ c.defineEnum(d)
+ }
// Pre-pass 2: register all `def` signatures so call sites (and
// recursive self-calls inside def bodies) can resolve them.
defSigs := make([]QuoteSig, len(file.Definitions))
@@ -109,6 +186,21 @@ func (c *Checker) CheckProgram(file *MShellFile) {
sig := c.ResolveDefSig(def.Inputs, def.Outputs)
defSigs[i] = sig
nameId := c.names.Intern(def.Name)
+ // Enum constructors share the word namespace and registered in
+ // pre-pass 1b; a def reusing a member name would resolve to the
+ // constructor in the checker but to the def at runtime, so reject it —
+ // the mirror of defineEnum rejecting a member that collides with an
+ // existing def or builtin.
+ if _, isMember := c.enumMemberToks[nameId]; isMember {
+ c.errors = append(c.errors, TypeError{
+ Kind: TErrTypeParse, Pos: def.NameToken,
+ Hint: "definition '" + def.Name + "' conflicts with an enum member of the same name",
+ })
+ continue
+ }
+ if c.recordDefName(nameId, def) {
+ continue
+ }
c.nameBuiltins[nameId] = append(c.nameBuiltins[nameId], sig)
}
// Pre-pass 3: type-check each def body against its declared sig.
@@ -1177,23 +1269,53 @@ func formatPatternItem(it MShellParseItem) string {
func (c *Checker) checkMatchBlock(matchBlock *MShellParseMatchBlock) {
startTok := matchBlock.GetStartToken()
if c.stack.Len() == 0 {
- c.errors = append(c.errors, TypeError{
- Kind: TErrStackUnderflow,
- Pos: startTok,
- Hint: "match subject",
- })
- return
+ if !c.inferring {
+ c.errors = append(c.errors, TypeError{
+ Kind: TErrStackUnderflow,
+ Pos: startTok,
+ Hint: "match subject",
+ })
+ return
+ }
+ // Quote-body inference: the subject is the quote's own input.
+ // Synthesize a fresh var exactly as applySig's underflow path does —
+ // at the bottom of the stack and the front of inferInputs — so
+ // `(match ... end) map` infers a one-input quote instead of erroring.
+ v := c.subst.FreshVar(c.arena)
+ c.inferInputs = append([]TypeId{v}, c.inferInputs...)
+ c.stack.items = append([]TypeId{v}, c.stack.items...)
}
// Widen a string-literal subject to `str`: match arms and the
// exhaustiveness check compare against `str` by type id, and the literal
// value carries no meaning for pattern matching.
subject := c.arena.WidenStrLit(c.stack.items[c.stack.Len()-1])
+ // See through a `type X = ...` brand once, here, so every arm form (enum
+ // member, `just`/` name` binding, list/dict pattern) and the
+ // exhaustiveness check match against the underlying type. A brand is
+ // nominal for typing but has no runtime representation, so a branded enum
+ // matches its members, a branded Maybe its `just`/`none`, etc. — exactly
+ // as the unbranded types do, which is what the runtime already does.
+ if resolved := c.subst.Apply(c.arena, subject); c.arena.Node(resolved).Kind == TKBrand {
+ subject = c.underlying(resolved)
+ }
+ // An unresolved subject (a quote input under inference) is pinned from the
+ // first arm pattern that names a type: an enum member determines its enum
+ // (member names are global) and `just`/`none` determine Maybe. Pinning
+ // happens before the entry branch is captured so every arm and the
+ // exhaustiveness check see the resolved subject.
+ if c.arena.Node(c.subst.Apply(c.arena, subject)).Kind == TKVar {
+ if pin, ok := c.matchSubjectPin(matchBlock); ok {
+ c.unify(subject, pin)
+ }
+ }
entry := c.captureBranch()
if len(matchBlock.Arms) == 0 {
- // Empty match block: no arms could fire. Treat as a no-op.
- // The runtime would error at first use; the checker keeps
- // the subject on the stack.
+ // An empty match can never fire — it always errors at runtime. Run
+ // exhaustiveness with no arms so the static check rejects it (no type
+ // is covered and there is no wildcard) instead of letting it crash at
+ // runtime.
+ c.CheckMatchExhaustive(subject, nil, startTok)
return
}
@@ -1242,6 +1364,36 @@ func (c *Checker) checkMatchBlock(matchBlock *MShellParseMatchBlock) {
c.reconcileArmBranches(armBranches, armLabels, entry, startTok)
}
+// matchSubjectPin returns the concrete type a match's arm patterns determine
+// for an as-yet-unresolved subject. An enum-member head names its enum (member
+// names are unique across enums), and a `just`/`none` head names Maybe[T] with
+// a fresh T. ok=false when no arm determines a type — value literals and type
+// keywords deliberately do not pin, since a type-keyword match may be
+// discriminating a union and pinning would wrongly narrow the input.
+func (c *Checker) matchSubjectPin(matchBlock *MShellParseMatchBlock) (TypeId, bool) {
+ for _, arm := range matchBlock.Arms {
+ if len(arm.Pattern) == 0 {
+ continue
+ }
+ tok, ok := arm.Pattern[0].(Token)
+ if !ok || tok.Type != LITERAL {
+ continue
+ }
+ if tok.Lexeme == "just" || tok.Lexeme == "none" {
+ return c.arena.MakeMaybe(c.subst.FreshVar(c.arena)), true
+ }
+ mid := c.names.Intern(tok.Lexeme)
+ if _, isMember := c.enumMemberToks[mid]; isMember {
+ // A member's constructor sig has the enum as its only output.
+ sigs := c.nameBuiltins[mid]
+ if len(sigs) > 0 && len(sigs[0].Outputs) == 1 && c.arena.Node(sigs[0].Outputs[0]).Kind == TKEnum {
+ return sigs[0].Outputs[0], true
+ }
+ }
+ }
+ return TidNothing, false
+}
+
// armPattern is the single interpretation of a match arm pattern. One
// analysis feeds all four consumers that used to re-pattern-match the
// arm independently: recognition diagnostics (Recognized), the
@@ -1289,6 +1441,14 @@ func (c *Checker) analyzeArmPattern(subject TypeId, pattern []MShellParseItem) a
func (c *Checker) armPatternOf(subject TypeId, pattern []MShellParseItem) armPattern {
out := armPattern{Tag: MatchArmTag{Kind: MatchArmType, TypeArm: TidNothing}}
+ // Enum constructor pattern: `member` or `member b1 b2 ...`, when the
+ // subject is an enum and the first token names one of its members. This
+ // can be any length (one token per payload binding), so it is handled
+ // before the length-based switch.
+ if ep, ok := c.enumMemberPattern(subject, pattern); ok {
+ return ep
+ }
+
switch len(pattern) {
case 1:
switch p := pattern[0].(type) {
@@ -1340,7 +1500,7 @@ func (c *Checker) armPatternOf(subject TypeId, pattern []MShellParseItem) armPat
case 2:
t0, ok0 := pattern[0].(Token)
t1, ok1 := pattern[1].(Token)
- if !ok0 || !ok1 || t1.Type != LITERAL {
+ if !ok0 || !ok1 || (t1.Type != LITERAL && t1.Type != UNDERSCORE) {
return out
}
if t0.Type == LITERAL && t0.Lexeme == "just" {
@@ -1379,6 +1539,69 @@ func (c *Checker) armPatternOf(subject TypeId, pattern []MShellParseItem) armPat
return out
}
+// enumMemberPattern recognizes an enum constructor arm: `member` (nullary) or
+// `member b1 b2 ...` (one binding name per payload). It returns ok=false when
+// the subject is not an enum or the first token is not one of its members, so
+// the caller falls back to the ordinary pattern forms. A payload-arity mismatch
+// is recognized (so no "invalid pattern" cascade) but reported.
+func (c *Checker) enumMemberPattern(subject TypeId, pattern []MShellParseItem) (armPattern, bool) {
+ if len(pattern) == 0 {
+ return armPattern{}, false
+ }
+ tok, ok := pattern[0].(Token)
+ if !ok || tok.Type != LITERAL {
+ return armPattern{}, false
+ }
+ // The subject is already brand-unwrapped by checkMatchBlock, so a branded
+ // enum (`type X = Enum`) arrives here as its underlying TKEnum.
+ resolved := c.subst.Apply(c.arena, subject)
+ sn := c.arena.Node(resolved)
+ if sn.Kind != TKEnum {
+ return armPattern{}, false
+ }
+ memberId := c.names.Intern(tok.Lexeme)
+ var payload []TypeId
+ found := false
+ for _, v := range c.arena.enumVariants[sn.Extra] {
+ if v.Name == memberId {
+ payload = v.Payload
+ found = true
+ break
+ }
+ }
+ if !found {
+ return armPattern{}, false
+ }
+ out := armPattern{
+ Recognized: true,
+ Tag: MatchArmTag{Kind: MatchArmEnumMember, TypeArm: resolved, EnumMember: memberId},
+ }
+ binds := pattern[1:]
+ if len(binds) != len(payload) {
+ c.errors = append(c.errors, TypeError{
+ Kind: TErrInvalidMatchPattern,
+ Pos: tok,
+ Hint: fmt.Sprintf("enum member '%s' binds %d payload value(s), got %d", tok.Lexeme, len(payload), len(binds)),
+ })
+ return out, true
+ }
+ for i, b := range binds {
+ bt, ok := b.(Token)
+ if !ok || (bt.Type != LITERAL && bt.Type != UNDERSCORE) {
+ c.errors = append(c.errors, TypeError{
+ Kind: TErrInvalidMatchPattern,
+ Pos: tok,
+ Hint: "enum payload bindings must be names",
+ })
+ return out, true
+ }
+ if bt.Lexeme != "_" {
+ out.Bindings = append(out.Bindings, patternBind{bt.Lexeme, payload[i]})
+ }
+ }
+ return out, true
+}
+
// analyzeTokenPattern handles the single-token pattern forms: type
// keywords, value literals, `_`, `none`, and user-declared type names.
func (c *Checker) analyzeTokenPattern(tok Token, out *armPattern) {
@@ -1404,6 +1627,9 @@ func (c *Checker) analyzeTokenPattern(tok Token, out *armPattern) {
case INTEGER, FLOAT, STRING, SINGLEQUOTESTRING, PATH:
// Value literals: legal patterns, but they credit no coverage.
out.Recognized = true
+ case UNDERSCORE:
+ out.Recognized = true
+ out.Tag = MatchArmTag{Kind: MatchArmWildcard}
case LITERAL:
switch tok.Lexeme {
case "_":
diff --git a/mshell/TypeCheckProgram_test.go b/mshell/TypeCheckProgram_test.go
index 21d4a994..b7e646da 100644
--- a/mshell/TypeCheckProgram_test.go
+++ b/mshell/TypeCheckProgram_test.go
@@ -15,7 +15,7 @@ func parseAndCheck(t *testing.T, src string) ([]string, bool) {
if err != nil {
t.Fatalf("parse error: %v", err)
}
- return TypeCheckProgram(file, nil)
+ return TypeCheckProgram(file, nil, nil)
}
func TestTypeCheckProgramEmpty(t *testing.T) {
diff --git a/mshell/TypeChecker.go b/mshell/TypeChecker.go
index 2ac50e88..2bb712cd 100644
--- a/mshell/TypeChecker.go
+++ b/mshell/TypeChecker.go
@@ -101,6 +101,22 @@ type Checker struct {
// type names are NOT stored here — they are recognized directly.
typeEnv map[NameId]TypeId
+ // enumMemberToks records every registered enum member name (value: the
+ // member's declaration token). Enum constructors and user defs share the
+ // word namespace, and enums register before same-file defs — so def
+ // registration checks this to reject a def whose name collides with a
+ // member, mirroring defineEnum rejecting a member that collides with an
+ // existing def or builtin.
+ enumMemberToks map[NameId]Token
+
+ // defNameToks records every registered definition name (value: the def's
+ // name token). Runtime definition lookup is first-match-wins, so a second
+ // def of a name is silently dead code, not an override; def registration
+ // checks this and rejects the duplicate, mirroring the runtime's
+ // FindDuplicateDefinition. Stdlib/init defs register before file defs,
+ // so a script redefining a stdlib name is caught too.
+ defNameToks map[NameId]Token
+
// Quote-body inference state (Phase 7). When inferring is true,
// applySig responds to stack underflow by synthesizing fresh type
// variables instead of reporting an error; those vars accumulate
@@ -394,7 +410,7 @@ func (c *Checker) checkOne(tok Token) {
// values, and it's load-bearing for `[cmd args] ;`-style
// pipelines where forcing the user to quote every word
// would defeat the point.
- if c.listDepth > 0 && tok.Type == LITERAL {
+ if c.listDepth > 0 && (tok.Type == LITERAL || tok.Type == UNDERSCORE) {
c.stack.Push(TidStr)
return
}
@@ -959,6 +975,11 @@ func (c *Checker) unify(got, want TypeId) bool {
return c.unifyQuote(gn, wn)
case TKOverloadedQuote:
return false
+ case TKEnum:
+ // Nominal: two enum types unify only when identical. Equal ids were
+ // already accepted at the top of unify; reaching here means distinct
+ // enums, which never unify.
+ return false
case TKGrid, TKGridView, TKGridRow:
// Phase-3 grids are opaque. Equality-by-id is the only way two grid
// types match; if we got here with same kind but different ids,
diff --git a/mshell/TypeEnum.go b/mshell/TypeEnum.go
new file mode 100644
index 00000000..9180ed96
--- /dev/null
+++ b/mshell/TypeEnum.go
@@ -0,0 +1,88 @@
+package main
+
+// Enum support: registering `enum Name = a | b(T..) | ...` declarations.
+//
+// An enum is a generative tagged sum type. Registration happens in two passes
+// so member payloads may reference any enum regardless of order (including the
+// enum itself):
+//
+// - predeclareEnum interns the name and registers a placeholder TKEnum in the
+// type environment, so the name resolves in any later type position.
+// - defineEnum resolves each member's payload types, finalizes the variant
+// list, and registers each member as a constructor word whose signature
+// consumes the payload and produces the enum (`(T.. -- Enum)`).
+//
+// Members share the global word namespace: a member name that collides with an
+// existing builtin / def / another enum member is rejected.
+
+// predeclareEnum registers the enum's name with a placeholder type. It returns
+// true when the name was newly registered (so defineEnum should finish it), and
+// false for a reserved or duplicate name (an error is recorded).
+func (c *Checker) predeclareEnum(d *MShellEnumDecl) bool {
+ if IsReservedTypeName(d.Name) {
+ c.errors = append(c.errors, TypeError{Kind: TErrReservedTypeName, Pos: d.NameToken, Name: d.Name})
+ return false
+ }
+ if c.typeEnv == nil {
+ c.typeEnv = make(map[NameId]TypeId, 8)
+ }
+ nameId := c.names.Intern(d.Name)
+ if _, exists := c.typeEnv[nameId]; exists {
+ c.errors = append(c.errors, TypeError{Kind: TErrDuplicateTypeName, Pos: d.NameToken, Name: d.Name})
+ return false
+ }
+ c.typeEnv[nameId] = c.arena.MakeEnum(nameId, nil)
+ return true
+}
+
+// defineEnum resolves payload types, finalizes the variant list, and registers
+// constructor words. It must run after predeclareEnum has registered the name.
+func (c *Checker) defineEnum(d *MShellEnumDecl) {
+ nameId := c.names.Intern(d.Name)
+ enumType := c.typeEnv[nameId]
+
+ type member struct {
+ name string
+ tok Token
+ payloads []TypeId
+ }
+ uniq := make([]member, 0, len(d.Members))
+ variants := make([]EnumVariant, 0, len(d.Members))
+ seen := make(map[string]bool, len(d.Members))
+ for i, m := range d.Members {
+ tok := d.MemberToks[i]
+ if seen[m] {
+ c.errors = append(c.errors, TypeError{
+ Kind: TErrTypeParse, Pos: tok,
+ Hint: "duplicate enum member '" + m + "' in '" + d.Name + "'",
+ })
+ continue
+ }
+ seen[m] = true
+
+ var payloads []TypeId
+ for _, p := range d.MemberPayloads[i] {
+ payloads = append(payloads, c.resolveTypeExpr(p, nil))
+ }
+ uniq = append(uniq, member{name: m, tok: tok, payloads: payloads})
+ variants = append(variants, EnumVariant{Name: c.names.Intern(m), Payload: payloads})
+ }
+
+ c.arena.SetEnumVariants(enumType, variants)
+
+ for _, u := range uniq {
+ mid := c.names.Intern(u.name)
+ if _, exists := c.nameBuiltins[mid]; exists {
+ c.errors = append(c.errors, TypeError{
+ Kind: TErrTypeParse, Pos: u.tok,
+ Hint: "enum member '" + u.name + "' conflicts with an existing definition or builtin of the same name",
+ })
+ continue
+ }
+ c.nameBuiltins[mid] = append(c.nameBuiltins[mid], QuoteSig{Inputs: u.payloads, Outputs: []TypeId{enumType}})
+ if c.enumMemberToks == nil {
+ c.enumMemberToks = make(map[NameId]Token, len(uniq))
+ }
+ c.enumMemberToks[mid] = u.tok
+ }
+}
diff --git a/mshell/TypeEnum_test.go b/mshell/TypeEnum_test.go
new file mode 100644
index 00000000..5cd57a7c
--- /dev/null
+++ b/mshell/TypeEnum_test.go
@@ -0,0 +1,170 @@
+package main
+
+import (
+ "strings"
+ "testing"
+)
+
+func TestEnumNullaryDeclAndConstruct(t *testing.T) {
+ errs, ok := parseAndCheck(t, "enum Color = red | green | blue end\ndef describe (Color -- str) c! \"x\" end\nred describe")
+ if !ok || len(errs) != 0 {
+ t.Fatalf("nullary enum decl + construct should pass; errs=%v ok=%v", errs, ok)
+ }
+}
+
+func TestEnumPayloadConstructorSignature(t *testing.T) {
+ // A payload constructor has signature (payload... -- Enum).
+ errs, ok := parseAndCheck(t, "enum R = ok str | failed int str | none2 end\ndef use (R -- str) c! \"x\" end\n404 \"nf\" failed use")
+ if !ok || len(errs) != 0 {
+ t.Fatalf("payload constructor should type-check; errs=%v ok=%v", errs, ok)
+ }
+}
+
+func TestEnumPayloadWrongType(t *testing.T) {
+ errs, ok := parseAndCheck(t, "enum R = ok int end\n\"x\" ok")
+ if ok {
+ t.Fatalf("wrong payload type should fail; errs=%v", errs)
+ }
+}
+
+func TestEnumDistinctNominal(t *testing.T) {
+ // Two enums with parallel members do not unify.
+ src := "enum A = a1 | a2 end\nenum B = b1 | b2 end\ndef takesA (A -- str) c! \"x\" end\nb1 takesA"
+ errs, ok := parseAndCheck(t, src)
+ if ok {
+ t.Fatalf("feeding enum B where A is expected should fail; errs=%v", errs)
+ }
+}
+
+func TestEnumDuplicateMember(t *testing.T) {
+ errs, ok := parseAndCheck(t, "enum E = a | b | a end")
+ if ok {
+ t.Fatalf("duplicate enum member should fail; errs=%v", errs)
+ }
+ if !strings.Contains(strings.Join(errs, "\n"), "duplicate enum member") {
+ t.Fatalf("expected duplicate-member error; errs=%v", errs)
+ }
+}
+
+func TestEnumCrossEnumMemberCollision(t *testing.T) {
+ errs, ok := parseAndCheck(t, "enum E = x1 | shared end\nenum F = shared | y1 end")
+ if ok {
+ t.Fatalf("member name reused across enums should fail; errs=%v", errs)
+ }
+}
+
+func TestEnumReservedName(t *testing.T) {
+ errs, ok := parseAndCheck(t, "enum Maybe = a | b end")
+ if ok {
+ t.Fatalf("enum named with a reserved type name should fail; errs=%v", errs)
+ }
+}
+
+func TestEnumMissingEnd(t *testing.T) {
+ // Without the closing `end`, the declaration is incomplete — a parse error.
+ l := NewLexer("enum Color = red | green | blue\n\"x\" wl", nil)
+ p := NewMShellParser(l)
+ if _, err := p.ParseFile(); err == nil {
+ t.Fatalf("enum without a closing 'end' should be a parse error")
+ }
+}
+
+func TestEnumMatchExhaustive(t *testing.T) {
+ src := "enum Color = red | green | blue end\ngreen match\n red : \"r\" wl,\n green : \"g\" wl,\n blue : \"b\" wl,\nend"
+ errs, ok := parseAndCheck(t, src)
+ if !ok || len(errs) != 0 {
+ t.Fatalf("exhaustive enum match should pass; errs=%v ok=%v", errs, ok)
+ }
+}
+
+func TestEnumMatchNonExhaustive(t *testing.T) {
+ src := "enum Color = red | green | blue end\ngreen match\n red : \"r\" wl,\n blue : \"b\" wl,\nend"
+ errs, ok := parseAndCheck(t, src)
+ if ok {
+ t.Fatalf("non-exhaustive enum match should fail; errs=%v", errs)
+ }
+ if !strings.Contains(strings.Join(errs, "\n"), "missing: green") {
+ t.Fatalf("expected missing-member hint naming 'green'; errs=%v", errs)
+ }
+}
+
+func TestEnumMatchWildcardExhaustive(t *testing.T) {
+ src := "enum Color = red | green | blue end\nred match\n red : \"r\" wl,\n _ : \"o\" wl,\nend"
+ errs, ok := parseAndCheck(t, src)
+ if !ok || len(errs) != 0 {
+ t.Fatalf("wildcard should make enum match exhaustive; errs=%v ok=%v", errs, ok)
+ }
+}
+
+func TestEnumMatchEmptyNonExhaustive(t *testing.T) {
+ // An empty match covers no members and must be rejected.
+ src := "enum Color = red | green | blue end\nred match end"
+ errs, ok := parseAndCheck(t, src)
+ if ok {
+ t.Fatalf("empty match on an enum should be non-exhaustive; errs=%v", errs)
+ }
+}
+
+func TestEnumMatchPayloadBinding(t *testing.T) {
+ src := "enum R = ok str | failed int str | quit end\n404 \"nf\" failed match\n ok s : @s wl,\n failed c e : @e wl,\n quit : \"q\" wl,\nend"
+ errs, ok := parseAndCheck(t, src)
+ if !ok || len(errs) != 0 {
+ t.Fatalf("payload-binding enum match should pass; errs=%v ok=%v", errs, ok)
+ }
+}
+
+func TestEnumRecursivePayload(t *testing.T) {
+ // A member may carry a payload that references the enum itself.
+ src := "enum Tree = leaf int | node Tree Tree end\n3 leaf 4 leaf node"
+ errs, ok := parseAndCheck(t, src)
+ if !ok || len(errs) != 0 {
+ t.Fatalf("self-referential enum payload should type-check; errs=%v ok=%v", errs, ok)
+ }
+}
+
+// parseItemsForTest parses source and returns its top-level items, for tests
+// that feed startup-file declarations to the checker.
+func parseItemsForTest(t *testing.T, src string) []MShellParseItem {
+ t.Helper()
+ l := NewLexer(src, nil)
+ p := NewMShellParser(l)
+ file, err := p.ParseFile()
+ if err != nil {
+ t.Fatalf("parse error: %v", err)
+ }
+ return file.Items
+}
+
+func TestStartupEnumAndTypeVisibleToChecker(t *testing.T) {
+ // `enum` and `type` declarations in a startup file (stdlib / init) are
+ // registered before the main program is checked, so the program can
+ // construct members, match on them, and reference the alias — the same
+ // declarations the runtime registers.
+ startup := parseItemsForTest(t, "enum Status = active | inactive end\ntype Tagged = {name: str, s: Status}")
+ l := NewLexer("active match\n active : \"A\" wl,\n inactive : \"I\" wl,\nend\n{ \"name\": \"x\", \"s\": active } as Tagged drop", nil)
+ p := NewMShellParser(l)
+ file, err := p.ParseFile()
+ if err != nil {
+ t.Fatalf("parse error: %v", err)
+ }
+ errs, ok := TypeCheckProgram(file, nil, startup)
+ if !ok || len(errs) != 0 {
+ t.Fatalf("startup enum/type should be visible to the checker; errs=%v ok=%v", errs, ok)
+ }
+}
+
+func TestDefCollidingWithStartupEnumMemberRejected(t *testing.T) {
+ // The member/def collision check spans files: a program def reusing a
+ // startup enum's member name is rejected, same as a same-file collision.
+ startup := parseItemsForTest(t, "enum E = foo | zz end")
+ l := NewLexer("def foo ( -- int) 42 end\nfoo drop", nil)
+ p := NewMShellParser(l)
+ file, err := p.ParseFile()
+ if err != nil {
+ t.Fatalf("parse error: %v", err)
+ }
+ errs, ok := TypeCheckProgram(file, nil, startup)
+ if ok {
+ t.Fatalf("def colliding with startup enum member should fail; errs=%v", errs)
+ }
+}
diff --git a/mshell/TypeError.go b/mshell/TypeError.go
index 9eabbf53..c4f5b0c3 100644
--- a/mshell/TypeError.go
+++ b/mshell/TypeError.go
@@ -282,6 +282,8 @@ func FormatType(arena *TypeArena, names *NameTable, id TypeId) string {
return "GridView"
case TKGridRow:
return "GridRow"
+ case TKEnum:
+ return names.Name(NameId(n.A))
}
return fmt.Sprintf("<%s #%d>", n.Kind, uint32(id))
}
diff --git a/mshell/TypeParseIntegration.go b/mshell/TypeParseIntegration.go
index 20979c26..1ffd8fc3 100644
--- a/mshell/TypeParseIntegration.go
+++ b/mshell/TypeParseIntegration.go
@@ -35,6 +35,110 @@ func (d *MShellTypeDecl) DebugString() string {
func (d *MShellTypeDecl) GetStartToken() Token { return d.StartTok }
func (d *MShellTypeDecl) GetEndToken() Token { return d.NameToken }
+// MShellEnumDecl is a top-level `enum Name = c1 | c2 T.. | ... end`
+// declaration: a generative tagged sum type. Each member is a constructor
+// name followed by zero or more space-separated payload types, members are
+// separated by `|`, and the body is closed by `end`. MemberPayloads is
+// parallel to Members; an entry is empty for a nullary member.
+type MShellEnumDecl struct {
+ Name string
+ NameToken Token
+ StartTok Token // the ENUM keyword
+ Members []string
+ MemberToks []Token
+ MemberPayloads [][]MShellParseItem
+}
+
+func (d *MShellEnumDecl) ToJson() string {
+ parts := make([]string, len(d.Members))
+ for i, m := range d.Members {
+ parts[i] = fmt.Sprintf("%q", m)
+ }
+ return fmt.Sprintf("{\"kind\": \"enumDecl\", \"name\": %q, \"members\": [%s]}", d.Name, strings.Join(parts, ", "))
+}
+
+func (d *MShellEnumDecl) DebugString() string {
+ return fmt.Sprintf("enum %s = %s end", d.Name, strings.Join(d.Members, " | "))
+}
+
+func (d *MShellEnumDecl) GetStartToken() Token { return d.StartTok }
+func (d *MShellEnumDecl) GetEndToken() Token {
+ if len(d.MemberToks) > 0 {
+ return d.MemberToks[len(d.MemberToks)-1]
+ }
+ return d.NameToken
+}
+
+// ParseEnumDecl handles a top-level enum declaration:
+//
+// enum Name = m1 | m2 T1 T2 | m3 ... end
+//
+// Each member is a bare identifier (LITERAL) followed by zero or more
+// space-separated payload type expressions; members are separated by `|` and
+// the body is closed by `end`. The `end` terminator is what makes the grammar
+// whitespace-insensitive and unambiguous against the following code: a member's
+// payload list runs until the next `|` or the closing `end`, so a nullary
+// member can be followed by an arbitrary `(...)`/`[...]` statement without it
+// being mistaken for a payload. The ENUM keyword is the current token on entry;
+// on return, parser.curr is positioned past the closing `end`.
+func (parser *MShellParser) ParseEnumDecl() (*MShellEnumDecl, error) {
+ startTok := parser.curr
+ parser.NextToken() // consume ENUM
+ if parser.curr.Type != LITERAL {
+ return nil, fmt.Errorf("%d:%d: expected an enum name after 'enum', got %s",
+ parser.curr.Line, parser.curr.Column, parser.curr.Type)
+ }
+ nameTok := parser.curr
+ parser.NextToken() // consume name
+ if parser.curr.Type != EQUALS {
+ return nil, fmt.Errorf("%d:%d: expected '=' in enum declaration, got %s",
+ parser.curr.Line, parser.curr.Column, parser.curr.Type)
+ }
+ parser.NextToken() // consume =
+
+ // Allow an optional leading `|` so members can be written ML-style, one
+ // per line each prefixed with `|`.
+ if parser.curr.Type == PIPE {
+ parser.NextToken()
+ }
+
+ decl := &MShellEnumDecl{Name: nameTok.Lexeme, NameToken: nameTok, StartTok: startTok}
+ var errs []TypeError
+ for {
+ if parser.curr.Type != LITERAL {
+ return nil, fmt.Errorf("%d:%d: expected an enum member name (an identifier), got %s. An enum is written `enum Name = m1 | m2 T ... end`",
+ parser.curr.Line, parser.curr.Column, parser.curr.Type)
+ }
+ memberTok := parser.curr
+ decl.Members = append(decl.Members, memberTok.Lexeme)
+ decl.MemberToks = append(decl.MemberToks, memberTok)
+ parser.NextToken() // consume member name
+
+ // Payload types run until the next `|`, the closing `end`, or EOF.
+ var payloads []MShellParseItem
+ for parser.curr.Type != PIPE && parser.curr.Type != END && parser.curr.Type != EOF {
+ payloads = append(payloads, parser.parseTypePrimary(&errs))
+ }
+ decl.MemberPayloads = append(decl.MemberPayloads, payloads)
+
+ if parser.curr.Type == PIPE {
+ parser.NextToken() // consume |
+ continue
+ }
+ if parser.curr.Type == END {
+ parser.NextToken() // consume end
+ break
+ }
+ // EOF before `end`.
+ return nil, fmt.Errorf("%d:%d: expected 'end' to close the enum declaration '%s'",
+ parser.curr.Line, parser.curr.Column, decl.Name)
+ }
+ if len(errs) > 0 {
+ return nil, fmt.Errorf("enum declaration body: %s", joinTypeErrs(errs))
+ }
+ return decl, nil
+}
+
// MShellAsCast is a ` as ` postfix cast.
type MShellAsCast struct {
AsToken Token
diff --git a/mshell/TypeUnify.go b/mshell/TypeUnify.go
index d5b9a5b7..c7295499 100644
--- a/mshell/TypeUnify.go
+++ b/mshell/TypeUnify.go
@@ -158,6 +158,10 @@ func (w *typeRewriter) mapType(t TypeId, skip map[TypeVarId]struct{}) TypeId {
return t
}
return w.arena.MakeCommand(argv, CommandCaptureMode(n.B), CommandCaptureMode(n.Extra))
+ case TKEnum:
+ // Nominal and ground: identity is the declaration name and payloads
+ // carry no type variables, so there is nothing to rewrite.
+ return t
case TKQuote:
sig, changed := w.mapSig(w.arena.quoteSigs[n.Extra], skip)
if !changed {
@@ -342,6 +346,11 @@ func (a *TypeArena) walkTypeVars(t TypeId, visit func(TypeVarId) bool) bool {
return true
}
}
+ case TKEnum:
+ // Enums are nominal and ground — payloads are resolved without a
+ // generic scope, so they never contain a type variable. Treat the
+ // enum as a leaf; recursing into payloads would loop forever on a
+ // self-referential enum (e.g. `node Tree Tree`).
case TKQuote:
if a.walkSigVars(a.quoteSigs[n.Extra], visit) {
return true
diff --git a/mshell/lsp.go b/mshell/lsp.go
index a7a34377..b8d55288 100644
--- a/mshell/lsp.go
+++ b/mshell/lsp.go
@@ -41,6 +41,7 @@ type lspServer struct {
envNames map[string]struct{}
candsBuf []string
stdlibDefs []MShellDefinition
+ stdlibItems []MShellParseItem // stdlib top-level items; `type`/`enum` decls registered per diagnostics pass
builtinSigs map[string][]string // name -> formatted "(in -- out)" sigs from the type checker
stdlibHover map[string][]string // name -> formatted sigs for stdlib defs
}
@@ -134,10 +135,11 @@ func RunLSP(in io.Reader, out io.Writer) error {
envNames: make(map[string]struct{}),
}
- if defs, err := loadStdlibDefsForLSP(); err != nil {
+ if defs, items, err := loadStdlibDefsForLSP(); err != nil {
logLSP(fmt.Sprintf("type-check diagnostics: stdlib unavailable (%v); proceeding without stdlib sigs", err))
} else {
server.stdlibDefs = defs
+ server.stdlibItems = items
}
server.builtinSigs, server.stdlibHover = buildHoverIndex(server.stdlibDefs)
@@ -181,23 +183,23 @@ func buildHoverIndex(stdlibDefs []MShellDefinition) (map[string][]string, map[st
// MSHSTDLIB if set, else the version-keyed install path), parses it,
// and returns its definitions. The bodies are not evaluated; we only
// need the signatures to register as builtins for the type-checker.
-func loadStdlibDefsForLSP() ([]MShellDefinition, error) {
+func loadStdlibDefsForLSP() ([]MShellDefinition, []MShellParseItem, error) {
stdlibSpec, _, err := getStartupFileSpecs(startupLoadOptions{
version: mshellVersion,
allowEnvOverrides: true,
})
if err != nil {
- return nil, err
+ return nil, nil, err
}
source, err := os.ReadFile(stdlibSpec.path)
if err != nil {
- return nil, err
+ return nil, nil, err
}
parsed, err := parseMShellInput(string(source), &TokenFile{stdlibSpec.path})
if err != nil {
- return nil, err
+ return nil, nil, err
}
- return parsed.Definitions, nil
+ return parsed.Definitions, parsed.Items, nil
}
func (s *lspServer) run() error {
@@ -548,6 +550,7 @@ func (s *lspServer) computeDiagnostics(text string) []protocol.Diagnostic {
names := NewNameTable()
checker := NewChecker(arena, names)
checker.RegisterStdlibSigs(s.stdlibDefs)
+ checker.RegisterStartupTypes(s.stdlibItems)
checker.CheckProgram(file)
errs := checker.Errors()
diff --git a/mshell/lsp_test.go b/mshell/lsp_test.go
index 689d5a33..d92d4a72 100644
--- a/mshell/lsp_test.go
+++ b/mshell/lsp_test.go
@@ -1328,7 +1328,7 @@ func TestCompletionWordIncludesBuiltinAndStdlib(t *testing.T) {
}
func TestBuildHoverIndexCoversTypedBuiltinsAndStdlib(t *testing.T) {
- stdlibDefs, err := loadStdlibDefsForLSP()
+ stdlibDefs, _, err := loadStdlibDefsForLSP()
if err != nil {
t.Skipf("stdlib not available in test environment: %v", err)
}
diff --git a/sublime/msh.sublime-syntax b/sublime/msh.sublime-syntax
index 965d4832..3f9d4a07 100644
--- a/sublime/msh.sublime-syntax
+++ b/sublime/msh.sublime-syntax
@@ -26,7 +26,7 @@ contexts:
scope: keyword.control.msh
- match: '\\*if'
scope: keyword.control.msh
- - match: '\\b(def|end|if|iff|loop|read|str|break|continue|else)\\b'
+ - match: '\\b(def|end|if|iff|loop|read|str|break|continue|else|match|enum|type)\\b'
scope: keyword.control.msh
- match: '\\b(and|or|not)\\b'
scope: keyword.operator.word.msh
@@ -48,6 +48,10 @@ contexts:
numbers:
- match: '\\b0[xX][0-9A-Fa-f]+\\b'
scope: constant.numeric.integer.hex.msh
+ - match: '\\b0[oO][0-7]+\\b'
+ scope: constant.numeric.integer.octal.msh
+ - match: '\\b0[bB][01]+\\b'
+ scope: constant.numeric.integer.binary.msh
- match: '\\b\\d+\\.\\d*(?:[eE][+-]?\\d+)?\\b'
scope: constant.numeric.float.msh
- match: '\\b\\d+(?:[eE][+-]?\\d+)?\\b'
diff --git a/tests/fail/cyclic_render.msh b/tests/fail/cyclic_render.msh
new file mode 100644
index 00000000..ba23816f
--- /dev/null
+++ b/tests/fail/cyclic_render.msh
@@ -0,0 +1,10 @@
+# mshell is strict: a cyclic value (a container appended into itself) is a
+# degenerate artifact of in-place mutation, so converting one to a string or
+# JSON is an error rather than a hang. Equality and sorting on cyclic values
+# still terminate (pointer-identity fast path + pair memoization).
+enum Box = wrap [Box] | z end
+[] x!
+@x wrap e!
+@x @e append drop
+@e dup = str wl
+@e str wl
diff --git a/tests/fail/cyclic_render.msh.stderr b/tests/fail/cyclic_render.msh.stderr
new file mode 100644
index 00000000..f45c549a
--- /dev/null
+++ b/tests/fail/cyclic_render.msh.stderr
@@ -0,0 +1 @@
+10:4: Cannot convert a cyclic value (a container that contains itself) to a string.
diff --git a/tests/fail/duplicate_def.msh b/tests/fail/duplicate_def.msh
new file mode 100644
index 00000000..831b2e97
--- /dev/null
+++ b/tests/fail/duplicate_def.msh
@@ -0,0 +1,5 @@
+# Defining the same name twice is an error: definition lookup is
+# first-match-wins, so the second def would be silently dead code.
+def greet (-- str) "hi" end
+def greet (-- str) "yo" end
+greet wl
diff --git a/tests/fail/duplicate_def.msh.stderr b/tests/fail/duplicate_def.msh.stderr
new file mode 100644
index 00000000..74ff8249
--- /dev/null
+++ b/tests/fail/duplicate_def.msh.stderr
@@ -0,0 +1 @@
+4:5: Duplicate definition 'greet'; already defined at 3:5.
diff --git a/tests/fail/enum_bad_payload_binding.msh b/tests/fail/enum_bad_payload_binding.msh
new file mode 100644
index 00000000..26ed8441
--- /dev/null
+++ b/tests/fail/enum_bad_payload_binding.msh
@@ -0,0 +1,8 @@
+# An enum payload binding must be a plain name (or `_`). A nested pattern (or a
+# keyword/operator token) is rejected at runtime, matching the type checker and
+# the `just`/type-test binding forms.
+enum Box = items [int] | z end
+[1 2 3] items match
+ items [a b] : "matched" wl,
+ z : "z" wl,
+end
diff --git a/tests/fail/enum_bad_payload_binding.msh.stderr b/tests/fail/enum_bad_payload_binding.msh.stderr
new file mode 100644
index 00000000..2682d61e
--- /dev/null
+++ b/tests/fail/enum_bad_payload_binding.msh.stderr
@@ -0,0 +1 @@
+6:3: enum member 'items' payload bindings must be names, not '['a', 'b']'.
diff --git a/tests/success/branded_maybe_match.msh b/tests/success/branded_maybe_match.msh
new file mode 100644
index 00000000..6f4cd0b4
--- /dev/null
+++ b/tests/success/branded_maybe_match.msh
@@ -0,0 +1,28 @@
+# A `Maybe` (or enum, or any type) named via a `type` alias is a distinct brand,
+# but `match` sees through the brand to the underlying type — so a branded
+# Maybe matches by `just v` / `none`, a branded enum by its members, and a
+# branded primitive by a type-keyword arm. The brand is nominal for typing but
+# has no runtime form, matching what the runtime already does.
+enum C = red | green | blue end
+type MC = Maybe[C]
+
+red just as MC match
+ just v : @v str wl,
+ none : "n" wl,
+end
+
+none as MC match
+ just v : @v str wl,
+ none : "n" wl,
+end
+
+# Branded Maybe of a primitive, with a type-keyword binding inside.
+type MI = Maybe[int]
+7 just as MI match
+ just n : @n 1 + str wl,
+ none : "n" wl,
+end
+
+# Branded primitive, matched with a type-keyword + binding.
+type MyInt = int
+5 as MyInt match int n : @n str wl, _ : "o" wl, end
diff --git a/tests/success/branded_maybe_match.msh.stdout b/tests/success/branded_maybe_match.msh.stdout
new file mode 100644
index 00000000..373f91b9
--- /dev/null
+++ b/tests/success/branded_maybe_match.msh.stdout
@@ -0,0 +1,4 @@
+red
+n
+8
+5
diff --git a/tests/success/enum.msh b/tests/success/enum.msh
new file mode 100644
index 00000000..191a7cd1
--- /dev/null
+++ b/tests/success/enum.msh
@@ -0,0 +1,33 @@
+# Enum: declaration, construction, match, and payload-carrying variants.
+enum Color = red | green | blue end
+
+green match
+ red : "red" wl,
+ green : "green" wl,
+ blue : "blue" wl,
+end
+
+enum CmdResult = ok str | failed int str | timeout end
+
+404 "not found" failed match
+ ok out : @out wl,
+ failed c e : [@e " " @c str] "" join wl,
+ timeout : "timeout" wl,
+end
+
+timeout match
+ ok _ : "ok" wl,
+ failed _ _ : "failed" wl,
+ timeout : "timed out" wl,
+end
+
+# Enum used in a def signature, distinct from other enums.
+def label (Color -- str)
+ match
+ red : "is red",
+ green : "is green",
+ blue : "is blue",
+ end
+end
+
+blue label wl
diff --git a/tests/success/enum.msh.stdout b/tests/success/enum.msh.stdout
new file mode 100644
index 00000000..e8de0ada
--- /dev/null
+++ b/tests/success/enum.msh.stdout
@@ -0,0 +1,4 @@
+green
+not found 404
+timed out
+is blue
diff --git a/tests/success/enum_alternating_deep.msh b/tests/success/enum_alternating_deep.msh
new file mode 100644
index 00000000..9ae0be88
--- /dev/null
+++ b/tests/success/enum_alternating_deep.msh
@@ -0,0 +1,25 @@
+# Deeply nested values must render, serialize, and compare without overflowing
+# even when container kinds alternate: rendering/JSON/equality all run on one
+# shared work-stack walker (renderValue / equalsIter) that expands enum, Maybe,
+# list, dict, and pipe inline. Per-type iterative walkers were not enough — an
+# enum→Maybe→enum chain re-entered each type's recursive method and overflowed
+# the Go stack well before this depth.
+enum E = m Maybe[E] | z end
+z e!
+0 i!
+(
+ @i 50000 >= if break end
+ @e just m e!
+ @i 1 + i!
+) loop
+@e str len str wl
+@e toJson len str wl
+z e2!
+0 i!
+(
+ @i 50000 >= if break end
+ @e2 just m e2!
+ @i 1 + i!
+) loop
+@e @e2 = str wl
+@e @e2 just m = str wl
diff --git a/tests/success/enum_alternating_deep.msh.stdout b/tests/success/enum_alternating_deep.msh.stdout
new file mode 100644
index 00000000..731afcab
--- /dev/null
+++ b/tests/success/enum_alternating_deep.msh.stdout
@@ -0,0 +1,4 @@
+450001
+350003
+true
+false
diff --git a/tests/success/enum_branded_match.msh b/tests/success/enum_branded_match.msh
new file mode 100644
index 00000000..a63a4a6a
--- /dev/null
+++ b/tests/success/enum_branded_match.msh
@@ -0,0 +1,27 @@
+# An enum named via a `type` alias is a distinct branded type, but it can still
+# be `match`ed by its members — just as a branded union (`type T = int | str`)
+# is matched by its arms. Exhaustiveness is enforced over the members, payload
+# binding works through the brand, and the brand stays nominal at call
+# boundaries (an explicit `as` is needed to pass an enum where the alias is).
+enum C = red | green | blue end
+type Color2 = C
+
+red as Color2 match
+ red : "r" wl,
+ green : "g" wl,
+ blue : "b" wl,
+end
+
+enum R = ok int | failed str | z end
+type R2 = R
+404 as int drop
+5 ok as R2 match
+ ok n : @n str wl,
+ failed m : @m wl,
+ z : "z" wl,
+end
+
+def paint (Color2 -- str)
+ match red: "is red", green: "is green", blue: "is blue", end
+end
+blue as Color2 paint wl
diff --git a/tests/success/enum_branded_match.msh.stdout b/tests/success/enum_branded_match.msh.stdout
new file mode 100644
index 00000000..c75b5130
--- /dev/null
+++ b/tests/success/enum_branded_match.msh.stdout
@@ -0,0 +1,3 @@
+r
+5
+is blue
diff --git a/tests/success/enum_dag_equality.msh b/tests/success/enum_dag_equality.msh
new file mode 100644
index 00000000..064f00a6
--- /dev/null
+++ b/tests/success/enum_dag_equality.msh
@@ -0,0 +1,72 @@
+# Equality and ordering on enum values with shared substructure must not blow
+# up: `@t @t node` reuses one subtree twice per level, so after 64 levels the
+# value is a DAG with 65 nodes but 2^64 tree paths. The comparison walks skip
+# pointer-identical pairs (and, past a step threshold, memoize already-expanded
+# pairs), so these finish instantly; a naive structural walk would run for
+# centuries.
+enum T = leaf int | node T T end
+
+# One DAG, compared against itself / its own reference.
+0 leaf t!
+0 i!
+( @i 64 >= if break end @t @t node t! @i 1 + i! ) loop
+@t @t = str wl
+@t dup = str wl
+[ @t @t ] uniq len str wl
+[ @t @t ] sort len str wl
+
+# Two DAGs built independently: no pointers are shared across the operands, so
+# the pointer fast path never fires — this exercises the memoized mode.
+0 leaf a!
+0 i!
+( @i 64 >= if break end @a @a node a! @i 1 + i! ) loop
+0 leaf b!
+0 i!
+( @i 64 >= if break end @b @b node b! @i 1 + i! ) loop
+@a @b = str wl
+
+# A third DAG differing at the bottom leaf: unequal, found without blowup.
+1 leaf c!
+0 i!
+( @i 64 >= if break end @c @c node c! @i 1 + i! ) loop
+@a @c = str wl
+[ @a @c @b ] sort len str wl
+[ @a @c @b ] uniq len str wl
+
+# Past the dagGuard memo cap (2^18): self-doubling pairs are deduplicated at
+# push time, so independent DAGs deeper than the cap stay linear — a bounded
+# memo alone cannot cover a pending-duplicate working set larger than itself.
+0 leaf p! 0 leaf q! 0 i!
+(
+ @i 300000 >= if break end
+ @p @p node p!
+ @q @q node q!
+ @i 1 + i!
+) loop
+@p @q = str wl
+
+# Dict-shaped self-doubling past the cap: the dict arms dedupe consecutive
+# identical value pairs the same way (an enum with a dict payload closes the
+# same doubling structure through {str: E}).
+enum D = md {str: D} | zd end
+zd u! zd v! 0 i!
+(
+ @i 300000 >= if break end
+ { "l": @u, "r": @u } md u!
+ { "l": @v, "r": @v } md v!
+ @i 1 + i!
+) loop
+@u @v = str wl
+
+# Class closure: a NON-consecutive alternating sharing pattern ([x y x] per
+# level) defeats push-time dedup entirely and, past any bounded memo, every
+# such pattern re-explodes — so the pair memo is unbounded. Any sharing
+# pattern, any container mix, any depth is polynomial in actual nodes.
+[0] xa! [1] ya! [0] xb! [1] yb! 0 i!
+(
+ @i 300000 >= if break end
+ [ @xa @ya @xa ] t! [ @ya @xa @ya ] ya! @t xa!
+ [ @xb @yb @xb ] t! [ @yb @xb @yb ] yb! @t xb!
+ @i 1 + i!
+) loop
+@xa @xb = str wl
diff --git a/tests/success/enum_dag_equality.msh.stdout b/tests/success/enum_dag_equality.msh.stdout
new file mode 100644
index 00000000..f7d26a30
--- /dev/null
+++ b/tests/success/enum_dag_equality.msh.stdout
@@ -0,0 +1,11 @@
+true
+true
+1
+2
+true
+false
+3
+2
+true
+true
+true
diff --git a/tests/success/enum_deep_equals.msh b/tests/success/enum_deep_equals.msh
new file mode 100644
index 00000000..48719b7e
--- /dev/null
+++ b/tests/success/enum_deep_equals.msh
@@ -0,0 +1,17 @@
+# Deeply nested enum values must compare for equality without overflowing:
+# `=` walks enum payloads with an explicit pair stack, not function recursion
+# (mirroring `str`/`toJson`). Build two independent 50000-deep trees and a
+# third that differs only at the very tip; a recursive comparator would
+# overflow the stack on values this deep.
+enum Tree = leaf int | node Tree Tree end
+0 leaf a! 0 leaf b! 0 leaf c! 0 i!
+(
+ @i 50000 >= if break end
+ @a 0 leaf node a!
+ @b 0 leaf node b!
+ @c 0 leaf node c!
+ @i 1 + i!
+) loop
+@c 0 leaf 99 leaf node node c!
+@a @b = str wl
+@a @c = str wl
diff --git a/tests/success/enum_deep_equals.msh.stdout b/tests/success/enum_deep_equals.msh.stdout
new file mode 100644
index 00000000..da29283a
--- /dev/null
+++ b/tests/success/enum_deep_equals.msh.stdout
@@ -0,0 +1,2 @@
+true
+false
diff --git a/tests/success/enum_deep_json.msh b/tests/success/enum_deep_json.msh
new file mode 100644
index 00000000..81175c78
--- /dev/null
+++ b/tests/success/enum_deep_json.msh
@@ -0,0 +1,14 @@
+# A deeply nested enum value must serialize to JSON without overflowing:
+# `toJson` renders enum payloads with an explicit work stack, not function
+# recursion (mirroring `str`/enum_deep_render). Build a 50000-deep tree and
+# print the length of its JSON; a recursive serializer would overflow the
+# stack well before this depth.
+enum Tree = leaf int | node Tree Tree end
+0 leaf t!
+0 i!
+(
+ @i 50000 >= if break end
+ @t 0 leaf node t!
+ @i 1 + i!
+) loop
+@t toJson len str wl
diff --git a/tests/success/enum_deep_json.msh.stdout b/tests/success/enum_deep_json.msh.stdout
new file mode 100644
index 00000000..1ca88b97
--- /dev/null
+++ b/tests/success/enum_deep_json.msh.stdout
@@ -0,0 +1 @@
+1250011
diff --git a/tests/success/enum_deep_render.msh b/tests/success/enum_deep_render.msh
new file mode 100644
index 00000000..952cfe94
--- /dev/null
+++ b/tests/success/enum_deep_render.msh
@@ -0,0 +1,13 @@
+# A deeply nested enum value must stringify without overflowing: `str` renders
+# enum payloads with an explicit work stack, not function recursion. Build a
+# 50000-deep tree and print the length of its rendering (deterministic, and a
+# recursive renderer would overflow the stack well before this depth).
+enum Tree = leaf int | node Tree Tree end
+0 leaf t!
+0 i!
+(
+ @i 50000 >= if break end
+ @t 0 leaf node t!
+ @i 1 + i!
+) loop
+@t str len str wl
diff --git a/tests/success/enum_deep_render.msh.stdout b/tests/success/enum_deep_render.msh.stdout
new file mode 100644
index 00000000..d9515471
--- /dev/null
+++ b/tests/success/enum_deep_render.msh.stdout
@@ -0,0 +1 @@
+700007
diff --git a/tests/success/enum_deep_sort.msh b/tests/success/enum_deep_sort.msh
new file mode 100644
index 00000000..358fb808
--- /dev/null
+++ b/tests/success/enum_deep_sort.msh
@@ -0,0 +1,17 @@
+# Sorting a list of deeply nested enum values must not overflow: compareValues
+# walks payloads with an explicit work stack, not recursion (mirroring `=`,
+# `toJson`, and `str`). Two 50000-deep trees share their whole prefix and differ
+# only at the tip, so comparing them descends the full depth. A recursive
+# comparator would overflow the stack on values this deep.
+enum Tree = leaf int | node Tree Tree end
+0 leaf a! 0 leaf c! 0 i!
+(
+ @i 50000 >= if break end
+ @a 0 leaf node a!
+ @c 0 leaf node c!
+ @i 1 + i!
+) loop
+@c 0 leaf 99 leaf node node c!
+# Sorting is deterministic regardless of input order, and leaves 2 elements.
+[@c @a] sort len str wl
+[@c @a] sort [@a @c] sort = str wl
diff --git a/tests/success/enum_deep_sort.msh.stdout b/tests/success/enum_deep_sort.msh.stdout
new file mode 100644
index 00000000..7600dd4b
--- /dev/null
+++ b/tests/success/enum_deep_sort.msh.stdout
@@ -0,0 +1,2 @@
+2
+true
diff --git a/tests/success/enum_leading_pipe.msh b/tests/success/enum_leading_pipe.msh
new file mode 100644
index 00000000..774f9993
--- /dev/null
+++ b/tests/success/enum_leading_pipe.msh
@@ -0,0 +1,15 @@
+# Members may be written ML-style: one per line, each prefixed with an optional
+# leading `|`.
+enum Suit =
+ | hearts
+ | diamonds
+ | clubs
+ | spades
+end
+
+clubs match
+ hearts : "H" wl,
+ diamonds : "D" wl,
+ clubs : "C" wl,
+ spades : "S" wl,
+end
diff --git a/tests/success/enum_leading_pipe.msh.stdout b/tests/success/enum_leading_pipe.msh.stdout
new file mode 100644
index 00000000..3cc58df8
--- /dev/null
+++ b/tests/success/enum_leading_pipe.msh.stdout
@@ -0,0 +1 @@
+C
diff --git a/tests/success/enum_match_in_quote.msh b/tests/success/enum_match_in_quote.msh
new file mode 100644
index 00000000..ac91746f
--- /dev/null
+++ b/tests/success/enum_match_in_quote.msh
@@ -0,0 +1,16 @@
+# A `match` may be the body of an inferred quotation: the checker synthesizes
+# the quote's input as the match subject (like any other underflow under
+# inference) and pins it from the first arm that names a type — an enum member
+# determines its enum, `just`/`none` determine Maybe. This is the canonical
+# way to consume a list of enum values.
+enum T = leaf int | node T T end
+[ 1 leaf 2 leaf ] (match leaf n : @n, node a b : 0, end) map (str) map "," join wl
+
+enum C = red | green | blue end
+[ red green blue ] (match red : true, green : false, blue : true, end) filter len str wl
+
+[ 5 just none ] (match just v : @v, none : 0, end) map (str) map "," join wl
+
+[1 2 3] (match 1 : "one", _ : "other", end) map "," join wl
+
+[ red green ] (match red : "r" wl, green : "g" wl, blue : "b" wl, end) each
diff --git a/tests/success/enum_match_in_quote.msh.stdout b/tests/success/enum_match_in_quote.msh.stdout
new file mode 100644
index 00000000..8bd43448
--- /dev/null
+++ b/tests/success/enum_match_in_quote.msh.stdout
@@ -0,0 +1,6 @@
+1,2
+2
+5,0
+one,other,other
+r
+g
diff --git a/tests/success/enum_payload_typealias.msh b/tests/success/enum_payload_typealias.msh
new file mode 100644
index 00000000..58c1bbef
--- /dev/null
+++ b/tests/success/enum_payload_typealias.msh
@@ -0,0 +1,19 @@
+# An enum payload may reference a user `type` alias (here a shape), in either
+# declaration order. Previously this failed with "unknown type" because enum
+# payloads were resolved before `type` declarations were registered.
+type Person = {name: str, age: int}
+enum Record = person Person | empty end
+
+{ "name": "Ada", "age": 36 } as Person person match
+ person p : @p :name? wl,
+ empty : "empty" wl,
+end
+
+# Order-independent: enum declared before the type alias it uses.
+enum Cell = num Count | blank end
+type Count = int
+
+5 as Count num match
+ num n : @n str wl,
+ blank : "blank" wl,
+end
diff --git a/tests/success/enum_payload_typealias.msh.stdout b/tests/success/enum_payload_typealias.msh.stdout
new file mode 100644
index 00000000..1a34b8e4
--- /dev/null
+++ b/tests/success/enum_payload_typealias.msh.stdout
@@ -0,0 +1,2 @@
+Ada
+5
diff --git a/tests/success/enum_recursive_generic.msh b/tests/success/enum_recursive_generic.msh
new file mode 100644
index 00000000..51fbf6c3
--- /dev/null
+++ b/tests/success/enum_recursive_generic.msh
@@ -0,0 +1,10 @@
+# Regression: a self-referential enum flowing through a generic parameter
+# triggers the type checker's occurs check. The checker must treat an enum as a
+# leaf when scanning for type variables; otherwise it recurses into the cyclic
+# payload (`node Tree Tree`) forever and overflows the stack.
+enum Tree = leaf int | node Tree Tree end
+
+def ident (q -- q) end
+
+3 leaf ident drop
+"ok" wl
diff --git a/tests/success/enum_recursive_generic.msh.stdout b/tests/success/enum_recursive_generic.msh.stdout
new file mode 100644
index 00000000..9766475a
--- /dev/null
+++ b/tests/success/enum_recursive_generic.msh.stdout
@@ -0,0 +1 @@
+ok
diff --git a/tests/success/enum_render_contexts.msh b/tests/success/enum_render_contexts.msh
new file mode 100644
index 00000000..57c12293
--- /dev/null
+++ b/tests/success/enum_render_contexts.msh
@@ -0,0 +1,12 @@
+# An enum value renders the same way in every context: standalone, inside a
+# list, and via `map` all use the member form (`red`, `leaf(3)`) with no
+# `EnumName.` prefix. (A dict / toJson still use the JSON-tagged form.)
+enum C = red | green | blue end
+red str wl
+[red green blue] str wl
+[red green blue] (str) map "," join wl
+
+enum T = leaf int | node T T end
+3 leaf str wl
+[ 3 leaf 1 leaf ] str wl
+1 leaf 2 leaf node str wl
diff --git a/tests/success/enum_render_contexts.msh.stdout b/tests/success/enum_render_contexts.msh.stdout
new file mode 100644
index 00000000..cde9f53c
--- /dev/null
+++ b/tests/success/enum_render_contexts.msh.stdout
@@ -0,0 +1,6 @@
+red
+[red green blue]
+red,green,blue
+leaf(3)
+[leaf(3) leaf(1)]
+node(leaf(1) leaf(2))
diff --git a/tests/success/enum_str_json.msh b/tests/success/enum_str_json.msh
new file mode 100644
index 00000000..f470bfb3
--- /dev/null
+++ b/tests/success/enum_str_json.msh
@@ -0,0 +1,16 @@
+# Stringifying enum values. `str` renders the member, with payloads after the
+# member name; `toJson` uses the externally-tagged convention. Nullary members
+# render as the bare member name / string.
+enum R = ok str | failed int str | timeout end
+
+"hi" ok str wl
+404 "nf" failed str wl
+timeout str wl
+
+"hi" ok toJson wl
+404 "nf" failed toJson wl
+timeout toJson wl
+
+# Nested payloads render through, member-first.
+enum Tree = leaf int | node Tree Tree end
+1 leaf 2 leaf node 3 leaf node str wl
diff --git a/tests/success/enum_str_json.msh.stdout b/tests/success/enum_str_json.msh.stdout
new file mode 100644
index 00000000..727821a0
--- /dev/null
+++ b/tests/success/enum_str_json.msh.stdout
@@ -0,0 +1,7 @@
+ok(hi)
+failed(404 nf)
+timeout
+{"ok": "hi"}
+{"failed": [404, "nf"]}
+"timeout"
+node(node(leaf(1) leaf(2)) leaf(3))
diff --git a/tests/success/enum_then_quote.msh b/tests/success/enum_then_quote.msh
new file mode 100644
index 00000000..8873b7e4
--- /dev/null
+++ b/tests/success/enum_then_quote.msh
@@ -0,0 +1,7 @@
+# A `(...)` statement after an enum declaration is the following code, not a
+# payload of the last member: the enum body is closed by `end`, so the boundary
+# is unambiguous and whitespace-insensitive.
+enum C = red | green | blue end
+
+(green) x str wl
+[1 2 3] (0 >) filter len str wl
diff --git a/tests/success/enum_then_quote.msh.stdout b/tests/success/enum_then_quote.msh.stdout
new file mode 100644
index 00000000..50af8b94
--- /dev/null
+++ b/tests/success/enum_then_quote.msh.stdout
@@ -0,0 +1,2 @@
+green
+3
diff --git a/tests/success/enum_union_match.msh b/tests/success/enum_union_match.msh
new file mode 100644
index 00000000..1f9232f0
--- /dev/null
+++ b/tests/success/enum_union_match.msh
@@ -0,0 +1,31 @@
+# An enum can be a member of a `type` union, and a `match` discriminates the
+# union by the enum's type name: a bare enum type name (`C`) is a type-test arm
+# that matches any value of that enum, while a non-matching value (here an int,
+# or a value of a different enum) falls through to the next arm.
+enum C = red | green | blue end
+type T = C | int
+
+red as T match
+ C : "a color" wl,
+ int : "an int" wl,
+end
+
+42 as T match
+ C : "a color" wl,
+ int : "an int" wl,
+end
+
+# A union of two enums, discriminated by each enum's type name.
+enum A = a1 | a2 end
+enum B = b1 | b2 end
+type AB = A | B
+
+b1 as AB match
+ A : "an A" wl,
+ B : "a B" wl,
+end
+
+a2 as AB match
+ A : "an A" wl,
+ B : "a B" wl,
+end
diff --git a/tests/success/enum_union_match.msh.stdout b/tests/success/enum_union_match.msh.stdout
new file mode 100644
index 00000000..42a132b5
--- /dev/null
+++ b/tests/success/enum_union_match.msh.stdout
@@ -0,0 +1,4 @@
+a color
+an int
+a B
+an A
diff --git a/tests/success/equality.msh b/tests/success/equality.msh
new file mode 100644
index 00000000..6484e3e2
--- /dev/null
+++ b/tests/success/equality.msh
@@ -0,0 +1,40 @@
+# Equality is total and structural for every value type: containers compare
+# element-wise, and at runtime a type mismatch yields false rather than an
+# error (comparing two different concrete types is itself a static type error,
+# caught by the checker, so this matters for unions and unchecked code).
+
+# Lists (structural, nested, length-sensitive)
+[1 2 3] [1 2 3] = str wl
+[1 2 3] [1 2 4] = str wl
+[[1] [2]] [[1] [2]] = str wl
+[1 2] [1 2 3] = str wl
+
+# str / path / literal compare by text content
+"foo" `foo` = str wl
+
+# Dicts compare structurally, independent of key order
+{ "a": 1, "b": 2 } { "b": 2, "a": 1 } = str wl
+
+# Maybe: None==None, Just==Just (equal / unequal payloads), Just != None, and
+# a Maybe nested in a list all compare structurally.
+none none = str wl
+"x" just "x" just = str wl
+5 just 5 just = str wl
+5 just 6 just = str wl
+5 just none = str wl
+[5 just] [5 just] = str wl
+
+# Enums (including Maybe[enum], the common optional-enum case)
+enum C = red | green end
+red red = str wl
+red green = str wl
+red just red just = str wl
+red just green just = str wl
+
+# uniq now deduplicates any equatable value (lists, enums, ...)
+[[1] [1] [2]] uniq len str wl
+[red green red green] uniq len str wl
+
+# Quotations compare by identity
+(1 +) dup = str wl
+(1 +) (1 +) = str wl
diff --git a/tests/success/equality.msh.stdout b/tests/success/equality.msh.stdout
new file mode 100644
index 00000000..23ac1554
--- /dev/null
+++ b/tests/success/equality.msh.stdout
@@ -0,0 +1,20 @@
+true
+false
+true
+false
+true
+true
+true
+true
+true
+false
+false
+true
+true
+false
+true
+false
+2
+2
+true
+false
diff --git a/tests/success/grid_set_cell_mixed.msh b/tests/success/grid_set_cell_mixed.msh
new file mode 100644
index 00000000..59a5dcef
--- /dev/null
+++ b/tests/success/grid_set_cell_mixed.msh
@@ -0,0 +1,11 @@
+# gridSetCell stores the value even when its type differs from the column's
+# original type: the column is promoted to mixed (generic) storage rather than
+# silently dropping the value, and the other rows are preserved.
+enum C = red | green end
+
+# int column, row 0 set to an enum; row 1 (int 2) is preserved.
+[| "c" ; 1 ; 2 |] "c" 0 red gridSetCell toJson wl
+
+# int column receiving a string, and a string column receiving an int.
+[| "n" ; 1 |] "n" 0 "X" gridSetCell toJson wl
+[| "s" ; "a" |] "s" 0 99 gridSetCell toJson wl
diff --git a/tests/success/grid_set_cell_mixed.msh.stdout b/tests/success/grid_set_cell_mixed.msh.stdout
new file mode 100644
index 00000000..dd332a96
--- /dev/null
+++ b/tests/success/grid_set_cell_mixed.msh.stdout
@@ -0,0 +1,3 @@
+[{"c": "red"}, {"c": 2}]
+[{"n": "X"}]
+[{"s": 99}]
diff --git a/tests/success/sort_structural.msh b/tests/success/sort_structural.msh
new file mode 100644
index 00000000..572e8db1
--- /dev/null
+++ b/tests/success/sort_structural.msh
@@ -0,0 +1,23 @@
+# `sort` reorders the original elements by a total structural order and never
+# changes their type (the old implementation replaced every element with a
+# string, dropping enum payloads). Numbers sort numerically and stay numbers,
+# enums sort by declaration order then payload, dicts by sorted key/value, and a
+# mixed-type list sorts deterministically by a fixed type rank (numbers < text).
+
+# Numbers keep their type (sum works) and sort numerically, not lexically.
+[10 2 1] sort (str) map "," join wl
+[10 2 1] sort sum str wl
+
+# Enums sort by member declaration order (low < medium < high), not by name.
+enum Priority = low | medium | high end
+[high low medium high low] sort (str) map "," join wl
+
+# Same member: payloads break the tie.
+enum Tree = leaf int | node Tree Tree end
+[3 leaf 1 leaf 2 leaf] sort (str) map "," join wl
+
+# Dicts compare by sorted key then value.
+[ { "b": 2, "a": 9 } { "a": 1, "b": 1 } { "a": 1, "b": 2 } ] sort (toJson) map " | " join wl
+
+# Mixed types: fixed type rank (numbers before text), deterministic.
+[hello 1 'c' 'A'] sort (str) map "," join wl
diff --git a/tests/success/sort_structural.msh.stdout b/tests/success/sort_structural.msh.stdout
new file mode 100644
index 00000000..c22eea2b
--- /dev/null
+++ b/tests/success/sort_structural.msh.stdout
@@ -0,0 +1,6 @@
+1,2,10
+13
+low,low,medium,high,high
+leaf(1),leaf(2),leaf(3)
+{"a": 1, "b": 1} | {"a": 1, "b": 2} | {"a": 9, "b": 2}
+1,A,c,hello
diff --git a/tests/success/sort_test.msh b/tests/success/sort_test.msh
index 613e32ac..1d0f888f 100644
--- a/tests/success/sort_test.msh
+++ b/tests/success/sort_test.msh
@@ -1,7 +1,7 @@
-# TODO: re-enable once the sort fix (sort -> [str]) lands; with the uw
-# fix ([str]) this [int | str] input fails type-check until then.
-# "# Basic sort test" wl
-# [hello 1 'c' 'A'] sort uw
+"# Basic sort test" wl
+# sort preserves element types (the int stays an int), so stringify for display.
+# Across types the sort order is by a fixed type rank (numbers before text).
+[hello 1 'c' 'A'] sort (str) map uw
"# Unique sort test" wl
[z y 'x' y z] uniq sort uw
diff --git a/tests/success/sort_test.msh.stdout b/tests/success/sort_test.msh.stdout
index a25f42b4..afd65e1d 100644
--- a/tests/success/sort_test.msh.stdout
+++ b/tests/success/sort_test.msh.stdout
@@ -1,3 +1,8 @@
+# Basic sort test
+1
+A
+c
+hello
# Unique sort test
x
y
diff --git a/tests/success/underscore_argv.msh b/tests/success/underscore_argv.msh
new file mode 100644
index 00000000..9f77ca26
--- /dev/null
+++ b/tests/success/underscore_argv.msh
@@ -0,0 +1,7 @@
+# A lone `_` is the match wildcard, but it remains usable as a bare argv word
+# (the literal string "_") inside a list. The two roles coexist.
+[echo _ arg _] ;
+
+5 match
+ _ : "wild" wl,
+end
diff --git a/tests/success/underscore_argv.msh.stdout b/tests/success/underscore_argv.msh.stdout
new file mode 100644
index 00000000..518b4723
--- /dev/null
+++ b/tests/success/underscore_argv.msh.stdout
@@ -0,0 +1,2 @@
+_ arg _
+wild
diff --git a/tests/success/uniq_enum.msh b/tests/success/uniq_enum.msh
new file mode 100644
index 00000000..6d4fe954
--- /dev/null
+++ b/tests/success/uniq_enum.msh
@@ -0,0 +1,9 @@
+# `uniq` accepts any value type (matching its `([t] -- [t])` signature) and
+# deduplicates by structural equality, so a list of enums dedupes instead of
+# throwing at runtime. First-occurrence order is preserved.
+enum C = red | green | blue end
+
+[red green red blue green red] uniq (str) map "," join wl
+
+# Other equatable values dedupe too (previously a runtime error).
+[true false true true] uniq len str wl
diff --git a/tests/success/uniq_enum.msh.stdout b/tests/success/uniq_enum.msh.stdout
new file mode 100644
index 00000000..41966831
--- /dev/null
+++ b/tests/success/uniq_enum.msh.stdout
@@ -0,0 +1,2 @@
+red,green,blue
+2
diff --git a/tests/typecheck_fail/branded_maybe_nonexhaustive.msh b/tests/typecheck_fail/branded_maybe_nonexhaustive.msh
new file mode 100644
index 00000000..d1fb35bb
--- /dev/null
+++ b/tests/typecheck_fail/branded_maybe_nonexhaustive.msh
@@ -0,0 +1,7 @@
+# Exhaustiveness is enforced through a `type` alias of a Maybe: a branded Maybe
+# match must still cover both `just` and `none` (or use `_`). Here `none` is
+# missing, so it is rejected — the brand does not hide the cases.
+type MI = Maybe[int]
+5 just as MI match
+ just v : @v str wl,
+end
diff --git a/tests/typecheck_fail/duplicate_def.msh b/tests/typecheck_fail/duplicate_def.msh
new file mode 100644
index 00000000..0455839f
--- /dev/null
+++ b/tests/typecheck_fail/duplicate_def.msh
@@ -0,0 +1,4 @@
+# A name defined twice in one file is rejected by the checker.
+def greet (-- str) "hi" end
+def greet (-- str) "yo" end
+greet wl
diff --git a/tests/typecheck_fail/duplicate_def_stdlib.msh b/tests/typecheck_fail/duplicate_def_stdlib.msh
new file mode 100644
index 00000000..c8015faf
--- /dev/null
+++ b/tests/typecheck_fail/duplicate_def_stdlib.msh
@@ -0,0 +1,4 @@
+# Redefining a name already defined by the standard library is rejected:
+# lookup is first-match-wins, so this def could never take effect.
+def id (q -- q) end
+3 id drop
diff --git a/tests/typecheck_fail/enum_branded_nonexhaustive.msh b/tests/typecheck_fail/enum_branded_nonexhaustive.msh
new file mode 100644
index 00000000..7be9ae0c
--- /dev/null
+++ b/tests/typecheck_fail/enum_branded_nonexhaustive.msh
@@ -0,0 +1,10 @@
+# Exhaustiveness is enforced through a `type` alias of an enum: matching a
+# branded enum must still cover every member (or use `_`). Here `blue` is
+# missing, so the match is rejected — the brand does not hide the members.
+enum C = red | green | blue end
+type Color2 = C
+
+red as Color2 match
+ red : "r" wl,
+ green : "g" wl,
+end
diff --git a/tests/typecheck_fail/enum_distinct.msh b/tests/typecheck_fail/enum_distinct.msh
new file mode 100644
index 00000000..b43f407e
--- /dev/null
+++ b/tests/typecheck_fail/enum_distinct.msh
@@ -0,0 +1,11 @@
+# Two enums are nominally distinct even with parallel members: a value of one
+# cannot be used where the other is expected.
+enum A = a1 | a2 end
+enum B = b1 | b2 end
+
+def takesA (A -- str)
+ c!
+ "ok"
+end
+
+b1 takesA wl
diff --git a/tests/typecheck_fail/enum_empty_match.msh b/tests/typecheck_fail/enum_empty_match.msh
new file mode 100644
index 00000000..2c58cdb1
--- /dev/null
+++ b/tests/typecheck_fail/enum_empty_match.msh
@@ -0,0 +1,5 @@
+# An empty match covers no members and has no wildcard, so it can never fire
+# and always crashes at runtime. It must be rejected as non-exhaustive.
+enum Color = red | green | blue end
+
+red match end
diff --git a/tests/typecheck_fail/enum_match_in_quote_nonexhaustive.msh b/tests/typecheck_fail/enum_match_in_quote_nonexhaustive.msh
new file mode 100644
index 00000000..0b96d55d
--- /dev/null
+++ b/tests/typecheck_fail/enum_match_in_quote_nonexhaustive.msh
@@ -0,0 +1,4 @@
+# Exhaustiveness is enforced inside inferred quotations too: the subject is
+# pinned to the member's enum, so a match that omits a member is rejected.
+enum C = red | green | blue end
+[ red ] (match red : 1, green : 2, end) map drop
diff --git a/tests/typecheck_fail/enum_member_def_collision.msh b/tests/typecheck_fail/enum_member_def_collision.msh
new file mode 100644
index 00000000..7c3a68a3
--- /dev/null
+++ b/tests/typecheck_fail/enum_member_def_collision.msh
@@ -0,0 +1,8 @@
+# Enum constructors and defs share the word namespace, so a def reusing a
+# member name must be rejected (in either textual order — enums register
+# first regardless). Without this, the checker resolves the word to the
+# constructor while the runtime runs the def, and a type-checked program
+# fails at runtime.
+enum E = foo | z end
+def foo ( -- int) 42 end
+foo str wl
diff --git a/tests/typecheck_fail/enum_nonexhaustive.msh b/tests/typecheck_fail/enum_nonexhaustive.msh
new file mode 100644
index 00000000..fce677a0
--- /dev/null
+++ b/tests/typecheck_fail/enum_nonexhaustive.msh
@@ -0,0 +1,8 @@
+# A match on an enum that omits a member (and has no wildcard) is
+# non-exhaustive and must be rejected.
+enum Color = red | green | blue end
+
+red match
+ red : "r" wl,
+ blue : "b" wl,
+end
diff --git a/tests/typecheck_fail/enum_underscore_member.msh b/tests/typecheck_fail/enum_underscore_member.msh
new file mode 100644
index 00000000..772e8dcc
--- /dev/null
+++ b/tests/typecheck_fail/enum_underscore_member.msh
@@ -0,0 +1,10 @@
+# `_` is the wildcard token, reserved as a name, so it cannot be an enum
+# member. (Previously it was accepted and then mis-dispatched at runtime,
+# because the checker treated `_` as a member while the matcher treated it as
+# the catch-all wildcard.)
+enum C = _ | red end
+
+red match
+ _ : "u" wl,
+ red : "r" wl,
+end