Initial enum commit#261
Open
mitchpaulus wants to merge 33 commits into
Open
Conversation
- str/toJson now render enum payloads. str: `member(p ...)`, built with an
explicit work stack (no recursion) so deeply nested values can't overflow.
toJson: serde externally-tagged (`"m"`, `{"m": v}`, `{"m": [..]}`).
- Fix type-checker stack overflow on recursive enums passed through generics:
walkTypeVars treats an enum as a ground leaf (payloads carry no type vars).
- Reject an empty `match` as non-exhaustive instead of letting it crash at
runtime.
- Language-agnostic regression fixtures: recursive-generic, empty-match,
str/json rendering, and a 50k-deep render overflow guard.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01BAoaBtTQdsLLfYTyfexcVr
mshell has no statement terminator, so after a nullary member a `(...)` on the next line (a quotation, a filter predicate, etc.) was greedily parsed as that member's payload list — making the parenthesized expression vanish. Require a payload `(` to be attached to the member name (`failed(int str)`); a detached paren belongs to the following code and the member is nullary. Regression fixture: tests/success/enum_then_quote.msh. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01BAoaBtTQdsLLfYTyfexcVr
Replace the parenthesized payload syntax (which required whitespace-significant
adjacency to avoid swallowing the next statement) with a block terminated by
`end`, like def/if/match/loop:
enum CmdResult = ok str | failed int str | timeout end
Members are `|`-separated; each carries zero or more space-separated payload
types; `end` bounds the member list so it is unambiguous against following code
without relying on whitespace. Parser-only change plus docs, fixtures, and Go
tests updated to the new surface.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01BAoaBtTQdsLLfYTyfexcVr
The design doc advertised ML-style member lists (one per line, each prefixed with `|`), but the parser rejected a leading `|` after `=`. Accept an optional leading `|` so `enum E =\n | a\n | b\nend` parses; the regular `a | b` form is unchanged. Regression fixture: enum_leading_pipe.msh. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01BAoaBtTQdsLLfYTyfexcVr
Previously `_` was an ordinary LITERAL special-cased by string comparison, so it could be used as an identifier — including as an enum member name, which the checker treated as a coverable member while the runtime treated `_` as the catch-all wildcard, type-checking a program that mis-dispatched at runtime. Lex a lone `_` as a new UNDERSCORE token. It is the wildcard / ignore marker in all pattern positions (match arm, list/dict element, `just _`, `<type> _`, enum payload binding), is rejected wherever a name is expected (enum member, def name, ...), and remains usable as a bare argv word (the literal string "_") in a list. `..._`, `_foo`, and `_!` are unaffected. Regression fixtures: typecheck_fail/enum_underscore_member.msh and success/underscore_argv.msh. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01BAoaBtTQdsLLfYTyfexcVr
Enum payload types were resolved in the pre-pass before `type` declarations were registered, so a payload referencing a `type` alias (a shape, union, etc.) failed with "unknown type" regardless of source order — only primitives and other enums worked. Reorder the type-check pre-pass: predeclare enum names, then register `type` declarations, then resolve enum payload bodies and constructors. Now an enum payload may reference any enum or `type` alias in either direction. Regression fixture: tests/success/enum_payload_typealias.msh. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01BAoaBtTQdsLLfYTyfexcVr
uniq is typed `([t] -- [t])` (generic) but its runtime only deduplicated a fixed set of primitives and threw for anything else — so `[enum] uniq`, `[dict] uniq`, `[bool] uniq` type-checked but failed at runtime, against the type system's goal of catching such errors statically (or not having them). Deduplicate any value without a fast hash path by structural equality (`Equals`), keeping the primitive fast paths. Enums, dicts, and booleans now dedupe correctly; values whose equality is undefined (lists) are kept rather than erroring, consistent with `=` on them. Regression fixture: tests/success/uniq_enum.msh. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01BAoaBtTQdsLLfYTyfexcVr
Equality was undefined (always errored) for lists, quotations, pipes, and grids, and several types errored on a type mismatch while others returned false — so equality could depend on operand order, and `[1] [1] =` threw. Define `Equals` for every type and make it total: - list / pipe: structural, element-wise (recursive) - grid / gridview / gridrow: structural, cell-wise (by materialized rows) - quotation: identity - str / path / literal: compare by text content (symmetric) - a type mismatch yields false rather than erroring, so equality is order-independent and union members (e.g. int | null) compare cleanly. Genuinely incompatible comparisons remain a static type error, caught by the checker before runtime. uniq now dedupes lists/grids too via this equality. Regression fixture: tests/success/equality.msh. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01BAoaBtTQdsLLfYTyfexcVr
A grid column uses typed columnar storage (int/float/string/datetime/generic). GridColumn.Set only stored a value when its Go type matched the column type; for any other type it was a silent no-op, so setting a string, enum, bool, or list into a typed column type-checked (gridSetCell is `(Grid str int t -- Grid)`) but left the cell unchanged. Promote the column to generic storage on a type mismatch: materialize the existing typed data, switch to COL_GENERIC, then store the value. Matching-type sets keep the fast typed path, and other rows are preserved. Regression fixture: tests/success/grid_set_cell_mixed.msh. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01BAoaBtTQdsLLfYTyfexcVr
Adversarial testing of the new enums surfaced four issues, fixed here: - toJson and Equals on a deeply nested enum overflowed the Go stack (unlike str, which already used an explicit work stack). Both now walk payloads iteratively, so arbitrarily deep values are safe. Equality also no longer re-enters itself for enum-vs-non-enum pairs. - An enum used inside a `type` union (`type T = C | int`) type-checked when matched by the enum's type name, but had no runtime implementation, so it always failed with "No matching arm found". The runtime now treats a bare enum type name as a type-test arm matching any member of that enum, and falls through cleanly for other union members. - `sort` replaced every element with its string form, silently dropping enum payloads and changing element types (a list of ints came back as lexically-sorted strings). It now sorts and preserves the original objects using a total structural order: numbers numerically, text lexically, lists positionally, dicts by sorted key/value, enums by declaration order then payload, and different types by a fixed type rank. compareValues is iterative, so sorting deeply nested values is stack-safe too. sortV keeps its string-key behavior but now preserves elements. Docs, CHANGELOG, and regression tests (deep json/equals/sort, union match, structural sort) updated. All suites green: tests 213, typecheck 196, go test. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01E3aH9BBud5rjDoHUF8daaJ
The runtime enum payload-binding loop bound by token lexeme and silently skipped non-token items, so it accepted match arms the type checker rejects: an operator-token name like `ok x` (x lexes as INTERPRET) was bound, and a malformed binding like `items [a b]` was ignored while the arm still matched. Now each payload binding must be a plain name (LITERAL) or the `_` wildcard, failing with a clear message otherwise. This mirrors the checker (enumMemberPattern) and the `just`/type-test binding forms, whose runtime and checker already agree, so all three binding forms are now consistent across type-check and run. Adds tests/fail/enum_bad_payload_binding.msh. All suites green: tests 214, typecheck 197, go test. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01E3aH9BBud5rjDoHUF8daaJ
Naming an enum via `type Color2 = C` wraps it in a TKBrand, so the checker's match logic (which tests Kind == TKEnum) no longer recognized the members: a match on the aliased type was rejected as "unrecognized pattern", even though brands are runtime-erased and the value matched correctly at run time. This was inconsistent with a branded union: `type T = int | str` stays a TKUnion node, so its arms remain matchable through the brand. Enums took the opaque-wrapper path instead. Now enumMemberPattern and CheckMatchExhaustive unwrap a TKBrand to its underlying enum before dispatching, so a branded enum matches (and enforces exhaustiveness over) its members exactly like a branded union matches its arms. The brand stays nominal at value boundaries (an explicit `as` is still needed to pass an enum where the alias is expected). Checker-only change; the runtime already matched branded enums correctly. Tests: tests/success/enum_branded_match.msh and tests/typecheck_fail/enum_branded_nonexhaustive.msh. All suites green: tests 215, typecheck 199, go test. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01E3aH9BBud5rjDoHUF8daaJ
# Conflicts: # CHANGELOG.md # mshell/Type.go # tests/success/sort_test.msh
A list's ToString renders elements via DebugString, and the enum's DebugString was the only value type to inject an `EnumName.` prefix. So an enum inside a list printed as `[C.red C.green]`, inconsistent with its standalone form (`red`), `map`-ed form (`red`), and dict/JSON form (`"red"`). DebugString now returns the same member form as ToString, so an enum renders identically in every context. The member name is globally unique, so the type prefix added no disambiguation (unlike the quotes a string's DebugString adds). Test: tests/success/enum_render_contexts.msh. Suites green: tests 217, typecheck 202, go test. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01E3aH9BBud5rjDoHUF8daaJ
Maybe.Equals asserted other.(Maybe) (the value form), but the runtime
constructs Maybe values as *Maybe pointers everywhere (just/none and every
lookup/parse builtin push &Maybe{...}). So the assertion missed every operand
and the method bailed to false — making *all* Maybe-vs-Maybe comparisons
false, including `none none =`, `5 just 5 just =`, and `Maybe[enum]` equality.
By extension `uniq` could not dedupe Maybes, and any list/dict/enum containing
a Maybe compared unequal to an identical value.
Now it unwraps other via the existing asMaybe helper (value or pointer),
matching how the match code and compareValues already handle both forms. The
receiver side already worked via Go's value-method promotion on *Maybe.
The equality.msh test had baked the wrong answers into its expected output
(None==None and Just==Just recorded as false); corrected and expanded with
real Just/None and Maybe[enum] assertions.
Suites green: tests 217, typecheck 203, go test.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01E3aH9BBud5rjDoHUF8daaJ
… forms The branded-enum fix unwrapped the TKBrand in two per-site spots (enumMemberPattern and CheckMatchExhaustive), so a branded enum matched its members — but the `just`/`none` and `<typekeyword> name` binding paths in armPatternOf still checked the raw subject kind. So matching a branded Maybe (e.g. `type MC = Maybe[C]`, a named optional enum) by `just v`/`none` was rejected by the checker (`@v` unbound / non-exhaustive) even though the runtime matched it fine — brands are runtime-erased. Unwrap the brand once where the match subject is established (checkMatchBlock), so every arm form — enum member, `just v`, `<type> name`, list/dict — and the exhaustiveness check see the underlying type uniformly. The two per-site unwraps are removed (redundant now); enumMemberPattern and CheckMatchExhaustive get the already-unwrapped subject. Brands stay nominal at value boundaries (an `as` is still needed to pass an enum where the alias is expected). Tests: tests/success/branded_maybe_match.msh and tests/typecheck_fail/branded_maybe_nonexhaustive.msh. Suites green: tests 218, typecheck 205, go test. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01E3aH9BBud5rjDoHUF8daaJ
A value built by reusing one subtree twice per level (`@t @t node` in a loop) is a DAG: n heap nodes but 2^n paths when walked as a tree. Equality and ordering walked it structurally with no notion of sharing, so `=`, `uniq`, and `sort` on such values went exponential — depth 24 took 0.7s, each +1 level doubled it, and depth 40 (41 actual nodes) would run for hours. Measured, not theoretical. Two layers of defense, both zero-cost for ordinary values: - sameRef fast path: a pointer-identical pair is equal by definition, so the enum Equals pair loop, compareValues, and itemsEqual skip it instead of expanding. This alone collapses every same-reference case (self compare, dup, a shared subtree meeting itself) from 2^n to n. Typed per-kind pointer comparisons, so it can never hit Go's non-comparable interface panic (e.g. MShellBinary). - dagGuard threshold memo: two *independently built* DAGs share no pointers across operands, so the fast path never fires. The walks count pops; past 2^19 steps they memoize already-expanded enum/list/dict pointer pairs and skip repeats, making the comparison polynomial in actual nodes. Sound in a LIFO walk: a duplicate only pops after the first occurrence's expansion fully resolved, and any mismatch returns immediately. The memo is capped (2^18 entries) so a huge linear value cannot balloon memory; below the threshold the guard is one integer increment and never allocates. Depth-40 self-compare: was ~13 hours extrapolated, now 0.035s. The depth-64 regression test (tests/success/enum_dag_equality.msh) covers both modes plus uniq/sort and an unequal tip. Deep linear values (50k suite tests, 4M manual) are unaffected. Suites green: tests 219, typecheck 206, go test. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01E3aH9BBud5rjDoHUF8daaJ
The spec'd invariant — enum members share the word namespace, collisions are
declaration errors — was only enforced in one direction. defineEnum (pre-pass
1b) checks nameBuiltins, which at that point holds Go builtins and stdlib
sigs, so a member colliding with those is caught. But user def signatures
register in pre-pass 2, after the enums, with no reverse check — so a member
colliding with a same-file def was silently accepted in either textual order.
That was a real soundness hole: the checker resolved the shared word to the
enum constructor (e.g. when the context demanded the enum type) while the
runtime resolves definitions before enum members and ran the def — so a
cleanly type-checked program failed at runtime ("Unknown match pattern
literal") or diverged in stack shape for payload constructors.
defineEnum now records registered member names (enumMemberToks), and def
registration rejects a def whose name is a member, mirroring the existing
member-vs-def error. Def-vs-def duplication is untouched (a separate,
deferred decision).
Test: tests/typecheck_fail/enum_member_def_collision.msh. Suites green:
tests 219, typecheck 207, go test.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01E3aH9BBud5rjDoHUF8daaJ
`enum` declarations only functioned in the main script: RegisterEnums had a
single call site on the script path, so an enum declared in the stdlib, the
user init file, or at the interactive prompt parsed and evaluated as a no-op
and its member words fell through to the bare-literal path ("Found literal
token"). `type` aliases in startup files were similarly invisible to the
checker, whose pre-passes walked only the main file's items.
Type-checking and running are separate today but should semantically match,
so both sides now read startup declarations:
- Runtime: loadStartupFile registers each startup file's enums on the
EvalState (covering script and interactive sessions), and the REPL line
executor registers enums declared at the prompt, next to its existing
def handling.
- Checker: loadStartupFile retains startup top-level items;
TypeCheckProgram passes them to the new Checker.RegisterStartupTypes,
which runs the same three-phase pre-pass order as CheckProgram (enum
names, type aliases, enum bodies + constructor words). Declaration
bodies are not type-checked, matching the stdlib-def treatment. The LSP
diagnostics pass registers the stdlib's items the same way.
Collision checks now span files in both directions: a startup enum member
rejects a colliding program def, and vice versa.
Tests: startup enum/type visible to checker + cross-file collision
(TypeEnum_test.go), startup-file enum registers constructors and constructs
at runtime (Startup_test.go). Suites green: tests 219, typecheck 207,
go test.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01E3aH9BBud5rjDoHUF8daaJ
…tainers Deep values with ALTERNATING container kinds crashed with a fatal Go stack overflow: `enum E = m Maybe[E] | z end` built ~4M deep (a linked list with optional next — a natural shape) killed `str`, `toJson`, and `=`. The earlier per-type stack-safety fixes only covered pure enum nesting — each iterative walker delegated non-enum payloads to that child's own recursive method (Maybe.ToString, Maybe.Equals, list/dict ToJson, ...), so every enum→Maybe→enum cycle added Go frames and alternation restored O(depth) recursion. compareValues was already immune because it expands every kind inline; this applies the same design to the other two walks: - renderValue(obj, flavor): one work-stack renderer for ToString / DebugString / ToJson, expanding enum, Maybe, list, dict, and pipe inline with each kind's exact existing format (flavor switches per child the way the old methods did: list children render as DebugString, a dict's `str` form is its JSON, enum payloads render as ToString). Scalars, grids, and quotations stay leaves via their own methods. - equalsIter(a, b): one pair-stack equality walk expanding the same kinds inline, keeping the sameRef fast path and the dagGuard shared-substructure memo that previously lived only on the enum walk (so lists/dicts/Maybes now get DAG protection too). All the per-type methods (enum, Maybe, list, dict, pipe) are one-line routings into the shared walkers; enumRender, the enum ToJson walker, the old recursive bodies, itemsEqual, and DebugStrs are deleted. Net -30 lines. Output is byte-identical across the suites, with two deliberate changes: multi-key dict DebugString now emits sorted keys (it iterated Go map order — nondeterministic — before), and dict equality no longer short-circuits on a TypeName mismatch, so a str and a literal with equal text compare equal inside dicts exactly as they do at top level. The 4M alternating chain now renders (18M chars), serializes (14M), and compares cleanly; enum↔dict alternation at 500k likewise. Regression test: tests/success/enum_alternating_deep.msh (50k, mirroring the deep-test family). Suites green: tests 220, typecheck 208, go test. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01E3aH9BBud5rjDoHUF8daaJ
A list appended into itself (in-place `append`) is a genuinely cyclic value, and an enum payload list can close a cycle through the enum (`enum Box = wrap [Box] | z end`). Rendering one never terminated: the work-stack renderer re-expanded the same pointer forever, growing output and task stack without bound (the old recursive renderers at least died fast with a stack overflow). Equality and sorting already terminate via the pointer-identity fast path and pair memoization. mshell is strict — a cycle is always the degenerate artifact of appending a container into itself, never a meaningful value — so user-facing conversion now errors: `str` and `toJson` on a cyclic value fail with "Cannot convert a cyclic value (a container that contains itself) to a string/JSON". renderValueDetect tracks the containers currently being expanded as an on-path set (pointer kinds only; a DAG merely revisits a *finished* pointer and still renders fine), unwinding via exit sentinels. Reaching an on-path pointer emits a `<cycle>` marker and reports cycled=true. Internal rendering (DebugString in error messages, stack dumps) stays total via the marker — those paths cannot propagate errors and must never hang. The on-path lookup is gated on the pointer-kind check, since hashing an interface holding an unhashable dynamic type (MShellBinary, a []byte) panics even on a map read. Test: tests/fail/cyclic_render.msh (cycle equality terminates, then `str` errors). Suites green: tests 221, typecheck 208, go test. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01E3aH9BBud5rjDoHUF8daaJ
A `match` as the body of an inferred quotation always failed the checker
with "stack underflow at 'match' (match subject)" even though it runs
fine — rejecting the canonical way to consume enums:
[ 1 leaf 2 leaf ] (match leaf n : @n, node a b : 0, end) map
checkMatchBlock errored on an empty stack unconditionally; other constructs
(operators, `if`) participate in quote-body inference, where applySig
responds to underflow by synthesizing fresh input vars. The failure predates
enums (Maybe matches in map had it too), but match is the enum eliminator,
so `(match ...) map/filter/each` never type-checking bit constantly.
Two pieces:
- Under inference, an empty stack at `match` synthesizes the subject exactly
as applySig's underflow path does (bottom of stack, front of inferInputs),
so the quotation infers a one-input signature.
- A subject that is still an unresolved var is pinned from the first arm
pattern that names a type — an enum member determines its enum (member
names are global), `just`/`none` determine Maybe[fresh]. Pinning happens
before the entry branch is captured, so every arm's analysis, payload
bindings, and the exhaustiveness check see the resolved subject (per-arm
substitution checkpoints would roll back a per-site pin). Value literals
and type keywords deliberately do not pin: a type-keyword match may be
discriminating a union, which pinning would wrongly narrow — those
matches still check via their wildcard arm.
Exhaustiveness now works inside quotations too: a match that omits a member
is rejected, and a pinned enum quotation applied to a list of a different
element type fails overload resolution as it should.
Tests: tests/success/enum_match_in_quote.msh (enum map/filter/each, Maybe,
value literals) and tests/typecheck_fail/enum_match_in_quote_nonexhaustive.msh.
Suites green: tests 222, typecheck 210, go test.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01E3aH9BBud5rjDoHUF8daaJ
…emo cap Comparing two independently built self-doubling DAGs deeper than dagMemoCap (2^18 levels) hung: the cap's "stop inserting" overflow policy left every level beyond 262144 un-memoized, and each un-memoized level doubles the walk. Measured: depth 200k compared in 1s, depth 300k ran effectively forever (2^38000 work). A first attempt — clearing the memo generationally on overflow — did NOT fix it (also measured): the pending duplicate for each upper level pops only after the entire subtree between, so the working set spans all levels and defeats any bounded memo regardless of eviction policy. The structural fix: deduplicate pointer-identical pairs when a container pushes its children. A self-doubling value (`@t @t node`, `[ @x @x ]`) expands to the SAME pair twice; pushing it once makes the whole family linear at any depth with no memo involvement — equality at depth 300k/600k/ 4M now runs in 1.1s/2.1s/14.4s (linear), and sort past the cap likewise. Applied in equalsIter (pushPairsDedup for enum payloads, list and pipe elements) and in compareValues' enum and list arms (skipping is sound in both walks: an identical pointer pair contributes equal/0). The generational clear is kept — the memo still covers cross-parent duplicate pairs (diamond-shaped sharing) up to the cap — and its comment now states honestly what it does and does not defend against. Extends tests/success/enum_dag_equality.msh with a 300k-deep (past-cap) independent-DAG comparison. Suites green: tests 222, typecheck 210, go test. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01E3aH9BBud5rjDoHUF8daaJ
The push-time pair dedup that closed the exponential cliff for self-doubling
values covered enum payloads, lists, and pipes — but not the two dict arms.
A dict-shaped doubling value ({ "l": @d, "r": @d } per level, or the same
structure through an enum's {str: E} payload) pushed the identical pointer
pair once per key and re-opened the cliff: equality at depth 200k ran in
1.4s, depth 300k (past the memo cap) hung. Measured, same boundary as the
list/enum case.
equalsIter's dict arm now skips a pair pointer-identical to the last one it
pushed, and compareValues' dict arm skips the identical *value* pair while
always keeping the key comparison. Dict-DAG equality at 300k/600k now runs
1.5s/3.3s (linear), sort works, and a deep unequal pair is still detected.
Extends tests/success/enum_dag_equality.msh with a 300k dict-payload DAG.
Suites green: tests 222, typecheck 210, go test.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01E3aH9BBud5rjDoHUF8daaJ
…bally Four successive fixes (memo cap, generational eviction, push-dedup for enum/list/pipe, then dict arms) each closed one self-doubling pattern and left the next container or sharing shape exponential past the cap — a non-consecutive pattern like [x y x] per level still hung at depth 300k. The chain of same-shaped patches was defending the wrong invariant: the memo cap itself. The memo is now unbounded. Every revisited pointer pair memo-hits, so ANY sharing pattern — consecutive, alternating, cross-parent, any container mix, any depth — is polynomial in actual heap nodes. There is no boundary left to probe. The memory trade is proportionate, not abstract: the memo only activates past the step threshold (2^19), so ordinary comparisons never allocate, and a comparison big enough to grow a large memo already holds operands larger than the memo. Measured: the pathological 4M-deep linear compare (every pair distinct — the case the cap was protecting) runs 21.9s at 2.6GB peak, of which the two operands are over half; the alternating [x y x] DAG at 300k, which defeated every previous patch, compares in 3.0s. Push-time dedup stays as a constant-factor fast path (it also avoids the pre-threshold spin for shallow doubling), but is no longer load-bearing for termination. Extends tests/success/enum_dag_equality.msh with the non-consecutive alternating pattern at 300k. Suites green: tests 222, typecheck 210, go test. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01E3aH9BBud5rjDoHUF8daaJ
The comparison/render walkers carried three hand-maintained "which types are pointers" switches (sameRef, refPairKey, cycleTrackable) that had already drifted apart. Replace all three with one isRefKind predicate that checks the dynamic kind via reflection: "is a pointer" is exactly the property every site cares about (heap identity ⇒ can share substructure, can cycle, safe to compare and use as a map key), and a newly added pointer kind is covered with no list to keep in sync. Also delete the push-time neighbor-dedup machinery (pushPairsDedup plus three hand-rolled copies in the dict/list/enum arms). It was written to compensate for the bounded dagGuard memo — its own comment still cited the memo's eviction — but the memo is unbounded now, which subsumes the whole trick: below the step threshold duplicate expansion is capped by the threshold itself, past it every repeated pair memo-hits. Timing on the DAG stress tests is unchanged. Net -71 lines, no behavior change. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Definition lookup is first-match-wins over [stdlib, init, script], so a second def of a name never took effect — it was silently dead code while the first definition kept running. A script "redefining" a stdlib word, or an interactive redefinition, quietly did nothing; and since the type checker registered the duplicate as an overload, a call could type-check against the dead def's signature while the runtime executed the other body. (Found via tests/success/enum_recursive_generic.msh, whose local `def id` was shadowed by std.msh's id all along; renamed to `ident`.) Reject duplicates everywhere instead: - Runtime: FindDuplicateDefinition checks startup loading, script startup+file assembly, and each interactive input line (which is rejected with the session continuing). The error reports both positions. - Checker: def registration records every definition name (mirroring the enum-member collision check) and rejects a repeat in both RegisterStdlibSigs and CheckProgram, so --type-check-only and the LSP report it too. A stdlib def records its name even when its sig defers to a table builtin (the 2unpack case) — runtime lookup still resolves to it. Def-shadowing-builtins stays legal; std.msh does that on purpose. - Lexer: makeToken now stamps Token.TokenFile, which was wired through the lexer but never assigned. Cross-file collisions can then name the other file: "already defined at lib/std.msh:62:5". No existing test fixture is affected; stdin input still formats as bare line:col. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Add the two user-facing branch changes missing from Unreleased: the cyclic-value render error (str/toJson error instead of hanging) and match type-checking inside inferred quotations ((match ...) map). Also correct compareValues' doc comment: compare-0 coincides with Equals only for orderable kinds — quotations and grids share a rank and compare 0 while Equals still distinguishes them. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
- doc/base.html: style mshellENUM (new this branch) and mshellTYPE (pre-existing gap) with the other declaration keywords, so --html-highlighted enum/type snippets render like def/match/end. - Sublime: add enum, type, and match to the keyword list, and 0o/0b integer literal patterns alongside the existing hex one. The VS Code grammar highlights no keywords at all (pre-existing design), so it needs no enum entry to stay consistent. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The grammar only highlighted comments, booleans, strings, and variables — no keyword rules at all, so def/end/match and the new enum/type rendered as plain text. Add the same rule set as the Sublime grammar: control keywords (including enum and type and the else*/*if forms), and/or/not, soe, the int/float/bool type names, and numeric literals including the new 0x/0o/0b integer forms. Rules are placed after the variable patterns so a same-position tie like `str!` keeps resolving to the variable-store rule. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The doc predated several final decisions and still read as a proposal. Update it to be the accurate decision record: - Status header: implemented (V1), with a map of what shipped vs. what stayed future work. - Fix examples that no longer parse: `read`/`write` are lexer keywords and cannot be member names; the §9/§10.A examples were missing the mandatory `end` terminator the doc itself decided on in §6. - §8: the shipped name-resolution rule replaced the draft's CmdResult.ok qualification — member names are globally unique and member/def collisions are declaration errors in both directions. - §9: record the shipped serialization (externally-tagged toJson, member(payload) str form), structural equality, and declaration-order sorting. - §10: per-tier status (A shipped; shape payload types shipped, their destructuring sugar not; B/D/E/F future). - §11: rewrite the end-to-end example against the shipped feature — an explicit parseMode boundary word instead of the unimplemented Mode.decode/Mode.values string backing. Example verified to run and type-check. - §12: open questions annotated with how each resolved. Design-dir edit explicitly requested (doc was AI-authored). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.