Skip to content

Drop .clone() on closed-union string() methods#1

Merged
redvers merged 4 commits into
mainfrom
refactor_iso_to_val
May 28, 2026
Merged

Drop .clone() on closed-union string() methods#1
redvers merged 4 commits into
mainfrom
refactor_iso_to_val

Conversation

@redvers
Copy link
Copy Markdown
Contributor

@redvers redvers commented May 28, 2026

Summary

The Category / GraphemeBreak / Script / BinaryProperty / AllValid primitives all had:

```pony
fun string(): String iso^ => "X".clone()
```

which allocates a fresh `String iso` on every call. The values are literal string constants — returning them directly as `String val` is shareable, cheaper, and clearer:

```pony
fun string(): String val => "X"
```

Side effect: Stringable

These primitives no longer structurally satisfy Pony's builtin `Stringable` interface (which requires `String iso^`). None declare `is Stringable` explicitly, and no internal caller passes them to a `Stringable`-expecting position. If a future caller needs `Stringable`, wrap with `.clone()` at the call site.

The `errors.pony` types (`InvalidUtf8`, `OutOfRange`, `InvalidScalar`) and `Codepoint.string()` build their output dynamically via `s.append(...)` — those still return `String iso^`.

Files

  • `unicode/category.pony` (30 entries, hand-written)
  • `unicode/grapheme_break.pony` (15 entries, hand-written)
  • `unicode/bytes.pony` (`AllValid`)
  • `unicode/binary_property.pony` (regenerated)
  • `unicode/script.pony` (regenerated)
  • `unicode_build/binary_props_table.pony` (codegen source)
  • `unicode_build/script_table.pony` (codegen source)

Test plan

  • `make ci` locally: 146 unit tests + 18,992 UAX #15 conformance cases pass
  • PR CI workflow runs the same

redvers added 4 commits May 27, 2026 21:10
The Category / GraphemeBreak / Script / BinaryProperty / AllValid
primitives all had

    fun string(): String iso^ => "X".clone()

which allocates a fresh String iso on every call. The values are
literal string constants — returning them directly as String val
is shareable, cheaper, and clearer:

    fun string(): String val => "X"

Side-effect: these primitives no longer structurally satisfy
Pony's `Stringable` interface (which requires `String iso^`). None
of them declare `is Stringable` and no internal caller passes
them to a `Stringable`-expecting position. If a future caller
needs Stringable, wrap with `.clone()` at the call site.

The errors.pony types (InvalidUtf8, OutOfRange, InvalidScalar)
and Codepoint.string() build their output dynamically via
`s.append(...)` — they still return `String iso^`.

Codegen sources updated:
  unicode_build/binary_props_table.pony
  unicode_build/script_table.pony

Generated files regenerated via `make ucd-generate`.
CI was using unicode.org/UCD/latest (currently 16.0.0) while the
committed tables were generated from the local UCD snapshot
(14.0.0). This mismatch caused 281 NormalizationTest.txt failures
on codepoints added in 15.x/16.0 (Todhri script, Tulu-Tigalari
script, MODIFIER LETTER CAPITAL S, etc.).

Pinning the UCD version in the Makefile keeps CI and local in
sync. Bumping UCD_VERSION is now a deliberate, reviewed step:
update the variable, `make ucd-download && make ucd-generate`,
verify `make conform` passes, commit the regenerated tables.

Local + CI now run against Unicode 16.0.0:
  146 unit tests pass
  19,965 / 19,965 NormalizationTest.txt cases pass (100%)
@redvers redvers merged commit 7a1284c into main May 28, 2026
6 checks passed
@redvers redvers deleted the refactor_iso_to_val branch May 28, 2026 06:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant