feat(intl): implement Intl.Locale constructor (#5344)#5359
Conversation
📝 WalkthroughWalkthroughAdds a complete ChangesIntl.Locale Implementation
Sequence Diagram(s)sequenceDiagram
participant JS as JS Caller
participant Ctor as Intl.Locale constructor thunk
participant Parser as parse_language_tag
participant Options as apply_options
participant Factory as make_locale_instance
JS->>Ctor: new Intl.Locale(tag, options)
Ctor->>Ctor: validate tag type (String / existing Locale / coerce)
Ctor->>Parser: parse_language_tag(canonical_string)
Parser-->>Ctor: ParsedLocale or None → RangeError
Ctor->>Options: apply_options(parsed, options_bag)
Options-->>Ctor: mutated ParsedLocale or RangeError
Ctor->>Factory: make_locale_instance(parsed, proto)
Factory-->>Ctor: JS object with internal slots + bound methods
Ctor-->>JS: Intl.Locale instance
sequenceDiagram
participant JS as JS Caller
participant Method as maximize/minimize thunk
participant Parser as parse_language_tag
participant LikelySubtags as likely_subtags::maximize/minimize
participant Factory as transform_instance
JS->>Method: locale.maximize()
Method->>Method: locale_this() — validate receiver kind
Method->>Parser: parse_language_tag(stored canonical id)
Parser-->>Method: ParsedLocale
Method->>LikelySubtags: maximize(parsed) or minimize(parsed)
LikelySubtags-->>Method: mutated ParsedLocale
Method->>Factory: transform_instance(parsed, receiver_proto)
Factory-->>Method: new Intl.Locale instance
Method-->>JS: new Intl.Locale with expanded/minimized subtags
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related issues
Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
crates/perry-runtime/src/intl/locale.rs (1)
481-487: 💤 Low valueMinor: avoid duplicate
JSValue::from_bitsconstruction.The JSValue is constructed twice from
options_valueat lines 482 and 484. Consider extracting to a local variable.♻️ Suggested cleanup
let options = object_ptr_from_value(options_value); - if options.is_none() && !JSValue::from_bits(options_value.to_bits()).is_undefined() { + let options_js = JSValue::from_bits(options_value.to_bits()); + if options.is_none() && !options_js.is_undefined() { // CoerceOptionsToObject: a non-undefined non-object (e.g. null) is a TypeError. - if JSValue::from_bits(options_value.to_bits()).is_null() { + if options_js.is_null() { throw_type_error("Intl.Locale options must be an object"); } }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/perry-runtime/src/intl/locale.rs` around lines 481 - 487, The JSValue is constructed twice from options_value.to_bits() in the conditional checks within the locale validation logic. Extract this JSValue construction into a single local variable before the first condition check, then use that variable in both the is_undefined() call and the is_null() call to eliminate the duplicate construction and improve code efficiency.crates/perry-runtime/src/intl/locale/likely_subtags.rs (1)
127-155: 💤 Low valueConsider reducing redundant clones in minimize.
max.1andmax.2are cloned both in the condition checks and again when assigning top.script/p.region. Since the checks short-circuit withreturn, you could restructure to move instead of clone in the final assignments, or use references in the comparisons.♻️ Suggested refactor to reduce allocations
pub(super) fn minimize(p: &mut ParsedLocale) { let max = maximize_triple(&p.language, p.script.clone(), p.region.clone()); - // Minimization operates on the fully-resolved tag, so the result language is - // always the maximal language (e.g. `und-Latn` minimizes to `en`). - let lang = max.0.clone(); - p.language = lang.clone(); + let (lang, max_script, max_region) = max; + p.language = lang.clone(); // 1. language alone. - if maximize_triple(&lang, None, None) == max { + if maximize_triple(&lang, None, None) == (lang.clone(), max_script.clone(), max_region.clone()) { p.script = None; p.region = None; return; } // 2. language + region. - if max.2.is_some() && maximize_triple(&lang, None, max.2.clone()) == max { + if max_region.is_some() && maximize_triple(&lang, None, max_region.clone()) == (lang.clone(), max_script.clone(), max_region.clone()) { p.script = None; - p.region = max.2.clone(); + p.region = max_region; return; } // 3. language + script. - if max.1.is_some() && maximize_triple(&lang, max.1.clone(), None) == max { - p.script = max.1.clone(); + if max_script.is_some() && maximize_triple(&lang, max_script.clone(), None) == (lang, max_script.clone(), max_region.clone()) { + p.script = max_script; p.region = None; return; } // 4. keep the full maximal triple. - p.script = max.1; - p.region = max.2; + p.script = max_script; + p.region = max_region; }Alternatively, a cleaner approach would be to compare tuple references or restructure to avoid the intermediate clones entirely by storing the result of
maximize_tripleand reusing it.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/perry-runtime/src/intl/locale/likely_subtags.rs` around lines 127 - 155, The minimize function is redundantly cloning max.1 and max.2 multiple times during condition checks that have early returns. Refactor the logic to avoid these redundant clones by either storing the cloned values once and reusing them throughout the function, or by comparing using references in the condition checks (like maximize_triple calls) and only cloning the values when they are actually assigned to p.script and p.region in the final step 4, since the earlier conditions all return early before reaching the final assignments.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/perry-runtime/src/intl/locale.rs`:
- Around line 403-417: Empty Unicode extension keyword values are being set to
empty strings, causing getters to return `""` instead of `undefined` which
violates ECMA-402. For each of the five keyword checks in this block (ca, kf,
co, hc, nu), add an additional condition to verify that the value string is not
empty before calling set_internal_field. This ensures that when keywords exist
but have no explicit value (representing defaults), they are not set on the
object and the corresponding getters will return undefined.
---
Nitpick comments:
In `@crates/perry-runtime/src/intl/locale.rs`:
- Around line 481-487: The JSValue is constructed twice from
options_value.to_bits() in the conditional checks within the locale validation
logic. Extract this JSValue construction into a single local variable before the
first condition check, then use that variable in both the is_undefined() call
and the is_null() call to eliminate the duplicate construction and improve code
efficiency.
In `@crates/perry-runtime/src/intl/locale/likely_subtags.rs`:
- Around line 127-155: The minimize function is redundantly cloning max.1 and
max.2 multiple times during condition checks that have early returns. Refactor
the logic to avoid these redundant clones by either storing the cloned values
once and reusing them throughout the function, or by comparing using references
in the condition checks (like maximize_triple calls) and only cloning the values
when they are actually assigned to p.script and p.region in the final step 4,
since the earlier conditions all return early before reaching the final
assignments.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: cc29af32-683e-44bf-9f60-1df771d03934
📒 Files selected for processing (3)
crates/perry-runtime/src/intl.rscrates/perry-runtime/src/intl/locale.rscrates/perry-runtime/src/intl/locale/likely_subtags.rs
| if let Some(v) = p.keywords.get("ca") { | ||
| set_internal_field(obj, KEY_CALENDAR, string_value(v)); | ||
| } | ||
| if let Some(v) = p.keywords.get("kf") { | ||
| set_internal_field(obj, KEY_CASEFIRST, string_value(v)); | ||
| } | ||
| if let Some(v) = p.keywords.get("co") { | ||
| set_internal_field(obj, KEY_COLLATION, string_value(v)); | ||
| } | ||
| if let Some(v) = p.keywords.get("hc") { | ||
| set_internal_field(obj, KEY_HOURCYCLE, string_value(v)); | ||
| } | ||
| if let Some(v) = p.keywords.get("nu") { | ||
| set_internal_field(obj, KEY_NUMBERINGSYSTEM, string_value(v)); | ||
| } |
There was a problem hiding this comment.
Empty keyword values cause getters to return "" instead of undefined.
When a Unicode extension keyword exists with an empty value (e.g., from -u-ca without a type, or calendar: "true" canonicalized to ""), the field is set to an empty string. The corresponding getter then returns "" rather than undefined.
Per ECMA-402 §14.3.3, accessors like calendar, collation, hourCycle, caseFirst, and numberingSystem should return undefined when no explicit value is present. An empty keyword value represents "default/unspecified" and should not be exposed.
🛠️ Proposed fix: skip setting fields when value is empty
- if let Some(v) = p.keywords.get("ca") {
+ if let Some(v) = p.keywords.get("ca").filter(|v| !v.is_empty()) {
set_internal_field(obj, KEY_CALENDAR, string_value(v));
}
- if let Some(v) = p.keywords.get("kf") {
+ if let Some(v) = p.keywords.get("kf").filter(|v| !v.is_empty()) {
set_internal_field(obj, KEY_CASEFIRST, string_value(v));
}
- if let Some(v) = p.keywords.get("co") {
+ if let Some(v) = p.keywords.get("co").filter(|v| !v.is_empty()) {
set_internal_field(obj, KEY_COLLATION, string_value(v));
}
- if let Some(v) = p.keywords.get("hc") {
+ if let Some(v) = p.keywords.get("hc").filter(|v| !v.is_empty()) {
set_internal_field(obj, KEY_HOURCYCLE, string_value(v));
}
- if let Some(v) = p.keywords.get("nu") {
+ if let Some(v) = p.keywords.get("nu").filter(|v| !v.is_empty()) {
set_internal_field(obj, KEY_NUMBERINGSYSTEM, string_value(v));
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/perry-runtime/src/intl/locale.rs` around lines 403 - 417, Empty
Unicode extension keyword values are being set to empty strings, causing getters
to return `""` instead of `undefined` which violates ECMA-402. For each of the
five keyword checks in this block (ca, kf, co, hc, nu), add an additional
condition to verify that the value string is not empty before calling
set_internal_field. This ensures that when keywords exist but have no explicit
value (representing defaults), they are not set on the object and the
corresponding getters will return undefined.
Adds the `Intl.Locale` constructor, one of the missing Intl constructors behind the largest test262 intl402 failure bucket (138x "undefined is not a constructor", per #5344). `Intl.Locale` is also foundational — it is the locale object the rest of ECMA-402 builds on. Implemented in a new `crates/perry-runtime/src/intl/locale.rs` submodule: - Constructor parses a `unicode_locale_id` (language / script / region / variants + the `-u-` Unicode extension keywords) with structural validation (RangeError on malformed tags, TypeError on non-string / non-Locale arguments), and applies the options-bag overrides (language/script/region/calendar/collation/hourCycle/caseFirst/numeric/ numberingSystem) with per-option validation. - The eleven accessor properties (baseName, language, script, region, calendar, caseFirst, collation, hourCycle, numeric, numberingSystem) are installed as getters on `Intl.Locale.prototype` for reflection, and as own non-enumerable data props on each instance for live dispatch (these native objects resolve lookups from own props, matching the other Intl constructors). - `toString` / `maximize` / `minimize` are bound instance methods. - `maximize`/`minimize` use a curated likely-subtags table (full CLDR data needs icu_locale + its data pack, out of scope); correct for the common languages, identity fallback otherwise. Validated byte-for-byte against node v26: 36 behavioral cases (construction, all getters, toString, maximize/minimize, options overrides, validation errors, name/length) match exactly. The remaining node divergences (`[object Object]` toStringTag, `undefined` getOwnPropertyDescriptor on the builtin prototype) are pre-existing Perry limitations shared by every Intl constructor, not introduced here. This is the first constructor of the #5344 roadmap; it does not close the umbrella issue. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
70d968e to
feea925
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/perry-runtime/src/intl/locale.rs`:
- Around line 431-432: The `numeric` field setter is defaulting to `false` when
the `kn` keyword is absent, but per ECMA-402 it should represent an unset state
(undefined). Change the setter to use an `Option<bool>` type for the
`KEY_NUMERIC` internal field, storing `None` when the `kn` keyword is absent
instead of using `unwrap_or(false)`. Then update the corresponding getter method
(around lines 580-588) to return `undefined` when the `KEY_NUMERIC` slot
contains `None`, matching the behavior of other optional accessors like
`calendar` and `collation`.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: f293034e-529c-4f65-81c9-3560d23fd31d
📒 Files selected for processing (3)
crates/perry-runtime/src/intl.rscrates/perry-runtime/src/intl/locale.rscrates/perry-runtime/src/intl/locale/likely_subtags.rs
🚧 Files skipped from review as they are similar to previous changes (2)
- crates/perry-runtime/src/intl/locale/likely_subtags.rs
- crates/perry-runtime/src/intl.rs
| let numeric = p.keywords.get("kn").map(|v| v != "false").unwrap_or(false); | ||
| set_internal_field(obj, KEY_NUMERIC, bool_value(numeric)); |
There was a problem hiding this comment.
numeric getter returns false instead of undefined when kn keyword is absent.
Per ECMA-402 §14.3.3, the numeric accessor should return undefined when the [[Numeric]] internal slot is not present. Currently, when the kn keyword is not specified, unwrap_or(false) defaults to false, causing the getter to always return a boolean.
This differs from other accessors (like calendar, collation) which correctly return undefined when unset.
🛠️ Proposed fix
- let numeric = p.keywords.get("kn").map(|v| v != "false").unwrap_or(false);
- set_internal_field(obj, KEY_NUMERIC, bool_value(numeric));
+ if let Some(v) = p.keywords.get("kn") {
+ set_internal_field(obj, KEY_NUMERIC, bool_value(v != "false"));
+ }And update the getter:
extern "C" fn getter_numeric(_c: *const ClosureHeader) -> f64 {
let obj = locale_this("numeric");
let value = get_field(obj, KEY_NUMERIC);
- if value.to_bits() == crate::value::TAG_TRUE {
- bool_value(true)
- } else {
- bool_value(false)
+ let js = JSValue::from_bits(value.to_bits());
+ if js.is_undefined() {
+ undefined()
+ } else if value.to_bits() == crate::value::TAG_TRUE {
+ bool_value(true)
+ } else {
+ bool_value(false)
}
}Also applies to: 580-588
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/perry-runtime/src/intl/locale.rs` around lines 431 - 432, The
`numeric` field setter is defaulting to `false` when the `kn` keyword is absent,
but per ECMA-402 it should represent an unset state (undefined). Change the
setter to use an `Option<bool>` type for the `KEY_NUMERIC` internal field,
storing `None` when the `kn` keyword is absent instead of using
`unwrap_or(false)`. Then update the corresponding getter method (around lines
580-588) to return `undefined` when the `KEY_NUMERIC` slot contains `None`,
matching the behavior of other optional accessors like `calendar` and
`collation`.
Summary
Implements the
Intl.Localeconstructor — the highest-ROI, most self-contained item in the #5344 intl402 roadmap (suggested order #1: "missing constructor(s) behind the 138×undefined is not a constructor, start with DurationFormat/Locale").Intl.Localeis also the locale object the rest of ECMA-402 builds on.All new code lives in a new
crates/perry-runtime/src/intl/locale.rssubmodule (+ alikely_subtagschild); only 2 wiring lines touch the existingintl.rs(mod locale;and oneinstall_locale(ns_obj)call). No existing code paths are modified.What's implemented
unicode_locale_id(language / script / region / variants + the-u-Unicode extension keywords) with structural validation —RangeErroron malformed tags,TypeErroron non-string / non-Localearguments. Accepts an existingIntl.Localeas the tag.language,script,region,calendar,collation,hourCycle,caseFirst,numeric,numberingSystemoverrides, each validated (enum checks forhourCycle/caseFirst, structural checks otherwise).baseName,language,script,region,calendar,caseFirst,collation,hourCycle,numeric,numberingSystem): getters onIntl.Locale.prototypefor reflection and own non-enumerable data props per instance for live dispatch (these native objects resolve lookups from own props — the same pattern every otherIntl.*constructor uses).toString,maximize,minimize(bound instance methods).maximize/minimize: use a curated likely-subtags table — full CLDR likely-subtags data needsicu_locale+ its data pack (out of scope / size). Correct for the common languages, identity fallback for the long tail.Validation
Diffed byte-for-byte against
node v26: 36 behavioral cases (construction, all getters,toString,maximize/minimize, options overrides, validation errors,name/length) match exactly.The only node divergences observed are
Object.prototype.toString.call(loc)→[object Object](not[object Intl.Locale]) andgetOwnPropertyDescriptor(Intl.Locale.prototype, "<getter>")→undefined. Both are pre-existing Perry limitations shared by every Intl constructor (verified identically onIntl.NumberFormat), not introduced here.Scope
This is the first constructor of the #5344 roadmap and intentionally does not close the umbrella issue.
Intl.DurationFormatand the Temporal↔Intl / instance-method-gap clusters remain follow-ups.🤖 Generated with Claude Code
Summary by CodeRabbit
Intl.Localefor BCP-47/ECMA-402 locale parsing, canonicalization, and inspection.toString(),maximize(), andminimize()for locale normalization.baseName,language,script,region,calendar,collation,hourCycle,caseFirst,numeric, andnumberingSystem.RangeError, and supported locale options can override parsed values.