feat(intl): getCanonicalLocales, supportedValuesOf, formatToParts + ListFormat/RelativeTimeFormat/PluralRules (#5298)#5320
Conversation
📝 WalkthroughWalkthroughThe PR extends the perry-runtime ECMA-402 ChangesECMA-402 Intl Expansion
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related issues
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/perry-runtime/src/intl.rs`:
- Around line 1212-1250: The rtf_instance_parts and rtf_parts functions
currently ignore the style and numeric options that are stored in the
RelativeTimeFormat instance during construction, causing them to always produce
output in long numeric format. To fix this, pass the instance object to
rtf_instance_parts so it can read the KEY_NUMERIC and KEY_RTF_STYLE fields from
the stored options, then thread these values into rtf_parts and add conditional
logic to handle different numeric modes (auto vs always) and style modes (long
vs short vs narrow) appropriately, or throw range errors for unsupported
combinations instead of silently ignoring them.
- Around line 1388-1393: The plural_select_range function only validates against
NaN values but does not reject infinities, which should be rejected according to
the API specification. Update the condition in the validation check to use the
is_finite() method instead of is_nan(), replacing the current check that uses
s.is_nan() || e.is_nan() with !s.is_finite() || !e.is_finite() to properly
reject both NaN and infinite values before processing further.
- Around line 225-248: The `supported_locales_of_thunk` function bypasses the
new ICU-backed canonicalization path and continues to use the old
`canonical_locale` function through `locales_from_value`. Update
`supported_locales_of_thunk` to call `canonicalize_language_tag` directly (the
function defined in this diff) instead of routing through `locales_from_value`,
so that `Intl.*.supportedLocalesOf` gets the same ICU structural validation and
canonicalization as `Intl.getCanonicalLocales`.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: f15e64c4-a2e4-4060-980d-04d7ffe9f735
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (5)
crates/perry-runtime/Cargo.tomlcrates/perry-runtime/src/intl.rscrates/perry/src/commands/compile/collect_modules/feature_detect.rscrates/perry/src/commands/compile/optimized_libs.rscrates/perry/src/commands/compile/types.rs
| /// Build the long-form, `numeric: "always"` en-US relative-time parts for | ||
| /// `value` in `unit`. (`short`/`narrow` abbreviations and the `numeric: "auto"` | ||
| /// special words — "tomorrow"/"yesterday" — need CLDR data and fall back to the | ||
| /// long numeric form here.) Returns `(leading, number, trailing)` literal/number | ||
| /// fragments so `format` and `formatToParts` stay consistent. | ||
| fn rtf_parts(value: f64, unit: &str) -> Vec<(&'static str, String)> { | ||
| let abs = value.abs(); | ||
| let num_str = format_number_parts(abs, "en-US", None, None); | ||
| let unit_display = if abs == 1.0 { | ||
| unit.to_string() | ||
| } else { | ||
| format!("{unit}s") | ||
| }; | ||
| let past = value.is_sign_negative(); | ||
| let mut parts: Vec<(&'static str, String)> = Vec::new(); | ||
| if past { | ||
| split_numeric_parts(&num_str, "en-US", &mut parts); | ||
| parts.push(("literal", format!(" {unit_display} ago"))); | ||
| } else { | ||
| parts.push(("literal", "in ".to_string())); | ||
| split_numeric_parts(&num_str, "en-US", &mut parts); | ||
| parts.push(("literal", format!(" {unit_display}"))); | ||
| } | ||
| parts | ||
| } | ||
|
|
||
| fn rtf_instance_parts(value: f64, unit_arg: f64) -> Vec<(&'static str, String)> { | ||
| let number = JSValue::from_bits(value.to_bits()).to_number(); | ||
| if !number.is_finite() { | ||
| throw_range_error("Value need to be finite number for Intl.RelativeTimeFormat.format()"); | ||
| } | ||
| let unit_str = value_to_string(unit_arg); | ||
| let Some(unit) = rtf_singular_unit(&unit_str) else { | ||
| throw_range_error(&format!( | ||
| "Value {unit_str} out of range for Intl.RelativeTimeFormat.format() unit" | ||
| )); | ||
| }; | ||
| rtf_parts(number, unit) | ||
| } |
There was a problem hiding this comment.
RelativeTimeFormat formatting ignores style/numeric instance options
The constructor stores style/numeric (Lines 1589-1592), but formatting paths here compute output without reading instance fields. This makes numeric: "auto" and style: "short" | "narrow" behave like a fixed long numeric format.
Please thread the instance object into rtf_instance_parts and branch on KEY_NUMERIC / KEY_RTF_STYLE (or reject unsupported modes instead of silently accepting and ignoring them).
Also applies to: 1252-1275
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/perry-runtime/src/intl.rs` around lines 1212 - 1250, The
rtf_instance_parts and rtf_parts functions currently ignore the style and
numeric options that are stored in the RelativeTimeFormat instance during
construction, causing them to always produce output in long numeric format. To
fix this, pass the instance object to rtf_instance_parts so it can read the
KEY_NUMERIC and KEY_RTF_STYLE fields from the stored options, then thread these
values into rtf_parts and add conditional logic to handle different numeric
modes (auto vs always) and style modes (long vs short vs narrow) appropriately,
or throw range errors for unsupported combinations instead of silently ignoring
them.
| fn plural_select_range(start: f64, end: f64) -> f64 { | ||
| let s = JSValue::from_bits(start.to_bits()).to_number(); | ||
| let e = JSValue::from_bits(end.to_bits()).to_number(); | ||
| if s.is_nan() || e.is_nan() { | ||
| throw_range_error("Invalid values for Intl.PluralRules.selectRange()"); | ||
| } |
There was a problem hiding this comment.
PluralRules.selectRange should reject infinities
Line 1391 only rejects NaN; +/-Infinity currently falls through and returns "other". The API should reject non-finite inputs.
Suggested fix
- if s.is_nan() || e.is_nan() {
+ if !s.is_finite() || !e.is_finite() {
throw_range_error("Invalid values for Intl.PluralRules.selectRange()");
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/perry-runtime/src/intl.rs` around lines 1388 - 1393, The
plural_select_range function only validates against NaN values but does not
reject infinities, which should be rejected according to the API specification.
Update the condition in the validation check to use the is_finite() method
instead of is_nan(), replacing the current check that uses s.is_nan() ||
e.is_nan() with !s.is_finite() || !e.is_finite() to properly reject both NaN and
infinite values before processing further.
e42a4c2 to
d570561
Compare
…istFormat/RelativeTimeFormat/PluralRules (#5298) Closes the largest test262 intl402 gaps (parity 60.5% -> 66.0%, +184 cases, no regressions): - Intl.getCanonicalLocales: CanonicalizeLocaleList via icu_locale_core's data-free BCP-47/UTS#35 structural parser (new default-on intl-locale feature, auto-enabled on use; already in the lock graph via temporal so default builds gain nothing). TypeError/RangeError per spec, dedup. - Intl.supportedValuesOf: sorted, dedup'd value tables for calendar/collation/ currency/numberingSystem/timeZone/unit; RangeError on invalid key. - Intl.NumberFormat/DateTimeFormat.prototype.formatToParts: typed {type,value} parts that reconstruct format() byte-for-byte. - New constructors Intl.ListFormat, Intl.RelativeTimeFormat, Intl.PluralRules with en-US format/select + spec-shaped resolvedOptions + enum option RangeError validation. ListFormat consumes any iterable via collection_iter::classify_init. - Symbol.toStringTag = 'Intl.<Name>' on every Intl prototype. Remaining tail (follow-up): Intl.Locale/DisplayNames/DurationFormat constructors, formatRange/formatRangeToParts, and ICU-CLDR-data-dependent locale-formatting cases (non-gregorian Temporal calendars, locale-specific number/date output).
d570561 to
3f33d7f
Compare
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/perry-runtime/src/intl.rs`:
- Around line 1190-1193: The plural_rules_select function reads the type field
from the object but does not apply the digit options (such as
maximumFractionDigits, minimumFractionDigits, maximumSignificantDigits,
minimumSignificantDigits) that were stored during construction. Before passing
the value to plural_select_en, extract these digit option fields from the object
using the same pattern as the type field retrieval, then apply the appropriate
digit formatting/rounding to the number based on the stored options to ensure
select() respects the same constraints as resolvedOptions(). Apply the same fix
to the other location mentioned in the comment around lines 1450-1464.
- Around line 1108-1119: The rtf_to_parts_thunk and rtf_bound_to_parts_thunk
functions currently use parts_to_js_array which only includes type and value
properties, but RelativeTimeFormat parts also need to include the unit property.
Modify the code to add the unit property to each part before converting to a JS
array. You should capture the unit parameter (which is passed to both functions)
and include it in the parts structure, either by creating a
RelativeTimeFormat-specific version of parts_to_js_array or by adding the unit
property to the parts returned from rtf_instance_parts before calling
parts_to_js_array.
- Around line 1450-1464: The code retrieves numeric options for
minimumIntegerDigits, minimumSignificantDigits, maximumSignificantDigits,
minimumFractionDigits, and maximumFractionDigits using get_option_number without
validating their ranges. Add validation checks after retrieving each option to
ensure minimumIntegerDigits is >= 1, significant digits are >= 1, maximum values
are >= minimum values where applicable, and fraction digits are >= 0. If any
validation fails, throw a RangeError with a descriptive message instead of
storing the invalid values. Apply these validations to the calls that retrieve
these options and their unwrap_or defaults before the set_internal_field calls.
- Around line 428-435: The issue is that for USD and EUR currencies, negative
values display the minus sign after the currency symbol (e.g., `$-1.00`) instead
of before it (e.g., `-$1.00`). To fix this, in both the USD and EUR match arms
where you push the currency symbol, check if the numeric parts contain a minus
sign as the first element. If present, extract and remove that minus sign from
numeric, push it to parts first, then push the currency symbol, and finally
extend with the remaining numeric parts. This ensures the minus sign appears
before the currency symbol for negative values.
In `@crates/perry-runtime/src/intl/locales.rs`:
- Around line 227-230: The error message in the throw_range_error call has
incorrect formatting with an extra space before the colon. In the format string
passed to throw_range_error, change "Invalid key : {key_str}" to "Invalid key:
{key_str}" by removing the space before the colon while keeping one space after
it to follow standard English punctuation conventions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: afe4b618-b119-4e72-886a-463de0768553
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (6)
crates/perry-runtime/Cargo.tomlcrates/perry-runtime/src/intl.rscrates/perry-runtime/src/intl/locales.rscrates/perry/src/commands/compile/collect_modules/feature_detect.rscrates/perry/src/commands/compile/optimized_libs.rscrates/perry/src/commands/compile/types.rs
🚧 Files skipped from review as they are similar to previous changes (3)
- crates/perry/src/commands/compile/collect_modules/feature_detect.rs
- crates/perry-runtime/Cargo.toml
- crates/perry/src/commands/compile/types.rs
| Some("EUR") => { | ||
| parts.push(("currency", "\u{20ac}".to_string())); | ||
| parts.extend(numeric); | ||
| } | ||
| Some("USD") => { | ||
| parts.push(("currency", "$".to_string())); | ||
| parts.extend(numeric); | ||
| } |
There was a problem hiding this comment.
Preserve the leading minus sign for prefix currencies.
For USD/non-DE EUR, numeric already contains the minusSign part, so negative values format as $-1.00 / €-1.00 instead of -$1.00 / -€1.00.
Proposed fix
Some("EUR") => {
+ if matches!(numeric.first().map(|(ty, _)| *ty), Some("minusSign")) {
+ parts.push(numeric.remove(0));
+ }
parts.push(("currency", "\u{20ac}".to_string()));
parts.extend(numeric);
}
Some("USD") => {
+ if matches!(numeric.first().map(|(ty, _)| *ty), Some("minusSign")) {
+ parts.push(numeric.remove(0));
+ }
parts.push(("currency", "$".to_string()));
parts.extend(numeric);
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/perry-runtime/src/intl.rs` around lines 428 - 435, The issue is that
for USD and EUR currencies, negative values display the minus sign after the
currency symbol (e.g., `$-1.00`) instead of before it (e.g., `-$1.00`). To fix
this, in both the USD and EUR match arms where you push the currency symbol,
check if the numeric parts contain a minus sign as the first element. If
present, extract and remove that minus sign from numeric, push it to parts
first, then push the currency symbol, and finally extend with the remaining
numeric parts. This ensures the minus sign appears before the currency symbol
for negative values.
| extern "C" fn rtf_to_parts_thunk(_closure: *const ClosureHeader, value: f64, unit: f64) -> f64 { | ||
| let _obj = this_intl_object("formatToParts", KIND_RELATIVE_TIME); | ||
| parts_to_js_array(&rtf_instance_parts(value, unit)) | ||
| } | ||
|
|
||
| extern "C" fn rtf_bound_to_parts_thunk( | ||
| closure: *const ClosureHeader, | ||
| value: f64, | ||
| unit: f64, | ||
| ) -> f64 { | ||
| let _obj = captured_intl_object(closure, "formatToParts", KIND_RELATIVE_TIME); | ||
| parts_to_js_array(&rtf_instance_parts(value, unit)) |
There was a problem hiding this comment.
Add the unit property to RelativeTimeFormat number parts.
formatToParts currently reuses parts_to_js_array, which only emits { type, value }. RelativeTimeFormat numeric parts also need the selected unit, otherwise consumers cannot distinguish { type: "integer", value: "3", unit: "day" } from a plain NumberFormat part.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/perry-runtime/src/intl.rs` around lines 1108 - 1119, The
rtf_to_parts_thunk and rtf_bound_to_parts_thunk functions currently use
parts_to_js_array which only includes type and value properties, but
RelativeTimeFormat parts also need to include the unit property. Modify the code
to add the unit property to each part before converting to a JS array. You
should capture the unit parameter (which is passed to both functions) and
include it in the parts structure, either by creating a
RelativeTimeFormat-specific version of parts_to_js_array or by adding the unit
property to the parts returned from rtf_instance_parts before calling
parts_to_js_array.
| fn plural_rules_select(obj: *const ObjectHeader, value: f64) -> f64 { | ||
| let n = JSValue::from_bits(value.to_bits()).to_number(); | ||
| let is_ordinal = get_string_field(obj, KEY_TYPE).as_deref() == Some("ordinal"); | ||
| string_value(plural_select_en(n, is_ordinal)) |
There was a problem hiding this comment.
Apply PluralRules digit options before category selection.
The constructor stores fraction/significant digit options, but plural_rules_select only reads type and passes the raw number to plural_select_en. Options such as maximumFractionDigits: 0 therefore change resolvedOptions() but not select().
Also applies to: 1450-1464
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/perry-runtime/src/intl.rs` around lines 1190 - 1193, The
plural_rules_select function reads the type field from the object but does not
apply the digit options (such as maximumFractionDigits, minimumFractionDigits,
maximumSignificantDigits, minimumSignificantDigits) that were stored during
construction. Before passing the value to plural_select_en, extract these digit
option fields from the object using the same pattern as the type field
retrieval, then apply the appropriate digit formatting/rounding to the number
based on the stored options to ensure select() respects the same constraints as
resolvedOptions(). Apply the same fix to the other location mentioned in the
comment around lines 1450-1464.
| let min_int = get_option_number(options, "minimumIntegerDigits").unwrap_or(1.0); | ||
| set_internal_field(obj, KEY_PR_MIN_INT, min_int); | ||
| let min_sig = get_option_number(options, "minimumSignificantDigits"); | ||
| let max_sig = get_option_number(options, "maximumSignificantDigits"); | ||
| if min_sig.is_some() || max_sig.is_some() { | ||
| set_internal_field(obj, KEY_PR_USE_SIG, bool_value(true)); | ||
| set_internal_field(obj, KEY_PR_MIN_SIG, min_sig.unwrap_or(1.0)); | ||
| set_internal_field(obj, KEY_PR_MAX_SIG, max_sig.unwrap_or(21.0)); | ||
| } else { | ||
| set_internal_field(obj, KEY_PR_USE_SIG, bool_value(false)); | ||
| let min_frac = get_option_number(options, "minimumFractionDigits").unwrap_or(0.0); | ||
| let max_frac = get_option_number(options, "maximumFractionDigits") | ||
| .unwrap_or_else(|| min_frac.max(3.0)); | ||
| set_internal_field(obj, KEY_PR_MIN_FRAC, min_frac); | ||
| set_internal_field(obj, KEY_PR_MAX_FRAC, max_frac); |
There was a problem hiding this comment.
Validate PluralRules digit option ranges.
get_option_number accepts any finite value here, so invalid values like minimumIntegerDigits: -1, maximumSignificantDigits: 0, or maximumFractionDigits < minimumFractionDigits are stored and exposed instead of throwing RangeError.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/perry-runtime/src/intl.rs` around lines 1450 - 1464, The code
retrieves numeric options for minimumIntegerDigits, minimumSignificantDigits,
maximumSignificantDigits, minimumFractionDigits, and maximumFractionDigits using
get_option_number without validating their ranges. Add validation checks after
retrieving each option to ensure minimumIntegerDigits is >= 1, significant
digits are >= 1, maximum values are >= minimum values where applicable, and
fraction digits are >= 0. If any validation fails, throw a RangeError with a
descriptive message instead of storing the invalid values. Apply these
validations to the calls that retrieve these options and their unwrap_or
defaults before the set_internal_field calls.
| None => throw_range_error(&format!( | ||
| "Invalid key : {key_str}. Wanted calendar, collation, currency, \ | ||
| numberingSystem, timeZone, or unit" | ||
| )), |
There was a problem hiding this comment.
Fix error message formatting.
The error message has an extra space before the colon: "Invalid key : {key_str}" should be "Invalid key: {key_str}" (no space before colon, one space after).
✏️ Proposed fix
- "Invalid key : {key_str}. Wanted calendar, collation, currency, \
+ "Invalid key: {key_str}. Wanted calendar, collation, currency, \
numberingSystem, timeZone, or unit"🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/perry-runtime/src/intl/locales.rs` around lines 227 - 230, The error
message in the throw_range_error call has incorrect formatting with an extra
space before the colon. In the format string passed to throw_range_error, change
"Invalid key : {key_str}" to "Invalid key: {key_str}" by removing the space
before the colon while keeping one space after it to follow standard English
punctuation conventions.
Closes the largest test262
intl402gaps. Tracks #5298.Result
intl402differential parity (node v26 oracle,--all-features): 60.5% → 66.0%, +184 cases passing, zero regressions in any directory.built-ins/Temporalself-validate suite: unchanged (no Temporal code touched).cargo test -p perry-runtime -p perry-stdlib: green (the two parallel-only flakestyped_array_indexof_includes/builtin_prototype_methods_reject_dynamic_newpass isolated and are pre-existing).What changed (all in
crates/perry-runtime/src/intl.rs)1.
Intl.getCanonicalLocales(was missing → 23 "is not a function").CanonicalizeLocaleList:
undefined→[], String→singleton,null→TypeError, Array/array-like→canonicalize+dedupe, non-string/non-object element→TypeError. Tag canonicalization delegates toicu_locale_core's data-free BCP-47/UTS #35 structural parser (Locale::normalize) — correct case normalization, variant ordering, extension well-formedness, and UTS #35 rejection of extlang/grandfathered/duplicate-singleton tags (which a hand-rolled validator gets subtly wrong). Gated behind a new default-onintl-localefeature, auto-enabled by the compiler ongetCanonicalLocales/supportedLocalesOfusage, with a hand-rolled fallback when off.icu_locale_coreis already in the lock graph viatemporal, so default/shipped builds carry no extra weight.2.
Intl.supportedValuesOf(was missing → 30 "is not a function").Sorted, duplicate-free value tables for
calendar/collation/currency/numberingSystem/timeZone/unit; new array per call; RangeError on any other (string-coerced) key.3.
formatToPartson NumberFormat + DateTimeFormat (was missing → 76 "is not a function").Refactored number/date formatting into typed
{type,value}part builders soformat()is the concatenation offormatToParts()values — the invariant the spec's own main tests assert.4. New constructors
Intl.ListFormat,Intl.RelativeTimeFormat,Intl.PluralRules(were missing → "undefined is not a constructor").en-US
format/selectmatching node byte-for-byte (verified), spec-shapedresolvedOptions(PluralRules incl. conditional significant-vs-fraction digits + rounding keys), andRangeErrorvalidation of enum options. ListFormat consumes any iterable (array / array-iterator / custom[Symbol.iterator]) viacollection_iter::classify_init.5.
Symbol.toStringTag = "Intl.<Name>"on every Intl prototype (descriptor tests; instance-toStringinheritance still limited by a separate runtime gap in prototype-chain symbol lookup).Auto-optimize plumbing (
feature_detect.rs/types.rs/optimized_libs.rs) wires theintl-localefeature;perry-api-manifestconsistency test green (these APIs are runtime-reified, not codegen dispatch).Remaining tail (follow-up, not in scope here)
Intl.Locale(142),Intl.DurationFormat(108),Intl.DisplayNames(52) — the bulk of remaining "undefined is not a constructor"; Locale needs many getters + likelySubtags data, DisplayNames/DurationFormat need CLDR display data.formatRange/formatRangeToPartson DateTimeFormat/NumberFormat (35).temporal_rslimitation), locale-specific number/date output (currency sign placement, non-latnnumbering systems, es-ES/pl-PL list & relative-time forms).Summary by CodeRabbit
Intl.getCanonicalLocales()andIntl.supportedValuesOf()runtime support for locale/value lookups.Intl.ListFormat,Intl.RelativeTimeFormat, andIntl.PluralRulesfor spec-aligned formatting andresolvedOptions.formatToParts()toIntl.NumberFormat, plus defaultformatToParts()forIntl.DateTimeFormat.#35behavior, including in-order de-duplication.