Skip to content

Add Temporal half-rounding boundary tests for PlainDate, PlainDateTime, PlainYearMonth, and ZonedDateTime#4996

Open
MidnightDesign wants to merge 6 commits intotc39:mainfrom
MidnightDesign:temporal-rounding-half-boundary
Open

Add Temporal half-rounding boundary tests for PlainDate, PlainDateTime, PlainYearMonth, and ZonedDateTime#4996
MidnightDesign wants to merge 6 commits intotc39:mainfrom
MidnightDesign:temporal-rounding-half-boundary

Conversation

@MidnightDesign
Copy link

@MidnightDesign MidnightDesign commented Mar 16, 2026

Note

This PR was drafted with the help of Claude Code. Apologies if that's not welcome here — happy to revise anything by hand.

Context

We're building temporal-php, a PHP 8.4 port of the TC39 Temporal API. We run the test262 suite as part of our CI (transpiled to PHP) and also use Infection for mutation testing. Infection systematically modifies source code and checks whether the test suite catches each mutation. Out of 11,983 mutations, several hundred escaped — and many of those escaped because the upstream test262 data doesn't exercise certain code paths. This PR adds tests to close the most impactful gaps we found.

Summary

Eight new test files across four Temporal types, all exercising the exact 0.5 fractional boundary in RoundRelativeDuration:

  • PlainDate, PlainDateTime, ZonedDateTime (since/ and until/): Test all 9 rounding modes at exact 0.5 fractional progress (183 days out of a 366-day leap year). The existing roundingmode-*.js tests use dates producing ~2.663 fractional years, where all half-* modes round identically. These new tests include both 1.5-year (odd integer) and 2.5-year (even integer) cases to distinguish halfEven from halfExpand.
  • PlainYearMonth (since/ and until/): Same boundary test, but uses June-starting dates because RoundRelativeDuration converts month remainders to days: Jun–Nov = 183 days in the 366-day year span from Jun 2019 to Jun 2020 (crossing Feb 29), giving exactly 183/366 = 0.5.

The until tests cover the positive direction; the since tests cover the negative direction (where halfExpand and halfCeil diverge).

How the gaps were found

Infection rewrites code like swapping halfExpand match arms, etc. If no test fails, the mutant "escapes." We traced ~80 escaped mutants back to the fact that all half-* rounding modes produce the same output with the current test data (no value near the 0.5 boundary), so entire match arms can be deleted or swapped without detection.

All expected values were verified against V8's Temporal implementation via test262-harness with esvu-installed V8 (d8).

…and dayOfWeek

The existing rounding mode tests for PlainDate.prototype.since() and
PlainDate.prototype.until() use dates that produce ~31.97 months of
difference, well above the 0.5 boundary. All half-* rounding modes
produce identical results, making it impossible to distinguish halfExpand
from halfTrunc, or halfEven from halfCeil.

These new tests use dates that produce exactly 0.5 fractional progress
(183/366 days in a leap year), causing all nine rounding modes to produce
distinct result patterns. The 2.5-year case specifically distinguishes
halfEven (rounds to nearest even integer 2) from halfExpand (rounds away
from zero to 3).

Also adds:
- inLeapYear century-year tests (1700, 1800, 1900, 2100, 2200) exercising
  the 100/400 rule that the basic test does not cover
- dayOfWeek tests across all 12 months of a year, since the basic test
  only checks 7 consecutive days within a single month

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@MidnightDesign MidnightDesign requested a review from a team as a code owner March 16, 2026 14:27
Remove unrelated dayOfWeek and inLeapYear tests. Add RoundRelativeDuration
spec references to info blocks. Extend half-boundary coverage to
PlainDateTime, PlainYearMonth, and ZonedDateTime (until + since).

PlainYearMonth uses June-starting dates because RoundRelativeDuration
converts month remainders to days: Jun-Nov = 183 days in a 366-day year
span (Jun 2019 - Jun 2020 crossing Feb 29), giving exactly 183/366 = 0.5.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@MidnightDesign MidnightDesign changed the title Add Temporal PlainDate tests for half-rounding boundary, century leap years, and dayOfWeek coverage Add Temporal half-rounding boundary tests for PlainDate, PlainDateTime, PlainYearMonth, and ZonedDateTime Mar 16, 2026
Copy link
Contributor

@ptomato ptomato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This looks good at first glance.

A couple questions and comments:

  • Would you mind sharing approximately the process you used for prompting Claude Code and turning the output into this PR? As the technology is new we are still finding our way around. Thank you for disclosing it up front, by the way.
  • Did the mutation tool only find a lack of test coverage when rounding to years, or is the coverage for units such as months also lacking? It might make sense to find similar pairs of objects for other units. (If you do that, it probably makes sense to loop over rounding modes that have the same outcome in each case, to prevent the tests from getting overly long.)
  • To make sure we are testing what we expect to be testing, it might be helpful to add assertions at the beginning such as assert.sameValue(earlier1.since(later).total({ unit: "years", relativeTo: earlier1 }), -1.5, "duration is on a 0.5 boundary");

@@ -0,0 +1,121 @@
// Copyright (C) 2026 Rudi Theunissen. All rights reserved.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Who's this?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well that's an interesting hallucination. Thanks, fixed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hah! Yeah not me, sorry. I'm flattered though thanks Claude.

TemporalHelpers.assertDuration(
earlier1.since(later, { smallestUnit: "years", roundingMode: "halfExpand" }),
-2, 0, 0, 0, 0, 0, 0, 0, 0, 0,
"-1.5 years, halfExpand rounds 0.5 away from zero"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does the "0.5" refer to here? Please use consistent phrasing

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced with "breaks ties."

@MidnightDesign
Copy link
Author

  • Would you mind sharing approximately the process you used for prompting Claude Code and turning the output into this PR?

I told it to write a minimal transpiler for the syntax used in the Temporal tests and it came up with this. It grew as it implemented more classes and methods that required different syntax to be converted.

Example conversion: JS -> PHP

Then I told it to start implementing classes and methods one by one against the test suite.

At one point it was pretty much done and all that was left were concepts that are untranslatable, like JS Symbol.

I was wondering whether there was any dead or untested code in there (which there shouldn't if test262 is exhaustive). I fired up my go-to technique for that, mutation testing, using Infection. To my surprise it actually flagged some mutants, specifically pointing out that the rounding modes were pretty much uncovered. I then told it in a different session (directly in the test262 repo) to double and triple check and it and it was positive that that's an actual gap in the test suite. I told it to add the tests, it did it, verified them against V8 and I let it open the PR, making sure to include the fact that this was written by an agent.

As the technology is new we are still finding our way around. Thank you for disclosing it up front, by the way.

Yeah of course, that was important to me. We're all still figuring out how to handle this stuff and I thought it would be important to be upfront with that.

  • Did the mutation tool only find a lack of test coverage when rounding to years, or is the coverage for units such as months also lacking? It might make sense to find similar pairs of objects for other units. (If you do that, it probably makes sense to loop over rounding modes that have the same outcome in each case, to prevent the tests from getting overly long.)

Claude Code did not flag other units by itself specifically, and I didn't ask. Will check tomorrow.

  • To make sure we are testing what we expect to be testing, it might be helpful to add assertions at the beginning such as assert.sameValue(earlier1.since(later).total({ unit: "years", relativeTo: earlier1 }), -1.5, "duration is on a 0.5 boundary");

Great idea, I will add that.

@ptomato Thanks for taking the time.

MidnightDesign and others added 2 commits March 18, 2026 18:09
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Clarifies the half-* rounding mode assertion messages by using standard
tie-breaking terminology for consistency with non-half mode phrasing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@linusg
Copy link
Member

linusg commented Mar 18, 2026

The copyright hallucination makes me very skeptical of this contribution.

  • We ought to reject it as a violation of the AI policy alone
  • Even if there was an exception, it clearly wasn't sanity checked

@MidnightDesign
Copy link
Author

The copyright hallucination makes me very skeptical of this contribution.

I checked the code, not the boilerplate-looking legal header.

  • We ought to reject it as a violation of the AI policy alone

Absolutely fair. I'm skeptical of AI-written code that I haven't generated myself either. But given the radical shift in the last couple of months I think these kinds of policies will need to be updated.

What a weird point in time; people still see anything involving AIs instinctively as low-quality (myself included) while even the greatest skeptics have started using it all the time.

Personally, I'm in favor of being transparent with the use of AI and allowing it in a limited capacity instead of banning it outright (and people using it anyway without telling).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants