Add date_order locale option and flexible date separator parsing#1624
Open
hidekoji wants to merge 1 commit into
Open
Add date_order locale option and flexible date separator parsing#1624hidekoji wants to merge 1 commit into
hidekoji wants to merge 1 commit into
Conversation
7f48ec7 to
472e72d
Compare
`locale()` gains a `date_order` argument so dates and date-times can be
parsed with an explicit component order ("mdy", "dmy", "ymd_hms", etc.).
This makes year-last formats such as 10/02/2024 readable, which the
automatic type guesser would otherwise treat as character.
Date and date-time auto-detection now also accepts any non-alphanumeric
separator between components and falls back to a year-last heuristic that
disambiguates D/M/YYYY vs M/D/YYYY (defaulting to MDY when ambiguous).
When date_order is set, CollectorDate / CollectorDateTime dispatch through
DateTimeParser::parseDateOrder(); guess logic in isDate() / isDateTime()
routes date-only vs time-suffixed orders accordingly.
Adds end-to-end read_csv() tests plus locale() and parser unit tests
covering explicit date_order, auto MDY/DMY detection, separator variants,
and YMD backward compatibility.
472e72d to
90403b0
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a
date_orderargument tolocale()and makes date / date-time auto-detection more forgiving, so year-last dates (e.g.10/02/2024) can be parsed as dates instead of guessed as character.locale(date_order =)— new optional argument accepting an explicit component order:"ymd","mdy","dmy", etc., optionally with a time suffix ("mdy_hms","dmy_hm","ymd_h").NULL(default) keeps current behaviour. Validated in R with a clear error message.DateTimeParser::parseDateOrder()— parses a value against an explicit order, including an optionalT/space-separated time part.DateTimeParser::parseYearLastHeuristic()— recognises unambiguousD/M/YYYYvsM/D/YYYY(part > 12disambiguates; defaults to MDY when ambiguous). Used as an auto-detection fallback in bothisDate()(guesser) andCollectorDate::setValue()(collector) so the two agree.2024.10.02,2024/10/02, …).CollectorDate/CollectorDateTimehold aLocaleInfo*and dispatch throughparseDateOrder()whendate_orderis set with no explicit format.Scope / known limitation
These changes affect readr's own C++ engine, which powers
parse_date(),parse_datetime(),guess_parser(), and the first edition ofread_csv().The second edition of
read_csv()(default since readr 2.0) delegates parsing to the vroom package, which does not yet know aboutdate_order. Full end-to-endread_csv()support in the default edition therefore depends on the companion vroom change (tidyverse/vroom#623). End-to-endread_csv()tests are intentionally omitted from this PR until that lands.Test plan
tests/testthat/test-locale.R—locale()accepts validdate_order, rejects invalid values, defaults toNULLtests/testthat/test-parsing-datetime.R—guess_parser()detects MDY/DMY dates and datetimes with explicitdate_orderguess_parser()auto-detects unambiguous DMY year-last dates; ambiguous year-last defaults to MDYparse_date()/parse_datetime()honourlocale(date_order =)