matej_suchanek created this task. matej_suchanek added a project: Wikidata. Restricted Application added a subscriber: Aklapper.
TASK DESCRIPTION MonthNameUnlocalizer <https://github.com/wmde/Time/blob/master/src/ValueParsers/MonthNameUnlocalizer.php> is used by PhpDateTimeParser <https://github.com/wmde/Time/blob/master/src/ValueParsers/PhpDateTimeParser.php#L83> to unlocalize month names from user inputs by replacing them with the English ones text-wise. The result is then parsed using PHP's `DateTime` object. For example, `28. prosinec 2022` is replaced with `28. December 2022`, which PHP can understand. In production, MonthNameUnlocalizer's replacements are populated using `MediaWikiMonthNameProvider`, which for each month looks up its name, its genitive form, and its abbreviation <https://github.com/wikimedia/Wikibase/blob/master/repo/includes/Parsers/MediaWikiMonthNameProvider.php#L50-L52> in the given language. In Czech (cs), the replacements are as follows: > $provider = new \Wikibase\Repo\Parsers\MediaWikiMonthNameProvider(); > $provider->getMonthNumbers( 'cs' ); = [ "leden" => 1, "ledna" => 1, "1." => 1, "únor" => 2, "února" => 2, "2." => 2, "březen" => 3, "března" => 3, "3." => 3, ... ] The abbreviated month is always a number followed by a dot. Therefore, if the input is in the DD. MM. YYYY format, the day may be replaced instead of the month (since during replacement string is scanned left-to-right). For example, `5. 4. 1891` (April 5th, 1891) can be replaced as both `5. April 1891` (parsed correctly) and `May 4. 1891` (parsed with day and month swapped). In general, this depends on which comes first. In case the day is also zero-padded (e.g., `07.05.1997`), the replacement ignores the zeros and may create either `07. 0May 1997` or `0July 05. 1997`. PhpDateTimeParser then transforms them to either `07.0May.1997` or `0July.05.1997` and lets PHP parse them. The result seems to depend on PHP version. In production (PHP 7.4), the date is parsed as June 30th, 1997 (1997-07-00 -> 1997-06-30). On PHP 8.1, it is considered invalid. TASK DETAIL https://phabricator.wikimedia.org/T325988 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: matej_suchanek Cc: Aklapper, matej_suchanek, Astuthiodit_1, karapayneWMDE, Invadibot, maantietaja, ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, Mbch331
_______________________________________________ Wikidata-bugs mailing list -- [email protected] To unsubscribe send an email to [email protected]
