matej_suchanek created this task.
matej_suchanek added a project: Wikidata.
Restricted Application added a subscriber: Aklapper.

TASK DESCRIPTION
  MonthNameUnlocalizer 
<https://github.com/wmde/Time/blob/master/src/ValueParsers/MonthNameUnlocalizer.php>
 is used by PhpDateTimeParser 
<https://github.com/wmde/Time/blob/master/src/ValueParsers/PhpDateTimeParser.php#L83>
 to unlocalize month names from user inputs by replacing them with the English 
ones text-wise. The result is then parsed using PHP's `DateTime` object.
  
  For example, `28. prosinec 2022` is replaced with `28. December 2022`, which 
PHP can understand.
  
  In production, MonthNameUnlocalizer's replacements are populated using 
`MediaWikiMonthNameProvider`, which for each month looks up its name, its 
genitive form, and its abbreviation 
<https://github.com/wikimedia/Wikibase/blob/master/repo/includes/Parsers/MediaWikiMonthNameProvider.php#L50-L52>
 in the given language.
  
  In Czech (cs), the replacements are as follows:
  
    > $provider = new \Wikibase\Repo\Parsers\MediaWikiMonthNameProvider();
    > $provider->getMonthNumbers( 'cs' );
    = [
        "leden" => 1,
        "ledna" => 1,
        "1." => 1,
        "únor" => 2,
        "února" => 2,
        "2." => 2,
        "březen" => 3,
        "března" => 3,
        "3." => 3,
        ...
    ]
  
  The abbreviated month is always a number followed by a dot. Therefore, if the 
input is in the DD. MM. YYYY format, the day may be replaced instead of the 
month (since during replacement string is scanned left-to-right).
  For example, `5. 4. 1891` (April 5th, 1891) can be replaced as both `5. April 
1891` (parsed correctly) and `May 4. 1891` (parsed with day and month swapped). 
In general, this depends on which comes first.
  
  In case the day is also zero-padded (e.g., `07.05.1997`), the replacement 
ignores the zeros and may create either `07. 0May 1997` or `0July 05. 1997`. 
PhpDateTimeParser then transforms them to either `07.0May.1997` or 
`0July.05.1997` and lets PHP parse them.
  The result seems to depend on PHP version. In production (PHP 7.4), the date 
is parsed as June 30th, 1997 (1997-07-00 -> 1997-06-30). On PHP 8.1, it is 
considered invalid.

TASK DETAIL
  https://phabricator.wikimedia.org/T325988

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: matej_suchanek
Cc: Aklapper, matej_suchanek, Astuthiodit_1, karapayneWMDE, Invadibot, 
maantietaja, ItamarWMDE, Akuckartz, Nandana, Lahi, Gq86, GoranSMilovanovic, 
QZanden, LawExplorer, _jensen, rosalieper, Scott_WUaS, Wikidata-bugs, aude, 
Mbch331
_______________________________________________
Wikidata-bugs mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to