thiemowmde added subscribers: thiemowmde, Addshore.
thiemowmde added a project: Need-volunteer.
thiemowmde triaged this task as "Low" priority.
thiemowmde added a comment.

This is an other situation where none of the (currently half a dozen) custom Wikibase parsers is able to understand an input string, and parsing falls back to PHP's problematic build-in parser (see http://php.net/manual/en/datetime.formats.php).

In my opinion the best option is to improve the existing YearMonthTimeParser. This parser is meant to understand dates with precision "month".

// Before:
'/^(-?[\d\p{L}]+)\s*?[\/\-\s.,]\s*(-?[\d\p{L}]+)$/'

// After:
'/^[\p{P}\p{Z}]*?(-?[\p{L}\p{N}]+)\p{Z}*?[\p{P}\p{Z}]\p{Z}*(-?[\p{L}\p{N}]+)[\p{P}\p{Z}]*$/'

// The same, just documented:
'/^
    [\p{P}\p{Z}]*?     # irrelevant punctuation/whitespace (ungreedy)
    (-?[\p{L}\p{N}]+)  # capture group 1 contains either month or year
    \p{Z}*?            # irrelevant whitespace (ungreedy)
    [\p{P}\p{Z}]       # at least 1 separator
    \p{Z}*             # irrelevant whitespace
    (-?[\p{L}\p{N}]+)  # capture group 2 contains either month or year
    [\p{P}\p{Z}]*      # irrelevant punctuation/whitespace
    $/x'

https://www.regular-expressions.info/unicode.html is a nice cheat sheet for these \p{…} Unicode character classes.

Properly testing this in YearMonthTimeParserTest is a must. Additionally, at least one relevant edge case should be added to TimeParserFactoryTest.


TASK DETAIL
https://phabricator.wikimedia.org/T198179

EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: thiemowmde
Cc: Addshore, thiemowmde, Aklapper, Nikki, tabish.shaikh91, Lahi, Gq86, GoranSMilovanovic, Soteriaspace, Jayprakash12345, JakeTheDeveloper, QZanden, merbst, LawExplorer, D3r1ck01, Wikidata-bugs, aude, TheDJ, Mbch331
_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to