https://bugzilla.wikimedia.org/show_bug.cgi?id=27807

Sam Reed (reedy) <s...@reedyboy.net> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |s...@reedyboy.net

--- Comment #11 from Sam Reed (reedy) <s...@reedyboy.net> ---
So, Legoktm and I were just looking at it. There's a few broken entries that
can be easily fixed with common sense (newlines in the middle and such).

The date regex is fair to naive to cater for all the localised date formats.

$rxTimestamp = '(?P<timestamp>\d+:\d+, \d+ \w+ \d+)';

We tried using '(?P<timestamp>.*?)'. It's a bit better, but with the optional
comma after, but then causes issues with dates with early commas

[bad timestamp] <li>۲۱:۲۰, ۲۰ اکتبر ۲۰۰۶ Jon Harald Søby got edits
XXX.XXX.XXX.XXX on fawiki</li>

And others such as 2006-10-25T20:29:01

    $regexes = array(
        'ipedits-xff' => "!^<li>$rxTimestamp,? $rxUser got edits for XFF
$rxTarget on $rxWiki$rxReason</li>!",
        'ipedits'     => "!^<li>$rxTimestamp,? $rxUser got edits for" ."
$rxTarget on $rxWiki$rxReason</li>!",
        'ipusers-xff' => "!^<li>$rxTimestamp,? $rxUser got users for XFF
$rxTarget on $rxWiki$rxReason</li>!",
        'ipusers'     => "!^<li>$rxTimestamp,? $rxUser got users for" ."
$rxTarget on $rxWiki$rxReason</li>!",
        'userips'     => "!^<li>$rxTimestamp,? $rxUser got IPs for".   "
$rxTarget on $rxWiki$rxReason</li>!"
    );

The first comma seems to be optional between some formats, so was easily
improved on.

The code is also using strtotime(), which isn't so good for these localised
formats "Parse about any English textual datetime description into a Unix
timestamp" - http://us1.php.net/strtotime


I'm guessing that the timestamp is in whatever format the person who did the
action has set in their preferences. Awesome, no?

There seems to be 10-20% of rows that won't be processed without at least some
manipulation of the code as it currently is

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to