https://bugzilla.wikimedia.org/show_bug.cgi?id=36839

--- Comment #16 from Sam Reed (reedy) <[email protected]> 2012-05-27 12:26:18 
UTC ---
(In reply to comment #15)
> (In reply to comment #10)
> > If I'm right, the fix for this bug would be to revert Roan's change to the
> > "pcre.recursion_limit" setting (and fix whatever PageTriage's problem is in
> > some other way), or at least turn it up to something more reasonable than 
> > 1024.
> > I'd expect this is causing problems in other areas of the code, too.
> 
> Sam, have you had a chance to look at this yet?

Nope.

We can increase the pcre.recursion_limit again, the value Roan set was overly
conservative.

I'm not sure what the availability of various Ops staff is going to be to push
this through

(In reply to comment #13)
> The regexp recursion limit aside, is using a regexp to check for UTF-8
> appropriate? Why not use mb_check_encoding() if available? Other operations in
> Language.php do make use of the mb_* functions...
> 
> http://php.net/manual/en/function.mb-check-encoding.php

I'm guessing legacy reasons, and the code was never updated.

Please feel free to submit a patch to BZ or a commit to Gerrit. Or is just
changing:

        $isutf8 = preg_match( '/^([\x00-\x7f]|[\xc0-\xdf][\x80-\xbf]|' .
                '[\xe0-\xef][\x80-\xbf]{2}|[\xf0-\xf7][\x80-\xbf]{3})+$/', $s
);
        if ( $isutf8 ) {
            return $s;
        }

to

        if ( mb_check_encoding( $s, 'UTF-8' )  ) {
            return $s;
        }

enough?

Language::checkTitleEncoding has no unit tests written for it either.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to