https://bugzilla.wikimedia.org/show_bug.cgi?id=36839
--- Comment #16 from Sam Reed (reedy) <[email protected]> 2012-05-27 12:26:18 UTC --- (In reply to comment #15) > (In reply to comment #10) > > If I'm right, the fix for this bug would be to revert Roan's change to the > > "pcre.recursion_limit" setting (and fix whatever PageTriage's problem is in > > some other way), or at least turn it up to something more reasonable than > > 1024. > > I'd expect this is causing problems in other areas of the code, too. > > Sam, have you had a chance to look at this yet? Nope. We can increase the pcre.recursion_limit again, the value Roan set was overly conservative. I'm not sure what the availability of various Ops staff is going to be to push this through (In reply to comment #13) > The regexp recursion limit aside, is using a regexp to check for UTF-8 > appropriate? Why not use mb_check_encoding() if available? Other operations in > Language.php do make use of the mb_* functions... > > http://php.net/manual/en/function.mb-check-encoding.php I'm guessing legacy reasons, and the code was never updated. Please feel free to submit a patch to BZ or a commit to Gerrit. Or is just changing: $isutf8 = preg_match( '/^([\x00-\x7f]|[\xc0-\xdf][\x80-\xbf]|' . '[\xe0-\xef][\x80-\xbf]{2}|[\xf0-\xf7][\x80-\xbf]{3})+$/', $s ); if ( $isutf8 ) { return $s; } to if ( mb_check_encoding( $s, 'UTF-8' ) ) { return $s; } enough? Language::checkTitleEncoding has no unit tests written for it either. -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug. _______________________________________________ Wikibugs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
