https://bugzilla.wikimedia.org/show_bug.cgi?id=69481

Brad Jorsch <[email protected]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |WONTFIX

--- Comment #13 from Brad Jorsch <[email protected]> ---
(In reply to Jackmcbarn from comment #11)
> Linux and Windows both say that U+D800 through U+DFFF are invalid.

That's not terribly surprising. UTF-16 surrogates aren't needed in UTF-8.

> but Windows also complains about
> U+FDD0 through U+FDEF, as well as all codepoints that end in FFFE or FFFF.

So it's complaining about non-characters even though they're valid code points.

(In reply to Jackmcbarn from comment #12)
> Windows PCRE version: 8.32 2012-11-30
> Linux PCRE version: 8.31 2012-07-06

Looking through the PCRE changelog, I see the following for 8.33:

> 21. Unicode validation has been updated in the light of Unicode Corrigendum 
> #9,
>    which points out that "non characters" are not "characters that may not
>    appear in Unicode strings" but rather "characters that are reserved for
>    internal use and have only local meaning".

This leads me to http://bugs.exim.org/show_bug.cgi?id=1340, which confirms that
this was a change in 8.32 that was fixed in 8.33.

About the only thing we could do in Scribunto would be to blacklist PCRE 8.32,
throwing an exception if that version is detected. I don't know that we really
want to go to that extent. I did update the documentation at
https://www.mediawiki.org/wiki/Extension:Scribunto#PCRE_version_compatibility
to note this issue.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to