https://bugzilla.wikimedia.org/show_bug.cgi?id=68724

--- Comment #6 from Bawolff (Brian Wolff) <[email protected]> ---
(In reply to Jean-Fred from comment #5)
> (In reply to Bawolff (Brian Wolff) from comment #2)
> > p.s. FWIW, C0 and C1 control characters including "STRING TERMINATOR" are
> > valid title characters (Although on commons they are blacklisted via title
> > blacklist)
> 
> You mean GWToolset ignores the title blacklist? That sounds bad.
> 
> (I noticed this error because the bot I fired to rename all the images of
> this batch choked on these 10 files with "STRING TERMINATOR" with an
> APIError. Not sure if the fault lies with Pywikibot, the MediaWiki API or
> something else, but such file titles are definitely a problem.

Yes. The 0xC9 should be blocked by the
 .*\p{Cc}.* <casesensitive|errmsg=titleblacklist-custom-hidden-char> # Control
characters

rule. Well such characters may technically be valid title characters according
to MediaWiki. There is really no good reason to ever use them. Almost to the
point where one might want to assume that things were converted wrong and
automatically try and re-convert as if its windows-1252.


[As an offtopic aside, Commons also blocks all astral characters (Mostly dead
languages and emoticons, but also a bunch of chinese-japanese-korean
characters), which seems a tad bit restrictive for a multi-lingual project of
the scope that commons is...]

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to