https://bugzilla.wikimedia.org/show_bug.cgi?id=23629
Summary: incorrect UTF-8 processing on output of page and
section titles
Product: MediaWiki extensions
Version: any
Platform: All
URL: http://ru.wikipedia.org/w/index.php?title=Special:Sear
ch&fulltext=1&search=%D0%B0&ns4=1&uselang=en
OS/Version: All
Status: NEW
Keywords: utf8
Severity: normal
Priority: Normal
Component: Lucene Search
AssignedTo: [email protected]
ReportedBy: [email protected]
The search system used in most WikiMedia projects makes errors in search result
page. There is no apparent flaw in matching algorithm, but <span
class="searchmatch"> tags are placed incorrectly when the search term contain
multibyte characters and appears in the title of a wikipage or its section.
Probably, matching algorithm provides substring lengths and offsets in
characters (code points), which are incorrectly interpreted as byte offsets by
HTML generating engine.
--
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l