https://bugzilla.wikimedia.org/show_bug.cgi?id=23629

           Summary: incorrect UTF-8 processing on output of page and
                    section titles
           Product: MediaWiki extensions
           Version: any
          Platform: All
               URL: http://ru.wikipedia.org/w/index.php?title=Special:Sear
                    ch&fulltext=1&search=%D0%B0&ns4=1&uselang=en
        OS/Version: All
            Status: NEW
          Keywords: utf8
          Severity: normal
          Priority: Normal
         Component: Lucene Search
        AssignedTo: [email protected]
        ReportedBy: [email protected]


The search system used in most WikiMedia projects makes errors in search result
page. There is no apparent flaw in matching algorithm, but <span
class="searchmatch"> tags are placed incorrectly when the search term contain
multibyte characters and appears in the title of a wikipage or its section.
Probably, matching algorithm provides substring lengths and offsets in
characters (code points), which are incorrectly interpreted as byte offsets by
HTML generating engine.

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to