[Bug 8445] Multiple search terms are not enforced properly for Chinese

bugzilla-daemon Mon, 22 Dec 2008 18:12:36 -0800

https://bugzilla.wikimedia.org/show_bug.cgi?id=8445



Brion Vibber <[email protected]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |[email protected]
           Severity|minor                       |normal
          Component|Page rendering              |Search
         OS/Version|Linux                       |All
           Priority|Low                         |Normal
           Platform|PC                          |All
            Summary|search: one character enough|Multiple search terms are
                   |to match?!                  |not enforced properly for
                   |                            |Chinese
            Version|1.7.1                       |1.14-svn




--- Comment #1 from Brion Vibber <[email protected]>  2008-12-23 02:12:30 UTC 
---
Ok, it looks like the splitting of characters (done to compensate for the lack
of word spacing in Chinese text) is happening after the boolean search query is
constructed, leading to failure:

The input:
'逢甲'

is translated to a boolean query for a single required word:
'+逢甲"

which then gets split up by character, then encoded to compensate for encoding
bugs:
'+  U8e980a2  U8e794b2'

The '+' gets detached from the characters, so has no affect, and the search
backend will returns results that contain either character instead of requiring
both.

As a workaround, you can quote the multi-character string, which ends up
encoding correctly for a phrase search:
'+"  U8e980a2  U8e794b2"'


-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

[Bug 8445] Multiple search terms are not enforced properly for Chinese

Reply via email to