https://bugzilla.wikimedia.org/show_bug.cgi?id=40133

Nemo <[email protected]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |[email protected]
            Summary|Use of guillemets ("french  |Guillemets ("french
                   |quotes") prevents           |quotes") are not tokenized
                   |expressions from being      |as word boundaries
                   |found                       |

--- Comment #3 from Nemo <[email protected]> ---
(In reply to comment #2)
> Phrases on wiki pages that are set between guillemets can't be found by the
> internal search, probably because the search index saves those phrases as
> "«Top" and "Dogs»" instead of "Top" and "Dogs".

Ok. This depends on the tokenization system being used: it's probably not an
issue with Lucene or Cirrus/ElasticSearch, what search is that wiki using? How
much control do we have on the tokenization for the standard MediaWiki search
which IIRC may use MySQL directly or something like that?

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to