--- Comment #4 from François Martin <> ---
(In reply to comment #3)
> (In reply to comment #2)
> > (In reply to comment #1)
> > > I know this is right for English, but maybe/probably not other languages.
> > 
> > This is right for French: apostrophes in this language are basically the
> > elision of a vowel and a space.
> The new search has a special filter to handle French's elision.  Here it is: 
> analysis-elision-tokenfilter.html
> .  I'll crack open the code and see what it does when I start work on this
> bug.

This new filter seems great. (Your link doesn’t mention “d’” as a stop word, it
will be worth the check when you hack the code.)
I’ve done some search tests on frwikisource and it appears that:

— apostrophes “'” and “’” are indeed interchangeable in the new Elasticsearch:
priority is given to the apostrophe typed in the search box, but the other one
is returned as well (e.g. the search “l'art d'avoir raison stratagème” first
returns a redirection page, but also every occurrence of “L’Art d’avoir
toujours raison”); although I don’t think that it’s due to the elision token
filter: the search “Morestal lorsqu'il” returns the same result as “Morestal
lorsqu’il”, even if “lorsqu” is not in this filter;

— despite this filter, apostrophes in french stop words don’t seem to break
words either: the search “avoir toujours raison” doesn’t return “L’Art d’avoir
toujours raison”, and the input “art d’avoir toujours raison” returns it but
“Art” in the search result is not bolded.

You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
Wikibugs-l mailing list

Reply via email to