How could I can tell the Splitter of ZCText intedex to not split words as "aaaèbbb" in "aaa" and "bbb"?

I would like to tell zope that è,à and so on are alphanumeric letters... In Splitter.c I have:

class Splitter:

   import re
   rx = re.compile(r"(?L)\w+")

?L match "as the locale", but I have multilingual latin-1 contents... \w would match only [a..z,A..Z]!


P.S. I've written a small Class for the ZCTextindex pipeline that convert all the accented characters in non accented ones, so I can index "perchè" as "perche". It would work only if I can solve this splitter problem...
Zope maillist  -
**   No cross posts or HTML encoding!  **
(Related lists - )

Reply via email to