[Zope] zope, latin-1 and accented words

Yuri Tue, 14 Jun 2005 07:54:36 -0700

How could I can tell the Splitter of ZCText intedex to not split wordsas "aaa�bbb" in "aaa" and "bbb"?

I would like to tell zope that �,� and so on are alphanumericletters... In Splitter.c I have:


class Splitter:

   import re
   rx = re.compile(r"(?L)\w+")

?L match "as the locale", but I have multilingual latin-1 contents...\w would match only [a..z,A..Z]!

TIA

P.S. I've written a small Class for the ZCTextindex pipeline thatconvert all the accented characters in non accented ones, so I can index"perch�" as "perche". It would work only if I can solve this splitterproblem...

_______________________________________________
Zope maillist  -  [email protected]
http://mail.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists -
http://mail.zope.org/mailman/listinfo/zope-announce
http://mail.zope.org/mailman/listinfo/zope-dev )

[Zope] zope, latin-1 and accented words

Reply via email to