Nick Bower wrote at 2004-10-8 16:41 +0200: > ... >Module Products.ZCTextIndex.Lexicon, line 69, in sourceToWordIds >Module Products.ZCTextIndex.Lexicon, line 135, in _getWordIdCreate > >UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 5: >orginal not in range(128).
In your lexicon operation a unicode and a non-unicode string is put together (this can happen internally during BTree traversal). Whenever such a thing happens, Python tries to convert the non unicode to unicode -- using its default encoding. This fails as the non unicode string contains bytes not convertable this this encoding. In a later message you reported that setting Python's default encoding to "utf-8" gave you an unexpected end exception. This means that your non unicode string is not utf-8 encoded. You should use as default encoding the encoding that is used for your non unicode strings. If you do not know it, use an encoding that can map any 8 bit byte. Windows has a few of them (called "cpXXX" (for CodePage); I do not know the correct XXX). -- Dieter _______________________________________________ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
