Nick Bower wrote at 2004-10-8 16:41 +0200:
> ...
>Module Products.ZCTextIndex.Lexicon, line 69, in sourceToWordIds
>Module Products.ZCTextIndex.Lexicon, line 135, in _getWordIdCreate
>
>UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 5: 
>orginal not in range(128).

In your lexicon operation a unicode and a non-unicode string is
put together (this can happen internally during BTree traversal).

Whenever such a thing happens, Python tries to convert the
non unicode to unicode -- using its default encoding.
This fails as the non unicode string contains bytes not convertable
this this encoding.

In a later message you reported that setting Python's default
encoding to "utf-8" gave you an unexpected end exception.
This means that your non unicode string is not utf-8 encoded.


You should use as default encoding the encoding that is
used for your non unicode strings.

If you do not know it, use an encoding that can map any 8 bit byte.
Windows has a few of them (called "cpXXX" (for CodePage);
I do not know the correct XXX).

-- 
Dieter
_______________________________________________
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope )

Reply via email to