[Matt Hamilton]
> ...
> We have been running Zope on OpenBSD/AMD64 3.6 for about a year now
> and it works pretty well.  I have however recently discovered a python bug
> that I am trying to track down.  I am unsure of the exact problem, but it
> affects the re and string libs:
> zeo1# uname -a
> OpenBSD zeo1.netsight.co.uk 3.6 conf#0 amd64
> zeo1# python
> Python 2.3.4 (#1, Nov 16 2004, 08:26:06)
> [GCC 3.3.2 (propolice)] on openbsd3
> Type "help", "copyright", "credits" or "license" for more information.
>  >>> import string
>  >>> string.whitespace
> '\t\n\x0b\x0c\r \x89\x8a\x8b\x8c\x8d\xa0'
> on all other platforms I've tried string.whitespace stops after '\r'...
> the trailing chars cause problems in weird and wonderful places.  I
> upgraded to python 2.3.5 and get the same result.  Not tried on python
> 2.4 yet.

Python version won't matter:  the value of string.whitespace is
entirely determined by your platform C and the locale in effect when
the C-coded portion of Python's string module (Modules/stropmodule.c)
is imported.  As you can see from that module's initstrop() function,
string.whitespace consists exactly of the 8-bit characters (0-255) for
which the platform C's isspace() macro returns a true value.  The
results you've seen on most systems:

    '\t\n\x0b\x0c\r '

is what it must be in the "C" locale (the C standard defines this),
but if you're not in the "C" locale it could be anything.

It's not unusual (well, not for non-Americans <wink>) to see \xa0 in
that list, because \xa0 is Latin-1's non-breaking space character
("&nbsp;" in HTML).  It's surprising to me to see \x89-\x8d there,
though.  It could be your system is set to use "an unusual" locale, or
it could be a bug in the platform C libraries.  Try writing a little C
program to see what isspace() returns.
Zope maillist  -  Zope@zope.org
**   No cross posts or HTML encoding!  **
(Related lists -
 http://mail.zope.org/mailman/listinfo/zope-dev )

Reply via email to