thanks Tim, for the partial enlightenment. :) Unfortunately my
BTree/C-wisdom is much much smaller than yours, so I got to check a
couple of assumptions over here. :)
Yup, it really helps if you have a 64-bit box to check them on.
Hah. Looks like BTrees can accept 2**31 on machines where maxint is a
Note that "accept" in this case means "silently loses the
most-significant bits", and that's a BTree bug on platforms where
sizeof(long) > sizeof(int):
I was able to reproduce this on an Intel 64-Bit machine (EM64T) running
Linux and gcc.
And I expect the same will happen on any 64-bit platform other than
Win64 (which is unique, AFAICT, in leaving sizeof(long) == 4 instead
of boosting it to 8).
For one: I didn't see any compiler warnings. That sounds bad, right?
The underlying BTree bug is assignments of the form (usually "hidden"
in macro expansions):
some_C_int = some_C_long;
When sizeof(int) < sizeof(long), that can silently lose information.
I was really hoping that major compilers on boxes where sizeof(int) <
sizeof(long) would warn about that. Oh well.
The problem arises because what _Python_ calls "int" is what C calls
"long", but the I-flavor BTree code stores C "int", and C "int"
doesn't correspond to any Python type (except "by accident" on 32-bit
boxes). C "int" is 4 bytes on all known current 32- and 64-bit
platforms, but the size of what Python calls "int" varies. The BTree
code isn't aware of the possible mismatch, storing Python "int" (C
"long") into C "int" without any checking.
2**31 doesn't actually lose any bits when you store it, but it will
probably misinterpret the high-order data bit as the sign bit when you
fetch it again, magically changing it into -2**31.
I can store 2**31 in the BTree, but the keys() method will tell you that
it actually stored -2**31.
Right. If it's not clear, this is because the _bit_ pattern
is 2**31 when viewed as an 8-byte C long, but is -2**31 when viewed as
a 4-byte C long. If you had, e.g., stored 2**32 instead, you would
have gotten 0 back when you fetched it (the top 32 bits are simply
Did you see my simpler suggestion for fixing the underlying bug (it
was a one-liner change to the original code)? When you get tired of
fighting the 64-bit BTree bug here (it will be a minor miracle if the
test actually passes now on a 64-bit box, despite all you've tried
...), look that up ;-)
Nope, I didn't find your one-liner. :) Can you post it explicitly for my
Here; it was a reply to one of the checkin messages:
The key problem in the original intid code is that it used randint()
instead of randrange(), theoretically allowing 2**31 to be a return
value. It's _almost_ enough just to use randrange() instead.
Unfortunately, that's not quite enough; see the msg for what is.
It _could_ be fixed by adding 2 characters to the original, changing
2**31 to 2**31-2 (or to 0x7ffffffe), but that would leave it pretty
I think I could have made up something that made it work, but I started
looking into making the BTree behave sanely.
My idea was, to modify the BTree code in a way that it actually checks
after the type cast whether the casted value is equal to the requested
key, or alternatively try making the CHECK_KEY macro do an "exact type
match" instead of allowing subclasses. But that wouldn't work either as
2**31 is still an int on those platforms. I'm a bit puzzled now. :)
There's potential silent information loss for both Ix-flavor BTree
keys and xI-flavor BTree values. There are two robust ways to check
("robust" means that Python's C code has used these ways at various
times for many years now without problems). Complain if:
some_C_long < INT_MIN || some_C_long > INT_MAX
or complain if (this sounds close to what you had in mind above, and
is my favorite):
(long)(int)some_C_long != some_C_long
Because the problem is due to bogus assumptions about the relationship
between C types, it's not going to help to examine Python's idea of
Checking isn't needed (can't fail) if SIZEOF_INT == SIZEOF_LONG
(Python.h supplies definitions for those macros), so there's some
worth to skipping checks when that's not true. Unfortunately, C
doesn't allow "#if" preprocessor statements _inside_ macro expansions,
so the best way to do that isn't immediately clear.
In short, irritiating little issues abound :-(. That's why I couldn't
make time to fix it (relatively high cost with no benefit on most Zope
Note that if ZODB moves to 64-bit Ix/xI BTrees on all boxes (IIRC, Jim
and Fred were agitating in that direction, but suffered massive
short-sighted ;-) opposition), the BTree problem would go away by
magic (C "int" would no longer be the type of Ix keys or xI values).
Zope3-dev mailing list