On Sat, Aug 28, 2010 at 12:17 PM, Hanno Schlichting <ha...@hannosch.eu> wrote: > Hi. > > I've recently stumbled on some at least to me unexpected behavior with > zope.keyreference.
Specifically, zope.keyreference.persistent, I assume. > For a persistent object it generates a unique key > using: > > hash((database_name, oid)) No, it generates a hash this way. > > where hash is Python's built-in hash function. > > Reading the documentation I assumed that a keyreference for the same > object (as identified by database name and oid) should be stable and > always produce the same result. This isn't always true, when you look > up persisted keyreference data, upgrade your software versions and > compare it to a new calculation. > > Python's hash function is only stable inside the same Python version > and 32/64 bit combination. The same input in a 32bit Python 2.6 and > 64bit Python 2.6 produces different results, as both try to use the > maximum available integer space and thus a 64bit Python generates keys > above the 32int range. As a simple example "hash(('main', 1)) > 2**32" > is True in a 64bit Python and False in a 32bit Python. > > The internal hash implementation seems to have been pretty stable in > all the latest Python versions up to 3.1. So the algorithm produces > the same results for all 32bit version of Python 2.x to 3.1 and 64bit > respectively. But as far as I understand this isn't guaranteed to be > the case for future versions. > > Does anyone else see a problem with this? Should keyreference use a > different hash algorithm? Potentially, yes. In current practice, I don't think so. When a key reference is uses as a BTree key, its comparison function, rather than it's hash is used. If a key reference hash was used as a persistent key, then this would definitely be a problem. Note that in a dictionary or PersistentMapping, the hash isn't saved persistently. The object is saves as a collection of items and the hashes are recomputed on unpickling. I'm in favor of someone coming up with a stable hash to avoid future pitfalls. It's sad that Python's hash isn't stable across Python versions and architectures. Is this documented? If so, It's a missfeature. If not, perhaps it should be reported as a bug. Jim -- Jim Fulton _______________________________________________ Zope-Dev maillist - Zope-Dev@zope.org https://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - https://mail.zope.org/mailman/listinfo/zope-announce https://mail.zope.org/mailman/listinfo/zope )