Tell me if you are tired from me. I do not use to write so long emails
to email lists...
Answer for Dieter's email:
Several days ago somebody already suggested to use IndexedCatalog. But
it have not seem to be very active. The stand alone ZCatalog definitely
My decision to use ZCatalog was: it is an extraction of code from Zope;
Zope will be always maintained (isn't it?), just I have to figure out
how to 'hack out' the stand alone ZCatalog based on Kevin Dangoor's 'hack'.
Kevin already mentioned that Zope 3 would be better for this purpose.
At this moment I think the best way for me in long term: handle the
querying through a wrapper class of mine. Put a layer between my
application(s) and the indexing application (something similar to RDBMS
DB-API; changing the db backend (in my case changing the
indexing/searching backend) without (big) changes in the application code).
Does some similar standards exist for ODBMSs? Similar to DB-API...
Not a real programmer: I have not learn it in the school; I read and try
to use developer lists (and brains) ;-)
Please do not forget that I am not a real programmer but a consumer:
That's OK -- in return, please don't forget that you're posting to a ZODB
developer's list ;-).
I just want to state: it is very difficult for me to formulate my
questions in a way you understand them easily. Although I think I put
more then enough time and energy to form relevant questions with
relevant terminology avoid wastin the time of other people.
That's the only solid reason Zope Corp has to _pay_ for ZODB development.
Zope pays the bills here, and ZODB is supporting infrastructure for Zope.
Then OK. But if there is bigger potential inside it...
Then what, specifically? Nobody works on something unless they want to,
and/or are paid to. It's not a matter of cheerleading, it's a matter of
someone doing the work.
Yes, I understand, that you have to feed your family and educate your
child. So if it is not payed...
I am surprised that you ask 'then what, specificaly'. My opinion (as an
One of the major and important trend in application development: object
At this moment most of the developers use (I may not be true) some
solution with RDBMS backend (SQLObject/Python; Hybernate/Java). But I
think these solutions are not so transparent as ODBMSs, like ZODB, db4o
(I tried these).
I think other programmers mostly use a solution with an RDBMS backend as
they do not want to handle/code the searching/indexing (I know that
ODBMS performance told to be not so good as RDBMS, but I think if you
calculate with the OR mapping overhead than some ODBMS are not so bed.
E.g. some heavily loaded biological application Versant's ODBMS was
reported better then one of the leader RDBMS ***).
"Specificaly" I think if you would implement a stand alone search/index
possibility for ZODB (if you would use some similar Zope-hacking
approach it would NOT be a huge, completely stand-alone project needing
a lot of efforts) ZODB could be an ODBMS leader, competitor of other
ODBMS solutions. So you may have more paying consumers, too...
(***My biggest problem is: if I want other biologists to try my
solutions/applications I do not have any chance for this with an RDBMS
backend. They just will not learn how to setup and will not setup an
RDBMS to try out something. For this reason an ODBMS system would be OK
(I think it would be also worth to implement some standards for ODBMSs.
Some ODB-API. Just to be able to change ODBMS backends and/or
indexing/searching backends. (I also know that there is no other
ODBMS backend to choose from and you do not want to switch users from
ZODB to other backend in the future :-) , but I think you would have the
potential to do something like this. Just for fun. Just to initiate some
I am not a 'real' developer, but it seems others also would not go with
it, even though if it is perfect. If you (ZODB developers) change
something in the next release of ZODB than the perfect IndexedCatalog
may not be able to communicate with the new version of the ZODB backend...
A notable exception is IndexedCatalog:
which is independent of Zope. You said before that you thought that wasn't
active, and it indeed doesn't look like it's had a release recently. That
could be because it's already perfect ;-) -- or it could be that there's not
(By the way: I think it is great and big, and I would like to use it.)
To formulate this on a more realistic way: it seems for me that there
is no potential to take care about this extra project outside of Zope
AND/OR it would not be good for Zope developers to have it as an
easy-to-use stand alone module (maybe some business policy?).
Not sure I followed that.
Sorry. Please note I simplify in the next 2 points:
1. I mean 'potential': there are not 100 ZODB developers, just a few.
2. I mean 'business policy': If you would make a good stand alone
indexing/searching possibility then everybody would use ZODB with
CherryPy ;-) so less people would pay for Zope training/classes or
something like that...
1. """That's usually viewed as an application-level problem, and it's up
to applications to solve it in ways best suited for their particular
needs.""" If I translate this for myself, if I understand well: I am very
happy that RDBMSs does not say this, and I can search them not only by
primary keys; I am happy that I do not have to implement something
similar to SQL as it is not considered as "application level problem".
A relational database forces you to slam all your data into uniform tables,
regardless of whether that's a natural fit. When all you have is uniform
tables, then it's relatively easy to define uniform operators for crawling
over those tables -- that's what SQL is all about.
An object database is more of a general graph structure, and an
application's idea of "search" can be correspondingly semantically richer
from the start -- or even irrelevant, if the object graph is constructed
from the start to make traversals of potential interest follow the natural
graph pointers. What's the analogue to SQL in this quite different view of
the world? Well, there isn't a standard accepted vision for that. That's
what makes it the app's problem. These are tradeoffs. Zope's assorted
indices and catalogs _probably_ capture some notion of "search" close to
what you're after.
Hkmm. I may not formulated well.
In this context SQL means (wanted to mean) for me just a standard.
Although there are OQL-s or something like that, but there are also
'native queries', or simple queries based on the object type. This may
not work well (out-of-box) with Python as the objects are not decleared
but created (or something like that). I think this is the main
limitation factor in creating some standard searching, indexing, if you
So there are some standard ways/ideas of "search" in an object database.
Or with high oversimplification: you want to retrieve all objects that
has the field/member 'name' and this field has the value 'Tamas'.
I think it is a low-level (database developers) problem ;-) how this
searching is implemented. An application developer just should worry
about to choose the best searching/indexing method/package suitable for
I used 'primary keys' as I thought if use just simple keys, you may
thought I did not know that BTree is not a simple dictionary :-)
2. BTrees: I could not find any 'built-in' possibility in the docs, just
the 'primary keys'. If I check the OOBTree, etc, it just give
'difference', 'intersection', 'union'. I do not see to do full text
search or field search on BTrees. Do I miss something???
BTrees map keys to values. The keys are always maintained in sorted order,
and it's both dead easy and efficient to do range searches over a BTree's
key space. That's what's built in.
For me _developers_ mean trained persons writing serious
programs/packages using expressions I do not understand; solving of
computional problems _always_ are trivial for them; they can always
choose the best tools for their and my problems. ;-)
3. I can not build up another database from the ZODB as I am not a
Do you use Python? I'm at a bit of a loss to figure out how you wound up
posting here if you're _not_ a Python programmer. It could be that ZODB is
much more general than I thought ;-), but I didn't think non-programmers
would have any use for it.
I am just playing with Python and programming. I am working with DNA,
proteins, cell lines, etc. But I can not waste my spare time (should I
write shorter emails?), so I have to find the best tools.
I write to the developer list as I can not receive help from other zope
related list. I tried several months ago. The result: I moved to java;
tried db4o; failed (I could not populate it with my objects (I have too
many and complex objects; I am looking forward to see how ZODB preforms
:-) )); trial of Perl; I was scared, so back to Python ;-)
But I think you formulated this not the best way: I think you do not
build the SB database OUT of ZODB's BTrees, I think you just build
up indexes from the BTrees and you implement searches on your indexes
that points back to the BTrees.
I suppose you could think of it that way, but I designed SpamBayes and
that's not how I thought of it. I thought of it in terms of abstract
mappings, then designed the main algorithms to work directly with BTrees.
ZODB supplies persistent BTrees, and that's all SB needs.
=> If you just build indexes from the BTrees, the following protocol
works for me and you can suggest?
Not sure I'm following. I can suggest what?
I think my 'following protocol' what you name 'abstract mapping'.
Suggest: how to implement indexes/searches on BTrees in ODBMS/ZODB.
I use <"primary key", "object"> for <key, value> of BTtrees. I think you
always have a python "object" as "value".
Walk trough on BTree taking each object: for each key of the BTree
instanciate the value/object.
1. walk trough on your BTree taking each object
A BTree is a collection of <key, value> pairs, and unsure what "object"
2. with an external indexing application build the index (on one or more
fields, or full text)
(indexed_something1: key1, key65, ... key_i),
(indexed_something2: key4, key6, key45, ... key_j)
3. search in your index that returns with the 'primary key' of objects
in the ZODB
searching for something, e.g. indexed_something2:
a, Is indexed_something2 among the keys of INDEX?
b, -- NO: no results
-- YES: return INDEX[indexed_something2] as list of keys to the
objects in the BTree
4. get the objects from the ZODB via the 'primary keys' from the prev
OK, now I'm sure not following. You appear to be assuming much more
structure than a plain BTree supports on its own, and in fact BTrees don't
really _appear_ to have anything to do with what you're saying. If you
think _your_ objects have such things as "fields" and "primary keys", then
that's part of your objects' design and your objects' implementations --
objects don't come with such notions built in. It sounds like you have RDMS
tables in mind, and are forcing object language on top of them.
Of course I have some 'tables' in my mind :-) I grown up on tables...
But I think you can admit that there are some analogies between the
BTree keys and 'primary keys'; that an object is similar to a record,
the fields/members of the objects are 'raws'.
I just would like to understand the basics of an index/search
implementation on objects (on BTrees).
No! I would not be happier with RDBMS. I am just using ZODB for 1 or 2
weeks and my life is happier :-)
If so, that's fine -- it's legitimate to do so. It sounds like you'd be
happier then with an RDMS, though (under the inference that you _think_ in
REAL questions with less 'phylosophy':
1. If I want to implement an index system for ZODB, 'walking through'
the key of the BTrees, instatnciate the objects and building the index
Or there is some low-level code 'magic' to use? I mean special
"_function" from ZODB, learning deep internals of the BTrees, etc...
2. Do you see a possible way to implement indexes on the row file (.fs)
without object instanciation?
(This mail list may not be the perfect place to ask, but I think you are
among the best for Python Objects questions :-) )
3. Python objects are not decleared but created. If I have an object,
anybody can just add extra members/fields/variables to my object or
delete one member/field what I defined e.g. in the __init__().
Do you know some implemented locking mechanism that inhibits these
things? Let say: the variables/fields/members of the objects are created
in the __init__() but you can not add more or delete any of them after
Thanks for your patient!
Tamas Hegedus, PhD | phone: (1) 919-966 0329
UNC - Biochem & Biophys | fax: (1) 919-966 5178
5007A Thurston-Bowles Bldg | mailto:[EMAIL PROTECTED]
Chapel Hill, NC, 27599-7248 | http://biohegedus.org
For more information about ZODB, see the ZODB Wiki:
ZODB-Dev mailing list - ZODB-Dev@zope.org