Tell me if you are tired from me. I do not use to write so long emails to email lists...

Answer for Dieter's email:
Several days ago somebody already suggested to use IndexedCatalog. But it have not seem to be very active. The stand alone ZCatalog definitely not active. My decision to use ZCatalog was: it is an extraction of code from Zope; Zope will be always maintained (isn't it?), just I have to figure out how to 'hack out' the stand alone ZCatalog based on Kevin Dangoor's 'hack'.
Kevin already mentioned that Zope 3 would be better for this purpose.

At this moment I think the best way for me in long term: handle the querying through a wrapper class of mine. Put a layer between my application(s) and the indexing application (something similar to RDBMS DB-API; changing the db backend (in my case changing the indexing/searching backend) without (big) changes in the application code).

Does some similar standards exist for ODBMSs? Similar to DB-API...

Please do not forget that I am not a real programmer but a consumer:
That's OK -- in return, please don't forget that you're posting to a ZODB
developer's list ;-).
Not a real programmer: I have not learn it in the school; I read and try to use developer lists (and brains) ;-)

I just want to state: it is very difficult for me to formulate my questions in a way you understand them easily. Although I think I put more then enough time and energy to form relevant questions with relevant terminology avoid wastin the time of other people.

That's the only solid reason Zope Corp has to _pay_ for ZODB development.
Zope pays the bills here, and ZODB is supporting infrastructure for Zope.

Then OK. But if there is bigger potential inside it...
Then what, specifically?  Nobody works on something unless they want to,
and/or are paid to.  It's not a matter of cheerleading, it's a matter of
someone doing the work.
Yes, I understand, that you have to feed your family and educate your child. So if it is not payed...

I am surprised that you ask 'then what, specificaly'. My opinion (as an outsider): One of the major and important trend in application development: object persistance.

At this moment most of the developers use (I may not be true) some solution with RDBMS backend (SQLObject/Python; Hybernate/Java). But I think these solutions are not so transparent as ODBMSs, like ZODB, db4o (I tried these).

I think other programmers mostly use a solution with an RDBMS backend as they do not want to handle/code the searching/indexing (I know that ODBMS performance told to be not so good as RDBMS, but I think if you calculate with the OR mapping overhead than some ODBMS are not so bed. E.g. some heavily loaded biological application Versant's ODBMS was reported better then one of the leader RDBMS ***).

"Specificaly" I think if you would implement a stand alone search/index possibility for ZODB (if you would use some similar Zope-hacking approach it would NOT be a huge, completely stand-alone project needing a lot of efforts) ZODB could be an ODBMS leader, competitor of other ODBMS solutions. So you may have more paying consumers, too...

---- (footnotes)
(***My biggest problem is: if I want other biologists to try my solutions/applications I do not have any chance for this with an RDBMS backend. They just will not learn how to setup and will not setup an RDBMS to try out something. For this reason an ODBMS system would be OK for me.)

(I think it would be also worth to implement some standards for ODBMSs. Some ODB-API. Just to be able to change ODBMS backends and/or indexing/searching backends. (I also know that there is no other ODBMS backend to choose from and you do not want to switch users from ZODB to other backend in the future :-) , but I think you would have the potential to do something like this. Just for fun. Just to initiate some ODBMS standardization....)

A notable exception is IndexedCatalog:


which is independent of Zope.  You said before that you thought that wasn't
active, and it indeed doesn't look like it's had a release recently.  That
could be because it's already perfect ;-) -- or it could be that there's not
I am not a 'real' developer, but it seems others also would not go with it, even though if it is perfect. If you (ZODB developers) change something in the next release of ZODB than the perfect IndexedCatalog may not be able to communicate with the new version of the ZODB backend...

(By the way: I think it is great and big, and I would like to use it.)
To formulate this on a more realistic way: it seems for me that there
is no potential to take care about this extra project outside of Zope
AND/OR it would not be good for Zope developers to have it as an
easy-to-use stand alone module (maybe some business policy?).
Not sure I followed that.
Sorry. Please note I simplify in the next 2 points:
1. I mean 'potential': there are not 100 ZODB developers, just a few.
2. I mean 'business policy': If you would make a good stand alone indexing/searching possibility then everybody would use ZODB with CherryPy ;-) so less people would pay for Zope training/classes or something like that...

1. """That's usually viewed as an application-level problem, and it's up
to applications to solve it in ways best suited for their particular
needs.""" If I translate this for myself, if I understand well: I am very
happy that RDBMSs does not say this, and I can search them not only by
primary keys; I am happy that I do not have to implement something
similar to SQL as it is not considered as "application level problem".
A relational database forces you to slam all your data into uniform tables,
regardless of whether that's a natural fit.  When all you have is uniform
tables, then it's relatively easy to define uniform operators for crawling
over those tables -- that's what SQL is all about.

An object database is more of a general graph structure, and an
application's idea of "search" can be correspondingly semantically richer
from the start -- or even irrelevant, if the object graph is constructed
from the start to make traversals of potential interest follow the natural
graph pointers.  What's the analogue to SQL in this quite different view of
the world?  Well, there isn't a standard accepted vision for that.  That's
what makes it the app's problem.  These are tradeoffs.  Zope's assorted
indices and catalogs _probably_ capture some notion of "search" close to
what you're after.
Hkmm. I may not formulated well.
In this context SQL means (wanted to mean) for me just a standard. Although there are OQL-s or something like that, but there are also 'native queries', or simple queries based on the object type. This may not work well (out-of-box) with Python as the objects are not decleared but created (or something like that). I think this is the main limitation factor in creating some standard searching, indexing, if you use Python.

So there are some standard ways/ideas of "search" in an object database. Or with high oversimplification: you want to retrieve all objects that has the field/member 'name' and this field has the value 'Tamas'.

I think it is a low-level (database developers) problem ;-) how this searching is implemented. An application developer just should worry about to choose the best searching/indexing method/package suitable for his/her(?) application.

2. BTrees: I could not find any 'built-in' possibility in the docs, just
the 'primary keys'. If I check the OOBTree, etc, it just give
'difference', 'intersection', 'union'. I do not see to do full text
search or field search on BTrees. Do I miss something???
BTrees map keys to values.  The keys are always maintained in sorted order,
and it's both dead easy and efficient to do range searches over a BTree's
key space.  That's what's built in.
I used 'primary keys' as I thought if use just simple keys, you may thought I did not know that BTree is not a simple dictionary :-)

3. I can not build up another database from the ZODB as I am not a
Do you use Python?  I'm at a bit of a loss to figure out how you wound up
posting here if you're _not_ a Python programmer.  It could be that ZODB is
much more general than I thought ;-), but I didn't think non-programmers
would have any use for it.
For me _developers_ mean trained persons writing serious programs/packages using expressions I do not understand; solving of computional problems _always_ are trivial for them; they can always choose the best tools for their and my problems. ;-)

I am just playing with Python and programming. I am working with DNA, proteins, cell lines, etc. But I can not waste my spare time (should I write shorter emails?), so I have to find the best tools.

I write to the developer list as I can not receive help from other zope related list. I tried several months ago. The result: I moved to java; tried db4o; failed (I could not populate it with my objects (I have too many and complex objects; I am looking forward to see how ZODB preforms :-) )); trial of Perl; I was scared, so back to Python ;-)

But I think you formulated this not the best way: I think you do not
build the SB database OUT of ZODB's BTrees, I think you just build
up indexes from the BTrees and you implement searches on your indexes
that points back to the BTrees.
I suppose you could think of it that way, but I designed SpamBayes and
that's not how I thought of it.  I thought of it in terms of abstract
mappings, then designed the main algorithms to work directly with BTrees.
ZODB supplies persistent BTrees, and that's all SB needs.

=> If you just build indexes from the BTrees, the following protocol
works for me and you can suggest?
Not sure I'm following.  I can suggest what?
I think my 'following protocol' what you name 'abstract mapping'.
Suggest: how to implement indexes/searches on BTrees in ODBMS/ZODB.

1. walk trough on your BTree taking each object
A BTree is a collection of <key, value> pairs, and unsure what "object"
means here.
I use <"primary key", "object"> for <key, value> of BTtrees. I think you always have a python "object" as "value". Walk trough on BTree taking each object: for each key of the BTree instanciate the value/object.

2. with an external indexing application build the index (on one or more
fields, or full text)
(indexed_something1: key1, key65, ... key_i),
(indexed_something2: key4, key6, key45, ... key_j)

3. search in your index that returns with the 'primary key' of objects
in the ZODB
searching for something, e.g. indexed_something2:
a, Is indexed_something2 among the keys of INDEX?
b, -- NO: no results
-- YES: return INDEX[indexed_something2] as list of keys to the objects in the BTree

4. get the objects from the ZODB via the 'primary keys' from the prev
step. ???

OK, now I'm sure not following.  You appear to be assuming much more
structure than a plain BTree supports on its own, and in fact BTrees don't
really _appear_ to have anything to do with what you're saying.  If you
think _your_ objects have such things as "fields" and "primary keys", then
that's part of your objects' design and your objects' implementations --
objects don't come with such notions built in.  It sounds like you have RDMS
tables in mind, and are forcing object language on top of them.
Of course I have some 'tables' in my mind :-) I grown up on tables...
But I think you can admit that there are some analogies between the BTree keys and 'primary keys'; that an object is similar to a record, the fields/members of the objects are 'raws'.

I just would like to understand the basics of an index/search implementation on objects (on BTrees).

If so, that's fine -- it's legitimate to do so.  It sounds like you'd be
happier then with an RDMS, though (under the inference that you _think_ in
No! I would not be happier with RDBMS. I am just using ZODB for 1 or 2 weeks and my life is happier :-)

REAL questions with less 'phylosophy':

1. If I want to implement an index system for ZODB, 'walking through' the key of the BTrees, instatnciate the objects and building the index is OK??? Or there is some low-level code 'magic' to use? I mean special "_function" from ZODB, learning deep internals of the BTrees, etc...

2. Do you see a possible way to implement indexes on the row file (.fs) without object instanciation?

(This mail list may not be the perfect place to ask, but I think you are among the best for Python Objects questions :-) ) 3. Python objects are not decleared but created. If I have an object, anybody can just add extra members/fields/variables to my object or delete one member/field what I defined e.g. in the __init__(). Do you know some implemented locking mechanism that inhibits these things? Let say: the variables/fields/members of the objects are created in the __init__() but you can not add more or delete any of them after that point.

Thanks for your patient!

Tamas Hegedus, PhD          | phone: (1) 919-966 0329
UNC - Biochem & Biophys     | fax:   (1) 919-966 5178
5007A Thurston-Bowles Bldg  | mailto:[EMAIL PROTECTED]
Chapel Hill, NC, 27599-7248 | http://biohegedus.org
For more information about ZODB, see the ZODB Wiki:

ZODB-Dev mailing list  -  ZODB-Dev@zope.org

Reply via email to