> > - I find the name dataguide misleading because it's a guide on the query and > > not on the data. Maybe QueryPruneGuide would be more meaningful > > The query itself is not pruned, the data is. I think "dataguide" is the > established term -- see for example this paper: > http://ilpubs.stanford.edu:8090/264/1/1997-50.pdf . "DataGuides serve as dynamic schemas, generated from the database." What we generate is a schema from the query.
> > - Why is the dataguide parameter on the Store's getCollection() function? > > Shouldn't it be on the function that returns the iterator? The problem is > that > > a Collection object within the simplestore exists only once per collection. > > What's the semantics if multiple queries access the collection (possibly in > > parallel)? > > It very much depends on how the collections are handled. Currently for Zorba > collections it doesn't make sense to have any dataguides at all, because > they're in-memory collections. I have not taken a look at the Sausalito code > and have not seen how e.g. the MongoDB "collections" are managed. > getCollection() seemed the most logical place where it should be passed, but > the dataguide parameter could be easily propagated to any Store class, > including the function that returns the iterator. > > Currently each and every db:collection() call has its own dataguide, even if > they might refer to the same collection. If the collection manager currently > "caches" or reuses the collection iterators, then it might make sense to > forbid that so that the dataguide for each individual db:collection call could > be used. > > Or alternatively, an "union" on the dataguides that refer to the same > collection could be performed. But I think it is not always possible to > determine if that is the case. > > I think this could be investigated and decided upon when implementing the > Dataguide push-down into MongoDB or when I would take a better look at the > Sausalito's collection manager code. I think we will run into a problem. 28msec has only one buffer that is accessed by all db:collection() calls in a query. Hence, the information needs to be the union. -- https://code.launchpad.net/~zorba-coders/zorba/dataguide/+merge/173026 Your team Zorba Coders is subscribed to branch lp:zorba. -- Mailing list: https://launchpad.net/~zorba-coders Post to : firstname.lastname@example.org Unsubscribe : https://launchpad.net/~zorba-coders More help : https://help.launchpad.net/ListHelp