John Wright wrote:

One interesting thing I've seen, particularly with the LORE project, is
the use of multiple indices, since XML and semi-structured data have at
least two distinct components - paths in the graph and nodes for the
data.  This is also where you start to lose storage efficiency, though,


I don't think that storage is that much of an issue as of now. Storage is cheap ATM, so I'd personally focus on efficiency and speed of the whole engine (true... when you start having giga-sized indexes you loose speed too).

and updates are similarly not very efficient...my first thought is that
the all the indices don't need to be stored on disk, or even in memory.


I have lost you here. If not on disk or memory, where are you supposed to store indexes? Do you mean that actually it might not be the case to have some indexes?

What sort of indexing schemes are we using right now?

Look at org.apache.xindice.core.indexer.*, basically there are two indexes (b-tree based): one for element/attributes names and one for element/attributes values.

Ciao,

--
Gianugo Rabellino



Reply via email to