----- Original Message -----
From: "Gianugo Rabellino" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Wednesday, November 27, 2002 6:15 AM
Subject: [RT] Xindice 2.0


>
> This is probably a good time to start thinking about Xindice 2.0. The
> major number switch should come from a major evolution of the current
> architecture: we have now a quite solid XML database, but still there is
> a lot of work to do in order to make Xindice a viable solution for the
> use cases that have been aniticipated by our candidate users.
>
> This is just a "starting point" to try and set things straight, in order
> to try to come up together with a sort of guideline for future
> developments. Please, feel free to fire at will, and remember that these
> are just Random Toughts. :-)
>
> There are some major points that I would like to address in the next
> future. In no particular order I think we need to work on:
>
> 1. XML:DB API
> This is not a 100% issue of Xindice, yet I think that since dbXML before
> and Xindice afterwards are the de facto standards for this API, the
> XML:DB APIs should be the primary way to access the database. I still
> think that it's really important to have a vendor-neutral API for
> accessing XML databases, so I would like to invest more and more on
> this: we might try to push on the xapi-dev list and see what happens, if
> we fail it will be always possible to run wild and do our own extensions.
>
> I think that we need to extend the API in order to accomodate the needs
> anticipated by the users. These points at least are crucial to me:
>
> - metadata: we need a neutral way to query metadata for collections and
> resources. I like David's solution of having a MetaData object with a
> set of fixed and basic metadata (author, creation, modification), a set
> of "properties" and a custom XML-based system: we don't really need much
> more than that, but we also need to refine it in order to come out with
> a complete solution that addresses the most basic needs (I, for one,
> would like to add to the MetaData the collection and the document ID).
> When the MetaData object is carved in stone we can decide how to get it:
> I'm all in favor for something like getMetaData() calls on Collection
> and Resource.
>
> - transaction support: the API should have a basic support for atomic
> operations and for transactions;
>
> - capabilities (is that the right English term?). There should be a way
> to query the Database (or maybe the Collection?) to understand if it
> supports some features (i.e.: transaction). A parallel with JDBC would
> be the DatabaseMetaData object even if I'm not really sure about the
> plethora of supports* methods, the alternative a SAX-feature like (URI
> based) set of capabilities and a single method to query for support,
> with a pseudocode of:
>
> if (database.supports(Capabilities.TRANSACTIONS)) {
> begin()/work()/commit()
> } else {
> workAndHopeForTheBest()
> }
>
> Again: this is not exactly the right place to discuss this, but before
> going to xapi-dev I'd like to hear your opinion and put together a draft
> that comprises all our present and (possibly :-)) future needs.
>
> 2. PERFORMANCE
> Face it: we are slow. We are fair enough for small jobs but we cannot
> stand high loads or huge documents, no matter how accurate your indexes
> might be. I put a great deal of hope into Tom's work on Xalan DTM
> (http://xml.apache.org/xalan-j/dtm.html) to improve the Xindice
> performances, but as of now I'm afraid that Tom is MIA too, so unless he
> shows up we have no choice but doing it on our own and decide what might
> be the best way to improve the Xindice storage and retrieval
> performance. I see some possible directions:
>
> a. Stefano pointed me to the Lore documentation. The guys at Stanford
> did a whole lot of work thinking about storage of semi-structured data,
> we might borrow something from there, if it's still up to date
> (http://www-db.stanford.edu/lore/);
>
> b. DTM (http://xml.apache.org/xalan-j/dtm.html). I had a small chat with
> Shane Curcuru from Xalan at ApacheCon and he was cautious about using
> DTM for persistent storage. But it might be worth trying (by asking to
> xalan-dev) to see if the DTM model is good enough (or can possibly be
> extended) to accomodate our needs;
>
> c. SAX events. There is almost no doubt about SAX being the most
> efficient way to deal with XML speed & memory wise. As of now Xindice is
> heavily based on DOM (albeit compressed and finely tuned), it might be
> worth investigating if this should change. Cocoon had very good results
> using SAX even for the internal cache, by compiling SAX events to byte
> streams and interpreting them at a later time: see
>
http://cvs.apache.org/viewcvs.cgi/xml-cocoon2/src/java/org/apache/cocoon/com
ponents/sax/
> and look for XMLByteStream[Compiler|Interpreter]. We might borrow that
> at least for the transport of SAX events over the wire in the XML-RPC
> protocol: if we have on the server side a Compiler (or, even better, if
> the documents are already stored in a compiled format) and on the client
> side an Interpreter things might be a whole lot faster, exp. when
> dealing with SAX based applications such as Cocoon.
>
> 3. AAA
> Badly needed, on two sides:
>
> a. Server side: not that hard to implement, after all, at least on a
> not-so-granular way. We might go the hard way with security-oriented
> markup languages and node based security or just rely on URI-based
> authentication, with a Tomcat/Slide/younameit-like role system. I'd go
> for the latter: Collection based security should be enough for most needs.
>
> b. transport: if we are going to have username and passwords flying over
> the wire, we need to protect them. XML-RPC over HTTPS? CHAP? Kerberos?
> Other thoughts?
>
> 4. TRANSACTION
> This is needed too. I don't know how JTA might help here, I have no idea
> of the API and never worked with it. Any expert around? We would need to
> know not only if JTA would make the job, but also if, performance wise,
> it will suffice without imposing severe penalties to the system.
>
> ======================================================================
>
> OK, this was the first stone in the lake: I hope to sparkle some
> discussion on it and, once we manage to agree on what we want from 2.0,
> to start writing docs and code. I'm now borrowing the world-famous
> absbestos underwear from Stefano & Sam and I'm eagerly waiting for your
> replies.
>
> Ciao,
>
> --
> Gianugo Rabellino

Reply via email to