> > I also hope we can have metadata at the database level. > > http://marc.theaimsgroup.com/?l=xindice-dev&m=103790372009713&w=2 > > > Can you be more specific on that? I saw the message on the > archive, but > I fail to see how would Database metadata help here. I tend to think > that Database metadata are capabilities (like transaction > support) and > maybe the collection tree, nothing more really.
Just having access to collection tree may not be enough. For example, if the database only tells user there is a collection hierarchy '/db/addressbook', there is no way for user to imagine a query like '/db/addressbook/email', unless they iterate every document in that collection. However, if the database return meta information tree /db /addressbook / email / phone ... It will open a door for an new breed of applications, such as a GUI tool that supports ad hoc query. > > As per XPath queries sent on the database, I understand that > they might > be useful, but I see a problem. Given an XPath like > /db/content/whatever/A/B, how can you tell which one of the > tokens is a > collection, which one is a document and wich one is a real XML XPath? > This would become even more difficult with XPaths like > //*/A/B. But I'd > be happy to be proven wrong, since I see lots of use cases for that. To me, using XPath as query language, it has limitation on syntax. But the biggest advantage is that it is more natural for end user. Each level is a container or collection. It also provides another level of virtualization. For end user, if one wants to get some weather information from a system, he naturally thinks about '/USA/California/Bayarea/Temperature'. Do they really care about that '/USA' is a collection or '/USA/California' is a collection? What they care is that they are going to send a query to the system '/'. Conceptually, every XML node is a collection too. On the other hand, if user really want to be specific, they can say /USA/California[system_type='collection']/... where 'system_type' is the meta information. > > >2. PERFORMANCE > > Here I disagree. My point is that XML database should solve > the problem > of semistructured data. Pushing semistructured data on a > relational DB > looks at least suboptimal to me. I can see a reason when dealing with > data oriented XML (like just tags an attributes), but things become > really messy on text oriented documents: how could you > efficiently break > into a tabular format something like > I think I need to restate it a little bit. There are data-centric XML files and there are document-centric XML files. I probably inherited more genes from data processing background (including the proposal for database level meta information). I agree the proposal may not be a good fit for document-centric scenario and I don't expect one-size-fit-all. I *briefly* (forgive my ignorance, if anyone from eXist :) ) scaned through eXist's sql scripts before, I was not totally convinced to build a DOM tree in RDBMS will help . I have my reservation on the issue that we should focus more on document-centric XML files too. At least, there is a 50-50 chance in the real world. As I said, my oringial movitation on searching XML database is from data processing not content management. More and more Web Services implementations mean more SOAP messages need to be logged and retrieved. Even for traditional middleware (JMS, MQSeries, ...) users, they tend to wrap their messages in XML too. So, the point is that are plenty data-centric use cases. If we can give user options to pick the suitable configurations, won't that be great? Lixin