Re: Future of Xindice

Murray Altheim 16 Jan 2002 21:41:48 -0000

"Timothy M. Dean" wrote:
> 
> Murray,
> 
> Thanks for the more detailed response. Below are a couple of follow-up
> questions:
> 
> > -----Original Message-----
> > From: [EMAIL PROTECTED]
> >
> > My point (which
> > I'm guessing was not expressed very clearly) is that any
> > Xindice-based application *must* have an XML parser
> > available, and Xindice
> > is distributed with Xerces 2, which provides support for DTDs
> > and XML Schema. If you need stronger content validation,
> > Xerces provides that with its XML Schema support.
> 
> Yes, I've worked with standalone Xerces applications and have
> implemented apps that use strong content validation based on XML
> schemas. What I'm not understanding is how I can architect my systems so
> that I can be sure that all applications sharing a particular set of
> data via Xindice can consistently enforce the validation rules I need.
> 
> I could easily implement code that validates a document before storing
> it into my DB. I have concerns about enforcing a rule that says "all
> applications should ensure that they only store documents that are valid
> against the Schema X". Many clients I work with require this kind of
> enforcement, and if there's not an easy way to do it within Xindice I
> feel that Xindice would be ruled out as a valid solution for these
> clients.


Perhaps I'm not understanding what you've explained, but it seems
that you're confusing client and server. Xindice is not a client,
it's a database server. How that database is provided, who has 
access, and what specific access controls, client software, and
security are management issues, not technical ones. A Xindice system
would include client software written by you, and I would hope you'd
have control over both the installation of the server and how those
clients are configured. If this isn't the case I'm not clear what
Xindice's role would be technically, since such an unregulated 
system's problems wouldn't be solved by validation.

> > At most stages in the process an XML processor is *required*
> > to handle the XML content moving in and out of Xindice. All
> > one needs to do to provide stronger validation support during
> > these processes is to establish those parsers in validation
> > mode, and provide the schemas necessary to validate the
> > content.
> 
> How then would you suggest handling the following scenario (which I'm
> currently hacking around because I can't enforce validation within
> Xindice). I've got an XML document stored in Xindice. The structure of a
> particular element in my schema looks something like this:
> 
>     <element name="AddressList" type="ab:AddressListType">
>         <unique name="AddressUnique">
>             <selector xpath="*"/>
>             <field xpath="@id"/>
>         </unique>
>     </element>
> 
> Basically, this is used to represent a list of "Address" elements, where
> the "id" attribute of each address in the list is unique within the
> scope of the list.
> 
> Now consider this - My application wants to add a new Address element to
> the document. I want to use an XUpdate query to perform the insertion of
> a new element. My application creates the appropriate XUpdate query and
> submits it. I want to make sure that the new element is only stored if
> its "id" attribute is unique within the list.
> 
> How can I enforce this restriction of uniqueness? The new element is
> perfectly valid as far as I can tell by looking at the element on its
> own. The restriction only comes in when I try to place the new element
> into a previously stored document. Right now, I'm being forced to read
> in the entire list of Address elements into my application, add the new
> element to this list within my application to check for uniqueness, and
> then either rewriting the entire list or performing the insert of only
> the new element once I've performed my validation manually. It would be
> *very* nice for me if I could simply attempt an insert of the new
> element directly using Xindice, and expect validation that I've enabled
> for the collection (or document) to handle this scenario.
> 
> Is there another way I can approach this that would make my life easier?

ID uniqueness is a fairly easy contraint to fulfill. You'd merely 
create an Xindice index for IDs (using the XPathService) and provide
that within your application as a HashSet. Incoming IDs would be checked
against the HashSet and if you didn't return a null, you'd throw a 
duplicate ID exception. This would occur prior to the document being
corrupted (ie., becoming invalid) by the insertion of new content,
which I'd imagine is preferable to corrupting it and *then* having to
fix the problem. Another simple means of doing this is to have Xindice
return the existing document as a DOM Document node using Xindice's
getContentAsDOM() method, and then perform the check against that using
Xerces' DOM method getElementById(). This would allow you to bypass 
creating your own ID hash table. Which direction to take would depend
upon the application's requirements, how big the documents are, how 
often they're changed, performance configurations, etc.

> > You don't need to include validation features in Xindice
> > itself because the packages required to support Xindice
> > already provide those features, and any application built
> > upon Xindice *by necessity* must parse and process XML
> > content. All XML content going into Xindice must at minimum
> > be well-formed XML -- that's structural validation at its
> > most basic. If further structural or content validation is
> > needed, set the parser
> > factories to produce validating parsers, and then provide the
> > schemas.
> > To put these features into Xindice itself would be redundant and
> > unnecessary. Xerces is already doing it.
> 
> All I am asking for is a way to tell Xindice that it should enable the
> validation (provided by Xerces or whatever other implementation it
> chooses) at a level that is extremely inconvenient to get at in some
> applications. Because Xerces is included with Xindice, it seems
> reasonable to ask for Xindice to make use of Xerces in way that would be
> a great help to some of us...

Well, as I said, you need to be using Xerces (or a suitable 
replacement) in order to process the content going into Xindice,
so I don't see that it's an extra burden to you as a developer 
to validate the content you're manipulating, and to do it at
any time using any schema. If you're passing around a DOM Document
or even an element, you can pass it to a DOMParser to check it 
out.

> > Now, one thing I can think might be quite valuable would be
> > to create some utility classes/methods that could be used
> > generally within Xindice to provide either DOM Document- or
> > Node-level validation at any stage within the process of
> > managing content. I'd be happy to even contribute to such an
> > effort. If this is of general interest, writing up a set of
> > requirements would be a good start.
> 
> I would be interested in this kind of project as well - My first
> requirements would be an easy way to handle the scenarios listed above.
> Any ideas you have on how to proceed would be appreciated.

Let me know if the suggestions I've made make any sense in your
scenario. 

Murray

...........................................................................
Murray Altheim, Staff Engineer          <mailto:murray.altheim&#64;sun.com>
Java and XML Software
Sun Microsystems, 1601 Willow Rd., MS UMPK17-102, Menlo Park, CA 94025

       Ernst Martin comments in 1949, "A certain degree of noise in 
       writing is required for confidence. Without such noise, the 
       writer would not know whether the type was actually printing 
       or not, so he would lose control."

Re: Future of Xindice

Reply via email to