If you know which schemas are going to be used, I would *strongly*
recommend preparsing the schemas and caching them. A good FAQ page [1]
describes how to accomplish the task. This will greatly improve the
performance of your application, because it's expansive to parse the same
set of schemas again and again whenever an XML document needs to be
validated.

As we all know, EntityResolver is from SAX, and SAX requires the system ID
to be absolutized before the custom EntityResolver is called. And by
default, Xerces thinks that all locations are files. (Is there a better
*default* behavior?) Applications can change this behavior by registering
an EntityResolver.

Since we are required to absolutize system ID's, we have to have a base for
that. We do try our best for a meaningful base, but if that fails, there is
no better choice than the user.dir property. If you don't like it, you can
ignore whatever comes before the last '/' in the system ID.

[1] http://xml.apache.org/xerces2-j/faq-grammars.html

Cheers,
Sandy Gao
Software Developer, IBM Canada
(1-905) 413-3255
[EMAIL PROTECTED]



                                                                                
                                                       
                      "Milan Trninic"                                           
                                                       
                      <[EMAIL PROTECTED]        To:       <[EMAIL PROTECTED]>, 
<[email protected]>                 
                      nc.com>                  cc:                              
                                                       
                                               Subject:  Re: problem: xsd 
schemas that are included or imported must be files !        
                      10/31/2002 11:19                                          
                                                       
                      AM                                                        
                                                       
                      Please respond to                                         
                                                       
                      xerces-j-user                                             
                                                       
                                                                                
                                                       
                                                                                
                                                       



I had the same problem.
In the xerces API it says that the parser is required to resolve the id to
a meningful URI before passing it to the application (custom
EntityResolver).
I was trying to find a way to intercept that, but due to the lack of time
gave up. I would certinly like to see if there is a way to do it.
Otherwise, the only solution I see is to preparse all the schemas and
provide them as external schemas to the parser.

Milan
 ----- Original Message -----
 From: Evyatar Kafkafi
 To: [email protected] ; [EMAIL PROTECTED]
 Sent: Thursday, October 31, 2002 5:24 AM
 Subject: problem: xsd schemas that are included or imported must be files
 !

 Hi all,

 My problem might be a feature of Xerces, or a feature of XML schema - I am
 not sure.
 In any case, I think this feature is wrong.

 In a schema like this:

 <?xml version="1.0" encoding="UTF-8"?>
 <xsd:schema targetNamespace="http://schemas.devxpert.com/order";
                     xmlns:xsd="http://www.w3.org/2001/XMLSchema";
                     xmlns:my="my_imported_namespace"
                     xmlns:dx="http://schemas.devxpert.com/order";
                     elementFormDefault="qualified">
      <xsd:import namespace="my_imported_namespace" schemaLocation
 ="import.xsd"/>
 ...
 </xsd:schema>

 When the parser parses the xml:import, it takes the schemaLocation and
 assumes it is a file on the file system !

 Now this approach is quite reasonable when you work with small systems,
 but not on large scale.
 In the system I'm currently developing, we don't keep the schemas in file
 system at all !
 We keep XML schemas in the Database. There are many good reasons for this.

 When we use a parser to parse an XML instance and validate it with a
 schema, we take the schema (and all its imported and included xsd files)
 directly from the Database. We don't want to write them to the file system
 and then tell the parser where they are located on the disk.
 Xerces supports this behavior by using the InputSource class, which allows
 the input to come from a string, not from a file.
 In addition, Xerces has the Entity Resolver, which allows me to supply the
 parser an Input Source for the entity "import.xsd".

 But before calling my custom made entity resolver, Xerces tries to expand
 the value "import.xsd" to a meaningfull URI, and its last resort is to
 assume it's in the "user.dir" directory [System.getProperty("user.dir")].

 So you see, Xerces assumes that "import.xsd" is a file. Why is that? Is it
 Xerces, or is it defined by the XML Schema Recomendation?
 And in any case, could it be changed?
 Isn't it much more general (and thus elegant) to allow for entities to
 come not only from file system?

     Evyatar.



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to