Re: Schema validation -- known performance problem?

zongaro Fri, 03 May 2002 07:19:16 -0700

Hi Joseph,

     The first time the parse() method of a parser instance that uses the 
StandardParserConfiguration is invoked with the schema validation feature 
set to true, there is some overhead involved in creating a schema 
validator for that parser.  Among other things, it entails creating a new 
DOMParser.  Usually the cost of creating parser components is paid when 
the parser is constructed, but in this case the cost is deferred until the 
first parse().


     On top of that, there's additional cost for the first schema 
validator that is constructed - it's responsible for constructing the 
various datatype validators that are built-in for schema.

     So you're right that there's some first-time object creation that is 
playing a part.  It's possible that construction of some of these things 
might be deferred until they are known to be needed without a severe 
impact on the case in which they are needed.

     On top of that, I think there's some additional overhead during 
parse-time proper that we might be able to improve upon fairly readily.

     I'll try to spend some time over the next few days to look into 
these.

Thanks,

Henry
------------------------------------------------------------------
Henry Zongaro      XML Parsers development
IBM SWS Toronto Lab   Tie Line 969-6044;  Phone (905) 413-6044
mailto:[EMAIL PROTECTED]





Joseph Kesselman/CAM/Lotus@Lotus
02/05/02 02:01 PM
Please respond to xerces-j-dev

 
        To:     [EMAIL PROTECTED]
        cc: 
        Subject:        Re: Schema validation -- known performance problem?

 


On Wednesday, 05/01/2002 at 11:24 AST, Elena Litani <[EMAIL PROTECTED]>
wrote:
> Joe,
>
> Joseph Kesselman/CAM/Lotus wrote:
> > HOWEVER -- when I turn on the schema validator, parser performance
falls
> > through the floor -- even though none of the test documents references
a
> > schema, and only two of them reference a DTD. The parse() operation
takes
> > almost twice as long to complete.
>
> This is a single parse(), correct? I mean you did not use any warm-up..?

The testcase I'm running  parses about 40 documents. It does instantiate a
new copy of the parser for each one, if that's what you're asking. So no,
this isn't a first-time code-load problem, though it may be a first-time
object-initialization problem.

And as I said, time difference is emphatically _NOT_ insignificant in 
these
tests. As I said: 2:1 difference measured in this test.

> Currently we try to validate against both: DTDs and XML Schemas. That is
> why we do check if XML Schema is found on some element.

I understand the goal. But poor performance in schema mode is going to 
push
folks away from Xerces...



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Schema validation -- known performance problem?

Reply via email to