"Since attributes are generally about the same size, it would seem to be
much more efficient to attempt to reuse the existing "stuff" instead of
explicitly deleting it then reallocating it.  I don't think that it
requires anything as sophisticated as a string pool.  Maybe just always
allocate fName (for example) at least, say, 64 characters.  If the new
values is less than 64 characters, copy the new value into fName otherwise
allocate a longer fName."

I would be a little leary of believing those numbers too much. Perhaps the
particular test you ran might have skewed the results a little towards that
particular code, but we've run extensive tests here with some pretty
powerful tools and they don't show up that  code as being very high on the
list, as far as I know. Far and away the biggest bang for the buck is in
minimizing the per-character loop overhead, which we've done considerably
so far but it could be done more still.

The optimization opportunity you point out is certainly there and,
actually, if you go back and look at some of the very earliest versions of
the parser, it was there. But I got rid of it because it was more complex
and there was no real proof that it was a big problem. We can consider
putting them back, but I think that there are other places with bigger bang
for the buck that will probably be attacked first.

Anyway, we have a big performance pow-wow scheduled after we get this new
release out. Roger has been doing considerable profiling and has a lot of
numbers we need to look at.

"Another anomaly is that I was not doing a validating parse, but I spent
10% of the time in DTDValidator::scanDTD()."

The DTD is parsed whether you validate or not. It has stuff in it that is
relevant whether or not validation is happening, such as the types of
attributes (controls normalization), defaulted or fixed attributes, entity
declarations, notation declarations, etc... Those things are all used if
available even if validation is not happening.

----------------------------------------
Dean Roddey
Software Weenie
IBM Center for Java Technology - Silicon Valley
[EMAIL PROTECTED]


Reply via email to