Insights from developers will be of great help Thank you ----------------------------------
> I am observing an inconsistent behavior in the treatment of internal > subset declarations by a SAX parser from Xerces300ea3 for Java. > > sax.SAXWriter from the samples jar was used as a parsing application > > Issues: > > 1. Scanning or buffering error (absent in older ibm4j) when reading a > large file of character entity declarations. The declarations are referred > to from the internal subset via a parameter entity. > > ============================== > <!DOCTYPE dummy [ > <!ENTITY % entts SYSTEM "allchars.ent"> > %entts; > ]> > <test> > text > </test> > ============================== > > <<enttest.ent>> > ... > <!-- Entity set. > Public identifier: > -//ISO 8879:1986//ENTITIES Added Math Symbols: Relations//EN > --> > ... > <!-- take me out and xerces will break WEIRD --> > <!ENTITY ape "&#38;ape;"> <!-- approximate, equals --> > ... > (from attachment enttest.ent) > ============================== output == > > [Fatal Error] enttest.ent:577:3: The markup declaration contained ... > > > In the same situation, everything *might* work if the DTD fragment > contains fewer entity declarations (other things like adding a blank line > can make it work too). > > 2. Additional declarations in the internal subset can not override the > same declarations previously read from the external resource or from the > same subset. > > <!DOCTYPE dummy [ > <!ENTITY % entts SYSTEM "enttest.ent"> > <!-- has aacute mapping to itself --> > %entts; > <!ENTITY aacute "&#38;xaacute;"> > ]> > <test> > text á text > </test> > > or > > <!DOCTYPE dummy [ > <!ENTITY aacute "&#38;aacute;"> > <!ENTITY aacute "&#38;xaacute;"> > ]> > <test> > text á text > </test> > > ============================output === > <test> > text &aacute; text > </test> > > > I expected to see the following output > <test> > text &xaacute; text > </test> > > > 3. The same relates to the following test that tries to unsuccessfully > override one of the predefined XML entities > > =============================== > <!DOCTYPE dummy [ > <!ENTITY amp "&#38;xxamp;"> > ]> > <test> > text & text > </test> > =============================== output === > <test> > text & text > </test> > =============================== > > A DOM parser would behave differently, BTW > > > I think all these issues manifest internal bugs that need to be fixed. > Thank you, > > Gleb > >
enttest.ent
Description: Binary data