Thanx all, Interesting, never dawned on me that it would be SGML, but it makes sense ... I'll look at the XHTML, but ideally I can hack this down to something workable.
I did manage to clean up the DTD file but hit the following trouble spot, perhaps you can shed some insight ... ... extra deleted ... <!ENTITY % reserved ""> <!ENTITY % attrs "%coreattrs; %i18n; %events;"> <!ENTITY % fontstyle "TT | I | B | BIG | SMALL"> <!ENTITY % phrase "EM | STRONG | DFN | CODE | SAMP | KBD | VAR | CITE | ABBR | ACRONYM" > <!ENTITY % special "A | IMG | OBJECT | BR | SCRIPT | MAP | Q | SUB | SUP | SPAN | BDO"> <!ENTITY % formctrl "INPUT | SELECT | TEXTAREA | LABEL | BUTTON"> <!ENTITY % inline "#PCDATA | %fontstyle; | %phrase; | %special; | %formctrl;"> <!ELEMENT ( %fontstyle; | %phrase; ) (%inline;)* > <!ATTLIST ( %fontstyle; | %phrase; ) %attrs; > ... continued ... The parser works fine until it hits the <!ELEMENT ( %fontstyle; | %phrase; ) (%inline;)* > line. I've gone through the XML syntax/grammar and it looks ok. Yet, the parser doesn't like it at all: C:\src\dtd\xerces-c_1_1_0_d05-win32\bin>saxcount readme.html Fatal Error at (file C:\src\dtd\xerces-c_1_1_0_d05-win32\bin\html4.dtd, line 95, char 11): readme.html: 50 ms (0 elems, 0 attrs, 0 spaces, 0 chars) I've checked the line a dozen times and can't see the problem. The fatal error seems to indicate something more systemic. Additional thoughts? -Sandy Arnaud Le Hors wrote: > > [EMAIL PROTECTED] wrote: > > > > The DTD, as you posted it anyway, is pretty much completely dead. > > Not true. > > > It > > appears as though the <!-- of most of the comments have been lost, for > > instance, from the very beginning of the file: > > > > <!ENTITY % ContentType "CDATA" > > -- media type, as per [RFC2045] --> > > > > should have almost certainly been: > > > > <!ENTITY % ContentType "CDATA" > > <! -- media type, as per [RFC2045] --> > > > > In the first form, its definitely not a valid DTD. So did it get munged in > > the process of your posting it or is that the way it really is? > > This is not a valid XML DTD, but it is a valid SGML DTD! HTML 4.0 is an > SGML application. Xerces only supports XML DTDs. You may want to > consider switching from HTML to XHTML [1], after which you'll be able to > use Xerces. > > [1] http://www.w3.org/TR/xhtml1 > -- > Arnaud Le Hors - IBM Cupertino, XML Technology Group