Thanx all,

Interesting, never dawned on me that it would be SGML, but it makes
sense ... I'll look at the XHTML, but ideally I can hack this down to
something workable. 

I did manage to clean up the DTD file but hit the following trouble
spot, perhaps you can shed some insight ...

... extra deleted ...
<!ENTITY % reserved "">
<!ENTITY % attrs "%coreattrs; %i18n; %events;">
<!ENTITY % fontstyle "TT | I | B | BIG | SMALL">
<!ENTITY % phrase "EM | STRONG | DFN | CODE | SAMP | KBD | VAR | CITE |
ABBR | ACRONYM" >
<!ENTITY % special "A | IMG | OBJECT | BR | SCRIPT | MAP | Q | SUB | SUP
| SPAN | BDO">
<!ENTITY % formctrl "INPUT | SELECT | TEXTAREA | LABEL | BUTTON">
<!ENTITY % inline "#PCDATA | %fontstyle; | %phrase; | %special; |
%formctrl;">

<!ELEMENT ( %fontstyle; | %phrase; ) (%inline;)* >

<!ATTLIST ( %fontstyle; | %phrase; )  %attrs;  >
... continued ...

The parser works fine until it hits the 
<!ELEMENT ( %fontstyle; | %phrase; ) (%inline;)* >
line. I've gone through the XML syntax/grammar and it looks ok. Yet, the
parser doesn't like it at all:

C:\src\dtd\xerces-c_1_1_0_d05-win32\bin>saxcount readme.html

Fatal Error at (file C:\src\dtd\xerces-c_1_1_0_d05-win32\bin\html4.dtd,
line 95, char 11):
readme.html: 50 ms (0 elems, 0 attrs, 0 spaces, 0 chars)

I've checked the line a dozen times and can't see the problem. The fatal
error seems to indicate something more systemic.

Additional thoughts?

-Sandy



Arnaud Le Hors wrote:
> 
> [EMAIL PROTECTED] wrote:
> >
> > The DTD, as you posted it anyway, is pretty much completely dead.
> 
> Not true.
> 
> > It
> > appears as though the <!-- of most of the comments have been lost, for
> > instance, from the very beginning of the file:
> >
> > <!ENTITY % ContentType "CDATA"
> >     -- media type, as per [RFC2045] -->
> >
> > should have almost certainly been:
> >
> > <!ENTITY % ContentType "CDATA"
> > <! -- media type, as per [RFC2045]  -->
> >
> > In the first form, its definitely not a valid DTD. So did it get munged in
> > the process of your posting it or is that the way it really is?
> 
> This is not a valid XML DTD, but it is a valid SGML DTD! HTML 4.0 is an
> SGML application. Xerces only supports XML DTDs. You may want to
> consider switching from HTML to XHTML [1], after which you'll be able to
> use Xerces.
> 
> [1] http://www.w3.org/TR/xhtml1
> --
> Arnaud  Le Hors - IBM Cupertino, XML Technology Group

Reply via email to