Hi Joseph,
Thanks for your reply. I have actually tried out a couple of things and this is
what i found out:
- I tried defining an inernal subset and tried using just character references in it.
<?xml version="1.0" ?>
<!DOCTYPE myRootElement [<!ENTITY eacute "ç"> ] >
then i removed all special characters in my input file and ran my java application
which converts the input to XML (which has a header as shown above), parses and
then validates it. I got the following error message: Element type myRootElement
must be declared. I then added a DTD instead of directly adding entities and ended
up creating an entire DTD with all the elements that appear in my XML !!
How do i actually define an internal subset ? How do i define only character entities
in it ? Where am i going wrong ?
- According to your suggestion i tried using the UTF-8 encoding tag in my doc header:
and put the special characters back into my doc. However, the following error showed
up this time:
Invalid byte 2 of 3-byte UTF-8 sequence.
I am using Xalan (which i believe in turn uses Xerces) to do the parsing and validation. Why
could this error be occuring ?
Thanks for your help.
Regards,
Chetan
Joseph Kesselman <[EMAIL PROTECTED]>
08/17/2004 08:04 PM
|
|
Alternatively, you can use an internal subset (DTD syntax inside the
document itself, which keeps things self-contained)...
or you skip the mnemonic names and just use numeric character references...
or you skip those and just use the characters themselves and pick an
encoding which supports them and which your processor understands. (if in
doubt, UTF-8 does support everything and is supported by all XML
processors).
(Schema considered adding an Entity-like macro/import facility, but decided
to leave this for other standards to deal with. Unfortunately there hasn't
yet been a clear consensus on what the best replacement is...)
______________________________________
Joe Kesselman, IBM Next-Generation Web Technologies: XML, XSL and more.
"The world changed profoundly and unpredictably the day Tim Berners Lee
got bitten by a radioactive spider." -- Rafe Culpin, in r.m.filk
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
ForwardSourceID:NT0000261E
DISCLAIMER: The information contained in this message is intended only and solely for the addressed individual or entity indicated in this message and for the exclusive use of the said addressed individual or entity indicated in this message (or responsible for delivery of the message to such person) and may contain legally privileged and confidential information belonging to Tata Consultancy Services. It must not be printed, read, copied, disclosed, forwarded, distributed or used (in whatsoever manner) by any person other than the addressee. Unauthorized use, disclosure or copying is strictly prohibited and may constitute unlawful act and can possibly attract legal action, civil and/or criminal. The contents of this message need not necessarily reflect or endorse the views of Tata Consultancy Services on any subject matter.] Any action taken or omitted to be taken based on this message is entirely at your risk and neither the originator of this message nor Tata Consultancy Services takes any responsibility or liability towards the same. Opinions, conclusions and any other information contained in this message that do not relate to the official business of Tata Consultancy Services shall be understood as neither given nor endorsed by Tata Consultancy Services or any affiliate of Tata Consultancy Services. If you have received this message in error, you should destroy this message and may please notify the sender by e-mail. Thank you.
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]