I a similar symptom recently when the file said it was in UTF-8 but a
non-compliant XML parser only supported ISO-8859-1. Make sure that your
character set encoding is correcton both sides.
David Johnson
Programmer Analyst
J. B. Hunt Transport
Information Services / OPR New Dev
John Gentilin
<[EMAIL PROTECTED]
ing.com> To
"Kransen, J."
07/01/2004 11:54 <[EMAIL PROTECTED]>
AM cc
"'Russell Simpkins'"
<[EMAIL PROTECTED]>,
"'[EMAIL PROTECTED]'"
<[EMAIL PROTECTED]>
Subject
Re: Parsing XML file retrieved
through HTTP
I have been doing this for a while very successfully, the only difference I
see
is that I actually parse the incoming stream into a DOM document then run
the
transformation from there. As a brute force test, just spool your input
stream
to a file, then run the transformation off the file. If it doesn't work,
you can
poke through the file with an editor to see where the problem was.
I had this issue before where the XML document I was retrieving was
dynamically
generated from a MS SQL server DB connection where the tables were
populated
through Access. The original content was generated in Notepad then cut &
pasted
into the Access table. The cut & paste action added the quote characters as
special
windows font value and the resulting parse threw an error "Single byte
Unicode violation".
Since the columns were declared as varchar and not nvarchar, I assumed it
was Xalan
on the server side... After about 4 hours of second guessing myself, I
spooled the
content to a file and use a Hex Editor to find the char....
HTH
John G
Kransen, J. wrote:
> Yes, I saw the file through a browser, and I'm sure that it's valid. Also
I
> tried a completely empty XML file with only open and close tag. Either
way
> it parses all right when found as a local file, but the file as retrieved
> over HTTP does not work. So I am quite sure that the problem lies there.
I
> think the HTTP headers are not stripped. If this is the case, can anybody
> tell me a straightforward way of stripping the headers in Java? I'm sure
I
> don't have to write something common and low-level like that ;-) Maybe in
> your case the header stripping was done by PHP? I sort of would expect
that
> from Java as well when I request an XML file over HTTP...
>
> So far what I did was write a custom URIResolver like this, this gives
> exactly the same result (same error as in original email):
>
> transformer.setURIResolver (new URIResolver() {
> public Source resolve(String href, String base) {
> StreamSource source = null;
> URL context = null, url = null;
> try {
> context = new URL(base);
> url = new URL(context, href);
> InputStream in = url.openStream();
> source = new StreamSource(in,
url.toString());
> }
> catch (MalformedURLException urle) {
> //
> }
> catch (IOException ioe) {
> //
> }
> return source;
> }
> });
>
> Since my code obviously doesn't do header stripping and gives the same
> result, I strongly feel like headers are still there when doing a
> StreamSource("http://localhost/risc/readonly.jsp").
>
> Any additional thoughts/help would be appreciated.
>
> Jeroen
>
>
>>-----Oorspronkelijk bericht-----
>>Van: Russell Simpkins [mailto:[EMAIL PROTECTED]
>>Verzonden: donderdag 1 juli 2004 13:55
>>Aan: 'Kransen, J.'; [EMAIL PROTECTED]
>>Onderwerp: RE: Parsing XML file retrieved through HTTP
>>
>>Jeroen,
>>
>>Did you look at the xml file through a brower? Maybe there is something
>>there you aren't seeing. I have done the exact same thing, opening an
>>http page as a streamsource, using php as the source. Every error I saw
>>came in malformed xml. Which is what it sounds like you have.
>>
>>Russ
>>
>>-----Original Message-----
>>From: Kransen, J. [mailto:[EMAIL PROTECTED]
>>Sent: Thursday, July 01, 2004 5:16 AM
>>To: '[EMAIL PROTECTED]'
>>Subject: Parsing XML file retrieved through HTTP
>>
>>Hello,
>>
>>I have a .jsp that outputs a XML file. I have another .jsp in which I
>>want
>>to parse the XML file using an XSL file. So in the latter .jsp I want to
>>do
>>something like this:
>>
>><textarea><%
>>
>>String xslFile = getServletContext().getRealPath("/xml/risc2cvs.xsl");
>>
>>TransformerFactory tFactory = TransformerFactory.newInstance();
>>Transformer
>>transformer = tFactory.newTransformer(new StreamSource(xslFile));
>>
>>// write the content of the parsed XML file
>>transformer.transform(new
>>StreamSource("http://localhost/risc/readonly.jsp"), new
>>StreamResult(out));
>>
>>%></textarea>
>>
>>However, when I do this, I get the following error message:
>>
>>The element type "base" must be terminated by the matching end-tag "".
>>
>>With this stack trace:
>>javax.xml.transform.TransformerException: The element type "base" must
>>be
>>terminated by the matching end-tag "".
>> at
>>org.apache.xalan.transformer.TransformerImpl.fatalError(TransformerImpl.
>>java
>>:744)
>> at
>>org.apache.xalan.transformer.TransformerImpl.transform(TransformerImpl.j
>>ava:
>>720)
>> at
>>org.apache.xalan.transformer.TransformerImpl.transform(TransformerImpl.j
>>ava:
>>1192)
>> at
>>org.apache.xalan.transformer.TransformerImpl.transform(TransformerImpl.j
>>ava:
>>1170)
>> at
>>org.apache.jsp.cvs_005fuittreksel_jsp._jspService(cvs_005fuittreksel_jsp
>>jav
>>a:109)
>>..
>>
>>
>>When instead I try to parse a local XML file, there are no problems:
>>String xmlFile = getServletContext().getRealPath("/xml/temp.xml");
>>transformer.transform(new StreamSource(xmlFile), new StreamResult(out));
>>
>>But then, when I try to parse the very same file as accessed through
>>HTTP, I
>>get the very same error:
>>transformer.transform(new
>>StreamSource("http://localhost/risc/xml/temp.xml"), new
>>StreamResult(out));
>>
>>I was thinking that maybe the HTTP headers aren't stripped before the
>>parsing. Does anybody know what to do here?
>>
>>Thanks in advance!
>>
>>Jeroen
>
>