Hello

Thanks for all the replies. I fell quite safe now that i know how it works.
All our important data is held within <xxx>...</xxx> so the extra linebreaks
and spaces for better visualisation will not impact.
Only one follow question. The XML_PARSE_NOBLANKS fixed everyting
and i could mix CR and CRLF in the file. But when i read about the option it
say
it fixes whitespace and not CRLF. Is that part missing in the documentation
or am i only reading it bad. Is 0x10 handled as a blank, or is there any
more
characters i might miss?

/James


2010/6/23 Michael Ludwig <[email protected]>

> James Ytterstene schrieb am 23.06.2010 um 14:41 (+0200):
>
> > If i have the file unchanged from any windows editor the line ending
> > is CR only but if someone edit the file it will be changed to CRLF
> > (Stupid windows editors but we must use them) If i now try to read the
> > file back in libxml2 i will get an extra node at each line only
> > containing 0x10.
>
> Most serious editors have an option to go with DOS or UNIX or Mac line
> endings. Maybe yours do, too.
>
> > If i change the xmlReadFile and add the option XML_PARSE_NOBLANKS i
> > can read the file back ok. But when reading about that option i find
> > many posts about not to use it, so im confused here.
>
> The question you have to answer: Are whitespace-only text nodes in your
> XML significant or not? If they're not significant, nothing wrong with
> stripping them. Unless, of course, your output is intended for human
> consumption. In that case, you have to keep them, or apply automatic
> output indenting.
>
> > When i read about libxml2 and how files should be parsed i get the
> > feeling that the parser should handle the CRLF when reading files and
> > always save the new files with CR only. So the extra CRLF shouIdn't be
> > any issue but I can be wrong here.
>
> It's a requirement of the XML spec:
>
> http://www.w3.org/TR/REC-xml/#sec-line-ends
>
> > Is there any general solution for the parsing of files so the CR CRLF
> > doesnt add any extra nodes?
>
> Well yes, the one you already found. Strip whitespace-only text nodes on
> parsing, using the appropriate parser or processor option, like in this
> case XML_PARSE_NOBLANKS.
>
> --
> Michael Ludwig
>  _______________________________________________
> xml mailing list, project page  http://xmlsoft.org/
> [email protected]
> http://mail.gnome.org/mailman/listinfo/xml
>
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
[email protected]
http://mail.gnome.org/mailman/listinfo/xml

Reply via email to