James Richardson wrote:
> cat bad.xml | perl -p -e 's/\<(.*?)\>(.*)\<\/\>/<$1>$2<\/$1>/' > good.xml

As long as your file is ASCII, this should be fine. But XML
is based on Unicode which can have any number of encodings.
And unless your Perl understands this (not likely) this is
limited to working for ASCII (and "ASCII-transparent") files.

> Why are strings containing [\n\t ]* reported as character, rather 
> than ignorable whitespace?

Without a grammar present, the parser has no knowledge as
to what is meaningful character data and ignorable white-
space. So if you want the [\n\t ]* reported as ignorable
whitespace, then you have to have a grammer associated.

-- 
Andy Clark * IBM, TRL - Japan * [EMAIL PROTECTED]

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to