Re: Ignorable Whitespace ( and 'terminating with ' )

Andy Clark 31 May 2001 16:50:04 -0000

James Richardson wrote:
> cat bad.xml | perl -p -e 's/\<(.*?)\>(.*)\<\/\>/<$1>$2<\/$1>/' > good.xml


As long as your file is ASCII, this should be fine. But XML
is based on Unicode which can have any number of encodings.
And unless your Perl understands this (not likely) this is
limited to working for ASCII (and "ASCII-transparent") files.

> Why are strings containing [\n\t ]* reported as character, rather 
> than ignorable whitespace?

Without a grammar present, the parser has no knowledge as
to what is meaningful character data and ignorable white-
space. So if you want the [\n\t ]* reported as ignorable
whitespace, then you have to have a grammer associated.

-- 
Andy Clark * IBM, TRL - Japan * [EMAIL PROTECTED]

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Ignorable Whitespace ( and 'terminating with ' )

Reply via email to