At 04:04 PM 1/6/00 -0800, Assaf Arkin wrote:
>+1 on on/off feature
>+1 on on by default (i.e. no whitespace unless said otherwise)
>
>+1 on documenting that you have to go trim your text nodes and ignore
>others if the DTD is missing (conclusion: always use some DTD)

Seems to me it's kind of a waste of time for an app writer to count on
help from the DOM here, simply because there are going to be lots of
times when you don't have a DTD or schema or whatever.  Thus, you're
going to have to write the code to nuke the superfluous whitespace
anyhow, so why not just curse a little bit and then do it?

Essentially, your app essentially *knows* which elements don't have
#PCDATA content anyhow, so it has the information it needs.

What would be nice, though, would be a library somewhere in Apache-land
that does with whitespace more or less exactly what HTML does, given
as input a set of tags that are:

- element-content-only (e.g. <html:dl>)
- block-level text containers (e.g. <html:p>)
- inline text containers (e.g. <html:i>)

It turns out there are quite a lot of subtleties, but given the above
information, you can get whitespace pretty well right.

One of the nice things about doing database rather than document 
processing is that you don't have to deal with whitespace.  Sigh. -T.

Reply via email to