Hi, On Wed, 27 Nov 2002 [EMAIL PROTECTED] wrote:
> wrt to fitting in with spec. > If an element is defined as mixed by a DTD then you can't insert or remove > white space for the purposes of being 'human readable' because if the > whitespace is important then what the human reads (after pretty printing) > will be different to what a machine reads. Agreed, but the rules are more complex when we take into account schema's. Adding whitespace may not be allowed even in mixed content. > I agree that the spec is somewhat grey. However there is no implication > that the transformation is allowed to make the document invalid - so > should you play it safe? The fact that it states that the spec does not define it suggests to me that we can use either approach. > > wrt to the difficulty. > Agree it could be hard. However, seeing as the parser can determine the > result for DOMText::getIsWhitespaceInElementContent, why can't the parser > figure out if a Text node can be added with this attribute set? Is that a > naive question ?;-). Its a matter of schema validation as well. We cannot currently perform validation on a DOM Tree. We have to serialise and reparse. Then we have problems like creating invalid documents. > I would have thought this sort of thing would have to the sorted with > cononicalisation anyway ... I know more about the schema stuff so I will take it from that perspective. Each of the schema types (and therefore derived types etc) does have a defined canonical representation. If we implemented the ability to print this out using a method like printCanonicalRepresentaion then I don't see why we could not also write a method (that may take some params) called prettyPrintCanonicalRepresentation. Implementing this in addition to binding the schema type info to the eles /attrs (something I am currently working on) and then having the SchemaGrammar available to find the validators, would, I think be sufficient. This is some work and we are still left with problems such as what to do with manipulated trees that are ow invalid. > > wrt to two features. > Feature "format-pretty-print", that plays it safe and maintains validity. > Feature > "http://apache.org/xml/features/format-pretty-print-no-grammar-check" that > takes a punt according to rules such as those below. Send a message suggesting this to [EMAIL PROTECTED] They do reply. They have probably discussed this. I would be interested in the response. > 2./ > If people do want particular formating then maybe it is not a big deal for > them to iterate over the DOM and insert text nodes (ie ignoreable > whitespace) as appropriate for their particular needs. You would have to clone the tree first because you may want to use it afterwards. > 3./ > Maybe having format-canonical implemented would suit people who want > 'human readability' but need to maintain validity. 'Fraid I'm not up on > the Canonical XML spec and/or status. I think what you suggest definitely has benefits, I just don't think it fits with the normal pretty print. I agree with you that it might be nice to have 2 features. Lets see what the DOM WG says. Gareth -- Gareth Reakes, Head of Product Development DecisionSoft Ltd. http://www.decisionsoft.com Office: +44 (0) 1865 203192 --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
