Vlad Alexander (XStandard) wrote:
User agents come and go, so how one browser parses markup is so trivial in the larger scheme of things. What is really important is content. If people write content in HTML they are creating legacy data because it is not easily parsable from a content management perspective.

Yes it is, it just requires SGML tools, instead of XML tools. This all comes down to using the right tool for the job.

Content written in HTML cannot easily be re-purposed. If you have 1,000 documents and you want to change some markup in all of them, it is very difficult to do this if these documents are in HTML. If the documents are in XML (XHTML), then this is a trivial task using off-the-shelf technologies like DOM/SAX parsers or XSLT.

The same is true of HTML, it just requires that you use SGML tools to process it, rather than XML tools, and SGML tools have been available for much longer than XML tools; they're just not so widely deployed because HTML is rarely treated as an application of SGML anyway.

Since, as you say, it's trivial to use such tools for XHTML, it's also trivial to convert from XHTML to HTML 4 on the fly using XSLT or some other method.

So we need to start writing content in XML and if it's content destined for the Web, then XHTML is perfect. The next step is: if you write it in XHTML, then why not serve it in XHTML (even if right now it's still processed by some current browsers as HTML).

Such use cases require XML tools, with a CMS that uses such tools to guarantee well-formed input and output. It also requires that the author be competent enough to develop and test and a completely XML environment, even if it's delivered to the world as text/html.

I do agree that XHTML on the back end does have significant authoring benefits for those experienced and competent enough to do so, but we're talking about beginners who are unlikely to have such tools at their disposal and are extremely likely to be developing and testing in an HTML environment. As I have said many times, learning XHTML that way is not a good idea, and it is the responsibility of those of us teaching it to make sure it is learned correctly, not incorrectly as you seem to be pushing.

Additionally, how many commonly used, off-the-shelf CMSs that claim to output XHTML as text/html, or in fact any CMS regardless of its output, actually make use of XML tools? WordPress certainly doesn't, it uses string substitutions and doesn't guarantee well-formed output, as do others such as MovableType, Blogger, etc.

I challenge you to name several readily available off-the-shelf CMSs that actually do make use of XML tools. As of yet, I have not found any that do, let alone guarantee 100% well-formed output.

--
Lachlan Hunt
http://lachy.id.au/
******************************************************
The discussion list for  http://webstandardsgroup.org/

See http://webstandardsgroup.org/mail/guidelines.cfm
for some hints on posting to the list & getting help
******************************************************

Reply via email to