Bert Doorn wrote:
I code in xhtml Strict and serve it as text/html. My code is future-proof, valid and well structured.

Future proof from what? Do you really think any browser will ever drop support for HTML4?

If I code in HTML4, there is less "need" for writing properly structured 
documents.

Rubbish. It's just as important to write properly structured HTML as it is for XHTML, the only difference is that when served as XML, UAs will just give up, but tag soup parsers are much more lenient. In fact, it's that lenience you're relying on when serving XHTML as text/html.

If at some point in the future browsers understand xhtml served as xthml, changing the way it's served is a relatively simple operation. Re-coding from HTML to xhtml (and unlearning bad coding habits) is not as simple.

What bad coding habits are there that can be learned in HTML that can't be learned in XHTML as text/html?

Plus, I'm sure you've read Ian Hickson's "Serving XHTML as
text/html considered harmful" article?!

One man's view, based on an assumption that people will write xhtml tagsoup. Even if they do, they will find out soon enough.

How many sites claiming to be XHTML, but served as text/html, do you honestly think will survive the transition to application/xhtml+xml?

I can't speak for others, but I write proper xhtml, not html tagsoup translated to xhtml.

You might, but you do really expect beginners to when they're not seeing and learning from the actual results of an XML parser?

I think we've had a thread about this article  already, so will leave it there.

Yes, the issue has been discussed to death in many forums, newsgroups and mailing lists, but there are clearly still some people that haven't got the message.

In the case of IE and XHTML, there isn't even limited support
for it, there's none at all.


While technically correct, it is misleading, particularly for
newbies, who might read it as "don't code in xhtml - people with
MSIE will not be able to view your site".

What's wrong with teaching them that? The fact is XHTML really should not be served as text/html, despite what the joke that is Appendix C says. (Also note that most sites claiming to be XHTML as text/html, don't conform to appendix C)

It's not true if the page is served as text/html.

If the page is served as text/html, then you're not really using XHTML, despite what the DOCTYPE says and the syntax used in the file may look like.

I think it's important for beginners to learn correctly from
the beginning.

Exactly. Teach them properly structured xhtml 1.0 and serve it in a MIME type that the browsers people use can work with.

Since when does using XHTML correctly involve using the wrong MIME type?

 Ready to reap the  benefits of X(HT)ML later, when browsers support it.

In theory, that sounds good. But the reality is, unless you actually developed the page and tested it under XML conditions, it's often not that simple. Here's a brief overview of the serious problems many people will encounter if they ever attempt to serve their XHTML as application/xhtml+xml if they've only ever tested it as text/html:

Scripts
* If you've used this very common and outdated comment trick, the script
  will be hidden from XHTML UAs:

  <script type="text/javascript"><!-- // Hide from obsolete UAs
  //--></script>

  (the same applies to <style>, but it's not as common)
  You need to use this (or other variation) instead:

  <script type="text/javascript">/*<![CDATA[*/
  /*]]>*/</script>

* document.write() and .writeln() will not work

* innerHTML will not work in some XHTML UAs
  (FF 1.0 didn't, but 1.5 now does.  Opera allows illformed innerHTML
   strings in XHTML documents, FF1.5 doesn't)

* Need to use namespace aware DOM methods, where applicable.

* element.tagName, and other similar properties/methods return uppercase
  for HTML, lowercase for XHTML.

CSS
* Case sensitivity of selectors:
  P { /* will match HTML P elements, won't match in XHTML */ }

* Different treatment of the body element, can result in backgrounds
  applied differently.

Markup
* Named entities will cause well-formedness errors in older Gecko based
  browsers that don't have the pseudo-DTD catalog, and any other
  non-validating XML parsers.

* Meta element doesn't work for specifying the encoding, need to use the
  XML declaration, specify in a higher level protocol like HTTP
  headers or use the default of UTF-8 or UTF-16.

* Unencoded ampersands slip through very easily, and due to limitations
  in the W3C validator's XML support, not even it will catch all of
  them.  There are other similar well-formedness errors the validator
  won't catch either.  eg. try validating <p>Barnes & Nobel</p> in an
  XHTML document with the W3C validator.

* If you've used numeric character references from &#128; to &#159; and
  haven't validated with the W3C's or WDG's  validator (and thus not
  seen the warning), they appear to work in text/html because they're
  commonly (and incorrectly) interpreted as though the Document
  Character Set is Windows-1252, in XML they're interpreted correctly
  and will either display nothing or a place holder character.

  If you validated with a real XML validator (Like that provided by
  PageValet, not the W3C's or WDG's patched SGML validators), you won't
  see either an error or a warning, since they're perfectly valid and
  well-formed, they just don't mean what most people think they mean in
  text/html.

I'm sure there's many more too, they're just the ones I can think of without bothering to look up a complete list.

Besides that, unless you've got a CMS that uses real XML tools and enforces well-formed input and output, once the client is in control of the content, it's extremely likely that you're valid, well formed XHTML you gave them, won't remain that way for long. WordPress is a prime example of such a disastrous CMS, it accepts and publishes whatever rubbish you input and since it serve as text/html by default, many users don't know or care.

While you may be competent enough to be aware of such issues and meticulously test your document as XHTML, beginners who are unaware of all of these won't have a chance.

--
Lachlan Hunt
http://lachy.id.au/

******************************************************
The discussion list for  http://webstandardsgroup.org/

See http://webstandardsgroup.org/mail/guidelines.cfm
for some hints on posting to the list & getting help
******************************************************

Reply via email to