Hi, Philip-
Philip TAYLOR wrote:
A browser does not need a doctype to be present to define for it the
HTML dialect. So your last sentence could just as well be written,
"Even without a doctype, a document can be parsed and converted
into meaning and/or rendering".
And ungrammatical utterances can still communicate meaning.
But they usually do so with a marked lack of precision, whence
the need for puncutation words such as "like" and punctuation
phrases such as "you know what I'm saying". The whole point
is that the meaning/rendering of "an arbitrary mixture of
angle brackets, ampersands, semi-colons and prose" can only be
guessed at; with a DOCTYPE, it is known.
Even were I to agree with your larger point (I'm not sure I do), you're
making a fallacy here. You want there to be a DOCTYPE because it serves
as an identifier of a particular instantiation of a strict grammar, but
one of your criteria for that strict grammar is that it has a DOCTYPE.
DOCTYPE, ipso facto, is not necessary to satisfy your requirement; the
presence of an identifier is. That identifier could be a DOCTYPE, or it
could be a namespace in the root (as SVG uses), or it could even be a
validation of the set of elements that serves as a hash (though the
latter is not very robust).
To illustrate: you don't require that everyone who speaks to you first
identifies the spoken language that they will use. You identify it by
the lexicon and the grammar. If you speak more than one language
(English and French), you are capable of using the linguistic signature
of the speech to decide if you can understand that language (it might be
Spanish or Chinese), and to correctly interpret it if so. Sure, people
are much more powerful than computers in this respect, but in the real
world, there are only a very few grammars that begin <html ...>;
browsers can do some inferring.
In fact, it's not even absolutely necessary that identifier the be
unique, only that it serve to differentiate on points where there may be
confusion between e.g. 2 similar grammars. To continue the earlier
example, to distinguish between American and British English, the
spelling or pronunciation of a single word is enough (a shibboleth); by
analogy, the presence or absence of any given non-shared element or
attribute could be enough to distinguish between grammars. But that's
going too far, and doesn't bode well for extensibility and mixed
namespaces.
I'm not calling for an abolition of identifiers, I'm just saying that
DOCTYPE doesn't have to be the one true identifier.
(I'm going to suppress my linguistic training, and not hold forth on
description vs. prescription.)
You know what I'm saying?
Regards-
-Doug Schepers (unofficially)
W3C Staff Contact, SVG, CDF, and WebAPI