Hi,
On Sep 25, 2007, at 18:35, Steven Faulkner wrote:
>Those who add a bogus alt for validation are a subset of people who
>include a bogus alt.
and what size is this subset (who knows)
Presumably the population whose behavior is swayed by what is deemed
valid (i.e. syntactically correct) is significant enough for the
html4all group to be concerned about what validity says about the alt
attribute.
why not develop a validator that looks for and fails the page if it
has bogus alt?
Because it only leads to an arms race-like escalation that makes more
junk to be served to users.
If I make the validator check that each image has an alt text that is
longer than the empty string, those generators that do not have an
alternative text available, will be programmed to emit a bogus string
that is at least one character long.
If I make the validator check that each image has an alt text that is
longer than on character, those generators that do not have an
alternative text available, will be programmed to emit a bogus string
that is at least two characters long.
And so on. (The point stays the same if you substitute another
heuristic for the length test.)
This is not only an issue with alt. There's non-alt precedent to this
kind of behavior. When HTML 4 said that paragraphs must not be empty,
people who saw value in emitting qualitatively empty paragraphs
started putting a no-break space in paragraphs that were
qualitatively empty. To address this, Hixie went ahead and defined a
concept of Significant Inline content to make a single no-break space
invalid in HTML5. Will that make people no longer see value in empty
paragraphs. My bet is on "No". They'll just generate something that
fools the new test. I've suggested Hixie that instead of trying to
outsmart people who want to do "bad" empty paragraph we stop the
escalation instead.
When HTML 4.01 Strict banned target='', it didn't make people not to
want to open links in new windows any longer. Instead, it begat this:
http://www.alistapart.com/articles/popuplinks
Moral of the story: You can't use the concept of validity to stop
people from achieving the results they want. They just figure out a
way to do it in a less detectable way which means browsers have a
harder time offering counter-measures to the users.
using heauristics, that shouldn't be so hard should it?
Heuristics work when the uncooperative data sources are indifferent
to the heuristics (that is, they don't actively try to fool the
heuristic). This, presumably, would be the case if alt-related
reasonable heuristics were deployed in AT. The heuristic of reading
the URI is so bad a heuristic that you have effectively been arguing
that authors defeat that particular heuristic.
Heuristics don't work when the uncooperative data sources are
actively hostile to the heuristic (i.e. try to fool it) *and* they
know what the heuristic is so that they *can* fool it. This is why
search engines keep their anti-SEO spam heuristics secret and complex
enough to be resilient to black-box reverse engineering.
Precedent suggests that in the case of validators, people will seek
to fool them if the concept of validity stands in the way of the
results they want and they have a requirement (for whatever reason)
to be valid.
Now I'm going to shortly follow up to John Foliot's email and then
excuse myself and write some validator software instead of talking
about it.
--
Henri Sivonen
[EMAIL PROTECTED]
http://hsivonen.iki.fi/