RE: HTML Action Item 54 - ...draft text for HTML 5 spec to require producers/authors to include @alt on img elements.

-----Original Message-----
From: Ian Hickson [mailto:ian@hixie.ch] 
Sent: Friday, May 09, 2008 8:31 PM
To: Justin James; John Foliot
Cc: public-html@w3.org
Subject: RE: HTML Action Item 54 - ...draft text for HTML 5 spec to require producers/authors to include @alt on img elements.

> I respectfully disagree. I think it's very important that we define 
> elements to have clear semantics and that we require that those semantics 
> not be misused.

How do you propose requiring that the semantics not be misused, if the spec cannot be 100% machine verified? Not just for "totalitarian" reasons, but also for the simple fact that the overwhelming majority of people operating software that results in HTML output are unaware of the spec's contents? The *only* way to require that the semantics be used properly (and actually have a shot at this happening even at a moderate level) is to have their usage be machine verifiable.

> I think getting rid of the accessibility goal would be terrible. 
> Discriminating against people simply based on which media they use is 
> immoral and short-sighted.

I agree. So we need to find a way to re-write this specification to ensure that even the semantics are machine verifiable, because without that, we will never meet the accessibility goal.

> The semantics of <p> can be determined accurately and reliably. It means 
> "paragraph". What can't be determined accurately and reliably is whether 
> the author actually put a paragraph there.

I disagree. Using <p> instead of <div> is merely a claim by the document that the tag's contents represent a paragraph as opposed to a generic division. The semantics are not guaranteed by the use of <p> (or any other semantic tag). Let's get real. So long as the Web browser displays the contents of the block such that it is clearly a paragraph, the user does not care what the underlying tag is. But a user agent that is trying to "grok" the semantics cares a lot. So when it comes across <p><img src="junk.jpg" alt="My picture!"  height="200" width="200"/></p>, it doesn't "think" to itself, "gee, they used <p> to set the image apart." Instead, the semantics are an empty paragraph (or, potentially, a paragraph with an image, or possibly a paragraph with the text contents "My picture!", depending on how the user agent wants to deal with @alt and the <p> tag that is devoid of text).

Or to rephrase, it is logically irrelevant to declare semantics, and have the existence of a semantic tag be verified, if the contents of the tag do not meet the proper usage. Why bother having <p> at all if it is not guaranteed to be what it claims to be?

> We do do that with attributes. We have alt="", title="", src="", class="", 
> lang="", and many more.

That list is hardly exhaustive. The list of semantic attributes for an image alone would probably rival the list of semantic tags for blocks, simply because of the flexibility if image usage. Further more, the role of an image (or any other tag) should toggle the mandatory status of certain attributes. For example:

<img src="long_text.jpg" role="text" transcript="The text that is contained within the image." height="200" width="200" />

In this case, using a @role of "text" would make @alt non-mandatory, but make @transcript mandatory.

Example #2:

<div role="paragraph">Text goes here.</div>

In this case, the role of "paragraph" mandates that there be textual content within the tag.

And so on.

Doing something like this is the only way I see (for the time being) of allowing (and forcing) hand-HTML writers and authoring tools to properly conform to the spec. People writing authoring tools will have to work hard in order to provide a UI that doesn't nag the users to death.

> There is no technical difference between <p> and <div purpose=paragraph>. 
> They are merely syntacticly different ways of conveying the same thing. 
> Neither one of them is machine-checkable.

At the moment, while <p> is just a synonym for <div>, they are technically identical. My point here is that if you want a semantic, accessible Web, you cannot continue to have tags that all function the same, just with different names, and a pleading within the HTML spec to use them only in certain circumstances.

J.Ja

Received on Saturday, 10 May 2008 04:35:06 UTC