Re: Cleaning House from Maciej Stachowiak on 2007-05-06 (www-html@w3.org from May 2007)

From: Maciej Stachowiak <mjs@apple.com>
Date: Sun, 6 May 2007 03:52:45 -0700
To: Philip & Le Khanh <Philip-and-LeKhanh@Royal-Tunbridge-Wells.Org>
Cc: www-html@w3.org, public-html@w3.org
Message-Id: <656144D0-2EEF-448C-9D7C-6297C84D759D@apple.com>

On May 6, 2007, at 3:32 AM, Philip & Le Khanh wrote:

> Maciej Stachowiak wrote:
>
> [...]
>
>> For HTML, there is no significant distinction in attested use  
>> between <em> and <i>. In practice they are used in the same kinds  
>> of contexts.
>
> There is no significant distinction in /uninformed/ attested use;
> those who actually care about accessibility, on the other hand,
> and who have bothered to read the guidelines, will use <em> where
> emphasis is required, restricting their usage of <i> to purely
> visual contexts where italicisation is required for presentational
> reasons.
>
> As a standards organisation, the W3C defines what /should/ be done,
> rather than merely rubber-stamping what is an actually an artifact
> of uninformed usage, poor tools, and a lack of concern for  
> accessibility.

Yes, that's exactly the difference between Prescriptivism and  
Descriptivism in linguistics. That's why contributors to this thread  
are often talking past each other. Some think it is the spec's  
definition of what you /should/ do, regardless of current practice,  
that defines semantics. Others say that actual use effectively  
defines the semantics.

In this regard, a spec is like a dictionary. It's our chance to say  
what we think should be used.

But consider what semantic markup is intended to be used for.

Semantic markup is often promoted partly on the basis that it may  
allow for more kinds of machine reasoning about web content. But to  
apply any kind of useful machine reasoning to the web based on markup  
tags, you have to look at how the tags are actually used. After all,  
systems that do machine reasoning on human natural language generally  
work with a corpus, not a dictionary. And unfortunately, there is no  
machine-computable way to determine if a page was generated by  
informed or uninformed authors and tools.

So it seems that advocates of semantic markup for machine reasoning  
purposes should favor descriptivist semantics, since it makes life  
easier if the spec better matches actual use. This would include  
things like making the definitions of popular tags better match their  
use, adopting commonly used class and rel values as predefined, etc.

Semantic markup for accessibility reasons is a separate issue, but is  
probably also best addressed with a descriptivist approach. After  
all, users want to access the content that actually exists, not the  
content that /should/ be there.

Regards,
Maciej

Received on Sunday, 6 May 2007 10:52:52 UTC