keeping out semantic facilities for theoretical purity from Robert Burns on 2007-08-31 (public-html@w3.org from August 2007)

From: Robert Burns <rob@robburns.com>
Date: Thu, 30 Aug 2007 22:00:10 -0500
To: "Ben 'Cerbera' Millard" <cerbera@projectcerbera.com>
Cc: "Philip Taylor (Webmaster)" <P.Taylor@Rhul.Ac.Uk>, "HTMLWG" <public-html@w3.org>
Message-Id: <6A091160-B69A-4ADF-88A4-608BE4401DC5@robburns.com>

HI Ben,

On Aug 30, 2007, at 9:00 PM, Ben 'Cerbera' Millard wrote:

>
> Philip Taylor wrote:
>> Suppose that, within
>> a single document, I have instances of book titles, ship's
>> names, scientific names, and foreign words and phrases,
>> all of which are conventionally indicated in printed
>> English by the use of italics.
>
> If italicising this text is sufficient for sighted users to tell  
> they are "special" terms, why would unsighted users need anything  
> extra? ATs could read <i> in a slightly different voice. Or just  
> read it normally and let the surrounding context make clear it's a  
> special term, similar to what a sighted user of a monochrome  
> display would experience.
>
> The capitalisation of things like Mary Rose make clear they the  
> name of something (at least in English). No markup is necessary in  
> this case. ATs could modulate their inflections through the  
> sentence to indicate capitalisation changes as well as with  
> punctuation like commas and semi-colons. (From what I've read, the  
> more sophisticated ones already do this to some extent.)
>
> Foreign words can be marked up with an element that has a lang=""  
> attribute. Multi-lingual ATs already make use of that attribute,  
> although they have [bugs] with it.
>
> I think we should gather feedback from users over the coming years  
> to see wheat the real problems are. My suspicion is that super-fine  
> granularity of markup is an issue of theoretical purity and does  
> not affect real users in day-to-day browsing. But that is just a  
> suspicion.

To me it looks more like the stance of purging these semantics is  
more motivated by theoretical purity. As the specification of a  
semantic language such as HTML, its not theoretical purity that adds  
in the semantic facilities that authors use on a widespread and  
regular basis. Instead its theoretical purity that says these can all  
be reduced to one or two elements to keep the language more pure. By  
specifying facilities for proper nouns, foreign idioms or emphatic  
quotation, we do not force authors to use those facilities. However,  
when we omit them to prevent authors from using such facilities it  
seems the only argument we could make to defend that is an argument   
that we were trying to preserve the theoretical purity of HTML. Its  
difficult to imagine what obstacles for authors would be created by  
including a few more semantic facilities in the language that some  
authors might prefer to use over SMALL, B, and I. I also cannot  
imagine how these additional facilities would create insurmountable  
obstacles for audible UAs and other AT.

Considering, these semantic replacements have worked well for such  
things as underline, strike-through and teletype, its hard to imagine  
what theoretical purity we would offend by doing the same for bold  
and italics and smaller text.

Take care,
Rob

Received on Friday, 31 August 2007 03:01:05 UTC