W3C home > Mailing lists > Public > www-html@w3.org > April 2007

Re: HTML5 script start tag should select appropriate content model according to src

From: Henri Sivonen <hsivonen@iki.fi>
Date: Tue, 24 Apr 2007 11:34:23 +0300
Message-Id: <853EC51A-3D8A-4F22-B66E-49E72A6F0B28@iki.fi>
Cc: XHTML-Liste <www-html@w3.org>
To: Patrick H.Lauke <redux@splintered.co.uk>

On Apr 23, 2007, at 23:22, Patrick H. Lauke wrote:

> Henri Sivonen wrote:
>> It's the same thing on other visual media, including screens, when  
>> the semantics are presented by italicizing. It's not like J.  
>> Random reader views source to see if a given run of text was  
>> marked up as <i>, <em>, <cite>, <dfn> or <var>.
> ...
>> If the UA doesn't present the distinctions to the reader, marking up
>> semantics is useless as far as the human reader is concerned.
> So, it's a shortcoming in user agent support.

Only if you assume that people reading from screens need more  
disambiguation than people reading from paper.

> Moving beyond the visual, screen readers for instance can  
> (depending on settings) differentiate between <i> and <em>, and  
> treat them differently (the latter resulting in a change of volume  
> and/or inflection of the spoken output).

I don't have personal experience to comment on that, but I wasn't  
surprised about T.V Raman having the same rule for both <i> and <em>:

After all, there are a lot of notable creation tools that map italics  
to <em>. (See below.)

>> It isn't particularly useful to try to make moral right/wrong  
>> arguments about the behavior of Web authors on the aggregate. To  
>> get the masses do something, there need to be good incentives.  
>> There's no point in bearing the cost of marking everything up  
>> diligently if there isn't a payoff that is reasonable compared to  
>> the cost.
>> Honestly, I can't make the case to my mother why she should bother  
>> to mark up anything as <cite> instead of just pressing command-i  
>> in Dreamweaver.
> The masses will use authoring tools/environments. As long as those  
> tools offer access to <i>, but not to the more semantic  
> alternatives, it's obviously futile to expect the masses to use  
> more appropriate markup.

Dreamweaver MX, by default, maps command-i to <em>. I guess it makes  
the output "more semantic" to some. Of course, as far as markup  
consumers are concerned, it makes sense to treat <em> as an alias for  

> The payoff is the usual chicken and egg conundrum: tools to further  
> extract and manipulate semantic data can be built right here, right  
> now, but until a sizeable amount of web content out there is  
> actually semantic, they won't be built...and vice versa.

Yes, but some of the theoretical use cases aren't just realistic for  
productization anyway and in some cases heuristics would work more  

> This is the argument you hear from AT manufacturers when they say  
> that their tools rely on heuristics for many things, rather than  
> structural markup.

The key question is whether the heuristics work well enough and what  
the marginal benefit of more authoring effort would be. I don't know  
if they work well enough. But if they do, what's the problem?

>> It isn't the same. Headings are more common than e.g. taxonomical  
>> names and are related to things like intra-document navigation  
>> using outlines, etc. Therefore, it is quite reasonable to include  
>> markup for headings but leave markup for taxonomical names on the  
>> other side of the cutoff.
> Hmmm...sounds like we may need a markup language that is  
> extensible, as the cutoff point may be different for different  
> audiences/purposes.

HTML5 allows the class attribute to be used for communicating  
granular semantics within special-interest communities.

>> No, don't ask them. See what they actually do. In the latter case,  
>> the is actually an HTML element (<dfn>), so the usage frequency  
>> could be measured.
> Is <dfn> readily and clearly available in authoring environments?

In OpenOffice.org Writer/Web it is, for example. Not as conveniently  
as command-i, though, of course.

Anyway, <dfn> has been available in the HTML spec for years so  
technical writers who see its value could use it if they cared to.

According to a survey of several billion pages done at Google in  
September, <i> is used on 178 times larger number of  pages than  
<dfn> and <em> is used on 80 times larger number of pages than <dfn>.  
Curiously, <dfn> is used on a larger number of pages than <var>, even  
though e.g. Nvu make only the latter available in the UI as far as I  
can tell. But the most interesting statistic is that <dfn> is used on  
fewer pages than <zeroboard> (don't ask), <st1:place> (no clue) and  

>> Even though the editor of the spec may mine this mailing list for  
>> feedback from time to time and even though Lachy and I are now  
>> engaging in this thread, posting to the WHATWG list is still a  
>> better way to get heard.
> The fundamental discussions around whether or not <i>, <b>, <sub>,  
> <sup> etc are presentational or not have been going around for  
> years...not just in this particular thread.

Since 1993 if not 1992. From the 1993 IIIR draft:

          This text contains an <em>emphasized</em> word.
          <strong>Don't assume</strong> that it will be italic!
          It was made using the <CODE>EM</CODE> element. A citation is
          typically italic and has no formal necessary structure:
          <cite>Moby Dick</cite> is a book title.


Henri Sivonen
Received on Tuesday, 24 April 2007 08:34:27 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:06:15 UTC