Re: Cleaning House

At 10:39 AM 5/7/2007 +0100, Benjamin Hawkes-Lewis wrote:
>We seem to be struggling over the word "forms". Let me pull back and try 
>restating my position. Sometimes bold and italic are used to distinguish a 
>phrase as more important than the surrounding text. Sometimes they are 
>used to demarcate a phrase as /different/ to surrounding text (e.g. ship 
>names, foreign phrases). In both cases they are forms of typographical 
>emphasis (Wikipedia's sense of emphasis). Only in the first case are they 
>expressions of stress emphasis (the common-usage sense of emphasis). The 
>question is which of those two definitions is relevant to <em>. Are you 
>with me so far?

I see the distinction that you are making between two different ways that 
italics are used.
I consider both to be forms of emphasis, but I do agree that sometimes I 
want to
highlight text because convention suggests that it is good form to do so 
and my aim is
to subtly signal the reader that this phrase is different somehow, and at 
other times I want
to catch the reader's attention more overtly.

>Now its very existence suggests that <em> has some purpose beyond <i>; and 
>the early discussion from www-talk I quoted demonstrated that this 
>difference between stress emphasis and other uses of italic and bold was 
>recognized by the correspondents. So I don't think associating <em> with 
>the stress emphasis is unreasonable.

No, not unreasonable on the face of it, but misleading. I don't think that 
most people
think of <em> that way. They have been told to use <em> instead of <i>, not 
with emphatic phrases, but because <em> is somehow 'more semantic' than <i>
and therefore its users are more virtuous.  The problem is now people who 
use <em>
are just as likely to be using it virtuously as they are to be trying to 
avoid hell.

Anyway, here we are with two elements <em> and <strong> which are somehow
supposed to express the range of emphatic phrases, i.e. emphasized and strongly
emphasized. Curious how there are just the two of them, corresponding to 
and bold.

Curiouser yet that <u> can be intrepreted as presentational and semantic 


1 : to draw a line under : UNDERLINE
2 : to make evident : EMPHASIZE, STRESS <arrived early to underscore the 
importance of the occasion>

So, we are discovering a range of stress emphasis:

         <em>    Emphasize or stress phrase
         <strong> Strongly emphasize phrase
         <u>     Underscore (a form of stress emphasis)
         <b>     Boldly emphasize
         <small> De-emphasize
         <strike> De-emphasize


>That's not the same thing. A lot of HTML authors (though still probably 
>not the majority if you think about user-generated content and HTML email 
>authors) are aware that elements can be transformed by CSS. What most of 
>them don't think about is fallbacks and user settings, which is what I 
>thought you were talking about ("if bold and italic are unavailable or 
>undesirable"). Witness the general assumptions that people can see images, 
>can tell the difference between colors, have screens of a certain size, 
>use Internet Explorer or Firefox, have JavaScript enabled, etc.

Fallbacks are what make <i> and <b> more useful to me than <span>.
I can write <i class="shipName"> or <span class="shipName">
As an author, I am going to use <i> everytime because I rely on the
fallback to ensure that the presentation that I intended is preserved.
Fallbacks make <em> and <strong> equally useful because they
fall back to italic and bold. However, <b> and <i> are shorter.

I think that the point I was trying to make was that it was misleading
to claim that <i> could only be rendered with an italic typeface.

What has come up through this discussion, but has not much been followed up
is my suggestion that CLASS attributes (or pick your fave) could be used to
provide layers of useful semantics onto primitive elements like <i>, <b> 
and <span>.

I find that the presence of <em> offers me no advantage in understanding 
the reason
that some phrase was marked up. Context and precedence serve as a much 
better guides
because the subtlety of <em> is lost on most authors.

But if you could give me a reliable way to encode useful semantic tokens into
HTML elements, then you would have something worth discussing.

What I am saying is "Stop picking on <b> and <i>, accusing them of being
non-semantic, when <em> and <strong> are barely semantic and certainly
not in a way that is proving especially useful to anyone."

>>I am qualified to say that you can redefine <b> to red and <i> to green
>>and aural and Braille readers can ignore or re-map them too.
>Of course. Although because <i> can be used without stress and <em> is 
>often misused without stress, many aural and braille remappings will be 
>erroneous. I doubt most authors know about such remappings though.

Let's give authors more and better choices then. Let's give them a way to say
what they mean. And if they want to say <i>, then let's let them. And if they
want to say <span class="stressEmphasis> then let's let them do that too.

And no, I do not want to standardize CLASS names. I want a reliable way
to define profiles and have them understood, along with links to related 
And that may not scale perfectly, and yes there will be collisions, but it can
work quite well in controlled settings.


>  Maybe (since you were there) you'd care to recollect for us /why/ <em> 
> was introduced in the first place, seeing as <i> was already allowed to 
> fallback to non-italic representations?

Ha! Ha! Is your intent to draw attention to the fact that I am a dinosaur? :-)
I am actually just a bit younger than Tim Berners-Lee and older than Bill 

I cannot speak to the moment that a decision was made to include <i> and <em>.

I can only say that as an author and as a manager of writing departments,
I want to ensure that authors can write easily and freely, and come back
to encode with semantic tags or attributes later. It is a question of 

I think that there should be a full set of document publishing primitives.

With these in place, the rate of semantic pollution may decrease over time.

                         - 30 -

Received on Monday, 7 May 2007 20:51:07 UTC