[whatwg] Deprecating <small> , <b> ?

Tab Atkins Jr. ha scritto:
> On Tue, Nov 25, 2008 at 10:24 AM, Calogero Alex Baldacchino 
> <alex.baldacchino at email.it <mailto:alex.baldacchino at email.it>> wrote:
>     Of course that's possible, but, as you noticed too, only by
>     redefining the <small> semantics, and is not a best choice per se.
>     That's both because the original semantics for the <small> tag was
>     targeted to styling and nothing else (the html 4 document type
>     definitions declared it as a member of the fontstyle entity,
>     while, for instance, <strong> and <em> were parts of the phrase
>     entity), and because the term 'small', at first glance, suggests
>     the idea of a typographical function, regardless any other related
>     concept which might be specific for the English (or whatever else)
>     culture, but might not be as well immediate for non-English
>     developers all around the world. As a consequence, since any
>     average developer could just rely on the old semantics, being he
>     intuitively confident with it, the semantics redefinition could
>     find a first counter-indication: let's think on a word written
>     with alternate <b> and <small> letters, or just to a paragraph
>     first letter evidenced by a <b>, obviously the application of the
>     new semantics here would be untrivial (i.e. an assistive software
>     for blind users would be fouled by this and give unpredictable
>     results). Despite the previous use case would be a misuse of the
>     <b> and <small> markup, yet it would be possible, meaning not
>     prohibited, and so creating a new element with a proper semantic
>     could be a better choice. 
> No matter *what* we do, if there *is* a default style for an element, 
> it will be misused by people.  This is a fact of life.  Defining a new 
> element which is identical to <small> in every way except that it 
> hasn't been misused *yet* is thus a mug's game, because it *will* be 
> misused in the same way as <small>, and then we just have two 
> identical elements for no reason.

I'll start with an example. A few time ago I played around with Opera 
Voice. It seemed to be capable to interpret visual style sheets and 
specifically font styles, so that bold or italics text (so constraint in 
the style sheet, not the markup) were spoken differently from 'normal' 
text, but a paragraph first letter differing from the rest of the word 
(which is a non-rare typographical choice), as far as I remember, caused 
the whole word to be skipped. This suggests me that if we really want a 
'cross-presentation' semantics, we have to keep as far as we can from 
anything having a *main* typographical semantics (as <small> and <b> 
have from their birth). Every language is somehow prone to side-effects 
caused by misuse (i.e. it is possible to cause a big mess in a software 
written in a language allowing to pass a pointer to a function - there 
are tons of examples for language design issues - yet such could be a 
desireable capability), but appropriate choices for both semantics and 
syntax may help to reduce the likelyhood of a misuse.

I think that very likely both <b> and <small> will carry on their old 
semantics, so being more prone to misuse with respect to their new one, 
since very likely a lot of developers are, and will rest, more confident 
with their original semantics, which is also suggested by their names 
('b' standing for 'bold' and 'small'... for something small on the 
screen or on paper). Instead, a new element would require the developer 
to take some effort at least to learn about its existence, so he would 
read that such element primary use is to indicate a different importance 
of a piece of text, so that a non visual user agent can present it in an 
appropriate manner, and a visual or print user agent can render it in 
different ways. Ah, the default style could be slightly or very 
different from the <small> one, i.e. the text could be surrounded by 
parenthesis or hyphens, despite of the font size (and the new elements 
could be designed such to accept just non-empty strings consisting of 
more than one non-spacing character).

> Yes, bad markup will foul up semantic agents.  But people will 
> *always* write bad markup.  At least with the semantic redefinition we 
> get to declare lots of usages that *are* appropriate to be conforming 
> without any effort on the author's part.
> And really, the type of people who would write a word with alternating 
> letters wrapped in <b> and <small> tags are hardly the kind to even 
> *care* about semantics.

Let me reverse this approach: what should an assistive user agent do 
with such a <b>M</b><small>E</small><b>S</b><small>S</small>? I think 
that dealing with that word as normal text would be a more gracefull 
degradation than discarding it, and if we clearly state that <b> and 
<small> have only typographical semantics, while different elements are 
provided to differentiate the grade of emphasys of a phrase, an 
assistive user agent could support a better behaviour, while any author 
disregarding semantics would not cause any trouble (the <b> and <small> 
wrapped alternating characters example may be unrealistic, but a 
paragraph could actually start with a bold and bigger first letter using 
<b> and <font> instead of style sheets).

>     But, you're right, we have to deal with backward compatibility,
>     and redefining the <small> and <b> semantics can be a good
>     compromise, since a new element would face some heavy concerns,
>     mainly related to rendering and to the state of the art
>     implementations in non-visual user agents (and the alike).
>     However, I think that a solution, at least partial, can be found
>     for the rendering concern (and I'd push for this being done
>     anyway, since there are several new elements defined for HTML 5).
>     Most user agents are capable to interpret a dtd to some extent, so
>     it could be worth the effort to define an html 5 specific dtd in
>     addition to the parsing roules - which aim to overcome all
>     problems arising by previous dtd-only html specifications - so
>     that a non html5-fully-compliant browser can somehow interpret any
>     new elements. HTML 5 Doctype declaration could accept a dtd just
>     for backward compatibility purpose, and any fully compliant user
>     agent would just ignore such dtd. More specifically, such a dtd
>     could define default values for some attributes, such as the style
>     attribute (to have any new element properly rendered - some
>     assistive technologies are capable to interpret style sheets too),
>     and, anyway, there should be a way, in SMGL, to create an alias
>     for an element (i.e., a new element - let's call it <incidental> -
>     could be aliased to <small> for better compatibility).
> Html5 is no longer an SGML language.

I know, and agree with the basic reasons; however I think that deriving 
an SGML version (i.e. by adding new entities and elements, as needed, to 
an html 4 dtd) should not be very difficoult, and could be worth the 
effort (i.e. to graceful degrade the presentation of a menu element 
thought as a context menu, wich content should not be shown untill a 
right click happens - if the u.a. cannot handle it, not showing it at 
all could be a reasonable behaviour). The derived sgml version should be 
aimed just for older browsers, while "newer", html 5-aware ones should 
just ignore any dtd reference. I'd consider this chance, at least on the 
fly - I suspect that the complete break out with the earlier sgml 
specifications might carry in an undesireable side-effect: from one side 
it solves the problems arised from sgml partial support/bad 
implementation and from browser-specific quirks, but from the other side 
no mechanism is provided to make sgml-somehow-based user agents to gain 
whatever awareness on the newly defined elements.

>     Let's come to the non-typographical interpretation a today u.a.
>     may be capable of, as in your example about lynx. This can be a
>     very good reason to deem <small> a very good choice. But, are we
>     sure that *every* existing user agent can do that? If the answer
>     is yes, we can stop here: <small> is a perfect choise. Better:
>     <small> is all we need, so let's stop bothering each other about
>     this matter. But if the answer is no, we have to face a number of
>     user agents needing an update to understand the new semantics for
>     the <small> tag, and so, if the new semantics can be assumed as
>     *surely* reliable only with new/updated u.a.'s (that is, with
>     those ones fully compatible with html 5 specifications), that's
>     somehow like to be starting from scratch, and consequently there
>     is space for a new, more appropriate element.
> I don't understand.  If some obscure UA can't extract an appropriate 
> meaning from <small> and come up with a device-appropriate rendering, 
> why does that mean we should have a new element?

Smylers himself stated that if we had to create html from scratch 
'small' might not be the best name for an element with the semantics he 
was suggesting, but it is a good choice because we are dealing with an 
evolving language and its backward compatibility issues. He also said 
'small' is good because most non-visual, non-printing user agents, such 
as textual ones (as lynx), are capable to interpret <small>/<b> in a 
suitable manner. From this point of view, I think that the 'goodness' of 
<small> might depend on the real number of UAs capable to avail of it 
without any trouble in most situations (specifically, the real number of 
textual/assistive UAs giving to <small> the same semantics as html 5 
specs should redefine); if we could find a big-enough number of 
situations (i.e. <small> use cases) where the behaviours vary a lot both 
from UA to UA and, for each UA, from the wanted behaviour for the 'new' 
semantics, we could also conclude that such a semantics needs to be 
added to any existing UA to be really reliable, so we had not to discard 
the idea of a new element for such a semantics with the rationale of  a 
backward compatibility or "state of the art" implementation. However, I 
understand such a casistic might be untrivial to estimate, but even for 
this reason, taking a conservative position with respect to a worst case 
scenario, I wouldn't disregard the opportunity to create a new element 
with the right semantics, instead of changing/adding semantics to an 
existing one.

>     Apart from considering that <b> isn't a good choice in such a case
>     (<strong> or <em> are far better, since they were born with the
>     proper semantics), personally I hope in the near future every user
>     agent (or at least any thought for human interaction) will be
>     style-sheets compatible (meaning at least capable to draw
>     importance and order from them), so that anything related to
>     presentation, in any presentation media, would be separable from
>     content.
> No, Smyler's example was referencing things that specifically should 
> *not* be marked up with <strong> or <em>.  They're not being 
> emphasized nor are they of greater importance than the rest of the 
> text - they are merely purposely offset from the surrounding text for 
> some reason (besides emphasis or importance). 

Here it is me not understanding. I think that any reason to offset some 
text from the surrounding one can be reduced to the different grade of 
'importance' the author gives it, in the same meaning as Smylers used in 
his mails (that is, not the importance of the content, but the relevance 
it gets as attention focus - he made the example of the English "small 
print" idiom, and in another mail clarified that "It's less important in 
the sense that it isn't the point of what the author wants users to have 
conveyed to them; it's less important to the message. (Of course, to 
users any caveats in the small print may be very important indeed!)"). 
 From this point of view, unless we aimed to avail of <b> as an 
intermediate grade of relevance between 'normal text' and 'em/strong' 
(but, aren't these enough to attract a reader's attention?), redefining 
its semantic might be redundant with lesser utility. (In my crazy mind, 
this applies to the headings too, since a 'good' heading focuses 
attention on the core subject of its following section, so have to be 
evidenced as an important slice of text). Furthermore, I meant that 
<strong> and <em> would have been a better choice than <b> in Smylers' 
examples because their *original semantics* is very close together with 
that of "a more relevant text/a text needing greater attention", while 
<b> *original semantics* is very different and needs to be redefined for 
this purpose (but we have still got possible alternatives to this).

Anyway, I'm not against a possible redefinition of <b> and <small> 
semantics, but just aiming to deeply explore any alternative (such as 
introducing new elements) while the specifications are in their draft 
state. Just trying to give an alternative point of view with some valid 
argumentations, if I can find some, nothing more (and hope I'm not 
giving a different impression). Best regards.

Calogero Alex Baldacchino
 Email.it, the professional e-mail, gratis per te: http://www.email.it/f
 Prison Break: dalla TV il gioco per cellulare! Evadi dalla noia!
 Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid=8274&d=25-11

Received on Tuesday, 25 November 2008 13:08:30 UTC