Objection to HTMLWG ISSUE-144 Change Proposal #2 (keep u non-conforming)

This Change Proposal says that there are only two use-cases given in
the other Change Proposal, namely Chinese text and misspelled words.
In fact, that was only the last point of the extensive Rationale, the
remainder of which remains almost entirely unaddressed.  The primary
use-case for <u> is presentational markup, such as in WYSIWYG editors,
or when your client tells you "I want that text underlined over there"
and you aren't being paid to give him a lecture on the importance of
media independence.

The claim that if we make <u> conforming we should also make <font>
<big> etc. etc. etc. conforming does not address the extensive
rebuttal of this argument that I wrote for the other change proposal
(beginning "<u> should not be invalid just because . . .").  I won't
repeat it here, but to summarize the most important points: the length
savings of <u> are greater than for most of the listed elements; <u>
is much more similar to <b> and <i> than to the non-conforming
elements listed; and maybe we should make other presentational markup
valid, but that can be dealt with in a separate bug/issue and isn't
relevant here.

(In my view, any presentational tag that's commonly used and is simply
a handy shorthand for CSS rules should be made valid.  That means <u>
and <tt> should be; <blink> should not because it's not commonly
wanted, and is arguably extremely unwanted; <font> and <center> and
align= should not because they don't follow the CSS model.  This
position is entirely consistent, and could be achieved by adding just
a couple more tags beyond <u>, maybe <tt> and <big>.)

The fact is, <b> is presentational markup too.  The tag name stands
for "bold"; everyone uses it if and only if they want bold text; and
the specification says that <b> is to be used for "spans of text whose
typical typographic presentation is boldened", so it's defined solely
in terms of whether you want it to look bold.  Yes, you can quibble
that "b { font-weight: normal; font-variant: small-caps }" would be
correct according to the official definition, but that's a purely
academic point, because nobody in his right mind is going to do that,
ever.  Just because there might be someone out there who's using <b>
(or <i> or <sub> or <sup> or <small> or anything like that) for an
effect other than the normal one, and the spec technically allows
this, doesn't mean that it's somehow less presentational in the real
world.  The fact that the definition allows room to use <b> other than
for bolding things is not meaningful -- it's playing word games in an
attempt to make it not look like it's presentational, because of a
long-standing crusade in the web standards community against
presentational markup.  In *practice* it still means "bold", just like
it always did.

Presentational markup is not bad per se.  Some typographical effects
are commonly required but have no particular meaning.  Sometimes
authors just want some text to be bold or italic or underlined, and
don't want to have to reason about *why* they want it in some abstract
fashion.  WYSIWYG editors are the only way that almost anyone edits
any rich-text format, including HTML, because presentation does not
require reasoning about anything you can't see before your eyes.
Everyone can understand the difference between "this makes things
bold" and "this makes things italic".  But would *you* be able to tell
when you should use something that "represents stress emphasis of its
contents" vs. "represents strong importance for its contents", if you
didn't already know one was <em> and one was <strong>?

So the real use-case here is presentation, and that's a completely
valid use-case.  Without <u>, we have to use <span
style="text-decoration: underline"> or <span class=u> or something
like that.  Since underlining is one of the commonest stylistic
effects available, there's no reason to declare an existing,
fully-functioning shortcut verboten.  <u> has a demonstrably useful
effect, and it results in much shorter markup than equally
presentational alternatives.

On the other hand, if <b> and <i> are actually semantic and not
presentational, then so is <u>.  The proposed text closely follows the
pattern set by <b> and <i>'s current definitions.  Even if it were the
case that "a span of text to be stylistically offset from the normal
prose" is already represented by <i>, there's no reason given why we
can't have two elements with the same semantics.

I would like to specifically agree with this quote from the change proposal:

"Inconsistent application of rationales leads to very poor language
design, confusing authors ("why is X possible but not the almost
identical Y?" is a common question in such cases)."

Confusing authors is an important concern here, but it's one that
speaks in favor of making <u> conforming.  Authors are all familiar
with word processors and other formatting systems in which bold,
italic, and underline are all prominently available right next to each
other.  Allowing only bold and italics, but not underlining, is sure
to be extremely confusing to authors.  For the sake of consistency and
meeting author expectations, it's important that <b>, <i>, and <u> all
have the same conformance status.

As far as underlines being confusing on the web because of links, I
agree with that, but authors still want them.  This is clear when you
go to any WYSIWYG web editor, since in my experience, they all have
underline buttons.  Nobody has provided any reasoning to suggest that
making <u> invalid will discourage the use of *underlining* -- if we
look at how major web applications are written, it seems much more
likely that people will just switch to <span
style="text-decoration:bold"> or <span class=u>.  I know that when the
issue of moving MediaWiki to future HTML versions came up once a few
years ago, then-lead developer Brion Vibber said something to the
effect of "If some future version of HTML is stupid enough to ban <b>,
we'll just automatically rewrite it to <span style='font-weight:
bold'> to keep the validator happy."

In my experience, this approach is ubiquitous.  The application
developers who care about validity only care insofar as the validator
is happy, and will not be deterred from outputting uglier markup if
necessary.  Look at how many sites diligently wrote <script
type="text/javascript"> for so many years despite the fact that the
attribute had no effect in any browser, just because the validator
told them to.  The same goes for any other useless element or
attribute, like <html>/<head> with no attributes, xmlns="", etc., etc.
 Authors who care about validation are not deterred by having to write
longer markup to achieve the same effect.

So ruling out <u> will only discourage the use of the <u> element
itself, not the use of underlining.  Given that underlining will
happen anyway, there's no reason to make markup more bloated so that
web pages are harder to read.

Responding to Ian Hickson's change proposal:

"The length savings argument is bogus because the alternative to <u>
is not <span style="blablabla"> but simple an appropriate semantic

This is simply wrong, because the most important use-case under
consideration is <u> as a presentational element.  No semantic element
is appropriate, so the correct alternative is <span>.

"The best practice (for accessibility, maintainability, and semantic
analysis) is widely recognised to be the separation of semantics and
styles, which argues against presentational markup such as in this

No evidence or reasoning is provided to back this statement up.  We
are not told who has "widely recognised" this, and more importantly,
we aren't told why.  I would say that it's widely recognized that
inline markup has several major disadvantages, and relying primarily
on CSS is the only reasonable way to write sites these days, but I
dispute the claim that *all* presentational markup is harmful.

Moreover, making <u> invalid would not reduce presentational markup,
it would just shift people to using <span>, as I explain above.  There
is no evidence or reasoning provided to support the implicit claim
that making <u> invalid will reduce the use of presentational markup.

Overall, we have clear author interest in being able to underline text
as a purely presentational matter, as suggested by every WYSIWYG
editor I can remember seeing.  Even if making <u> invalid would
actually discourage presentational markup (which it will not), the
reasons for wanting to discourage presentational markup so absolutely
are entirely theoretical and unsubstantiated.  Since it is undisputed
that many authors want presentational markup for underlining, there
are no grounds for accepting the disputed claim that all
presentational markup is bad in the absence of specific evidence or
reasoning.  Of course, the issue could be reopened at a later date if
anyone produces specific evidence or reasoning that <u> will have some
kind of negative effect, without having to invoke a disputed general

Received on Friday, 1 April 2011 22:15:42 UTC