Working Group Decision on ISSUE-144 conforming-u

The decision follows.  The chairs made an effort to explicitly address
all arguments presented in the Change Proposals on this topic in
addition to arguments posted as objections in the poll.

*** Question before the Working Group ***

There is a basic disagreement in the group as to whether or not the
HTML5 specification should allow the <u> element to be conforming.  The
result was an issue, two change proposals, and a straw poll for
objections:

http://www.w3.org/html/wg/tracker/issues/144
http://www.w3.org/html/wg/wiki/ChangeProposals/UShouldBeConforming
http://lists.w3.org/Archives/Public/public-html/2011Mar/0643.html
http://www.w3.org/2002/09/wbs/40318/issue-144-objection-poll/results

== Uncontested observations:

The "<u> should be conforming" Change Proposal made the following
observations:

* If <u> is conforming, authors will have an excuse not to use
   appropriate markup for applying underlines. (e.g. insertion,
   emphasis, etc.)

* Use cases such as proper noun marks in Chinese and misspelled words
   are pretty rare.

* Underlines are confusing on the web because of links, but authors
   still want them

* Making HTML a semantic language rather than a presentational language
   is a design goal of the language

None of these were decisive.  There were people who supported the "<u>
should be conforming" Change Proposal even after taking these facts
into consideration.  The fact that they were acknowledged up front was
appreciated.

=== Objections on the basis of consistency

We start off with this claim:

   We have to take a holistic approach to the language or we will be
   forced to create a "compromised by committee" language that is
   internally inconsistent

As cited by the "<u> should be conforming" Change Proposal, the onus is
on Working Group members to identify specific inconsistencies that any
proposal would create, and to object on that basis.  In this case, the
lack of objections is clearly not a problem, as there were a number of
statements made on this topic.  A representative sample of such
statements:

   If we were to address the use case of content generated by an
   authoring agent, the same argument should be applied to <font>,
   <big>, <layer>, <blink>, <tt>, <center>, align="", etc, yet nobody is
   making such a case, suggesting that this rationale is not being
   consistently applied.  Inconsistent application of rationales leads
   to very poor language design, confusing authors ("why is X possible
   but not the almost identical Y?" is a common question in such cases).

   Confusing authors is an important concern here, but it's one that
   speaks in favor of making <u> conforming.  Authors are all familiar
   with word processors and other formatting systems in which bold,
   italic, and underline are all prominently available right next to
   each other.  Allowing only bold and italics, but not underlining, is
   sure to be extremely confusing to authors.  For the sake of
   consistency and meeting author expectations, it's important that <b>,
   <i>, and <u> all have the same conformance status.

   the editor refuses to define the <u> element in similar way and
   claimed that <u> is far more presentational than <b> and <i> without
   giving details

   <u> is more comparable to other tags that were declared valid, such
   as <b>, <i>, <s>, and so on. It's more inconsistent to leave it
   invalid than to make it valid.

   For italics, bold and strike-through, there are <i>, <b> and <s>. It
   is weird to pretend that <u> isn't likewise available

   Fundamentally, I think having <b>, <i> but not <u> is a better
   inconsistency then the inconsistency of where use cases are strong or
   not (it gives normal Web authors a big surprise), so I don't think
   any argument based on consistency applies.

In all, we find that there is no consensus on which change would result
in the most consistency, and furthermore there is no consensus on what
would confuse authors least.  As such, any and all objections within
the scope of this issue on the basis of consistency were found to be
weak.

=== Objections on the basis of presentational markup

Next, we have a number of arguments on the basis on what is markup is
to be considered 'presentational' and what markup is to be considered
'semantic'.  Again, a representative sample of the arguments:

   What matters as to whether it's presentational is what the spec
   defines it as.

   The best practice (for accessibility, maintainability, and semantic
   analysis) is widely recognised to be the separation of semantics and
   styles, which argues against presentational markup such as in this
   proposal.

   No evidence or reasoning is provided to back this statement up.  We
   are not told who has "widely recognised" this, and more importantly,
   we aren't told why.  I would say that it's widely recognized that
   inline markup has several major disadvantages, and relying primarily
   on CSS is the only reasonable way to write sites these days, but I
   dispute the claim that *all* presentational markup is harmful.

   Presentational markup is not bad per se.  Some typographical effects
   are commonly required but have no particular meaning.  Sometimes
   authors just want some text to be bold or italic or underlined, and
   don't want to have to reason about *why* they want it in some
   abstract fashion.  WYSIWYG editors are the only way that almost
   anyone edits any rich-text format, including HTML, because
   presentation does not require reasoning about anything you can't see
   before your eyes.  Everyone can understand the difference between
   "this makes things bold" and "this makes things italic".  But would
   *you* be able to tell when you should use something that "represents
   stress emphasis of its contents" vs. "represents strong importance
   for its contents", if you didn't already know one was <em> and one
   was <strong>?  ... So the real use-case here is presentation, and
   that's a completely valid use-case.

   I strongly object to the positive effect section. Deprecating <u>
   won't help Web authors migrate <u>s to other semantic markup as
   text-level semantic elements are very hard to choose from (what's the
   difference between <em> and <strong>?), while deprecating <center>
   has a value because the section can be main content (no markup), or
   <section>/<header>/<footer>/etc.

Again, we do not find agreement.  While we find that there is agreement
to discourage unpopular presentational features of the language,
different members of the working group appear to have differing
standards as to where one draws the line on what use cases are common
enough to be standardized anyway.

   Accessibility: semantics are easier to map to media-specific
   presentations (e.g. speech synthesis) than are media-specific styles
   (e.g. visual styles) because to map a media-specific style to another
   medium's styles one has to first determine the meaning of the styles,
   which is an unsolved  computer science artificial intelligence
   problem. For example, does the underline indicate importance, which
   should be mapped to a more deliberate speech pattern, or is it merely
   an aethetic effect, which should not map to anything? Does it
   indicate a link, which should be clearly denoted  (e.g. with audio
   icons), or does it indicate a stress emphasis, which should merely be
   mapped to a slightly altered voice?"

This argument would be a strong argument if it also covered styling
added using CSS.  Lacking a proposal for how authoring tools should
deal with underlining and lacking evidence that such authors and
authoring tools would be willing to migrate to such, this argument was
considered weak.

   Maintainability: Should the author (or the author's employer/client)
   decide that actually underlining all the headings was a mistake and
   they should instead be italics, the change can be trivially
   implemented if the markup is semantic rather than stylistic: simply
   change headings to be italics rather than underlined. If, instead, a
   stylistic element is used within the pages each time an underline is
   required, the author is going to have to go through every part of
   every page changing just the underlines that correspond to headings.
   This would take orders of magnitude more time. Given this, separating
   semantic markup from styles is therefore the best practice for
   maintainability."

It is uncontested that there is a trade-off here.  Clearly authors and
authoring tools are making a different trade-off en masse to date.
Lacking a proposal for an alternative and evidence that authors and
authoring tools would be willing to migrate to such, this argument was
considered weak.

   Semantic analysis: As with accessibility, the ability for a computer
   to distinguish underline when used for a proper name mark, when used
   to indicate a hyperlink, when used to indicate emphasis, when used to
   indicate italics in a manuscript, when used to indicate a spelling
   error, and so forth, requires artificial intelligence at the cutting
   edge of natural language research (or beyond). To allow semantic
   analysis to be performed by those who do not have access to the
   latest and greatest research, and indeed to enable semantic analysis
   to be done at all in many cases given the state of this research, the
   input markup must include at least basic hints as to the meaning
   implied by the presentation. As such, separating semantic markup from
   styles is therefore the best practice for enabling semantic analysis.

This argument would be a strong argument if it also covered styling
added using CSS.  Lacking a proposal for how authoring tools should
deal with underlining and lacking evidence that such authors and
authoring tools would be willing to migrate to such, this argument was
considered weak.

   HTML is a media-independent semantic markup language

This is an assertion over which we have not found consensus within the
working group.  Again, there is general agreement to encourage and
prefer semantic markup whenever possible.  The point of disagreement is
whether or not it is possible in the specific case of <u>.

   The use case of "stylistically offset" is already entirely handled by
   <i>.

"stylistically offset" was not the use case presented, and the
assertion that <i> is an adequate substitution semantically for <u> is
disputed.

=== Objections on the basis of Use Cases

The primary assertion in dispute concerns use cases.  In one case the
assertion is:

   There is no new use cases addressed by the <u> element

vs

   The use cases of this element are mainly content generated by
   authoring tools

In support of the Change Proposal to "make the <u> element conforming",
we have the following list of products which make use of this element:
Internet Explorer (tested in 9 RC), Firefox (tested in 4b11), Chrome
(tested in 10.0.642.2 dev), Safari (tested in 5.0.3), Opera (tested in
11.00), Thunderbird 3.17, Word 2002, GMail, and OpenOffice.org 3.2.0.
Notably with Firefox emitting the <u> element is an option and not the
default.  This list was not intended to be complete.

While the rationale as to why these tools may emit such markup was
missing and/or speculative (example: length savings), no evidence was
provided that any of these tools has any interest in changing the
markup produced.

Furthermore, evidence was provided that the <u> element is the sixth
most commonly used phrase element on the internet.  The fact that this
element is evidently popular leads us to conclude that that a large
percentage of the authoring community weigh the value of the <u>
element against the "widely recognized best practice (for
accessibility, maintainability, and semantic analysis)" of "separation
of semantics and styles", they feel that the previously cited
consistency with <b> and <i> is more important to them than this best
practice.

Finally, lacking any evidence that there is an alternative to the <u>
element that a substantial portion of the cited tools above would be
willing to migrate to, objections on the basis of impact to existing
tools were taken to be stronger than that arguments on the basis of
architectural purity.

As this point, we have a strong objection to Change Proposal that
"There is no new use cases addressed by the <u> element", so we turn to
evaluate the remaining objections to the "<u> should be conforming"
Change Proposal.

=== Objections to the "<u> should be conforming" Change Proposal

First we have an objection based on user confusion:

   An underlined text which is not a hyperlink confuses the user in
   his/her browsing experience.

   The <u> element causes confusion and poor readability as "...users
   are trained to click on underlined things"; Making the "u" element
   conforming would encourage underlining text that is not a link;
   Preserving underlines exclusively for use on links is particularly
   important for people with low vision [7], color blindness, and
   monochrome displays.

This is uncontested and is a valid objection.  However, the fact that
nobody proposed removing "text-decoration:underline" from CSS makes
this objection less compelling.  Despite this, it would be still be
considered a strong objection if there were any evidence that making
this non-conforming would cause authoring tools to stop providing the
option to underline text.  From the poll:

   Nobody has provided any reasoning to suggest that making <u> invalid
   will discourage the use of *underlining* -- if we look at how major
   web applications are written, it seems much more likely that people
   will just switch to <span style="text-decoration:bold"> or <span
   class=u>.

Firefox (cited above) is a concrete example of a tool that would rather
change the markup used to signal underlining to be one that does not
trigger validation errors rather than remove the option to underline
entirely.  WikiMedia was listed as a single example of a tool that does
not provide an option for underlining.

Onto the next objection:

   Underlining cuts through the descenders of some text characters,
   which can interfere with on-screen readability. Underlining text can
   clutter and make reading difficult.

As nobody is proposing eliminating underlining of links, this objection
was treated as weaker than the objection that underlined text can be
confused with links.

   underlining is not a common typographic effect except for
   indicating hyperlinks

We have a study which indicates that <u> is the sixth most popular
phrase element.  Lacking any evidence that underlines are not common,
this objection was not given any weight.

   The first hit for the word "underlining" on Google: ...explicitly
   indicates that italics and underlining are equivalent.

The fact that italics and underlining are considered to be equivalent
in formal English is a valid objection.  We also have evidence that
underlining is considered a distinct punctuation mark in formal
Chinese.  However, lacking any evidence that there is an alternative to
the <u> element that a substantial portion of the cited tools above
would be willing to migrate to, objections on the basis of impact to
existing tools were taken to be stronger than that arguments on the
basis of linguistic purity.

   <u> as proposed has essentially the same meaning as <i>, and there is
   no length saving between the 'u' and 'i'."

The assertion is disputed, and at best constitutes a weak objection.

Additionally we have objections on the basis of backwards
compatibility, potential use as "fallback styling", and requiring
conformance checkers to produce errors that would mask ones that are
far more important.  The strength of these claims were not evaluated
further as they were objections to the "There is no new use cases
addressed by the <u> element." Change Proposal and thus were not needed
in order to identify the proposal that draws the weakest objections.

*** Decision of the Working Group ***

Therefore, the HTML Working Group hereby adopts the "<u> should be
conforming" Change Proposal for ISSUE-144:

   http://www.w3.org/html/wg/wiki/ChangeProposals/UShouldBeConforming

Of the Change Proposals before us, this one has drawn the weaker
objections.

== Next Steps ==

Bug 10838 is to be REOPENED and marked as WGDecision.

Since the prevailing Change Proposal does call for a spec change, the
editor is hereby directed to make the changes in accordance to the
change proposal.  Once those changes are complete ISSUE-144 is to be
marked as CLOSED.

We further wish to comment on this statement made in the Change
Proposal that "<u> should be conforming":

   If the Working Group decides to make <u> valid, and the editor
   believes this harms consistency, he is free to make other markup
   conforming to restore consistency

We continue to encourage all members of the working group, including
the editor, to advocate for changes they believe in.  This includes
noting potential inconsistencies and proposing changes which would
resolve them.  Reviewing the survey and the change proposals, we do not
find that the argument for consistency was made.  Furthermore, at this
point in the development of HTML5 we strongly discourage changes which
were not proposed and over which the group has not found there to be
consensus to be made to the draft at this time.

== Appealing this Decision ==

If anyone strongly disagrees with the content of the decision and would
like to raise a Formal Objection, they may do so at this time. Formal
Objections are reviewed by the Director in consultation with the Team.
Ordinarily, Formal Objections are only reviewed as part of a transition
request.

== Revisiting this Issue ==

This issue can be reopened if new information come up. Examples of
possible relevant new information include:

* Evidence that there is an alternative to the <u> element that a
   substantial portion of the authoring tools would be willing to
   migrate to.  Ideally this would either include the majority of the
   tools cited by the "<u> should be conforming" Change Proposal or an
   explanation as to why the list of tools provided is a better list to
   base a decision on.

* Evidence that substantial portion of the authoring tools would be
   willing to drop underlining entirely.  Again, ideally this would
   either include the majority of the tools cited by the "<u> should be
   conforming" Change Proposal or an explanation as to why the list of
   tools provided is a better list to base a decision on.

=== Arguments not considered:

These objections were linked from the survey:

   The fact is, <b> is presentational markup too.

This "fact" is disputed. The editor points out that the <b> element is
is not just "bold" it is a definition that applies to multiple media in
a way that an author can clearly distinguish when this element should
be used vs other elements such as <i>, <strong>, <dfn>, et al.

   The specification says that <b> is to be used for "spans of text
   whose typical typographic presentation is boldened", so it's defined
   solely in terms of whether you want it to look bold.

Again, this "fact" is disputed. The editor points out that it isn't
defined "solely in terms of whether you want it to look bold", since
the mention of bold is literally only one of 3 non-normative examples
of the actual definition.

   Yes, you can quibble that "b { font-weight: normal; font-variant:
   small-caps }" would be correct according to the official definition,

Again, this "fact" is disputed.

   The fact that the definition allows room to use <b> other than for
   bolding things is not meaningful -- it's playing word games in an
   attempt to make it not look like it's presentational

Assessing whether or not "word games" are being played here is not
necessary in order to identify the proposal that generates the weakest
objections.

   They also have "text color" buttons, but we don't support that.

This is outside of the scope of this issue.  It does indirectly apply
to the consistency argument above, but not in a way that would change
the overall conclusion.

   Authors who care about validation are not deterred by having to write
   longer markup to achieve the same effect.

This is a disputed generalization that we did not have to evaluate in
depth in order to identify the proposal that draws the weakest
objection.

   Transitioning authors away from using <u> for purely presentational
   effects should be an educational effort; evolution not revolution -
   http://www.w3.org/TR/html-design-principles/#evolution-not-revolution"

We are looking for objections to the proposals presented.  At best this
would support the decision.

   I have never heard arguments against stylistic markup and stylistic
   attributes when it comes to SVG. But when it comes to HTML, then
   there are all kinds of purity arguments with regard to which elements
   to use

Comparison to SVG is off-topic.  Feel free to identify specific
arguments as weak.  But lacking in specifics, this objection was not
evaluated further.

Additionally, we have a number of quotes from experts who mention that
separating presentation from content is a "best practice".  The
specific arguments were addressed in this decision.  The credentials of
the individuals making the arguments was not evaluated.

Received on Friday, 8 April 2011 18:13:51 UTC