Re: HTML5 Authoring Conformance Study from Aryeh Gregor on 2010-03-22 (public-html@w3.org from March 2010)

From: Aryeh Gregor <Simetrical+w3c@gmail.com>
Date: Mon, 22 Mar 2010 16:48:54 -0400
To: Maciej Stachowiak <mjs@apple.com>, Henri Sivonen <hsivonen@iki.fi>
Cc: Shelley Powers <shelley.just@gmail.com>, HTMLwg WG <public-html@w3.org>, Sam Ruby <rubys@intertwingly.net>
Message-ID: <7c2a12e21003221348o4f6cc98blb39ac3a73234cd5b@mail.gmail.com>

On Sun, Mar 21, 2010 at 10:09 AM, Maciej Stachowiak <mjs@apple.com> wrote:
> I think those are good questions. I am personally more interested in asking
> them about specific requirements than in general form. It seemed like Aryeh
> at least felt that there may be good reasons for conformance requirements
> besides interoperability, namely helping the author by spotting likely
> mistakes, and user vigilance for such issues as nonstandard behavior.

Discussion here is mostly focusing on validators as lint-like tools,
to help authors (or possibly users) in some concrete way.  I think
there are at least two other reasons for many current authoring
requirements: maintaining language coherency, and changing author
behavior.

First, trying to keep the language coherent.  Take duplicate id's.
Let's say the spec fully described the behavior of id's, without
telling you that they're supposed to be unique.  Then someone could
read the HTML5 spec and not realize that id's are even *supposed* to
be unique.

Conceptually, an id is *meant* to be a unique identifier.  Yes, it's
possible to have duplicates, but it creates all sorts of weird
effects.  The most comprehensible way for the spec to describe the
behavior of id is to say that it's *supposed* to be unique.  Then you
can understand the behavior: it behaves as you would expect a unique
identifier to behave, with undesired effects if there's a duplicate.

The spec would become harder to understand and use if it weren't clear
that duplicate id's are not supposed to happen.  Similarly,
presentational elements are currently segregated into their own
section, and it's made clear that these are from a bygone era and CSS
is the preferred way to do things.  This, too, is important to
understanding why HTML works as it does.  And so on.

Of course, this concern is unconnected to what a validator reports.
The spec could just as well say "Note: The value is expected to be
unique" instead of "The value must be unique".  But a lot of things
that are currently authoring conformance requirements do need to be in
the spec in some form, even if just informatively.

There's at least one further reason to have a conformance requirement:
to try to change author behavior.  Some authors and users, rightly or
wrongly, view W3C validation as important, and will (respectively) try
to get pages to validate, or complain if they don't.  (In my
experience, people who just browse websites don't complain if pages
don't validate, but many people who download and install web apps do.
These are customers, for commercial web apps.)

This is why alt="" is required on images: to try to get authors to
write more accessible web pages.  It's also part of why presentational
attributes are prohibited (rightly or wrongly).

I think this kind of conformance requirement is important.  For
example, authors should be discouraged from making up new element  and
attribute names, because if authors made up their own element and
attribute names, it could conflict with future specs.  If everyone had
used <header> instead of <div class="header">, HTML5 could not have
defined the new element with conformance requirements different from
<div>, because it would break sites.  Without the idea that <div
class="header"> is more "correct" than <header>, we might have run
into this problem in real life.

These conformance requirements do really need to be MUST requirements
in the spec to be effective.  And I do think we need some of them.
But possibly less than we have now.

Received on Monday, 22 March 2010 20:49:27 UTC