Re: AuthConfReq: Presentational Markup from Sam Ruby on 2010-03-31 (public-html@w3.org from March 2010)

From: Sam Ruby <rubys@intertwingly.net>
Date: Wed, 31 Mar 2010 07:35:39 -0400
To: Maciej Stachowiak <mjs@apple.com>
CC: Henri Sivonen <hsivonen@iki.fi>, Jonas Sicking <jonas@sicking.cc>, HTML WG <public-html@w3.org>
Message-ID: <4BB3338B.9090709@intertwingly.net>
On 03/31/2010 05:17 AM, Maciej Stachowiak wrote:
>
> On Mar 31, 2010, at 1:51 AM, Henri Sivonen wrote:
>
>> "Jonas Sicking" <jonas@sicking.cc> wrote:
>>
>>> My personal opinion, which I think I stated a long time ago when Rob
>>> Sayre first brought up this topic, is that I'd prefer to get rid of
>>> all the authoring conformance requirements.
>>>
>>> There simply is too much controversy for too little value to make
>>> this worth it for us. Instead we should leave it up to lint tools
>>> to create best practices.
>>>
>>> This not only saves us a bunch of work, it also gives lint tool
>>> authors more freedom to develop whichever practices that they deem
>>> suitable.
>>
>> Leaving the definition of document conformance criteria to "lint tool"
>> developers doesn't remove the need to think through the definition. It
>> just makes it Someone Else's Problem.
>
> Someone Else may even still have to be this Working Group, since one of
> our charter success criteria is "Availability of authoring tools and
> validation tools". I do not think a tautology machine would be a
> meaningful fulfillment of this success criterion.
>
>>
>> I'd still expect markup generator developers to want the criteria to
>> be such that the output of their products isn't considered to be in
>> error by lint tools. If this expectation is correct, there's still a
>> need to come to mutual agreement on the rules--i.e. to standardize
>> them. Kicking that standardization work out of this WG would only have
>> the benefit that developers of products that are exclusively markup
>> consumers wouldn't have to watch how the document conformance sausage
>> is made. But even browsers these days are also producers: For example,
>> the contentEditable feature set should probably be informed by
>> document conformance criteria.
>
> I would also add that at least a subset of the document conformance
> criteria are important for documents to be interoperable with user
> agents. Even if every fully HTML5 compliant browser handles arbitrary
> octet sequences in an interoperable way, we still have to consider:
>
> - Constructs which are known to result in non-interoperable behavior in
> legacy UAs. And by "legacy" I mean "all the ones available now".
> - Constructs which are known to result in non-interoperable behavior for
> non-browser classes of content consumers, even ones that are fully HTML5
> compliant, but either do not implement all of the optional error
> handling, or operate in streaming mode.
> - Construct which may stress weird corner cases even in largely
> compliant UAs, and therefore are more likely to fall into buggy and
> non-interoperable territory.
>
> In other words, any content we label "conforming" needs to interoperate
> with not just fully compliant and fully capable HTML5 browsers, but also
> implementations that are not fully compliant or fully capable in various
> predictable ways. This is a simple application of Postel's Law. I used
> to think that we could get away with removing all authoring conformance
> requirements, but the above considerations have changed my mind.
>
> I now believe that requirements that address the above points are the
> bare minimum position that is at all tenable.
>
> Granted, this is a considerably smaller set than the full set of
> conformance criteria in the spec currently. But I think this has to be
> the minimum baseline, rather than no document requirements whatsoever.

 From the spec[1]:

1) The use of presentational elements leads to poorer accessibility

Doesn't meet that bar.

2) Higher cost of maintenance

Doesn't meet that bar.

3) Higher document sizes

Doesn't meet that bar.

4) Unintuitive error-handling behavior

Does meet that bar.

5) Errors with optional error recovery

Needs further discussion.  The description mentions "bizarre and 
convoluted", and any such examples meet that bar.  Following is an 
example where there are two parse errors that are neither bizarre nor 
convoluted:

   <!DOCTYPE html><title>x</title><span><div></div>

6) Errors where the error-handling behavior is not compatible with 
streaming user agents

Meets that bar.  Not just because of streaming user agents, but because 
such examples also match the Unintuitive error-handling behavior case above.

7) Errors that can result in infoset coercion

Needs further discussion.  Those that wish to use XML user agents may 
want to outlaw consecutive dashes in comments or apply other technique 
for dealing with same, but this use case doesn't affect most authors, 
and such restrictions will be routinely ignored.

8) Errors that result in disproportionally poor performance

Needs further discussion.  The example given in the spec (essentially a 
extreme variation on the example I gave in #5 above) needs more 
justification.

9) Errors that help authors avoid fragile syntax constructs

Needs further discussion.  The example given in the spec (unescaped 
ampersands in URIs) does not meet the bar.

10) Errors that protect authors from security attacks

Meets that bar.

11) Cases where the author's intent is unclear

Meets that bar.

12) Cases that are likely to be typos

The example given in the spec (<capton>) meets that bar.  Distributed 
extensibility, however, is still an open question.  To address this, 
extensions need to be clearly and obviously identifiable as such.  This 
is addressed in a number of the existing change proposals.

13) Errors that allow for new syntax in future

The example given in the spec (attribute like syntax in end tags) meets 
this criteria, even though I believe it is highly unlikely that such 
will ever be adopted in HTML.  I would lump this in the "Cases where the 
author's intent is unclear" category above, and actually expand it.  A 
document that starts with two consecutive less than signs is not 
conforming.  Any document that matches the sniffing algorithm as a jpeg 
is non-conforming.

- Sam Ruby

[1] 
http://dev.w3.org/html5/spec/Overview.html#conformance-requirements-for-authors
Received on Wednesday, 31 March 2010 11:36:15 UTC