Re: Bug 7034 from Sam Ruby on 2010-03-23 (public-html@w3.org from March 2010)

From: Sam Ruby <rubys@intertwingly.net>
Date: Tue, 23 Mar 2010 09:19:38 -0400
To: Henri Sivonen <hsivonen@iki.fi>
CC: Maciej Stachowiak <mjs@apple.com>, Philip Taylor <pjt47@cam.ac.uk>, HTMLwg WG <public-html@w3.org>
Message-ID: <4BA8BFEA.4020501@intertwingly.net>
On 03/23/2010 07:30 AM, Henri Sivonen wrote:
> On Mar 23, 2010, at 00:39, Maciej Stachowiak wrote:
>
>> Another possibility is to change parsing such that "...&copy=..."
>> is not a hazard. I believe you had some evidence that this would
>> fix more sites than it would break. It seems like it would also
>> have the benefit of allowing us to make authoring rules more
>> lenient in a beneficial way, without at the same time introducing
>> undue complexity.
>
> Making named character references not longer be named character
> references when the terminating character is '=' might be
> worthwhile.
>
> However, it seems this wouldn't address Sam's stated problem. If we
> changed "...&copy=..." processing for the future, there'd be deployed
> software treating it differently. This would count as an interop
> problem, so there'd be all the more reason to make validators whine
> about it under the criterion of focusing on interop problems.

I'll ask you to be careful when attributing statements to others in 
general, and to me in particular.

I'd like to understand the rationale for Authoring Conformance 
Requirements -- it makes a big difference.  In particular depending on 
what the intent is,either a bug should be filed to remove mandatory 
requirements for escaping ampersands in URIs OR a bug should be filed to 
add mandatory requirements to explicitly close all open non-void tags.

To further the discussion, I suggested interop as one possible criteria. 
  If I gave the impression that this criteria was absolute or 
non-negotiable, I apologize.

As to the observation that there are a large number of named entities to 
be remembered and avoided, I will state that it is my observation that 
the number is small enough to be that the possibility of collisions is 
ignorable in practice.  People can and will ignore this, and will almost 
never see a problem.  The problem of character encoding, on the other 
hand contains a much larger set of byte code sequences that are invalid, 
is a problem that is routinely seen on the web, and is only reported 
when an actual problem exists.  It is consistency in criteria that I am 
seeking to understand.

Just so that there is no confusion, I do have a preference.  I would 
prefer that the spec be written in ways that cause validators to only 
report on actual instances of problems, and cases where it seems likely 
that the actual markup that is present is not intentional.  (I'll 
acknowledge up front that with any automated software there always is 
the possibility of false positives).

As a start, if the number of spec compliance issues on google.com were 
to be reduced to zero (either by changes to that page or by changes to 
the spec), I think that would represent forward progress, and we could 
repeat that process with other sites.  Simultaneously, we can define one 
or more sets of best practices, and ways that authors can opt in to 
which sets of best practices they intend to follow.

And I will note that both the ombnibus bug report[1] and a very specific 
bug report[2] have been downgraded to P3.  Given that whatever we come 
up with has the distinct possibility of affecting a large number of open 
issues, I think that's rather unfortunate.  I don't think it is fair to 
put this entirely on the editor's lap, and think that opening this up 
for change proposals would be a reasonable next step.  That being said, 
I don't want anybody to claim that such proposals were out of order: so 
I would like to request that these bugs either be changed to a high 
priority, or that we agree to solicit change proposals (and agree to 
keep these specific bugs in abeyance while that is done).

- Sam Ruby

[1] http://www.w3.org/Bugs/Public/show_bug.cgi?id=7034
[2] http://www.w3.org/Bugs/Public/show_bug.cgi?id=7468
Received on Tuesday, 23 March 2010 13:20:25 UTC