- From: Sam Ruby <rubys@intertwingly.net>
- Date: Tue, 23 Mar 2010 09:19:38 -0400
- To: Henri Sivonen <hsivonen@iki.fi>
- CC: Maciej Stachowiak <mjs@apple.com>, Philip Taylor <pjt47@cam.ac.uk>, HTMLwg WG <public-html@w3.org>
On 03/23/2010 07:30 AM, Henri Sivonen wrote: > On Mar 23, 2010, at 00:39, Maciej Stachowiak wrote: > >> Another possibility is to change parsing such that "...©=..." >> is not a hazard. I believe you had some evidence that this would >> fix more sites than it would break. It seems like it would also >> have the benefit of allowing us to make authoring rules more >> lenient in a beneficial way, without at the same time introducing >> undue complexity. > > Making named character references not longer be named character > references when the terminating character is '=' might be > worthwhile. > > However, it seems this wouldn't address Sam's stated problem. If we > changed "...©=..." processing for the future, there'd be deployed > software treating it differently. This would count as an interop > problem, so there'd be all the more reason to make validators whine > about it under the criterion of focusing on interop problems. I'll ask you to be careful when attributing statements to others in general, and to me in particular. I'd like to understand the rationale for Authoring Conformance Requirements -- it makes a big difference. In particular depending on what the intent is,either a bug should be filed to remove mandatory requirements for escaping ampersands in URIs OR a bug should be filed to add mandatory requirements to explicitly close all open non-void tags. To further the discussion, I suggested interop as one possible criteria. If I gave the impression that this criteria was absolute or non-negotiable, I apologize. As to the observation that there are a large number of named entities to be remembered and avoided, I will state that it is my observation that the number is small enough to be that the possibility of collisions is ignorable in practice. People can and will ignore this, and will almost never see a problem. The problem of character encoding, on the other hand contains a much larger set of byte code sequences that are invalid, is a problem that is routinely seen on the web, and is only reported when an actual problem exists. It is consistency in criteria that I am seeking to understand. Just so that there is no confusion, I do have a preference. I would prefer that the spec be written in ways that cause validators to only report on actual instances of problems, and cases where it seems likely that the actual markup that is present is not intentional. (I'll acknowledge up front that with any automated software there always is the possibility of false positives). As a start, if the number of spec compliance issues on google.com were to be reduced to zero (either by changes to that page or by changes to the spec), I think that would represent forward progress, and we could repeat that process with other sites. Simultaneously, we can define one or more sets of best practices, and ways that authors can opt in to which sets of best practices they intend to follow. And I will note that both the ombnibus bug report[1] and a very specific bug report[2] have been downgraded to P3. Given that whatever we come up with has the distinct possibility of affecting a large number of open issues, I think that's rather unfortunate. I don't think it is fair to put this entirely on the editor's lap, and think that opening this up for change proposals would be a reasonable next step. That being said, I don't want anybody to claim that such proposals were out of order: so I would like to request that these bugs either be changed to a high priority, or that we agree to solicit change proposals (and agree to keep these specific bugs in abeyance while that is done). - Sam Ruby [1] http://www.w3.org/Bugs/Public/show_bug.cgi?id=7034 [2] http://www.w3.org/Bugs/Public/show_bug.cgi?id=7468
Received on Tuesday, 23 March 2010 13:20:25 UTC