- From: Al Gilman <asgilman@iamdigex.net>
- Date: Fri, 29 Nov 2002 10:03:44 -0500
- To: www-qa@w3.org
Recently there was an incident where I complained about something on the XML Conference website. http://lists.w3.org/Archives/Public/www-validator/2002Nov/0127.html Now, I do concur with Bjoern's analysis that technically a value of html:a.name of '#wai' is legal per the HTML 4.01 Transitional DTD. But for webmasters, this is operationally the wrong answer. The answer to the wrong question. And the answer to the right question is readily generated by the validator tool, if we just made the appropriate DTD available as an option. The webmaster in question, presented with the evidence in my complaint, immediately fixed the problem. Using '#wai' as the html:a.name token is a HCI usability disaster; and the sense of the HTML WG is that the things that can be used in the #fragment clause in a URI-reference to a HTML document (XHTML included) should be restricted to the Name production. This is a change from CDATA as it had been up until in HTML 4.01 for html:a.name. See XHTML 1.0 for where this reform was instituted, I believe. There are human factors and de_facto interoperability (see evidence about Netscape and Lynx results) reasons why those webmasters who will take the trouble to exercise a syntax check against their content should check instances of html:a.name against the requirements of the NAME production and not pass general CDATA. Even if they want to limit their feature set to what is processed similarly-enough across many browsers including NN4. There are very few downsides to enforcing this more restrictive syntax on html:a.name and it is as easy to check your hypertext for the criteria on the safe side as it is to check for the per the letter of the spec profile. The actual industrial processes separate the activities of webmasters in "scrubbing content before it goes live" from the activities of browser makers in maintaining and publishing user agents. These two communities need to know what are reasonable emit and accept filters. Taking a hard line for the principle that these two filters ought to be the same thing is not listening to our customers. It flies in the face of the learnings of the quality revolution. This is a matter of applying the so-called DRUMS principle of being strict in what you emit and lax in what you accept. In the present case we are dealing with #fragment tokens that do get interpreted and created by people a certain amount in the overall information flow of the Web, so the human confusion that arises from allowing an initial '#' character in this token are germane. This string is not just for machines to process. Where I am going is that the W3C should consider maintaining (as a living document) one or more Best Current Practice DTD as an alternate to the one which exactly agrees with the specification as published in the past. I do not believe that this needs errata to the specification. The specification can be left as is an the BCP checking profile can still be adjusted to avoid known current interoperability problems in the field, as has been done with the extraction of the core of CSS features when there were a lot of interoperability problems with CSS. I am posting this idea here and not to the validator list because it is a question of W3C attitudes toward specifications, not just the mechanical maintenance of the validator itself. In an ideal world W3C would be fully committed to measuring the quality of its impact, and the QA group would get all the same visibility and participation as the TAG. But we aren't there yet. Somehow we have to get the W3C enterprise to take a more empirical approach to characterizing the level of performance of interoperability, and perhaps more importantly what it is that succeeds in interoperating, and not just trying to define an emit filter and an accept filter by the same utterance. The validator technology would be used much more by webmasters if we were publishing a BCP DTD that was a rough consensus estimate of what is safe to assume will actually interoperate successfully with a variety of browsers. This is not a normative statement, it is a descriptive consolidation of workarounds for known current problems in the continuity of operation of the web-in-the-large. Or at least I claim that if we went to the WWF or took a scientific sampling of webmasters, that is the answer that would come back. To organize an effective campaign to promote the use of orthodox markup, we need something much like an "accept filter" that is on the loose side of the spec. This is a rough envelope of the markup practices present in the content that a typical browser will need to cope with in order for its user not to reject it as useless. We need a catalog of the differences between actual common practice and W3C writ in order to do the necessary triage as to where we should fight and where we should switch; as well as organizing our business case for why they should switch where we chose to fight. I am not sure that the browser makers would consense that they want to share enough information to build the 'accept' model. That is something that the HTML WG probably has the right participation to answer. But I do suspect that the webmasters of the world would be delighted to share problem experience and work to extract the core of what mostly works into a conservative "emit filter." I should qualify 'conservative' in that regard. This is what was discussed above, that which is largely safe in the hands of the existing user agents, expressed as a profile of markup utilization. And this has to "eliminate the middleman" of W3C writ. It has to name names as to language constructs and processors. I don't mean that we shouldn't be checking and tracking what the specifications say with regard to trouble incidents. But the binary relationship between content example and processor example is sufficent to get started, and must be preserved somehow in what is available to the customers of this collection. Al
Received on Friday, 29 November 2002 10:01:28 UTC