- From: <jgraham@opera.com>
- Date: Tue, 26 May 2009 20:01:09 +0000
- To: Larry Masinter <masinter@adobe.com>
- Cc: HTML WG <public-html@w3.org>
Larry Masinter wrote: > The document is currently over 930 pages when printed > "letter" size. The first complaint I get from implementors > wanting to review the specification is that it is > unreviewable: too long, to complex, too difficult to > review individual sections, too difficult to find > the definition of terms, or where things are used. Who are these implementors? What are they (planning on) implementing? Is it possible to encourage them to speak for themselves? If there are really significant issues with the document they should, of course, be fixed, but not at the expense of making the document incomplete. > My calling out this section was part of the review of the > use of pseudo-code algorithmic specifications. It is > well known that it is difficult to verify whether an > algorithm produces expected results, Can you provide pointers to back this up, please? and even more > difficult to determine whether two algorithms produce > equivalent results, which someone wishing to test > conformance would have to do. My understanding is that typically conformance is ascertained by running testsuites rather than by attempting formal proofs of the equivalence between two algorithms. So the difficulty associated with doing the latter seems inconsequential to the current discussion. Admittedly there is some burden on a developer who wants to achieve reasonable certainty that their implementation has the same behavior as the spec text. However this difficulty exists in any scheme where the description in the spec does not translate directly into production code. I am far from convinced that an initially algorithmic style makes this more of a problem. Expressing normative > requirements in terms of sets of constraints which the > results of the implementation must satisfy is far > preferable from the point of view of validation, > testing, and document review. Do you have some evidence to back that up? My experience is that an algorithmic style of specification makes it rather easy to determine what the expected behaviour is. It has worked well for me when implementing various parts of the HTML 5 specification (parsing, table structure + headers, outline, microdata) although I was generally not interested in making the most optimal implementation. It has also worked well for me when QAing implementations of other specifications that use a similar algorithmic style e.g. ECMAscript (as an aside, I will note that the presentation of the algorithms can, of course, make a big difference; the switch from a goto-based style in ECMAScript 3 to a loop-based style in ECMAScript 5 significantly improved the readability of the spec). Indeed for many things I have no idea how one would convey the equivalent semantics in a non-algorithmic style (it is worth noting that informative text specifying the intended output of the algorithm is often helpful; maybe more of this would address your primary concern?). > If we are concerned about whether the document can be > reviewed, then an algorithmic normative section that > is also lengthy is even more egregious. Being > "precise" in this way may be counter-productive, > if no one is really capable of evaluating the > precision. The idea that no one can evaluate these sections is demonstrably false; there already exist implementations of most of the algorithms in HTML 5. For example there is an implementation of the outline algorithm at [1]. The source code [2] contains extensive comments taken from the spec text. In writing the program the author was able to review the spec (e.g. [3]). Since there is a requirement that the WG produce testcases, we will produce testcases that verify the implementation matches the spec. Such an implementation is helpful for other people looking to understand the spec. Of course there are other ways to review the spec than by implementation but for some parts, testcases and implementations are the only reasonable approach. The parsing section is an example of this; it needs to match hat implementations are prepared to ship. The only way to determine if it actually does match what implementations are prepared to ship is to implement it in mass-market implementations and see where it fails on the existing corpus of web documents. > While there may be other applications which want a common definition > of "outline", those other applications > have their own requirements, ways of determining > conformance, and constraints which this working group > is not in a position to review. There is no way of > determining conformance, for example. There are no > requirements for "outline" against which this particular > outline algorithm can be reviewed. I really don't understand this position. The outline is a property of the document; its logical structure. Explicitly marking such structure in documents is commonplace; the HTML 5 spec itself is a good example. Saying that HTML shouldn't define how the elements of the document form a logical structure is rather like saying that it shouldn't define the meaning of <em> because different consumers might want to interpret it differently. The utility of outline tools is in their ability to present that logical structure in a way that is useful to the end user e.g. as a table of contents, as a sidebar, or as a set of position-dependent navigation commands in a voice browser. Regardless of the specific presentational form chosen by a given tool, the underlying document stucure is something that all tools should agree on. That is what this section defines. > In any case, if the "outline" specification has > no application within HTML itself, then it can > be put in a separate document and processed > independently. I find the concept of defining the core semantics of HTML in a document outside HTML itself to be utterly bizzare. [1] http://gsnedders.html5.org/outliner/ [2] http://hg.gsnedders.com/anolis/file/b6d93515d41e/anolislib/processes/outliner.py [3] http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2008-June/015083.html
Received on Tuesday, 26 May 2009 20:01:51 UTC