Re: "Outline" algorithm (document length and complexity)

On Wed, May 27, 2009 at 3:10 AM, Larry Masinter <masinter@adobe.com> wrote:

> The document is currently over 930 pages when printed
> "letter" size. The first complaint I get from implementors
> wanting to review the specification is that it is
> unreviewable: too long, to complex,


I agree it's unfortunately long and complex, but I think that's because what
it describes (the behaviour of HTML) is large and complex. As an implementor
I'm actually amazed by how concisely some extremely complex behaviours have
been described. It would be great if the spec was shorter, but I see no
reason to believe it can be made significantly shorter without sacrificing
completeness (or reducing its scope).

too difficult to
> review individual sections,
>

Identifying common concepts and reusing them across the spec is one of the
spec's great contributions. If you only want to read one section it may be
an impediment, but if you read (or implement) the whole thing, it's a great
boon.

too difficult to find
> the definition of terms, or where things are used.


Most terms link to their definitions, and right-clicking on the term shows
you where it's used. This is much better than most other specs I've seen.

My calling out this section was part of the review of the
> use of pseudo-code algorithmic specifications. It is
> well known that it is difficult to verify whether an
> algorithm produces expected results, and even more
> difficult to determine whether two algorithms produce
> equivalent results, which someone wishing to test
> conformance would have to do.


Verifying that your code matches the spec's algorithm can be relatively easy
if you're able to have your code closely resemble the spec's algorithm.
Otherwise, verifying that your code matches the spec's algorithm and
verifying that your code satisfies certain declarative properties are about
the same level of difficulty, and verifying that the spec's algorithm
satisfies certain declarative properties is a bit easier.

Expressing normative
> requirements in terms of sets of constraints which the
> results of the implementation must satisfy is far
> preferable from the point of view of validation,
> testing, and document review.


It can be, if the set of constraints is smaller than the definition of the
algorithm. In other situations declarative constraints can equivalent, e.g.
if you end up just describing an algorithm using declarative constraints, or
even worse, if they describe an algorithm in an obscure way. For example,
even in programming language theory, where there is an abundance of
mathematical sophistication, you see a lot of use of operational semantics
and even definitional interpreters.

I think it would be great if we could identify more concise and declarative
ways to write parts of the specification that don't sacrifice completeness
or precision. For example, I've argued with Ian that the storage mutex can
be eliminated in favour of a general serializability requirement. But you
can't just assume that there must be a better way to do it across the board
and therefore the current text is a big mistake.

I actually think that in some places, where important declarative properties
emerge from an algorithm in the spec in a subtle way, the spec should state
both the properties and the algorithm. That would make it clear that a
violation of the properties is a bug in the spec, not a behaviour loophole
that should be implemented or exploited. That would make the spec even
longer, but again, the spec should be as short as possible and no shorter.

Rob
-- 
"He was pierced for our transgressions, he was crushed for our iniquities;
the punishment that brought us peace was upon him, and by his wounds we are
healed. We all, like sheep, have gone astray, each of us has turned to his
own way; and the LORD has laid on him the iniquity of us all." [Isaiah
53:5-6]

Received on Tuesday, 26 May 2009 22:05:01 UTC