OT: Effective Spec Writing [was RE: Large content size value] from Travis Snoozy (Volt) on 2007-01-02 (ietf-http-wg@w3.org from January to March 2007)

From: Travis Snoozy (Volt) <a-travis@microsoft.com>
Date: Tue, 2 Jan 2007 10:44:36 -0800
To: "Roy T. Fielding" <fielding@gbiv.com>
CC: "ietf-http-wg@w3.org Group" <ietf-http-wg@w3.org>
Message-ID: <86EDC3963F04D546BED8996F77D290F6049D117DE6@NA-EXMSG-C138.redmond.corp.microsoft>
Roy T. Fielding said:

<snip>

> > For reference, the size of the file is 4,368,281,840 bytes, 4GiB is
> > 4,294,967,295 bytes, and the difference is 73,314,545 bytes (the
> > value of Content-Length + 1). The actual GET returns a 501/Not
> > Supported, but the erroneous HEAD reply is still Bad and Wrong.
>
> You should file it with the other protocol errors in IIS.
>

Trust me, I'm trying to figure out how to do that :).

> > Not that it's a surprise; these are the _exact_ problems that I
> > predicted would show up, based solely on what the spec said. Go
> > figure. More digging in more products will very likely uncover
> > similar issues (and not just in Content-Length, but anywhere where
> > 1*DIGIT is present).
>
> That is complete nonsense.  The spec does not say "Fail to use any
> common sense or valid software engineering techniques while reading
> untrusted network input." Nor does it say "Failure to recognize and handle
> integer field values larger than the expected integer size is okay."
>
> Professional software developers are expected to know better and be
> able to use their own judgement. They don't need a standard to tell
> them it is a bug.

This isn't actually my original problem -- I want the spec to tell me how
the error case should be handled, esp. on the client, and esp. in regard to
connection handling. However, I've got more than $0.02 on the topic of
quality spec writing, and I've already hovered off-topic onto it this far,
so forgive me if I indulge a bit in my response :).


The bar for "professional" software developers is not exactly high (which
leads to a completely different and much lengthier rant). Expecting that
*every person who reads the spec* can make competent interpretations and
judgments at every opportunity the spec gives them (and there are a *lot* of
opportunities) is ignoring the reality of the end-users.

Let me ask, because I honestly don't know: how many HTTP/1.1 features can we
not use because implementations are just too broken? Are -all- the
implementers in these cases (if they exist) inept or incompetent? Or is the
spec not being tight enough? Are these not the same types of issues that
required the HTTP/1.1 respin after HTTP/1.0?

It's not just in this spec, either: examining *any* spec is an excellent way
to determine what implementers are likely to get wrong, and in turn to
figure out how to break them. This is as great a technique for the black
hats as it is for the white: if we can read and re-read the spec until we
can't find anything that could easily be mis-implemented, we've improved the
life not only for the implementers (the end-users of the spec), but also for
the ultimate end-users of the protocol.

Finally, there are the "political" and "business" reasons why it has to be
the spec that is overtly uptight. Businesses are interested in getting stuff
out the door fast -- quality too often winds up as an afterthought. In order
to get good, solid implementations out of this situation, a specification
needs to do many things:

1. Be correct. Implementers are blind, and will implement bugs as spec'd.

2. Be consistent. Implementers have not gotten any more sighted since (1).

3. Be understandable. Implementers that don't understand will guess, and
   they tend to guess wrong.

4. Be simple (KISS). Implementers have more places to screw up if either the
   document itself is difficult to read (too many cross-references,
   unlabeled cross-references, baroque verbiage, etc.), or the algorithms
   the document is specifying are "tricky" in some way.

5. Be implementation-oriented. The spec should make it possible for one
   person to read-and-implement the spec in sequence, and for many people to
   be assigned discreet sections to implement in parallel. Requiring that
   everyone read the entire spec (possibly multiple times) to get enough of
   a grasp to start implementing is unrealistic.


If our goal as spec writers is to promote interoperability, we need to take
these constraints into consideration when we're writing the spec, otherwise
we're just ensuring interop on paper at best. IMHO, HTTP/1.1 fails on points
3, 4, and 5 (with 1 and 2 type problems being about average for a spec this
size), and I think that unduly limits the effectiveness of the spec in
practice.

To conclude, my point is that we shouldn't be writing this spec for
ourselves, for academics, for people who enjoy reading specs, for competent
and well-reasoned engineers, or for people who have infinite time and
patience. It should be written for the people who really have to implement
this -- people who have never seen the spec before, who may be mediocre at
engineering, and who just want to get the implementation over with as soon
as humanly possible (if not sooner).


Thanks,

-- Travis
Received on Tuesday, 2 January 2007 18:45:02 UTC