Re: DTD Confusion from James J. Hunt on 2001-02-05 (ietf-dav-versioning@w3.org from January to March 2001)

From: James J. Hunt <jjh@allerton.de>
Date: Mon, 05 Feb 2001 23:59:34 +0000
To: "Geoffrey M. Clemm" <geoffrey.clemm@rational.com>
CC: ietf-dav-versioning@w3.org
Message-ID: <3A7F3E66.8B7AAEA2@allerton.de>
Dear Goeff,


>
>    The tendency of the
>    authors of the DeltaV document to write ANY in every place that one
>    would like to be able to the protocol is both unnecessary and unhelpful.
>
> The purpose of ANY in the protocol usually indicates that there is
> no DTD construct that usefully describes the syntactic constraints
> on what elements can appear, and therefore those constraints are
> described in English text.

ANY is needed for the case when the possible responses are not known in
advance.  This is the case with the content of dead properties.

>
>    In particular, a DTD is similar to a BNF for a language: a parser need
>    not use the BNF directly to parse the language, but It still gives
>    implementers a way of precisely describing the language.
>
> DTD's provide a very limited vocabulary for describing the syntax.
> In the case of the versioning protocol, we mostly need the ability
> to say "at most one of each of the following element types can
> appear, in any order".  Unfortunately, there is no DTD construct
> to express this, so we use (very) simple English text to describe
> this constraint.

As long as the number of such options are finite, they can be described in
DTD.  One must list all possible combinations.  It may not be pretty, but it
does work.  Of course, if you mix such options with elements that can appear
without limit on their number, one can not express it precisely with a DTD,
but then it is not a context free grammar.  You can not expect any reasonable
grammar to describe such a mixture exactly.  The other reasonable alternative
is to specify the grammar without limit on occurrence, then define the
behavior so that only the first (or last) will have effect.

Example

<!ELEMENT foo ((A | B | C | D)*)>

With the description:  only at most one occurrence of  each of B and D are
used; subsequent occurrence are to be ignored.  This is still more useful than

<!ELEMENT foo ANY>

which tells me absolutely nothing about what foo may contain.

>
>    Of course, this
>    would be contrary to the spirit of this protocol.  However, validation
>    does not work this way.  If a client sends a message based on a new DTD,
>    it causes no problem.  The client must send a copy or a reference of the
>    new DTD with its message.  The server has two choices: it may fetch the
>    new DTD and validate against that DTD, or the server may simply check
>    for well-formedness. In both cases, the server need not and should not
>    reject the request, so long as the request matches the given DTD.  If it
>    does not match, however, something is in fact wrong with the request.=20
>    Even in this case, the server is still permitted to recover from the
>    error.  The same applies in reverse for responses from the client.
>
> Yes, and this illustrates the very limited value of including a DTD
> (or a reference to a DTD) in a message.  If the message is valid (wrt
> the DTD), the DTD has no effect on how the message is processed.  If
> the message isn't something the receiver can process, it will be
> rejected whether it is valid or not.  So the only case where a DTD
> matters is where the client can process the message, but it doesn't
> match the DTD that was sent.  As you state above, the recipient may
> well want to process the message anyway in this case, and ignore the
> fact that the message didn't match the DTD that it came with.
>

When a message fails, it is helpful for debugging to know whether a valid
message failed because it was not understood, or if a tag was present that is
proprietary.   This can help when tracing interoperability problems between a
client and server from different vendors.

>
> Because of this, our implementations will not send DTD's (or references
> to DTD's) with our messages, nor will we process any DTD's (or references
> to DTD's) that are included.  What matters to us is whether we can
> process the message, not whether it matches some DTD.
>

This does not speak well for interoperability.

>
>    We opine that a client or a server should be able to send messages that
>    conform to the DTD that it uses.   This does not inhibit communication
>    with peers that use another version of the protocol.
>
> I agree, as long as we do not require that a DTD (or a reference to a DTD)
> be sent, and we do not require that a DTD (or a reference to a DTD) be
> processed by the receiver of the message.
>

I opine that the DTD helps one reason about the protocol.  It collects
everything in one place, and helps eliminate duplications (like using the tags
add and remove for labels, when they are already define differently for
propertyupdate).  The larger the protocol become, the more important it is.

You obviously disagree, but I think a defined grammar also makes it easier to
debug interoperability problems.

>
>    Each new version
>    must have its own DTD, but this should be a good basis for understanding
>    what the protocol actually is in any given version and how it changes
>    from one version to the next.  A well designed DTD should also help
>    insure that new version are, in fact, upwardly compatible.
>
> The fact that new variants are syntactically compatible with old
> variants is rather trivial to verify, with or without the use of
> DTD's.  The hard part is making sure that the semantics is upwardly
> compatible.  But in any case, DTD's do not help when they do not have
> the capability of describing the syntactic constraints of a particular
> type of message (as is the case for the versioning messages).

This is only true if you can determine if a (foreign) client or server
actually implements some interface.

>
> Cheers,
> Geoff

A better description language than DTD is certainly desirable, but DTD is not
so useless that it is not worth using.

Sincerely,
James
Received on Monday, 5 February 2001 17:59:42 UTC