[SpecLite] Managing Normative References from Bjoern Hoehrmann on 2004-07-13 (www-qa@w3.org from July 2004)

From: Bjoern Hoehrmann <derhoermi@gmx.net>
Date: Tue, 13 Jul 2004 08:55:11 +0200
To: www-qa@w3.org
Message-ID: <41aa64e7.1905102963@smtp.bjoern.hoehrmann.de>
Hi,

  Something that seems to come up quite often in Working Groups is how
to normatively reference other specification and how the reference
affects the specification, in particular what happens if the referenced
specification or parts thereof is changed, updated, obsoleted, super-
ceded, rescinded, replaced, etc.

For example, a specification states

  "The value of the attribute is a URI as in [RFC2396]".

What does that mean for e.g. these examples:

  * ...="http://www.example.org/#foo"
  * ...="http://[3ffe:2a00:100:7031::1]/"
  * ...="http://666.666.666.666/"
  * ...="foo"
  * ...="http://www.example.org/~björn"

Are these currently legal? Will this change once RFC2396bis "obsoletes"
RFC 2396? I would say that the first example is illegal as the example
uses a URI Reference as opposed to only a URI, IMO, RFC 2396 clearly
distinguishes between those constructs. Others sometimes disagree about
that. Whoever would be right, it would have been much better if the
specification said

  "The value of the attribute is a URI Reference
   as defined in section 4 of [RFC2396]".

as that would not allow any argument about it. The second example is
a bit tricky, RFC 2396 does not include support for IPv6 literals, the
syntax has been introduced in RFC 2732 which does *not* update RFC 2396,
even though it is commonly referred to as doing so, e.g. XML 1.0 Third
Edition states

[...]
  Definition: The SystemLiteral is called the entity's system
  identifier. It is meant to be converted to a URI reference (as defined
  in [IETF RFC 2396], updated by [IETF RFC 2732]), as part of the
  process of dereferencing it to obtain input for the XML processor to
  construct the entity's replacement text.
[...]

This is also where things get even trickier than above. The definition
clearly refers to URI References, so

  <!DOCTYPE example PUBLIC "..." "http://www.example.org/#foo">
  <example/>

would be allowed as "http://www.example.org/#foo" is a legal URI
Reference. But it is not, the specification points out,

[...]
  It is an error for a fragment identifier (beginning
  with a # character) to be part of a system identifier.
[...]

Well, I complained about this misuse of terminology and the XML Core WG
told me [1] that using the term "URI Reference" is necessary here as it
allows absolute and relative references, which, I conclude, the term
"URI" does not. So it seems that according to their interpretation the
text proposed first would exclude the fourth example as it uses a
relative identifier.

And actually I was a bit too fast considering the DOCTYPE example above
as not allowed, it is allowed, the document is, if the processor is able
to locate the DTD either by using the public identifier or by recovering
from the erroneous system identifier, both well-formed and valid. And
yet it has errors. That's an "interesting" problem for authors of
conformance tools, they would have to write software that says

  The document is well-formed.
  The document is valid.

  Error: ...

which would likely confuse users...

The third example is also tricky, the grammar of RFC 2396 allows it, but
I am not sure how it is supposed to be implemented; anyway, a Validator
would not probably consider the document valid regardless of this issue.
Until RFC2396bis joins these scene, which prohibes this syntax. What
would that mean? Would my legal content become non-conforming? Would my
implementation that supports the syntax become non-confofming? Or would
it be neccessary that the specification gets updated to consider RFC
2396bis?

But let us not drift too much from the original issue. So far we have
seen examples for unclear specification text and disagreement about the
interpretation of common terminology. Another problem that arises is if
the specification not only makes unclear references, but also duplicates
the content of other specifications.

An example would again be the XML 1.0 Recommendation. The "Name"
production that is used to define legal syntax for e.g. element names is
defined in terms of a concrete list of Unicode code points, Unicode 2.0
at the time of publication of the Recommendation. The XML 1.1
Recommendation states:

[...]
  Characters not present in Unicode 2.0 may already be used in XML 1.0
  character data. However, they are not allowed in XML names such as
  element type names, attribute names, enumerated attribute values,
  processing instruction targets, and so on. In addition, some
  characters that should have been permitted in XML names were not, due
  to oversights and inconsistencies in Unicode 2.0.

  The overall philosophy of names has changed since XML 1.0. Whereas XML
  1.0 provided a rigid definition of names, wherein everything that was
  not permitted was forbidden, XML 1.1 names are designed so that
  everything that is not forbidden (for a specific reason) is permitted.
  Since Unicode will continue to grow past version 4.0, further changes
  to XML can be avoided by allowing almost any character, including
  those not yet assigned, in names.
[...]

Spite its name XML 1.0 has not been designed with extensibility in mind;
if it were, XML 1.1 would probably not exist as the benefit would likely
be considered too small compared to the cost of having two XML standards
which is another problem.

  http://www.w3.org/mid/48457920-A9B9-11D8-9806-000A95BD86C0@bea.com

lists some of the problems that emerged due to XML 1.1. Are simpler 
problem would be just whether

  <?xml version="1.1" encoding="utf-8"?>
  <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
      "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
  <html xmlns="http://www.w3.org/1999/xhtml">
  <head>
  <title></title>
  </head>
  <body>
  <p>...</p>
  </body>
  </html>

is a strictly conforming document. The XHTML 1.0 Second Edition
Recommendation refers to XML 1.0 many times, but section 3.1.1 only
states

[...]
  A Strictly Conforming XHTML Document is an XML document that requires
  only the facilities described as mandatory in this specification. Such
  a document must meet all of the following criteria:
[...]

Is an XML 1.1 document an "XML document"? And is the XML 1.0
Recommendation a normative reference of the XHTML 1.0 Second Edition
Recommendation? Appendix E of XHTML 1.0 SE is "informative", so it seems
that it is not. The HTML Working Group however assured me that the lack
of normative references is an error. What the normative references are
they still have to say...

What a mess.

For references to Unicode and ISO 10646 the I18N Core WG tries to clean
things up a little and provide 

  http://www.w3.org/TR/2004/WD-charmod-20040225/#sec-RefUnicode

detailed information for editors on how to reference those documents,
just like Unicode provides

  http://www.unicode.org/unicode/standard/versions/#Citations

similar material. That's good. I want more of that. Hmm, it seems that
SpecLite only has

  http://www.w3.org/TR/qaframe-spec/#reference

[...]
  B.3 Make a list of normative (and non-normative) references

  Good Practice: Start now and keep adding to it as you go.
[...]

That's a bit insufficient... Say I am an editor, how to

  * make a reference so that updates do affect my specification
  * make a reference so that updates do not affect my specification

when should I choose which option, are there differences between the
various specification production facilities, IETF, W3C, ISO, etc. that
one should be aware of? I should be aware that normative references
might always turn into a point of extensibility in my specification.
I should also know what happens e.g. if my specification references
spec X and spec Y replaces X but does not include all the features
spec X included, what does that mean for my specification. And such.

It also seems to be good practise to avoid duplication of normative
content, if I state in my specification that feature X is defined in
some other specification and works like $how_it_works, and the other
specification is changed so that $how_it_works and that specification
contradict, it is not clear how it works for my specification, so I
should either state that the normative definition is in that reference
and $how_it_works is informative, or the other way round.

And going back to the original RFC 2396 example, it might be best to
choose verbosity and state e.g.

  "The value of the attribute is a relative or absolute URI
   Reference with an optional fragment identifier as defined
   in section 4 of [RFC2396] as updated by [RFC2732]."
 
possibly stating something about updates like

  "... or their successor(s)"

or like

  "Implementations also MAY conform to their successor(s)"

and such. I would like "QA Specification Guidelines" to discuss such
things to a reasonable extend, please make that happen :-)

[1] http://lists.w3.org/Archives/Public/xml-editor/2001JulSep/0012.html

regards.
Received on Tuesday, 13 July 2004 02:56:05 UTC