References and Modularity

Dear TAG,

I sent the message below to the chairs' list a couple days ago. Dan 
pointed out to me that this seemed to be TAG territory, and indeed you 
have been discussing referencing documents.

It's quite conceptually handwavy in its arguments, but hopefully it can 
help start a discussion on how we manage dependencies.


Dear chairs,

as you well know, there have over the years been rather extensive
discussions about the way in which a specification may reference another.

The common lore on this issue is that you can only reference stable
documents, or documents that are at most one degree of maturity behind
your own. In truth, that requirement is not a solid one: Process allows
the Director to decide whether (s)he feels the way a specification
handles references is satisfactory or not. That said, "satisfactory" is
a fuzzy concept and while it remains undefined the natural tendency of
cautious stakeholders will be to reach for the strict interpretation in
order to be on the safe side.

This, however, is quickly and increasingly becoming a problem. As a
strict policy, it was manageable when each group handled a document or
two, where dependencies across them were pretty scarce, and where we
didn't have that many things baking as a community. Nowadays, we have
more documents progressing on the Rec track than the sum total of
Recommendations we've released since the beginning. Most groups have
several (often interlinked) documents, some have literally dozens. And a
lot of these documents tend to be heavily related to one another.

To give an example, I've been going through the references that the HTML
specification has to other documents in order to assess its
seaworthiness in terms of referential stability.

In there, there are 26 references to documents that aren't Recs. Of those:

 1 is a PR and 4 are CRs. Those are probably okay.

 11 are WDs. Some of those are dead; some of those haven't moved in a
long while; some are active but clearly on a long time frame; and some
are specs that have been announced as "done any minute now" for years.
Some might be done when HTML gets to Rec (in 2014), but all of them?
That's not happening.

 4 are EDs or CG drafts that have simply never been published, and 6
are WHATWG specs. Again, these can all be shepherded through Rec track,
but pushing them all to Rec in 18 months might be stretching practicality.

That's just HTML. Many, many other specifications are in the exact same
spot (albeit with smaller numbers). We can multiply exemptions (like the
one we have about referencing HTML itself) but that's just patching
things up clumsily.

So we need a plan. The plan I have is to define what can be considered a
"satisfactory" reference in terms sufficiently clear that we only need
to reach for strictly stable references when it is absolutely required,
and can be predictably flexible in other cases. I don't claim to have a
perfect layout, but I think it's good enough for discussion amongst
smart people (that'd be all y'all).

Note that my approach takes no sides on the living standards debate. My
experience is that different stakeholders benefit from different
approaches, and we should enable them all. LS specs are great if your
browser self-updates every few weeks; it pretty much blows if you're
paying for software upgrades and the standard shifted from beneath your
feet after you've paid (e.g. a lot of AT software is expensive).

Let's look at the details of what can happen when a reference changes.

A change in a referenced specification could be anywhere in the range
form harmless to debilitating.

If you're a Recommendation for an XPointer scheme and the the XPointer
Framework kicks the bit bucket, then you're going to have a bad time.
(In such an extreme case though, I'd have to ask how the multiple
implementations criterion was ever satisfied.)

At the other end of the spectrum, to take a concrete example, if you're
the HTML specification, you're referencing Microdata, and Microdata
starts pining for the fjords, then very little of consequence happens.

Why? Because in the common case, the way in which HTML integrates
Microdata is like this (I'm paraphrasing): "If a <meta> element has an
itemprop attribute from [MICRODATA], then it may also appear in body

With Microdata gone, all that we have is a set of conditions in the HTML
specification that can simply never happen. But they don't break
anything. Sure enough, it's dead wood, and we'd like to avoid that, but
interoperability is not affected. We phase out that text in a revision
of the specification, and no one gets hurt. It's not ideal, but if we
take the long view  which we have to  every specification above a
given size is going to turn out to have *some* dead wood over a scale of
a decade or two. We can handle that.

Then there are the greyer area situations.

One is the case in which Microdata doesn't die, but evolves to change
the syntax of the content of the itemprop attribute. Again, in this
case, I would argue that it is not a problem (though more of a problem
than in the death case). As far as HTML is concerned as a specification,
the interface points do not need to change. *If* we have arranged things
so that specification separation does have some form of sensible mapping
to implementation separation, then we'll be fine; if, however, there is
no sane way that one may burn HTML into silicon without also bringing in
the unbaked Microdata then we have a problem that requires some discussion.

Then there is the darker grey case. Microdata doesn't die, and evolves
to change the *name* of the itemprop attribute. This is a pretty bad
scenario, as the interface breaks.

 From the above I think we can try to extract a few principles:

 When there is a hard dependency such that it is meaningless to
consider that Referring Rec could pass sensible CR exit criteria without
Referred Rec also passing them, then the reference must be stable.

 When there is a modular dependency, then we need to follow good
interface design principles:

   - The interface between the two specifications should have the
smallest possible surface.

   - The interface points must be stable. (In many cases that's just the
names not changing.)

   - Specifications should be split along sensible implementation lines.
(This is a nice-to-have for design, not a strict requirement.)

I'm not claiming that the above is enough to produce a decision tree for
transitioning specifications. But I hope that it can help clarify how
dependencies can be handled in a way that takes into account the reality
that we have far more strongly intermeshed dependencies than we had even
five years ago without however being a weasel way of just shipping specs
without caring for the deployment and interoperability implications.

If it were a decision tree, it might look like this (subject to heavy
refinement), given an unstable reference:

Is it a hard dependency?
   Yes: wait for it
     Is the interface surface large?
       Yes: red flag
       No: probably okay
     Is the other group aware that it has interfaces that can't change?
       Yes: probably okay
       No: can't progress without their express agreement
     How distinct are implementations of both specifications?
       Very: probably okay
       Not much: (smaller) red flag

Put differently, it's basically evaluating specs for modularity and test
of independent invention, allowing for looser references the higher they
rate against those principles. It's imperfect, but you get the idea.

If we had something like the above agreed upon, I reckon that editors
and chairs could more easily make the judgement call of what referential
stability is called for for given references in their specifications.
They could then go into transition calls with their arguments prepared
on each unstable reference.


There is a strong tension, if not a strict incompatibility, between
modularity and stable references. Modularity requires some degree of
dynamic linking, or of independent layering. Stable references push in a
different direction. I think that we need flexible rules based on
expectation of good design, even if it means longer, harder transition

Robin Berjon - - @robinberjon

Received on Wednesday, 29 May 2013 10:02:15 UTC