Re: References and Modularity from Marcos Caceres on 2013-06-15 (www-tag@w3.org from June 2013)

From: Marcos Caceres <w3c@marcosc.com>
Date: Sat, 15 Jun 2013 19:51:33 +0100
To: Larry Masinter <masinter@adobe.com>
Cc: "www-tag@w3.org" <www-tag@w3.org>
Message-ID: <A33B7A1EC97D4229AB9EE51773BBA37E@marcosc.com>
On Wednesday, June 12, 2013 at 10:31 PM, Larry Masinter wrote:

> Sorry this is long. I'd rather have a conversation about this.
>  
> If A --normative reference -> B and A and B are both 'living standards' being
> maintained by the same organization, then it's fine, an undated
> reference to B is no problem, because presumably updates to B are
> verified to not cause problems with the reference from A.  
>  
> So that's fine.
>  
> However, when the groups working on A and B aren't necessarily
> coordinating, every update to B can potentially change the  
> conformance requirements to implement A, but the watchers
> of A may not notice changes to B.  
>  
> So I think it's a matter of trust and coordination.

I agree.

> > I agree - editors are actually last in my chain of cares. My interest is that
> > implementers, reviewers, and users are always looking at the most stable/up-
> > to-date references.
>  
>  
>  
> We're trying to optimize the needs of end users first, by paving the
> way for those who deploy (install, service, support) the components of
> the web (browsers, servers, sites, search engines, data gatherers) and  
> who use tools, libraries, and other elements to create those.

Sure, but end users generally don't read specs - so they are written for implementers and, hopefully, developers.  
>  
> As part of that, we're creating a documented agreement of what is
> shared in common between all of the various communities of interest.
> The documented agreement is of concern to all of the community,
> with organizations like W3C and IETF and ECMA providing venues
> for discussion and consensus-building through the whole community
> by allowing representatives of the various constituencies to speak,
> propose alternatives, and to propose and effect changes that are
> important for the overall functioning.

Yes, this is good summary of the standardization process.   
>  
> Reviewers are like your QE department -- if they're going to test your software,
> you need to make sure you don't introduce new bugs when you try to fix
> old bugs, which requires a controlled "fork" where bug-fixes get made
> in the stable fork and new features get added to the "up to date" spec.
>  
> I'm having trouble understanding why this is controversial.
>  
> But to stabilize a specification, you have to stabilize all of the references in it.
>  
> I have no problem with an "Editor's Draft" having references that update,
> fine. It's an Editor's Draft. But when you get down to "Last Call", all of the referenced
> material should have some clear identification of which edition is to be reviewed --
> if only for making sure that the references are up to date enough.

Sure, but sometimes one can't wait (e.g., WebIDL, HTML5).    
> > To this point, it's why I've been nagging the W3C to make the /TR/ information
> > available in JSON and why I pushed for all specifications to be CORS enabled
> > (they now are!) - so we can dynamically scrape the metadata of specifications
> > to build up references.
>  
> Automatically building references might be OK, but how do you
> automatically add notices when normative references have changed
> since the last time you looked. The automation people are doing
> now with undated references skips the step of verifying the applicability
> of changes.

Implementers and users complain - tests start failing, etc. Complaints are a good thing.  
> > I also nagged the IETF about CORS-enabling their specs
> > list (they have a big XML file or something), but I never heard back from them.
>  
>  
> nagging people to do something without a clear rationale is rarely
> productive.

Are you assuming that I didn't explain what I needed the references for? Cause I'm pretty sure I did (pointed people to my bibliographical builder thingy). Anyway, it didn't happen.  

> > So: what I am trying to do at the moment is remove human error from
> > references - humans do a bad job at keeping information up to date, and this
> > punishes reviewers and users. So the best way to deal with the human
> > problems is to limit the amount of things the humans (including me) can get
> > wrong.
>  
>  
> Deciding whether changes are needed when the reference is updated
> requires human intervention. Automating the process of inserting the
> right metadata for a specific version is part of xml2rfc, and the database
> should be shared with W3C and respec, but automatically generated
> references should be to specific versions.

I think we have to agree to disagree. I'm more for letting the whole system of dependencies come tumbling down if tests and specs change. Stability is only achieved when things stop changing.  
> > > Secondly you are doing this optimization for the editor at the expense of losing
> > > cruicial information for the review process, and one of the essential tasks for
> >  
> >  
> > which
> > > standards groups have editors: to insure the integrity of the references.
> >  
>  
>  
>  
>  
> > In my model, the review process is continuous and forever ongoing hence the
> > references have to be tied to the latest and greatest (and the Editor/WG
> > deeply invested in monitoring changes to all specs they reference).
>  
>  
>  
> The editor edits. The editor is not the sole designer or the sole reviewer,
> of any normative requirement.

Agree. I hope you didn't think I was saying otherwise.  
> You review a spec for both whether the
> thing specified works (can be implemented) and whether it is well described
> (can be implemented from the spec).  
>  
> There's a standards process in various organizations; your model is incomplete.
> The "latest" is not always the "greatest". Deciding that requires judgment
> and consensus.


Maybe, more often than not, stuff gets added to fix stuff. Look at any change log for any spec.  

Here is a typical example of a commit history for a spec (see how stuff generally gets improved):  

* Removed hard links, which were broken
*Fixed up markup.
*Merge branch 'typos_editorial' of https://github.com/..
*Consistant use of spec name
*fixed grammar
*fixed merge conflict, added full stops.
*Reintroduced error steps (closes #90)
*Patent disclosure URL was wrong one!


And so on…  

> > Anything
> > else is demonstratively fundamentally flawed: If once cites a WD version, and
> > that WD version gets updated the date after the spec is published, the
> > reference is useless. A reviewer/user can go and look at the now out of date
> > version, and conclude it makes sense, but in reality the cited WD has changed
> > and now the published spec would be wrong.
>  
>  
>  
> I think there are many flaws, including in the model you've proposed. I think
> there are some process and automation changes that can address the problems,
> and that universally switching to undated references breaks many processes --
> as many problems as it fixes.

I can only speak from my own experience. I've been burnt very badly by the date model. I also see things like HTML and the WHATWG specs using the model I'm discussing. This gives me confidence that the model is not broken.   
  
> >  
> > That thread concludes with:
> > "(Updating is good spec hygiene, and the RFC editor will make you update it
> > anyway. Or the IESG before it.)"
> >  
> > Which underscores my point, no?
>  
> Not at all. The updates to ABNF include some changes with the expectation
> that referring specifications would adapt.
>  
> > > http://www.ietf.org/mail-archive/web/json/current/msg00301.html
> >  
> > And this one:
> > "There are other i18n issues that we might want to clarify, but updating the
> > Unicode reference seems uncontroversial."
>  
> Yes, after evaluation. It's not controversial because someone actually looked.

I'm not saying not to look. Stuff will break if you don't look… but that's a good thing: Interoperability is a state, not something that is claimed.  
> > And even better followup from Tim Bray:
> > "So how about `string is a sequence of zero or more Unicode characters
> > defined in version 6.2 (or any subsequent version) of [UNICODE]`"
> >  
> > Yes, sir… "or any subsequent version" … i.e., the latest and greatest :)
>  
> Yes, that's what Tim proposed, but I think it is still subject to the
> caveat that if the Unicode Consortium decided to change what
> was called a "character", all of the specs that referenced it would
> need to be changed when they updated the version.


That's ok. Living Documents give you that ability.  
  
> > Note also the clever discarding of section numbers by Tim. Most impressive.
>  
>  
>  
> > > you may think these conversations are unnecessary, extra work and burden,
> > > but they're part of creating one of the things expected of a "standard" --
> >  
> >  
> >  
> > I think they are great, because they seem to completely validate what I am
> > saying.
>  
>  
>  
> If at time T I make a reference from A to B, I should say which version  
> of B I intended, as WELL as giving, if possible, my estimate of where I think
> it is likely to be able to find the latest version of B. But unless I control
> B or have confidence that B won't change in ways that mess up my spec,
> giving an undated reference to B is just reckless.

I would argue the opposite is reckless. If B changes from under you in incompatible ways, then there is not much you can do but to update your spec. You can't then turn around to your users and blame B. It's why we put into specs:  

"Implementors should be aware that this specification is not stable. Implementors who are not taking part in the discussions are likely to find the specification changing out from under them in incompatible ways."


> > > namely that it has been widely reviewed for consistency. And you can
> > > only insure wide review for consistency if there are no unreviewed
> > > changes during a 'last call' period.
> >  
> >  
> >  
> > You can review for consistency anytime/anywhere. Specs change specially
> > _after_ Last Call. CR is when specs are at their most volatile.
>  
>  
> I think if there are substantial changes to a document during Last Call,
> then Last Call needs to be repeated.

Sure.  

> > > Whenever A --normative reference -> B and B updates from B1 to B2,
> > > that you actually need to REVIEW whether the changes from B1 to B2
> > > require changes to the language in A.
> >  
> >  
> >  
> > This is true. In my model, I assume the Editor will continue to track all specs he
> > or she references. This is a fundamental part of maintaining a Living Standard.
>  
>  
>  
> The Editor isn't the only reviewer. I don't think any single person
> can track all of the specs of the open web platform.


I know of some crazies that do :) But you are right, but that's why we have a working group and a developer community. If no one cares about your spec, and it falls out of date, then no harm done. If lots of people care about your spec, and it suddenly falls out of date, you will be sure to hear about it.   

> > > Your proposal to remove explicit dates and point to the 'latest version'
> > > not only removes the opportunity to do this review, it eliminates the
> > > important information of whether the updated reference has
> > > been checked for consistency with your use of the previous referenced
> > > spec.
> >  
> >  
> >  
> > Kinda, but it's assumed to be part of the Living Document process - and if found
> > to be an issue, one fixes either spec to resolve the issues.
> > > The other metadata (title, author/editors, organization publishing) are,
> > > in addition to the date, important clues -- when you chase a
> > > URL in A and get 404 Not Found -- as to how you might search for
> > > the intended specification anywhere, e.g., if it moved.
> >  
>  
>  
>  
> > I bet you can find it without those bits (i.e., only with title and URL). I would like
> > to see at least one example of where a reference in a W3C spec has gone 404
> > and you can't find it again with the title and the URL through Google or the web
> > archive. It seems like cargo cult behavior (copying from the dead tree academic
> > journal model) for no demonstrable reason. I'm open to being proved wrong
>  
>  
>  
> I think calling it "cargo cult" is inappropriate. With a cargo cult, the system
> being followed never actually worked, it's blind following of process
> based on misunderstanding.


That's not from the original definition. See, the original video here - particular 3:18:  
http://www.youtube.com/watch?v=7yYD9ia-5_M

Airplanes landing was not a broken system.   

> The system of using dated references is embedded in our society even
> in the migration to digital media. Academic references to supporting evidence
> is required by all online resources. WIkipedia requires author, date, title
> http://en.wikipedia.org/wiki/Wikipedia:Citing_sources -- they're not
> "cargo cult", are they?

Yes. Specially if they can't prove the utility of:

name of the author(s)
title of the article within quotation marks
name of the website
date of publication

page number(s) (if applicable)


> I think I've tried to justify the "demonstrable reason" for having dated
> references as a way of indicating clearly which version of the documented
> cited was reviewed widely -- for patents, implementability, performance,
> security. I think I've given two use cases and can give more of situations
> where the update to the cited document required some changes to the
> citing document before the update was correct.


Without actionable outcomes, it's just you and me just talking to each other.
  
> > > Finally (as a minor point): specifications that don't change substantially
> > > sometimes get reorganized for clarity, but the reorganization plays
> > > havoc with references from other specs into the interior. So if you
> > > say "Section 7.2 of [UNICODE]" but Unicode gets reorganized,
> > > the new section might have a different section number.
> >  
> >  
> >  
> > Totally, that's another rule of mine: never ever cite section numbers - only
> > concepts or complete specifications (or tell the editor of the other spec to add
> > suitable id's to their document so you can hyperlink to them).
>  
>  
>  
> If you can get a clear agreement from the editor of a cited work
> to keep the integrity of the citation, that might work, to allow
> undated citations. But if there is no such agreement, or even if there
> is but it's not documented in B that A depends on B and that updates
> to B must coordinate with A -- you still have problems that are
> easily avoided by being more extensive in how you list citations.
>  
> Larry
Received on Saturday, 15 June 2013 18:52:06 UTC