Re: reference needed - versioned documents

On Mar 31, 2008, at 4:48 PM, Pat Hayes wrote:

> At 2:58 PM -0400 3/31/08, Jonathan Rees wrote:
>> On Mar 31, 2008, at 1:48 PM, Williams, Stuart (HP Labs, Bristol)  
>> wrote:
>> [...]
>>> Is the answer not evident from the references is Felix Sasaki's  
>>> response?
>>> A "latest version" URI, which identifies the most recently  
>>> published draft in a document series.
>> No. Nothing tells me anything about *which* document series is  
>> involved,
> er... the one all of whose drafts have the same title?

If you know that they all had the same title in the past, and will  
all have the same title in the future, and that there is only one  
such series, then you know more than I do. Evidence?  ... well even  
if you do have evidence, the fact that such evidence is hard to find  
helps to make my case.

>> or provides me with any invariants over the elements of the  
>> series, or tells me the process by which new drafts are produced,  
>> or even what the past drafts were.
> You can trace this through the 'previous version' links. In fact,  
> that might be the best way to define such a series: go to the  
> latest version, then iterate back through the previous versions.

This doesn't tell me anything about the publisher's intentions  
regarding future versions, and it's not even obvious to me that  
following the chain of links will get all previous versions.

>> So not only do I not know what the named entity *is*, I don't know  
>> much about it. There's little I can say about it that will be  
>> understood by someone reading what I say at an unknown future  
>> time. I don't know who is going to have written whatever the "most  
>> recently published draft" will be at the time my statements about  
>> it are read, or what the draft will be saying, or even what the  
>> draft will be about. The URI might be useful heuristically as a  
>> hyperlink ("see xxx to see a most recently published draft of the  
>> zzz working group's spec... probably") but I don't see how it's  
>> useful as a name to be used in discourse (e.g. RDF).
> Well, we can make some assertions about it, such as that its a W3C  
> TR 'draft series', and it was begun on a certain date, and its the  
> product of a certain WG, and so on.

The first I agree with. The second seems a bit slippery. And evidence  
that past and future drafts come from the same WG will I think be  
hard to come by - this is what I meant below by "detective work".

If you have a total inventory of all drafts, and you know the  
resource is a draft series, then you're in pretty good shape in  
understanding the resource. Maybe we have such an inventory for this  
draft series. But I think you'd have to be pretty sophisticated to  
determine that you do have your hands on all past versions and that  
there will be no more future versions.

>> If I do some detective work I may be able to figure out invariants  
>> such as the series's subject matter or working group affiliation  
>> (and the WG's charter), and if I do a *lot* of detective work I  
>> might find some piece of email or some minutes that say how the  
>> URI is going to be used, but before I get to that point I will  
>> have decided that it's not worth the effort to try to use or  
>> understand that URI.
> Why is this URi any worse in this respect than, say, the URI which  
> identifies the Working Group itself?

Probably isn't - depends on whether there's a publication that tells  
me how to use the latter. I'm just saying that if no one has come out  
and told me what the URI is supposed to name, or how it is supposed  
to be used, then I'm going to be reluctant to use it in assertions.

>> If I'm unfortunate enough to find that someone else has used it in  
>> communication with me, then I'll have to make assumptions (e.g.  
>> that the draft they were talking about is close enough to the one  
>> I see) or enter into dialog with them (which draft are you talking  
>> about? or what invariants do you know about the series that I  
>> don't know?) or attempt to verify what they say (since it is  
>> probably very easy to be wrong in making statements about things  
>> like this).
> You seem (?) to be presuming that one can make useful assertions  
> only about actual documents, but I don't see the rationale for this  
> assumption.

Just the opposite. I'm saying the URI "owner" can and should make  
useful statements about the named resource, but generally doesn't,  
and without these useful statements *I* can't make useful statements  
about what's named because I don't know what's named. I can do as  
many GETs as I like, and I still won't know anything. The W3C's  
statements about its TR URIs qualify as useful statements, but for  
the undated URI I don't think they go far enough to let the the URI  
be a good citizen of the semantic web.

It just seems a wasted opportunity since it would be nice to be able  
to make statements - either about what's in a document, or what it's  
about, or what's invariant over a series of 'drafts', or about the  
process that generates a 'draft series'. The document case is easy -  
you can just look at the document - or would be if we had a standard  
way to say that a URI names a document (unchanging) and if there were  
a standard way to discover such an assurance (thus the Link header  
discussion). The case of a changing or CN-varying document is harder  
but commitments saying that all drafts will be about such-and-such,  
or will have so-and-so as an author, ought to be valuable.

Maybe not valuable enough. I would not be too bothered if authors of  
RDF frequently determined that there was not enough value in making  
declarative statements about changing documents to justify the effort  
it would take to formulate or use the statements - that "resource  
description" only pays off for unchanging documents and for non- 
documents, and in the situation when you are privileged to be the URI  
"owner" (since then you get to express what you mean for the URI to  
name). If I come across RDF that seems to say something about a  
changing document, and the URI has no published policy around what it  
names, I think I will generally assume that the statement is probably  
about whatever document someone saw when dereferencing that URI, and  
I'll be even more cautious than usual since the document at that URI  
might have changed since the RDF was written - or the URI might even  
be meant to name a different document, or something that's not a  
document... hmm, maybe you know what I'm talking about.


Received on Tuesday, 1 April 2008 19:14:57 UTC