W3C home > Mailing lists > Public > www-tag@w3.org > November 2010

Re: Reviewing Momento - http://mementoweb.org/guide/rfc/ID/

From: Herbert Van de Sompel <hvdsomp@gmail.com>
Date: Wed, 24 Nov 2010 09:11:45 -0700
To: Tim Berners-Lee <timbl@w3.org>
Message-Id: <B9820AD0-AA58-49A7-AEEA-BD26DCD410BE@gmail.com>
Cc: Tim's Notes <timbl+notes@w3.org>, w3t Team <w3t@w3.org>, TAG List <www-tag@w3.org>
Dear Tim,

Thanks for your interest in Memento and for your feedback to the first version of the Internet Draft. Michael Nelson, Robert Sanderson, and I discussed your comments and I provide our responses inserted in your feedback, below. 

Greetings

Herbert Van de Sompel

On Nov 21, 2010, at 10:38, Tim Berners-Lee <timbl@w3.org> wrote:

> Reviewing Momento - http://mementoweb.org/guide/rfc/ID/
> Incomplete notes from a plane.
> 
> General complaint applies to many specs:
> 
> The spec uses the words "RECOMMENDED" as in RFC2616 which immediately flags a 
> concern that it does not define a protocol cleanly, in the sense of describing (a) what
> the parties do and (b) what good that brings.    Whenever a spec uses MAY or RECOMMENDED
> it is in fact defining >1 protocol, one more powerful than the other.  It is wise to 
> explain not just a slider scale of from 0 .0 to 1.0 of how "MUST" something is, but
> to enumerate the different protocols with their conformance and what they deliver.
> 
> Example: 
> 	The link: rel=timegate header is only RECOMMENDED.
> 	If you don't do it, Section 3.0 breaks.
> 	Therefore, for the protocol in section 3.0, rel=timegate is mandatory.
> 
> 

We chose the weaker RECOMMENDED over the stronger MUST because of our concern for how Memento functions in partially compliant scenarios, and to reflect the fact that the framework does not break when a "timegate" link is not provided. If the server doesn't put rel="timegate" in a response, the client substitutes its own timegate value (see section 3.1.2.3). That section also explains how the rel="timegate" from the server is a suggestion; the client is free to ignore it for some reason (trust, better knowledge based on extensive discovery, etc.). 

Furthermore, it is possible that the server is generally Memento-aware, but simply can't (or won't) make a TimeGate recommendation for a particular URI.  We think a MUST would limit the server's flexibility in this situation.

> Maybe the spec needs to be factored into two specs, a timsegate discovery protocol and a memento
> protocol?
> 

The ID sort of follows this separation already: section 3 is about negotiation, and section 4 is about discovery.  And while they're different, they're also interlinked enough we thought they would make a better read as a single document.

> Step 2: The entity-header of the response from URI-R includes an HTTP "Link" header with a Relation Type of "timegate" pointing at a TimeGate (URI-G) for the Original Resource.
> 
> Just using the present tense like "includes" is actually more effective than an over-exhuberance 
> with RFC2616 capitals.
> 
> 
> 
> 2.1.1
> 
> "The presence of a Memento-Datetime header and associated value for a given resource constitutes a promise that the resource is stable and that its state will no longer change." --> is a  a:FixedResource
> where 
> @prefix a: <http://www.w3.org/2007/gen/ont#>.
> 
> 
> Or in the tabulator link vocabulary
> 
> { ?m log:uri [ is link:requestedURI of [ link:Response  [ httph:momento-decline []]]  } => ( ?m a a:FixedResource }.
> 

This is definitely an oversight. We are aware of the ontology as we have used the notions TimeGeneric and TimeSpecific resource in various talks and papers.  We agree that a Memento is a FixedResource, and will put wording and reference to that account in the next version of the ID.


> 
> "Similarly, if an application is mirroring the resource at a different URI, it SHOULD retain the resource's Memento-Datetime header and value if mirroring the resource does not include a meaningful change to the resource's state. For example, this behavior allows duplicating a Web archive at a new location while preserving the Memento-Datetime values of the archived resources."
> 
> So what breaks if another location does not  preserve the momento-datetime: header?
> Suppose a different value is given, or none at all?
> Presumably if the mirror is pointed to as a timegate for R, then everything breaks.
> If it isn't nothing breaks. Is that right?

Nothing will break if no Memento-Datetime is provided. A similar situation occurs today with web archives that are made Memento-compliant by proxy rather than natively. Such archives do not provide a Memento-Datetime header, yet Memento clients can deal gracefully with it, guided by the fact that they were redirected to the archive by a confirmed TimeGate. 

What we want to avoid, however, is that mirrors of archives that actually do use a Memento-Datetime header change its value, i.e. we want the value to be sticky. Nothing would technically break if a mirror changed the value, but it would be like rewriting history. We propose to make the language in the ID clearer by saying that a mirror that wants to support Memento MUST NOT assign a new Memento-Datetime value.

>  
> 
> 2.1.1.1
> 
> "The q-value approach is not supported for Memento's datetime negotiation because it is well-suited for negotiation over a discrete space of mostly predictable values, not for negotiation over a continuum of unpredictable datetime values."  
> Algorithm?  How to detremine bets fit
> 
> 
> " Not using an interval indicator is equivalent with expressing an infinite interval around the preferred datetime."
> 
> 
> But what is the algorithm for determining the best fit?

The ID states that the selection algorithm is an internal matter for the server, and suggests some possible approaches (based on existing practice). The ID also states that the server should be consistent in its use of an algorithm. 

Now, we had previously discussed allowing the server to convey which algorithm it used. The technical implementation we had in mind was:
* Use TCN header
* Use optional fields of TCN header to convey the algorithm name
* Predefine 2 or so algorithms and associated names to use in the header

However, we are not convinced that significant added value would result from the added complexity. We think that, in most cases, the client will be happy enough to actually have found a Memento. Also, a client can check the value of Memento-Datetime to see whether it is pleased with the server's choice of a Memento. 

We are very interested in further ideas with this regard.

> 
> 
> 3.0
> 
> Step 4: The entity-header of the response from URI-G includes a "Location" header pointing at the URI of a Memento (URI-Mj) for the Original Resource. In addition, the entity-header contains an HTTP "Link" header with a Relation Type of "original" pointing at the Original Resource, and an HTTP "Link" header with a Relation Type of "timemap" pointing at a TimeMap (URI-T). Also HTTP Links pointing at various Mementos are provided using the "memento" Relation Type, as specified in Section 2.2.1.4.
> 
> The time map header is transferred but not used in this protocol 3.0

Not sure we understood the comment. Section 4.1 gives an example of a TimeMap request and response.  TimeMaps are introduced to support batch discovery of Mementos and TimeGates. Did you have something else in mind?


> 
> ________________________________________
> 
> 
Received on Wednesday, 24 November 2010 16:12:02 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 26 April 2012 12:48:29 GMT