Re: ACTION- 541: Jeni to help Dan pull together terminology on Deep Linking from John Kemp on 2011-04-07 (www-tag@w3.org from April 2011)

From: John Kemp <john@jkemp.net>
Date: Thu, 7 Apr 2011 18:11:11 -0400
To: Larry Masinter <masinter@adobe.com>
Cc: Jeni Tennison <jeni@jenitennison.com>, Noah Mendelsohn <nrm@arcanedomain.com>, "www-tag@w3.org List" <www-tag@w3.org>
Message-Id: <1E29F85E-A8BB-4F92-93E6-DAC8586AEE96@jkemp.net>

Larry,

I agree with your statements about the functional properties of archives, and agree also that they are different from those offered by what is commonly known as a cache.

But, for the purposes of a document talking about "deep linking", are the differences not smaller than the similarities?

See below:

On Apr 7, 2011, at 5:16 PM, Larry Masinter wrote:

> An archive and a cache share some properties, but their purpose is so fundamentally different that it would be incorrect to call an archive a "kind of cache" in a document whose purpose is to be clear about terminology.

Just to be specific, the properties which are listed in this document (https://docs.google.com/View?id=dgnh4s67_2cv7hf5c7#Caching_0002013307809961562_43) are as follows:

> • 
> links within an HTML page may be rewritten so that they point to pages that are also served by the cache or distributor
> • Javascript and CSS files may be combined and compressed to provide speedier access
> • banners may be added within an HTML page to highlight that the representation is a copy of an original somewhere else
> • files may be compressed or converted to different formats
> • wholly new documents may be created that bring together information from multiple different sources

Are any of these properties not present in Web archives? I think the basic point is that content once present in an archive may be altered from the form in which it was presented by the "origin" server (as is possible with other forms of caching) - and this seems quite possible to me. 

> 
> They both may contain copies of material that is or was otherwise available at the "true" origin server.
> 
> The purpose of a cache is to improve network performance by not re-transferring data over lower bandwidth or latency links.
> The purpose of an archive is to  ensure long-term accessibility of web material that otherwise becomes unavailable over time.
> 
> Caches can be cleared at any time. Stale cache material becomes less valuable over time. Cached content should follow the same security and access control model as the original content for which it is a cache.
> 
> Clearing an archive destroys the value of an archive. Old archived material becomes more valuable over time. Archived information is likely to have different security and access control policies as the original content. In fact, some archives might even hold material in "escrow" and only provide archived content after a delay.

Which of the properties you note in the preceding paragraphs are relevant to the "deep linking" discussion, and how are they relevant?

Regards,

- John

> 
> Larry
> --
> http://larry.masinter.net
> 
> 
> -----Original Message-----
> From: www-tag-request@w3.org [mailto:www-tag-request@w3.org] On Behalf Of Jeni Tennison
> Sent: Thursday, April 07, 2011 12:46 PM
> To: Noah Mendelsohn
> Cc: www-tag@w3.org List
> Subject: Re: ACTION- 541: Jeni to help Dan pull together terminology on Deep Linking
> 
> Noah,
> 
> On 7 Apr 2011, at 18:32, Noah Mendelsohn wrote:
>> On 3/31/2011 1:51 PM, Henry S. Thompson wrote:
>>>  *cache*: to store a copy of something hosted elsewhere and
>>>  likewise make it available
>> 
>> I feel like this is missing something along the lines of:
>> 
>> *cache*: to store a copy of something hosted elsewhere and likewise make it available. Cached copies are typically created to improve performance or availability, and are usually not managed for long-term stability.
>> 
>> I don't love that, but I feel your original definition misses the point of a cached copy: it's typically an optimization, and crucially, nothing in the system should behave differently if that copy disappears, except for perhaps being slower or less able to respond in the face of network partition.
> 
> In the document as it stands, I class archives as a form of cache, and of course there are archives, such as the internet archive or the UK government's web continuity project, in which the cached copies *are* maintained for long-term stability.
> 
> Jeni
> -- 
> Jeni Tennison
> http://www.jenitennison.com
> 
> 
>

Received on Thursday, 7 April 2011 22:11:37 UTC