Re: Link rot in Supreme Court decisions from Herbert van de Sompel on 2013-09-26 (www-tag@w3.org from September 2013)

From: Herbert van de Sompel <hvdsomp@gmail.com>
Date: Thu, 26 Sep 2013 03:22:10 -0600
To: ashok.malhotra@oracle.com
Cc: Larry Masinter <masinter@adobe.com>, Karl Dubost <karl@la-grange.net>, Mark Nottingham <mnot@mnot.net>, "www-tag@w3.org WG" <www-tag@w3.org>, Herbert van de Sompel <hvdsomp@gmail.com>
Message-ID: <CAOywMHeQXprOjKSx5JJJkpcvG8Wk85V4QzfogY2tJMz6n7WY2A@mail.gmail.com>

On Wed, Sep 25, 2013 at 5:41 PM, Ashok Malhotra
<ashok.malhotra@oracle.com> wrote:
> Hi Larry:
> Let's frame the discussion a bit differently.
> The memento proposal does not solve the link permanence problem.
> But does it help the link rot problem?
>
> I must admit I'm on the fence about this because there are other
> solutions to the link versioning issue.  W3C and IETF have conventions
> for document versioning, etc.

Memento does not compete with existing resource versioning methods. It
uses them. To put it in terms of Tim Berners-Lee's discussion on
genericity of resources (http://www.w3.org/DesignIssues/Generic.html),
Memento provides a way to navigate from the URI of a time-generic
resource to the URI of a time-specific (version) resource that was
active at a certain point in time. Memento uses negotiation in the
time dimension to make this happen.

For example, with Memento, one can use the URI
http://en.wikipedia.org/wiki/Web_archiving and the date Mar 20 2012 to
arrive at the URI
http://en.wikipedia.org/w/index.php?title=Web_archiving&direction=prev&oldid=485347845
which is the version of the resource that was active at the specified
date. Or to use  http://cnn.com and Mar 1 2006 to arrive at
http://web.archive.org/web/20060301005753/http://www.cnn.com/ .

Does Memento help with the link rot problem? The first example in the
document I shared (http://mementoweb.org/missing-link/#example1) has a
rotten link to http://www.metamute.org/en/Radio-Playtime. Using the
accessdate provided in the example, 2010-02-09, Memento leads to
http://web.archive.org/web/20110312101108/http://www.metamute.org/en/Radio-Playtime
which is the archived version of the resource that is temporally
closest to the requested date.

The proposal in my document is about putting temporal context
information as attributes on the anchor element to allow a client to
travel to an appropriate version resource without affecting the link
to the time-generic resource.

Cheers

Herbert

Cheers

Herbert

> All the best, Ashok
>
>
> On 9/25/2013 10:37 AM, Larry Masinter wrote:
>>
>> to archive documents for a long time, you need more than just link
>> stability. You need document formats designed for archive, and you need a
>> storage system that can guarantee integrity for the lifetime of the
>> information archived.
>>
>> My take on this was http://larry.masinter.net/0603-archiving.pdf -- your
>> answers might be different but the questions remain.
>>
>> Larry
>>
>>
>>> -----Original Message-----
>>> From: Herbert Van de Sompel [mailto:hvdsomp@gmail.com]
>>> Sent: Tuesday, September 24, 2013 2:58 PM
>>> To: Karl Dubost
>>> Cc: Mark Nottingham; www-tag@w3.org WG
>>> Subject: Re: Link rot in Supreme Court decisions
>>>
>>> Karl
>>>
>>> I couldn't agree more. If you find a moment, pls read the document
>>>
>>> http://mementoweb.org/missing-link/
>>>
>>> Herbert
>>>
>>> Sent from my iPhone
>>>
>>> On Sep 24, 2013, at 16:45, Karl Dubost <karl@la-grange.net> wrote:
>>>
>>>> Mark Nottingham [2013-09-23T21:50]:
>>>>>
>>>>> I'm sure some here will enjoy / be horrified by this.
>>>>> http://www.nytimes.com/2013/09/24/us/politics/in-supreme-court-
>>>
>>> opinions-clicks-that-lead-
>>> nowhere.html?partner=rss&emc=rss&utm_source=twitterfeed&utm_medium=t
>>> witter&_r=0
>>>>
>>>> horrified by the tracking link ;) :p
>>>> http://www.nytimes.com/2013/09/24/us/politics/in-supreme-court-opinions-
>>>
>>> clicks-that-lead-nowhere.html
>>>>
>>>> That said:
>>>>
>>>> * Identifier
>>>> * Actual content
>>>> * Duplication
>>>>   - Location
>>>>   - Rights
>>>>
>>>> Many things which are published on papers are extremely resistant to
>>>> time,
>>>
>>> because of the third property, aka duplicated in many places apart from
>>> each
>>> other. In addition to that, libraries have a special status with regards
>>> to the law
>>> on keeping and circulate copies of a work.
>>>>
>>>> On the Web, there are cache systems but not really with an archiving
>>>> policy.
>>>
>>> And we have the issue of most of the time what is identified is
>>> what/where is
>>> stored. Practical for fresh information, catastrophic for the fabric of
>>> time. In an
>>> aesthetics of the Web, rust is spreading quite quickly.
>>>>
>>>> In addition to that, the identifier is dependent on the location owner
>>>> (domain
>>>
>>> name).
>>>>
>>>> --
>>>> Karl Dubost
>>>> http://www.la-grange.net/karl/
>>>>
>>>>
>>
>

-- 
Herbert Van de Sompel
Digital Library Research & Prototyping
Los Alamos National Laboratory, Research Library
http://public.lanl.gov/herbertv/

==

Received on Thursday, 26 September 2013 09:22:39 UTC