W3C home > Mailing lists > Public > www-tag@w3.org > October 2013

Re: Link rot in Supreme Court decisions

From: Herbert van de Sompel <hvdsomp@gmail.com>
Date: Wed, 9 Oct 2013 18:13:54 -0600
Message-ID: <CAOywMHfnkffOQHdj1qcZCVqenSB7cBp-bYKAUTt5K-j_i+kgJQ@mail.gmail.com>
To: Larry Masinter <masinter@adobe.com>
Cc: "ashok.malhotra@oracle.com" <ashok.malhotra@oracle.com>, Karl Dubost <karl@la-grange.net>, Mark Nottingham <mnot@mnot.net>, "www-tag@w3.org WG" <www-tag@w3.org>
On Wed, Oct 9, 2013 at 4:17 PM, Larry Masinter <masinter@adobe.com> wrote:
> Memento makes much more sense and is much more compelling as a (proposed) standard interface supported by (or at least compatible with) the collective community of web archivers.
> Trying to isolate a solution to "link rot" as a separate, independent problem from archiving just weakens the concept, and invites people to think the two problems are readily separable.
> When you say you want a link to be "permanent", whatever it is you want to preserve seems to be highly correlated to whatever it is you thought the link meant in the first place.
>
> (permanent link to a page that tells you the weather of the day you look, permanent link to a page that tells you TODAY's whether on the date when the link was made, for example.)
>

Thanks, Larry.

I don't think the proposal expressed in
http://mementoweb.org/missing-link/ isolates the link rot from the
archiving problem. Rather it proposes an approach that allows links as
we use them to be loosely coupled with archives and resource
versioning systems for applications in those pockets of the web where
an increased degree of persistence is considered important. I
understand this is, for example, the case in Wikipedia.

The wikipedia page http://en.wikipedia.org/wiki/Border_collie has the
following external references that are explicitly marked as dead
links:

47. "Border Collie". Justusdogs.com.au. Retrieved 2010-09-13.[dead link]

50. Fastest Car Window Opened by a Dog [dead link]
www.guinnessworldrecords.com. Retrieved 2007-08-12.

The linked resources are respectively:

=> For (47): http://www.justusdogs.com.au/dog-pages/dog-breeds/538/border-Collie.cfm

=> For (50): http://www.guinnessworldrecords.com/records/natural_world/fantastic_pets/fastest_car_window_opened_by_a_dog.aspx

and they are very dead indeed.

Using a Memento client, say the Memento extension for Chrome available
at https://chrome.google.com/webstore/detail/memento-time-travel/jgbfpjledahoajcppakbgilmojkaghgm?hl=en&gl=US,
these dead links can be resurrected:

=> For (47), manually set the time travel date to the "retrieved" date
listed in the reference - 2010-09-13 - and end up at the archived
version http://web.archive.org/web/20100325111147/http://www.justusdogs.com.au/dog-pages/dog-breeds/538/border-collie.cfm

=> For (50), manually set the time travel date to the "retrieved" date
listed in the reference - 2007-08-12 - and end up at the archived
version http://web.archive.org/web/20070716001208/http://www.guinnessworldrecords.com/records/natural_world/fantastic_pets/fastest_car_window_opened_by_a_dog.aspx

Part of the proposal in http://mementoweb.org/missing-link/ is about
conveying the "retrieved" date expressed in the references 47 and 50
in a machine-actionable manner:

=> For (47): <a
href="http://www.justusdogs.com.au/dog-pages/dog-breeds/538/border-Collie.cfm"
versiondate="2010-09-13">...</a> instead of <a
href="http://www.justusdogs.com.au/dog-pages/dog-breeds/538/border-Collie.cfm">...</a>

=> For (50): <a
href="http://www.guinnessworldrecords.com/records/natural_world/fantastic_pets/fastest_car_window_opened_by_a_dog.aspx"
versiondate="2007-08-12">...</a> instead of <a
href="http://www.guinnessworldrecords.com/records/natural_world/fantastic_pets/fastest_car_window_opened_by_a_dog.aspx">...</a>

Doing so conveys the date intended by the link, which as such is
informative. Doing so also supports retrieval by Memento capable
clients of a version that is temporally as close as possible to the
versiondate.

Cheers

Herbert




> IMNSHO
>
> Larry
> --
> http://larry.masinter.net
>
>
>> Admittedly, these sources of prior resource versions do not cover all
>> prior versions of all resources. But there's a significant body of
>> prior resource versions out there. For example, the Internet Archive
>> is said to currently contain 335 billion archived web resources. To
>> put it differently, there's a significant body of URIs out there for
>> which machine-actionable temporal information added to a link, as
>> proposed in the document I shared, would be useful rather than
>> useless. Hence, it would be nice to see a discussion that is more
>> about that aspect of the reference rot problem that is addressed in
>> the document I shared, and less about those aspect that the document
>> has no proposal for and for which it relies on ongoing international
>> efforts pertaining to web archiving.
>>
>> Cheers
>>
>> Herbert
>>
>>
>> --
>> Herbert Van de Sompel
>> Digital Library Research & Prototyping
>> Los Alamos National Laboratory, Research Library
>> http://public.lanl.gov/herbertv/
>>
>> ==



-- 
Herbert Van de Sompel
Digital Library Research & Prototyping
Los Alamos National Laboratory, Research Library
http://public.lanl.gov/herbertv/

==
Received on Thursday, 10 October 2013 00:14:26 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:33:21 UTC