W3C home > Mailing lists > Public > www-tag@w3.org > August 2016

Re: Firefox addin to replace 404 pages with archived pages from wayback machine

From: Herbert Van de Sompel <hvdsomp@gmail.com>
Date: Fri, 5 Aug 2016 11:27:06 -0600
Message-ID: <CAOywMHe=4FzF+s6N=5M_HeMfKa816WmGrvgU05dT8W1_Lu5WaA@mail.gmail.com>
To: "www-tag@w3.org" <www-tag@w3.org>
Cc: Herbert Van de Sompel <hvdsomp@gmail.com>
Hi all,

I would like to take the opportunity to mention a few things with this
regard:

(*) The plug-in uses the Internet Archive's Wayback collection to find old
pages, routinely called Mementos in the web archiving community. Note that
there are many more web archives around the world and that an aggregator
service exists; see http://timtravel.mementoweb.org. This aggregator also
supports the Memento "Time Travel for the Web" protocol (RFC7089) and
exposes APIs that allow looking for Mementos across many archives; see
http://timetravel.mementoweb.org/guide/api/ . In order to get broader web
archive coverage, the plug-in could use this aggregator service.

(*) In the Hiberlink project, we studied approaches to ameliorate the link
rot problem. One outcome was the notion of Robust Links, basically an
approach to decorate links in HTML as a means to allow revisiting linked
content in case a link has died or when the linked content had changed. The
link decoration uses HTML5 data- attributes, which can be made actionable
using simple JavaScript.
- For a motivation regarding Robust Links, see
http://robustlinks.mementoweb.org/about/
- For the Link Decoration spec, see http://robustlinks.mementoweb.org/spec/
- For an example paper that shows Robust Links at work, see
http://dx.doi.org/10.1045/november2015-vandesompel

Cheers

Herbert

On Fri, Aug 5, 2016 at 10:56 AM, Brian Kardell <bkardell@gmail.com> wrote:

>
>
> On Fri, Aug 5, 2016 at 12:23 PM, Noah Mendelsohn <nrm@arcanedomain.com>
> wrote:
>
>> See [1].
>>
>> I thought this might be of some interest to the TAG. Seems to me that
>> this is OK insofar as the addin is a modification to a user agent, and is
>> presumably activated only with the user's consent.
>>
>> Nonethess, this seems to embody a slightly skewed view of Web protocols:
>> if I as a URI authority serve a new or updated page, your browser will do
>> what I intend and show the user that new content. If I delete a page, the
>> browser will not honor that deletion, but will show content anyway. This
>> seems to me just a bit of a slippery slope. A 404 is just as meaningful in
>> Web protocols (no such page) as a 200 IMO.
>>
>> I'm not proposing that the TAG do anything about this or devote
>> significant time to it right now, just pointing it out in case it's of
>> interest.
>>
>> Thank you.
>>
>> Noah
>>
>>
>> [1] http://gadgets.ndtv.com/apps/news/firefox-will-try-to-show-y
>> ou-saved-archive-of-a-page-instead-of-404-error-869482
>>
>>
>
> Noah,
>
> The UA actually shows a prompt when it encounters a 404 if there is a
> version in wayback[1].  It seems that both wayback and the UA are acting
> entirely within their appropriate boundaries to me, does it not? Your
> deletion is indeed honored, but if someone archived that it is indeed
> archived.  If you setup your server not to be, it shouldn't be (though it
> really still could be).  If the UA offers help in finding that, that seems
> really not a lot different than all sorts of a lot of browser features
> (like a search toolbar).  Am I misunderstanding something?
>
>
> [1] https://testpilot-prod.s3.amazonaws.com/experiments_
> experimenttourstep/d/a/dafca30f93dadf7f13cc48d389e08f
> 84_image_1470245154_0851.jpg
>
>
> --
> Brian Kardell :: @briankardell
>



-- 
Herbert Van de Sompel
Digital Library Research & Prototyping
Los Alamos National Laboratory, Research Library
http://public.lanl.gov/herbertv/
http://orcid.org/0000-0002-0715-6126

==
Received on Friday, 5 August 2016 18:35:09 UTC

This archive was generated by hypermail 2.3.1 : Friday, 5 August 2016 18:35:09 UTC