W3C home > Mailing lists > Public > whatwg@whatwg.org > May 2009

[whatwg] Link rot is not dangerous

From: Manu Sporny <msporny@digitalbazaar.com>
Date: Fri, 15 May 2009 14:24:50 -0400
Message-ID: <4A0DB372.60702@digitalbazaar.com>
Kristof Zelechovski wrote:
> I understand that there are ways to recover resources that disappear from
> the Web; however, the postulated advantage of RDFa "you can go see what it
> means" simply does not hold. 

This is a strawman argument more below...

> All this does not imply, of course, that RDFa is no good.  It is only
> intended to demonstrate that the postulated advantage of the CURIE
> lookup is wishful thinking.

That train of logic seems to falsely conclude that if something does not
hold true 100% of the time, then it cannot be counted as an advantage.

Example:

Since the postulated advantage of RAID-5 is that a disk array is
unlikely to fail due to a single disk failure, and since it is possible
for more than one disk to fail before a recovery is complete, one cannot
call running a disk array in RAID-5 mode an advantage to not running
RAID at all (because failure is possible).

or

Since the postulated advantage of CURIEs is that "you can go see what it
means" and it is possible for a CURIE defined URL to be unavailable, one
cannot call it an advantage because it may fail.

There are two flaws in the premises and reasoning above, for the CURIE case:

- It is assumed that for something to be called an 'advantage' that it
  must hold true 100% of the time.
- It is assumed that most proponents of RDFa believe that "you can go
  see what it means" holds at all times - one would have to be very
  deluded to believe that.

> The recovery mechanism, Web search/cache,
> would be as good for CURIE URL as for domain prefixes.  Creating a redirect
> is not always possible and the built-in redirect dictionary (CURIE catalog?)
> smells of a central repository. 

Why does having a file sitting on your local machine that lists
alternate vocabulary files for CURIEs smell of a central repository?
Perhaps you're assuming that the file would be managed by a single
entity? If so, it wouldn't need to be and that was not what I was proposing.

> Serving the vocabulary from the own domain is not always possible, e.g. in
> case of reader-contributed content, 

This isn't clear, could you please clarify what you mean by
"reader-contributed content"?

> and only guarantees that the vocabulary
> will be alive while it is supported by the domain owner.

This case and it's solution was already covered previously. Again - if
the domain owner disappears, the domain disappears, or the domain owner
doesn't want to cooperate for any reason, one could easily set up an
alternate URL and instruct the RDFa processor to re-direct any
discovered CURIEs that match the old vocabulary to the new
(referenceable) vocabulary.

> (WHATWG wants HTML documents to be readable 1000 years from now.)  

Is that really a requirement? What about external CSS files that
disappear? External Javascript files that disappear? External SVG files
that disappear? All those have something to do with the document's
human/machine readability. Why is HTML5 not susceptible to link rot in
the same way that RDFa is susceptible to link rot?

Also, why 1000 years, that seems a bit arbitrary? =P

> It is not always practical either as it could confuse URL-based 
> tools that do not retrieve the resources referenced.

Could you give an example of this that wouldn't be a bug in the
dereferencing application? How could a non-dereference-able URL "confuse
URL-based tools"?

-- manu

-- 
Manu Sporny
President/CEO - Digital Bazaar, Inc.
blog: A Collaborative Distribution Model for Music
http://blog.digitalbazaar.com/2009/04/04/collaborative-music-model/
Received on Friday, 15 May 2009 11:24:50 UTC

This archive was generated by hypermail 2.3.1 : Monday, 13 April 2015 23:08:49 UTC