Re: naive question: why prefer absolute URIs to # URIs for linked data? from Martin J. Dürst on 2011-08-30 (www-tag@w3.org from August 2011)

From: Martin J. Dürst <duerst@it.aoyama.ac.jp>
Date: Tue, 30 Aug 2011 10:11:33 +0900
To: Ivan Herman <ivan@w3.org>
CC: Jonathan Rees <jar@creativecommons.org>, www-tag@w3.org, Manu Sporny <msporny@digitalbazaar.com>, Harry Halpin <hhalpin@ibiblio.org>, Ian Davis <me@iandavis.com>
Message-ID: <4E5C38C5.8010100@it.aoyama.ac.jp>

Hello Ivan,

On 2011/08/29 20:47, Ivan Herman wrote:

> in my understanding that is related to the follow-your-nose principle. If I see a URI for a, say, predicate, I may want to follow that URI and get some information. That predicate (or class or whatever) is rarely alone, it may be part of a vocabulary.
>
> If the URI is of the form http://blabla#blah, that means that I, typically, have a large vocabulary file at http://blabla and #blah is somewhere there. So if I dereference http://blabla#blah, I will get the full vocabulary and I will have to locate the specific element #blah to something with it (as a caller). If the vocabulary is very large, that might be a pain.
>
> If the URI is of the form http://blabla/blah, and I dereference it then I can expect to get only the information I am looking for.

That's indeed a potential problem, but not an absolute argument against 
using fragment identifiers. Just to the contrary, if I'm following my 
nose to dereference http://blabla#blah, there is a reasonable chance 
that I might also subsequently want or have to dereference other stuff 
in the  http://blabla vocabulary such as  http://blabla#foo and 
http://blabla#bar.

If I already have downloaded  http://blabla, then these are essentially 
immediately available, whereas if I have to access http://blabla/foo and 
http://blabla/bar separately, it will take some time (network access is 
magnitudes slower than local memory access).

So the only conclusion from this seems to be a good practice: "When 
creating vocabularies, organize them in reasonable-sized chunks of 
closely related terms to avoid having to download a huge document or a 
large number of very tiny documents."

Of course, the optimum size and organization of the chunks will depend a 
lot on network bandwidth and latency and the actual set of terms needed, 
but what's important here is not to reach the optimum, but to avoid 
completely inefficient edge cases (every term in a separate document 
even if closely related, or a single huge file of stuff that rarely is 
needed together). That's good because even infrequent reorganization 
would be a bad idea.

In any case, these considerations clearly seem to support the use of 
fragment identifiers (into reasonable-sized files), rather than to be an 
argument against fragment identifiers.

Regards,    Martin.

Received on Tuesday, 30 August 2011 01:12:33 UTC