Re: INSEE releases OWL ontology and RDF data for geographical entities from Richard Newman on 2006-08-06 (semantic-web@w3.org from August 2006)

From: Richard Newman <r.newman@reading.ac.uk>
Date: Sat, 5 Aug 2006 17:39:04 -0700
To: "Xiaoshu Wang" <wangxiao@musc.edu>
Cc: Semantic Web <semantic-web@w3.org>
Message-Id: <110B17BF-95D8-4258-AF9B-568CCA23816A@reading.ac.uk>
>> And then to look up http://rdf.insee.fr/geo#code_commune , a
>> consumer just GETs http://rdf.insee.fr/geo as usual; then
>> when they want to look up another term such as
>> http://rdf.insee.fr/geo#subdivision,
>> they can save a round trip because they already have it.
>>
>> Using a / namespace has a higher cost for the producer
>> (redirects) and for the consumer (one GET per term rather
>> than one GET for the ontology).
>
> I actually disagree this a little bit.  To save a roundtrip  
> requires caching
> on the consumer's side.  The same can/should also be done with the  
> slash
> URI.

No, it can't.

Retrieving

http://example.com/ont#one
http://example.com/ont#two
http://example.com/ont#three

makes three requests for "http://example.com/ont", which is the  
resource. This is very very likely to be cached in at least one place  
along the line; hopefully within the tool itself, for one.

Retrieving

http://example.com/ont/one
http://example.com/ont/two
http://example.com/ont/three

requires three separate requests for three separate resources, and an  
HTTP client *does not know* that  it should -- *perhaps* -- retrieve  
"http://example.com/ont" to get a description of those three terms.

The web architecture does not support retrieval of terms from the  
entire hierarchy.

> But, in order to do so, there are a few things needs to be clarified.
>
> First, should W3C clearly define a policy regarding what can be a  
> legit
> namespace from a URI?
>
> The reason for me to raise this question is this.  Unlike hash URI,  
> the
> namespace URI can not be "inferred" or "guessed" from a URI  
> itself.  For
> instance, a URI of http://foo.com/bar can be constructed with  
> namespace
> http://foo.com/, http://foo.com/b or http://foo.com/ba coupled with  
> local ID
> of "bar", "ar" and "r", respectively.

It depends. QNames have interesting rules:

http://www.w3.org/2001/tag/issues.html
http://www.w3.org/2001/tag/doc/qnameids

I've seen some tools where a namespace prefix not ending in '#' or  
'/' will effectively have '#' appended by default -- i.e., the  
namespace is a resource in and of itself, and the concatenation  
operator is not simply string concatenation. (I suspect this is in  
some document somewhere, but I don't have a reference to hand.)

As a side point: the fact that you're attempting to compute a  
'namespace URI' from a resource illustrates that you're thinking in  
terms of canonical documents at some root that you can find by  
inspecting the URI. Don't think like this. The Semantic Web is a  
graph, and HTTP requests to some URIs will return some fragments of  
graph. Anything more than that is mostly wishful thinking. Those  
fragments can be dynamically generated, empty, or they might be 404s  
instead.

> Shouldn't there be a policy governed this.  For instance, to say  
> that only
> the first one is legit?
>
> Second, should dereferencing a URI retrieve all the statements  
> under the
> same namespace?  I.e., should dereference http://foo.com/bar will  
> retrieve
> the same document as dereferencing http://foo.com/bar2, assuming  
> they shared
> the same namespace?

That is surely up to the content provider -- whoever owns foo.com  
should return RDF (or HTML!) for each request that is sufficient to  
describe each resource. If this happens to be the same chunk of RDF,  
then so be it.

I would expect that each request should return different triples,  
perhaps with some being shared. Some of those triples will directly  
support the kind of discoverability you would like. (E.g., definedBy,  
seeAlso...)

> Inferred from the arguments so far, the question should be "no"  
> because
> otherwise, slash URI would be no different from hash URI except the  
> added
> server side complexity.
>
> Then, i.e., the answer to the question 2 is "no", what is the use  
> of the
> namesapce if it is only served as a shorthand for URI?  What would  
> I get
> when I dereference the namespace of a slash URI?

There is no namespace, at least not in the way you're thinking about  
it. I think if you turned every piece of RDF you've ever seen into  
ntriples, which supports no prefixing shorthand, then your  
interpretation of the issues would be slightly different.

It's all just URIs, some of which happen to be defined together in  
one representation returned from one resource (e.g., http://xmlns.com/ 
foaf/0.1/).

Any form of URI inspection, use of prefixes, etc. to discover useful  
RDF sources is a heuristic, admittedly typically supported by best  
practices. Attempting to do that with 'slash URIs' is just one point  
where the heuristic starts to diverge from the architecture.

-R
Received on Sunday, 6 August 2006 00:39:20 UTC