Re: In defence of 404 Re: INSEE releases OWL ontology and RDF data for geographical entities from Harry Halpin on 2006-08-05 (semantic-web@w3.org from August 2006)

From: Harry Halpin <hhalpin@ibiblio.org>
Date: Sat, 05 Aug 2006 02:13:19 +0100
To: Dan Connolly <connolly@w3.org>
Cc: Bernard Vatant <bernard.vatant@mondeca.com>, semantic-web@w3.org, Eric van der Vlist <vdv@dyomedea.com>, Franck Cotton <franck.cotton@insee.fr>
Message-ID: <44D3F0AF.7090206@ibiblio.org>
Ah, the infamous and never-ending hash vs. slash debate rears its head
yet again :)

 Dan's method is nice because you get a distinct URI for your
non-information resource but still can retrieve a document or ontology
using a separate URI (i.e. the non-hashed racine) for free. However,
this view gets in a bit of trouble if return *both a HTML document* and
an ontology and you don't support conneg. Bernard is arguing that you
can get more control if you have multiple slash URIs that directly
resolve to documents. To me, the main advantage/disadvantage of this
would be allowing different terms to return *different* human-readable
documents/RDF. This may sound like a engineering nightmare. Are there
any tools that make this easier besides just reconfiguring your 404?

My guess is it's a preference right now more than anything, and the W3C
has rather carefully not written anything in stone quite yet. I would
recommend people interested in this issue read the rather healthy debate
we had at the "Identity, Reference, and the Web" workshop at WWW2006 [1]
and the new summary [2]. SemWeb people need to think long and hard over
this one, and most importantly, make some *tools* to make the job of
maintaining semantic web data at URIs easy. Regardless, now with
multiple ways of embedding RDF in HTML such as RDF/A[3] and GRDDL[4], I
think we can make progress on the issue mixing human and
machine-readable documents. At the IRW workshop people were talking
about starting a XG to decide best practices about these very issues,
which would be a good idea. Leveraging the power of URIs is important to
make the SemWeb work.

[1]http://www.ibiblio.org/hhalpin/irw2006
[2]http://www.ibiblio.org/hhalpin/irw2006/summary.html
[3]http://www.w3.org/2001/sw/grddl-wg
[4]http://www.w3.org/TR/xhtml-rdfa-primer/





 

Dan Connolly wrote:
> On Fri, 2006-08-04 at 23:53 +0200, Bernard Vatant wrote:
> [...]
>   
>> And actually, this should be the general situation in SW publication :
>> there is no authoritative, definitive, complete, description of a
>> resource, packaged in one file, with a single access point.
>>     
>
> Yes, there are authoritative descriptions in the Semantic Web.
> Perhaps not complete definitions, but for a URI such
> as
>   http://www.w3.org/2000/01/rdf-schema#subClassOf
> any document you get back by doing an http GET
> of http://www.w3.org/2000/01/rdf-schema
> is authoritative. That's how the web works: we all agree
> that if you lease/buy a domain name, you get to say what
> the URIs starting with http:// and that domain mean, and
> we agree that if you run a web server and serve up
> documents there, they are authoritative w.r.t. the meanings
> of those URIs.
>
> Anybody else is free to say things about rdfs:subClassOf,
> but the document that W3C serves up at
> http://www.w3.org/2000/01/rdf-schema says that rdfs:subClassOf
> is an rdf:Property, and is some other document 
> says that it's not an rdf:Property, that other document should
> be considered in error.
>
>
>   
>>  So, the best an URI can do, when its referent is not an accessible
>> thing, and that its main purpose is identifying the resource in
>> distributed descriptions, if one wants to make sense of it through
>> http protocol - since it's an http URI after all - is to get acces
>> some information like : "Sorry, what you try to access by this URI is
>> not an accessible resource. But its description can be found in RDF
>> files X, Y, Z, ...".
>>     
>
> That's not the best we can do.
> If you use URIs of the form DOC#TERM for non-information
> resources, then the information resource DOC can
> say things like { <#TERM> rdf:type geo:City }.
>
>
>   
>>  And the more I think about it, the more I think that the 404 page
>> that you get through http://rdf.insee.fr/geo/COM_80078 is close to
>> that. Agreed, the current message displayed on the page is suboptimal,
>> independently of the fact that it is in French, but replace it by the
>> quote I suggest above, and it makes much more sense that any fragment
>> identifier. 
>>     
>
> Really? It doesn't appeal to me at all.
>
>   
>> Maybe in contradiction with what I wrote in a previous message, where
>> I suggested that maybe we could have kept the # namespace for the
>> ontology, I think now that this argument holds for ontology elements
>> as well. Granted, we have now published a single ontology file
>> containing a description of e.g., http://rdf.insee.fr/geo/Commune. But
>> next year we can have another version, or another ontology defining
>> the same entity, with the same URI, at another level of detail, and
>> which the publisher would not like to see merged with the previous
>> one.
>>     
>
> Hmm... I can't imagine why not. Care to elaborate?
>
>   
>>  There again, packaging considerations naturally lead to define
>> several files containing partial descriptions of the same resource.
>>     
>
> It seems very unnatural to me to use anything other than a single
> static file for the case of an ontology with just a few dozen terms.
> Maybe a handful of content-negotiated static files. But not more than
> that.
>
>   
>> I'm well aware this is highly controversial and certainly not in tune
>> with the TAG recommendation on httpRange14 issue. 
>>
>> Bernard
>>
>> [1]
>> http://www.ibiblio.org/hhalpin/irw2006/presentations/HayesSlides.pdf
>>     
>
>   


-- 
		-harry

Harry Halpin,  University of Edinburgh 
http://www.ibiblio.org/hhalpin 6B522426
Received on Saturday, 5 August 2006 01:13:30 UTC