AW: URIs for Concepts: Best Practices from Leo Sauermann on 2004-04-20 (public-esw-thes@w3.org from April 2004)

From: Leo Sauermann <leo@gnowsis.com>
Date: Tue, 20 Apr 2004 11:09:24 +0200
To: "'Kal Ahmed'" <kal@techquila.com>, "'David Menendez'" <zednenem@psualum.com>
Cc: "'Miles, AJ (Alistair)'" <A.J.Miles@rl.ac.uk>, <public-esw-thes@w3.org>, <public-esw@w3.org>
Message-ID: <000501c426b7$2d593c30$0c01a8c0@Bundeslade>
This discussion is coming up from time to time. Its also called
"Identity crisis" or "Uri crisis"

Some interesting articles are:
-rfc2396 (uri)
- http://www.w3.org/2002/11/dbooth-names/dbooth-names_clean.htm 
- http://www.xml.com/pub/a/2002/09/11/deviant.html
- http://www.w3.org/DesignIssues/HTTP-URI 

My opinion is to use Http URIs because:
- they are unique
- you can optionally put some content at the place the uri identifies
(be it RDF or HTML)

another guy behind this approach is Patrick Stickler and his URIQA.

A good concept to think about when using Uris to identify more than one
thing is to "Seperate by Ontology"

You can use a single resource uri and annotate it with different
triples, when you want to describe the "Web resource" aspect of the uri,
use a web ontology, when you want to describethe "dog-concept" aspect,
use the dog/concept ontology. They won't mix up, namespaces do the
seperation. And a single resource can have more than one type. voila.

Ad1: Don't use HASH identifiers if you can avoid them!
http://test/doh#hello
may come to your web server as:
http://test/doh
--> whoops.

greetings
Leo Sauermann



> -----Ursprüngliche Nachricht-----
> Von: public-esw-request@w3.org 
> [mailto:public-esw-request@w3.org] Im Auftrag von Kal Ahmed
> Gesendet: Dienstag, 20. April 2004 09:19
> An: David Menendez
> Cc: Miles, AJ (Alistair); 'public-esw-thes@w3.org'; 
> 'public-esw@w3.org'
> Betreff: Re: URIs for Concepts: Best Practices
> 
> 
> 
> On Mon, 2004-04-19 at 22:22, David Menendez wrote:
> > Miles, AJ (Alistair)  writes:
> > 
> > > I wanted to consult you all on this matter.  I have 
> agreement from 
> > > the EEA to publish the GEMET environmental thesaurus in 
> the SKOS/RDF 
> > > format.  The next step is to work out with them the URIs 
> they wish 
> > > to assign to their thesaurus and concepts.  I'm not sure what to 
> > > recommend to them on this matter.
> > 
> > Dan Brickley's Wordnet vocabulary service[1] at xmlns.com 
> seems like a 
> > useful model. Essentially, each concept is given a 
> (non-fragmentary) 
> > URI which, if dereferenced, returns a description of the 
> concept. Mr 
> > Brickley's system only returns RDF/XML presently, but there's no 
> > reason it couldn't also return HTML or something else via content 
> > negotiation.
> > 
> > [1] http://xmlns.com/2001/08/wordnet/
> > 
> > > I thought to use an http:// based URI base (e.g.
> > > http://www.eionet.eu.int/GEMET) and then add the id 
> number of each 
> > > concept (e.g. http://www.eionet.eu.int/GEMET#204).
> > 
> > That works, but my preference would be for something like 
> > <http://eionet.eu.int/GEMET/204>. In practice, using a fragment ID 
> > means that an HTTP request to a term's URI will return 
> nothing or else 
> > a description of the entire vocabulary, which I'm guessing 
> is pretty 
> > large.
> > 
> I think that this practice would certainly work much better 
> with PSI/PSID constructs than the fragmentary approach - one 
> resource per concept is probably a best practice that the 
> Published Subjects TC should recommend.
> 
> > > A first question is, is it OK to use http: URIs for 
> concepts?  Sorry 
> > > to drag this old chestnut up again, but I need some clear 
> answer on 
> > > best practices for this.  Are we not at all concerned 
> that the same 
> > > URI may identify both a thesaurus concept and a 
> resolveable network 
> > > resource (i.e. the file containing the RDF data)?
> > 
> > It would be confusing for a URI to identify a thesaurus 
> concept and an 
> > RDF file. The key, as I see it, is the idea that the response to an 
> > HTTP Get is a representation of the resource, not the 
> resource itself. 
> > The fact that <http://xmlns.com/wordnet/1.6/Dog> returns an RDF/XML 
> > document, doesn't mean that it identifies that particular document. 
> > If, for some reason, you wanted to talk about that RDF/XML document 
> > instead of the word "Dog", you would need to use a blank node or a 
> > different URI.
> > 
> It is certainly true that content negotiation gives you the 
> problem of talking about the descriptive resource as opposed 
> to the described thing. That is a strong argument against 
> content negotiation for RDF / XTM resources. However, there 
> are still two other options:
> 
> 1) Embed the RDF / TM markup in its XML form. Then use an 
> rdf:ID attribute or XTM id attribute so that the reference to the
RDF/XTM would be > <http://xmlns.com/wordnet/1.6/Dog#foo>
> 
> 2) Use a profile of 
> XLink to link to the RDF / TM resource that describes the 
> concept, and make it completely separate. e.g. 
<http://xmlns.com/wordnet/1.6/Dog/dog.rdf>

> Not everyone agrees with this position.
> 
I don't think that a position can ever be established which everyone 
will agree with :-)

> > What do you think of info: based URIs for concepts?
> 
> >From an RDF perspective, it's just as good. From a web perspective, 
> >it's
> less useful because it can't be dereferenced.

I tend to agree. I tend to consider the use of URIs for subject
identification as being divided into three categories:
1) The URI resolves to the subject being described
2) The URI resolves to a description of the subject being described
3) The URI is used as a pure, unresolvable identifier

I think (2) gives the greatest possibility for interchange of semantics
if the resource addressed by the URI is human-readable - at some point
the processing of semantics has to be transferred from SW machinery to
wet-ware.

Cheers,

Kal
-- 
Kal Ahmed <kal@techquila.com>
techquila
Received on Tuesday, 20 April 2004 05:09:39 UTC