Re: URIs for Concepts: Best Practices from David Menendez on 2004-04-23 (public-esw-thes@w3.org from April 2004)

From: David Menendez <zednenem@psualum.com>
Date: Fri, 23 Apr 2004 01:32:52 -0400
To: Charles McCathieNevile <charles@w3.org>
Cc: Kal Ahmed <kal@techquila.com>, "'public-esw-thes@w3.org'" <public-esw-thes@w3.org>
Message-id: <r02000200-1033-AA74D88E94E711D8AF23000393758032@[10.0.1.5]>

Charles McCathieNevile writes:

> cc- public-esw (this is relevant now to the thesaurus, and is relevant
to the
> best practices working group)
> 
> On Tue, 20 Apr 2004, Kal Ahmed wrote:
> 
> >
> >On Mon, 2004-04-19 at 22:22, David Menendez wrote:
> >>
> >> That works, but my preference would be for something like
> >> <http://eionet.eu.int/GEMET/204>. In practice, using a fragment ID
> >> means that an HTTP request to a term's URI will return nothing or
> >> else a description of the entire vocabulary, which I'm guessing is
> >> pretty large.
> >>
> >I think that this practice would certainly work much better with
> >PSI/PSID constructs than the fragmentary approach - one resource per
> >concept is probably a best practice that the Published Subjects TC
> >should recommend.
> 
> Something I have done is use http://example.org/terms? as a
> namespace, so the URI resolves to a query (which says to cache it by
> good practice, since it doesn't change anything to run the query) and
> since that means you can do something different for the "bare URI".

Sure, from an RDF perspective, <http://example.org/term?204> is just as
good as <http://example.org/term/204>. I prefer the latter, because it
feels simpler and more flexible to me.

In practice, any decent web server would allow you to internally rewrite
"/term/204" to "/cgi/term?204". 

> Of course there is no reason not to configure foo#bar to return just
> the relevant RDF if that's the content-type requested, but people
> don't.

The problem with using <http://example.org/foo#bar> is that an HTTP
query will have the form "GET /foo", not "GET /foo#bar". For a small
dataset being served from a flat file, this is probably the best way to
go, but if there's any server-side scripting involved, it seems better
to go with non-fragmented URIs.

> >> It would be confusing for a URI to identify a thesaurus concept
> >> and an RDF file. The key, as I see it, is the idea that the
> >> response to an HTTP Get is a representation of the resource, not
> >> the resource itself. The fact that
> >> <http://xmlns.com/wordnet/1.6/Dog> returns an RDF/XML document,
> >> doesn't mean that it identifies that particular document. If, for
> >> some reason, you wanted to talk about that RDF/XML document
> >> instead of the word "Dog", you would need to use a blank node or a
> >> different URI.
> 
> that's why I like using .../Dog#term

I'd rather do <http://xmlns.com/wordnet/1.6/Dog> for the concept and
<http://xmlns.com/wordnet/1.6/Dog.rdf> for the RDF/XML document.

To be *really* unambiguous, you also have to consider the possibility of
changes over time. I've always felt that the best way to refer to an
representation retrieved via HTTP would be something like:

  [ a Representation
  ; source <http://xmlns.com/wordnet/1.6/Dog>
  ; date "2004-04-23T01:28:00Z"
  ]

You can also add properties like format, e-tag, last-modified, and the
content (in base64 if necessary).

Alternately, if I'm correct in thinking that HTTP lets you use
Content-ID, you could identify specific representations using cid: URIs.
-- 
David Menendez <zednenem@psualum.com> <http://www.eyrie.org/~zednenem/>

Received on Friday, 23 April 2004 01:33:27 UTC