rdf:nodeID, rdf:ID and rdf:about from Miles, AJ (Alistair) on 2004-11-17 (public-esw-thes@w3.org from November 2004)

From: Miles, AJ (Alistair) <A.J.Miles@rl.ac.uk>
Date: Wed, 17 Nov 2004 11:28:04 -0000
To: "'public-esw-thes@w3.org'" <public-esw-thes@w3.org>
Message-ID: <350DC7048372D31197F200902773DF4C05E50D4D@exchange11.rl.ac.uk>
This is a general answer to the question: in an RDF/XML document describing
a thesaurus, should I use rdf:about, rdf:ID or rdf:nodeID, to identify
concepts?

First some preamble ...

An RDF document (in any syntax) describes a set of 'nodes'.  A node can be
identified by a URI.  If a node does not have a URI, it is a 'blank node'.
 
In an RDF/XML document, both rdf:about and rdf:ID give the *URI* of a node. 

The difference between them is that rdf:ID can only contain a *relative*
URI, rdf:about can contain either a *relative* or *absolute* URI.

Just in case there is any confusion (Dave Beckett had to explain this to me
a while ago :) ...

The following RDF/XML document:

<rdf:RDF xml:base="http://example.org/rdf">
  <rdf:Description rdf:ID="foo">
    <rdfs:label>A node with a URI</rdfs:label>
  </rdf:Description>
</rdf:RDF>

... is identical to:

<rdf:RDF xml:base="http://example.org/rdf">
  <rdf:Description rdf:about="#foo">
    <rdfs:label>A node with a URI</rdfs:label>
  </rdf:Description>
</rdf:RDF>

... is identical to:

<rdf:RDF>
  <rdf:Description rdf:about="http://example.org/rdf#foo">
    <rdfs:label>A node with a URI</rdfs:label>
  </rdf:Description>
</rdf:RDF>

Personally, I don't use rdf:ID.

In an RDF/XML document, you can identify a *blank node* by an rdf:nodeID
attribute.  The rdf:nodeID identifier has no meaning outside the scope of
the document.

So the original question is best rephrased as: should I use URIs to identify
concepts in an RDF description of a thesaurus, and if not, when and why not?


To answer that question:

If there is some 'thing' (such as a thesaurus concept) that you want to be
able to refer to from lots of different places, then it needs to have a URI
(or an 'inverse-functional property' ... but that's another email).

However, very few thesauri have been published with URIs for their concepts
(this is something we should strongly push for!!).  The concepts do not have
'official' URIs.  

It is quite reasonable to make up your own URIs in your own namespace for
concepts in someone else's thesaurus.  If you don't think the thesaurus
owners are going to publish URIs in the near future or ever, then making up
your own URIs is probably a good idea.  But if you think the owner of a
thesaurus can be persuaded to publish some 'official' URIs, then probably
best to wait.

When some thing has been given more than one URI, this is called 'URI
aliasing'.  There is nothing wrong with URI aliasing.  However, the more
aliases a thesaurus concept has, the harder it is to maintain referential
integrity.  This is a social issue.  If by our actions we can encourage the
adoption within as wide a community as possible of just one URI for some
thing, well we just make life easier for ourselves. 

In the interim, it is quite possible to publish an RDF description of
someone else's thesaurus concepts, without using any URIs.  Examples are at
[1] and [2].  Although you won't be able to refer to the concepts from other
RDF descriptions (i.e. this is not true 'semantic web' yet) you can still
use the URIless descriptions for data exchange.  I.e. you can load the data
into any application that understands RDF and use it locally.  

But I'd like to emphasise this: we should vigourously encourage thesaurus
owners to publish URIs for their concepts!

Hope that helps,

Al.

[1] http://isegserv.itd.rl.ac.uk/skos/gcl/
[2] http://isegserv.itd.rl.ac.uk/skos/apais/

---
Alistair Miles
Research Associate
CCLRC - Rutherford Appleton Laboratory
Building R1 Room 1.60
Fermi Avenue
Chilton
Didcot
Oxfordshire OX11 0QX
United Kingdom
Email:        a.j.miles@rl.ac.uk
Tel: +44 (0)1235 445440



> -----Original Message-----
> From: Houghton,Andrew [mailto:houghtoa@oclc.org]
> Sent: 12 November 2004 20:08
> To: Miles, AJ (Alistair)
> Subject: RE: RDF description of the Government Category List
> 
> 
> 
> I think I understand...  The rdf:nodeID is used locally in 
> the document for referencing, such as broader, narrower, and 
> related terms.  But to make assertions outside the document, 
> you give a skos:subjectIndicator which acts as the proxy to 
> find the correct skos:Concept.
> 
> So what happens when the skos:subjectIndicator is used in 
> multiple concepts?  Does that mean that you just related 
> multiple concepts?  I'm thinking that it is possible to have 
> the same skos:subjectIndicator for multiple concepts in a 
> single skos:ConceptScheme or multiple concepts from different 
> skos:ConceptScheme.  What does that mean, in regard to what 
> you said below?  Or... must a skos:subjectIndicator be 
> version and concept specific?
> 
> The question in my mind is: is it better to use rdf:nodeID, 
> rdf:ID, or rdf:about for your identifiers.  Why would you use 
> one over the other?  Maybe your answer to this question is 
> something that should also go in the SKOS guide.  Right now 
> I'm trying to decide how best to do this in the ADL Feature 
> Type Thesaurus since it doesn't have concept URI's.  Are 
> there any restrictions on the content for rdf:nodeID?  For 
> example, rdf:ID needs to conform to an XML NCName and 
> rdf:about needs to conform to a URI.
> 
> I also think that you might want to include the example 
> below, including the SKOS description and how someone would 
> reference the SKOS description, in the SKOS guide.  I'm 
> thinking the full example would go in the Subject-Based 
> Indexing with SKOS Concept section.
> 
> 
> Thanks, Andy.
> 
> > -----Original Message-----
> > From: Miles, AJ (Alistair) [mailto:A.J.Miles@rl.ac.uk] 
> > Sent: 12 November, 2004 14:40
> > To: Houghton,Andrew
> > Subject: RE: RDF description of the Government Category List
> > 
> > Hi Andy,
> > 
> > rdf:nodeID is there so you can assert statements between 
> > blank nodes within the same document.  But as you say you 
> > can't use them to reference from outside the document.
> > 
> > This is handy especially for when we want to identify things 
> > via an inverse functional property (such as foaf:homepage or 
> > skos:subjectIndicator), but don't want to give a reasoner 
> > excess work.  
> > 
> > So within the document that contains the RDF description of 
> > the GCL, all blank nodes refer to each other via a nodeID 
> > property.  All blank nodes also have a value for an inverse 
> > functional property, which you can use to refer to the nodes 
> > from outside the document.
> > 
> > So for example, I could do some subject indexing of the web 
> > with the GCL and state here that ...
> > 
> > <rdf:RDF>
> >   <rdf:Description
> > rdf:about="http://www.basc.org.uk/content/accessshooting">
> >     <skos:subject>
> >       <skos:Concept>
> >         <skos:subjectIndicator
> > rdf:resource="http://www.govtalk.gov.uk/schemasstandards/gcl.a
> > sp?term=446"/>
> >       </skos:Concept>
> >     </skos:subject>
> >   </rdf:Description>
> > </rdf:RDF>
> > 
> > The blank skos:Concept node in the above RDF description will 
> > be merged with the blank node from the GCL RDF description 
> > with the same value for a subjectIndicator property (by an 
> > OWL reasoner or a simple rule reasoner with a rule to support 
> > owl:InverseFunctionalProperty).
> > 
> > N.B. the implication of owl:InverseFunctionalProperty is:
> > 
> > (?x,?p,?v)
> > (?y,?p,?v)
> > (?p,rdf:type,owl:InverseFunctionalProperty)
> > ->
> > (?x,owl:sameAs,?y)
> > 
> > ... I'm pretty sure that's right.
> > 
> > Does that make sense?  
> > 
> > Al.
> > 
> > 
> > 
> > 
> > 
> > 
> > ---
> > Alistair Miles
> > Research Associate
> > CCLRC - Rutherford Appleton Laboratory
> > Building R1 Room 1.60
> > Fermi Avenue
> > Chilton
> > Didcot
> > Oxfordshire OX11 0QX
> > United Kingdom
> > Email:        a.j.miles@rl.ac.uk
> > Tel: +44 (0)1235 445440
> > 
> > 
> > 
> > > -----Original Message-----
> > > From: Houghton,Andrew [mailto:houghtoa@oclc.org]
> > > Sent: 12 November 2004 16:59
> > > To: Miles, AJ (Alistair)
> > > Subject: RE: RDF description of the Government Category List
> > > 
> > > 
> > > 
> > > Could you answer some simple RDF questions?  After 
> reading the RDF 
> > > documents, a number of times, I just don't understand the 
> > difference 
> > > between rdf:ID and rdf:nodeID and why you would use one over the 
> > > other.  It also seems that if I put rdf:ID on something, 
> that I can 
> > > address that RDF description as document#id (or... 
> > xml:base#id).  But, 
> > > it doesn't seem that you can do that with rdf:nodeID.  So if you 
> > > cannot reference a rdf:nodeID outside the document, then 
> > what use is 
> > > it to specify a rdf:nodeID on an skos:Concept?
> > > 
> > > 
> > > Thanks, Andy.
> > > 
> > > 
> > > > -----Original Message-----
> > > > From: public-esw-thes-request@w3.org 
> > > > [mailto:public-esw-thes-request@w3.org] On Behalf Of Miles, 
> > > > AJ (Alistair)
> > > > Sent: 12 November, 2004 11:44
> > > > To: 'public-esw-thes@w3.org'
> > > > Cc: Stella Dextre Clarke (E-mail)
> > > > Subject: RDF description of the Government Category List
> > > > 
> > > > 
> > > > Hi all,
> > > > 
> > > > I've done a SKOS/RDF description of the GCL, which you 
> > can get from:
> > > > 
> > > > http://isegserv.itd.rl.ac.uk/skos/gcl/
> > > > 
> > > > N.B. this is just an experimental version, done with nodeIDs 
> > > > and subjectIndicators pointing to the GovTalk website.  
> > > > 
> > > > Al.
> > > > 
> > > > ---
> > > > Alistair Miles
> > > > Research Associate
> > > > CCLRC - Rutherford Appleton Laboratory
> > > > Building R1 Room 1.60
> > > > Fermi Avenue
> > > > Chilton
> > > > Didcot
> > > > Oxfordshire OX11 0QX
> > > > United Kingdom
> > > > Email:        a.j.miles@rl.ac.uk
> > > > Tel: +44 (0)1235 445440
> > > > 
> > > > 
> > > > 
> > > 
> > 
>
Received on Wednesday, 17 November 2004 11:28:38 UTC