Blank nodes for concepts. from Miles, AJ (Alistair) on 2004-02-05 (public-esw-thes@w3.org from February 2004)

From: Miles, AJ (Alistair) <A.J.Miles@rl.ac.uk>
Date: Thu, 5 Feb 2004 18:03:50 -0000
To: "'public-esw-thes@w3.org'" <public-esw-thes@w3.org>
Message-ID: <350DC7048372D31197F200902773DF4C049440D9@exchange11.rl.ac.uk>

Hi,

A couple of people have picked up that in the examples in the documents [1]
[2] [3] I've encoded concepts as blank nodes, without URIs.  This email
addresses why I chose to do that.

My thinking is as follows.  We allow three methods for uniquely identifying
a concept:

	a.  The URI for the concept.
	b.  A combination of the concept's prefLabel and the URI of the
thesaurus to which it belongs.
	c.  A combination of the concept's externalID and the URI of the
thesaurus to which it belongs.

So, the following are all valid globally unique concept declarations:
----
<soks:Concept rdf:about="http://foo.com/examplethes/aconcept"/>
----
<soks:Concept>
	<soks:prefLabel>Bangers & Mash</soks:prefLabel>
	<rdfs:isDefinedBy rdf:resource="http://foo.com/examplethes"/>
</soks:Concept>
----
<soks:Concept>
	<soks:externalID>A00456</soks:externalID>
	<rdfs:isDefinedBy rdf:resource="http://foo.com/examplethes"/>
</soks:Concept>
----

I'll get to why in a minute.

I also then thought, rather than giving every concept and <rdfs:isDefinedBy>
property to indicate membership of some conceptual scheme, why not allow
people to subclass the <soks:Concept> class?  

So, for example, you could define the class:
----
<rdfs:Class rdf:about="http://foo.com/thesaurus/Concept">
	<rdfs:subClassOf
rdf:resource="http://www.w3c.rl.ac.uk/2003/11/21-skos-core#Concept"/>
	<rdfs:comment>This is the class of all concepts from the foo.com
thesaurus.</rdfs:comment>
</rdfs:Class>
----
Which would then allow globally unique concept declarations such as the
following:
----
<foo:Concept>
	<soks:prefLabel>Bangers & Mash</soks:prefLabel>
</foo:Concept>
----
<foo:Concept>
	<soks:externalID>A00456</soks:externalID>
</foo:Concept>
----

OK, so why bother?  

1.  It makes for better-looking RDF encodings (this is a serious point, as
it may help reduce the uptake hurdle - how many times have you heard people
groan that RDF looks like gobbledegook because of all the URIs?  Also
remember many potential users are from totally non sem-web environments,
e.g. english heritage.  RDF is a new and complicated beast to them.)

2.  It may not be appropriate to give a URI to a concept that is part of
some thesaurus that has been defined by an authority outside the semantic
web world.  So until the authority itself gives its own concepts URIs, we
can still make statements about them using reference-by-description.

On the down side ...

1.  Someone has to write a bit of reasoning code to equate all blank nodes
with the same prefLabel/rdfs:isDefinedBy property values, and run it over
the data before publishing it.

Where I fall on the matter:  in the short term use URIs to identify
concepts, so can work in a world without any reasoning required.  In the
slightly longer term look into allowing the blank-node style encodings, and
support the little bit of reasoning required with some code.

What does everyone think?

Al.


[1] http://www.w3c.rl.ac.uk/SWAD/deliverables/8.1.html
[2] http://www.w3c.rl.ac.uk/SWAD/deliverables/8.3.html
[3] http://www.w3c.rl.ac.uk/SWAD/deliverables/8.4.html

CCLRC - Rutherford Appleton Laboratory
Building R1 Room 1.60
Fermi Avenue
Chilton
Didcot
Oxfordshire OX11 0QX
United Kingdom

Email:        a.j.miles@rl.ac.uk
Telephone: +44 (0)1235 445440

Received on Thursday, 5 February 2004 13:04:07 UTC