- From: Stela Dextre Clarke <sdclarke@lukehouse.demon.co.uk>
- Date: Wed, 6 Apr 2005 18:38:50 +0100
- To: "'Mark van Assem'" <mark@cs.vu.nl>, "'Miles, AJ \(Alistair\)'" <A.J.Miles@rl.ac.uk>, <public-esw-thes@w3.org>
Just a quick note of support for Mark's comments. The thesauri I work
with, and those provided by my clients, are usually (perhaps always)
held in a database, but the database is not usually relational.
And I agree you have to be careful about conversion. You can't assume
editors will have used even the standard tags in exactly the way ISO
2788 envisages.
Stella
*****************************************************
Stella Dextre Clarke
Information Consultant
Luke House, West Hendred, Wantage, Oxon, OX12 8RR, UK
Tel: 01235-833-298
Fax: 01235-863-298
SDClarke@LukeHouse.demon.co.uk
*****************************************************
-----Original Message-----
From: public-esw-thes-request@w3.org
[mailto:public-esw-thes-request@w3.org] On Behalf Of Mark van Assem
Sent: 06 April 2005 10:21
To: Miles, AJ (Alistair); public-esw-thes@w3.org
Subject: Re: Quick Guide to Publishing a Thesaurus on the Semantic Web
Hi Alistair,
> How about the following after the 'Expressing a Thesaurus in RDF'
section (feel free to hack or add suggestions):
>
> ---
> Section: Generating an RDF Representation of a Thesaurus
How do you feel about "Converting a Thesaurus to an RDF Representation"?
Personally I'm fond of the word converting, "generating" to me has a
feel that it's straightforward to do conversion. In any case we have to
choose one word and stick with it in the text. I don't know if it makes
a difference to anyone, if it doesn't then generating is fine.
> Most thesauri are stored and managed via a relational database. The
best method for generating an RDF representation of a thesaurus from the
contents of a relational database will depend on both the technologies
deployed and the database schema, and is beyond the scope of this
document.
Are you sure about this (most thesauri in rel. db)? Maybe we can try to
generalize over formats to avoid any discussion, e.g.
"Most thesauri are stored in a relational database, XML file, or a text
format as described in [ISO standards]. The best method for generating
an RDF representation of a thesaurus from its original format will
depend on both the technologies deployed and the schema. A description
of recommended practices for conversion is beyond the scope of this
document. "
> If an XML representation of the thesaurus is already available, then
an RDF/XML representation using SKOS Core may be generated via an XSLT
transformation. The design of this transformation will depend on the
original XML format, and care must be taken to ensure sensible output.
It would be great if we could show a small example of what difficulties
can be encountered. So an example of what can go wrong. I can't think of
one related to UKAT (which would be the best, because relates to what
reader has already read), but maybe this is something:
"For example, if the thesaurus contains terms in another language than
its main language (e.g. French synonyms of English terms), the correct
language tags should be attached to these terms."
or maybe:
"For example, some thesauri include term descriptions under the heading
"scope note" which are actually definitions. These should be translated
to skos:definition instead of skos:scopeNote. Careful consideration of
both the intended usage of the thesaurus and the SKOS Core Vocabulary
are required to ensure a consistent conversion."
(actually this last one may be confusing, because I'm not sure whether
this is also the case for UKAT, which in the Guide is converted to
skos:scopeNote instead of skos:definition)?
Or maybe an example about that people shouldn't forget to add a
ConceptScheme definition, which is not in the original source.
> How about if I add a skos:inScheme arc to the graph instead?
This might leave the reader wondering what this inScheme is about
(without text). But I'm all for keeping graph and RDF/XML as consistent
as possible.
Hope this is useful,
Mark.
--
Mark F.J. van Assem - Vrije Universiteit Amsterdam
mark@cs.vu.nl - http://www.cs.vu.nl/~mark
Received on Wednesday, 6 April 2005 17:38:59 UTC