W3C home > Mailing lists > Public > public-esw-thes@w3.org > April 2005

Re: Quick Guide to Publishing a Thesaurus on the Semantic Web

From: Mark van Assem <mark@cs.vu.nl>
Date: Wed, 06 Apr 2005 11:20:43 +0200
Message-ID: <4253A9EB.4020704@cs.vu.nl>
To: "Miles, AJ \(Alistair\)" <A.J.Miles@rl.ac.uk>, public-esw-thes@w3.org

Hi Alistair,

> How about the following after the 'Expressing a Thesaurus in RDF' section (feel free to hack or add suggestions): 
> ---
> Section: Generating an RDF Representation of a Thesaurus

How do you feel about "Converting a Thesaurus to an RDF Representation"? 
Personally I'm fond of the word converting, "generating" to me has a 
feel that it's straightforward to do conversion. In any case we have to 
choose one word and stick with it in the text. I don't know if it makes 
a difference to anyone, if it doesn't then generating is fine.

> Most thesauri are stored and managed via a relational database.  The best method for generating an RDF representation of a thesaurus from the contents of a relational database will depend on both the technologies deployed and the database schema, and is beyond the scope of this document.  

Are you sure about this (most thesauri in rel. db)? Maybe we can try to 
generalize over formats to avoid any discussion, e.g.

"Most thesauri are stored in a relational database, XML file, or a text 
format as described in [ISO standards]. The best method for generating 
an RDF representation of a thesaurus from its original format will 
depend on both the technologies deployed and the schema. A description 
of recommended practices for conversion is beyond the scope of this 
document. "

> If an XML representation of the thesaurus is already available, then an RDF/XML representation using SKOS Core may be generated via an XSLT transformation.  The design of this transformation will depend on the original XML format, and care must be taken to ensure sensible output.  

It would be great if we could show a small example of what difficulties 
can be encountered. So an example of what can go wrong. I can't think of 
one related to UKAT (which would be the best, because relates to what 
reader has already read), but maybe this is something:

"For example, if the thesaurus contains terms in another language than 
its main language (e.g. French synonyms of English terms), the correct 
language tags should be attached to these terms."

or maybe:

"For example, some thesauri include term descriptions under the heading 
"scope note" which are actually definitions. These should be translated 
to skos:definition instead of skos:scopeNote. Careful consideration of 
both the intended usage of the thesaurus and the SKOS Core Vocabulary 
are required to ensure a consistent conversion."

(actually this last one may be confusing, because I'm not sure whether 
this is also the case for UKAT, which in the Guide is converted to 
skos:scopeNote instead of skos:definition)?

Or maybe an example about that people shouldn't forget to add a 
ConceptScheme definition, which is not in the original source.

> How about if I add a skos:inScheme arc to the graph instead?

This might leave the reader wondering what this inScheme is about 
(without text). But I'm all for keeping graph and RDF/XML as consistent 
as possible.

Hope this is useful,

  Mark F.J. van Assem - Vrije Universiteit Amsterdam
        mark@cs.vu.nl - http://www.cs.vu.nl/~mark
Received on Wednesday, 6 April 2005 09:20:47 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:45:17 UTC