W3C home > Mailing lists > Public > public-esw-thes@w3.org > February 2007

Re: [SKOS] languages and scripts

From: Jakob Voss <jakob.voss@gbv.de>
Date: Fri, 09 Feb 2007 12:12:45 +0100
Message-ID: <45CC572D.6020602@gbv.de>
To: public-esw-thes@w3.org

Hi Bernard,

 > I think your approach of making a combination of tags a narrower concept
> of each component is better than mine, and I would be indeed happy to
> get rid of owl classes altogether, but nevertheless capture in some way
> that #zh is a primary language, #Hant a script (right?) and #HK a region.
> Soo ... maybe we could simply define a "tagType" attribute to flag the
> simple tags (lang is the default namespace, whatever that will be)
> 
> <skos:Concept rdf:about='#zh'>
>  <skos:prefLabel>zh</skos:prefLabel>
>  <skos:altLabel>Chinese</skos:altLabel>
>  <lang:tagType rdf:resource="#PrimaryLanguage">
>  <dc:date>2005-10-16</dc:date>
> </skos:Concept>
> 
> Typing without classes is certainly better in this case than subclassing
> skos:Concept, because otherwise we will have a quite weird conceptScheme
> with concepts in different classes, with common narrower concepts with
> none of those. Very bizarre ...

The "tagType" also looks weird to me. Alternative solutions:

1. Subclassing of skos:Concept (you can still use simple skos:Concept)
2. No differences in classes (this is implied by 1. if you do inferencing)
3. Put regions and scripts in ConceptSchemes of their own

I like 3. BCP 47 is based on ISO 15924 (scripts) and
ISO 3166 + UN M.49 (regions) so both should be defined in Schemes of
their own and linked to by BCP 47 anyway. By the way I'm working on a
detailed paper with a proposal how to encode countries and regions in
SKOS, based on ISO 3166. This is less easier than it looks like because
 countries regularly change (seperate, join, rename... ;-)

> I now look at my example document
> http://en.wikipedia.org/wiki/Quebec_French, written in en-US about fr-CA
> to see how it flies:
> 
>  <foaf:Document rdf:about="http://en.wikipedia.org/wiki/Quebec_French">
>    <dc:language rdf:resource="&lang;en-US"/>
>    <dc:subject rdf:resource="&lang;fr-CA">
>  </foaf:Document>
> 
> Not bad. We have in your model
> lang:en-US   skos:broader   lang:en
> lang:en-US   skos:broader   lang:US
> 
> So I will find my document by looking under either "lang:en" or
> "lang:US" indexes. Seems to fly well indeed. :-)

You could also index a document with
"some-unknown-language-in-Latin-Script" or "some-canadian-language" :-)

Greetings,
Jakob


Greetings,
Jakob
Received on Friday, 9 February 2007 11:12:53 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 7 December 2009 10:38:55 GMT