W3C home > Mailing lists > Public > public-esw-thes@w3.org > December 2006

Re: Could ISO-639 languages be defined as skos concepts?

From: Bernard Vatant <bernard.vatant@mondeca.com>
Date: Fri, 22 Dec 2006 09:37:51 +0100
Message-ID: <458B995F.9060207@mondeca.com>
To: Felix Sasaki <fsasaki@w3.org>
Cc: Sue Ellen Wright <sellenwright@gmail.com>, Gerhard Budin <gerhard.budin@univie.ac.at>, Addison Phillips <addison@yahoo-inc.com>, Mark Davis <mark.davis@jtcsv.com>, Thomas Baker <baker@sub.uni-goettingen.de>, public-esw-thes@w3.org

Hi Felix

Thanks for jumping in.

> I'm trying to understand what you want to achieve: Is it URIs for
> language values, e.g. http://www.w3.org/2004/02/skos/language#en-US ?
>   
Indeed. All the point is to identify and represent languages as 
concepts, in order to be able to make RDF assertions about them, beyond 
the "tag" use.

> I don't think that it is feasible to have everything after "#" as an
> URI, since RFC 4646 or its successor define a grammar for language tags.
>   
Do you mean there is a technical issue forbidding to build valid URIs 
out of language tags?
Not that although a single # namespace is the first idea which comes to 
mind it's not the only option.
Could be as well http://www.w3.org/2004/02/skos/language/en/US or even 
an opaque URI http://www.w3.org/2004/02/skos/language#1234
In any case subtag elements and other properties as revision date will 
be explicitly attached as properties. You can't rely on the URI string 
to carry semantics. This is a "Semantic Web Axiom" :-)
> That is, you cannot have a finite set of URIs built out of that.
>   
Sorry, I don't catch the point. What do you mean by a "finite set"? 
Could you expand on that?
> Have you thought of registering an XPointer scheme at W3C? E.g.
> something like "language()" which can be used e.g. in
> http://www.w3.org/2004/02/skos/language#(en-US) . You would have to
> define that the scheme data "()" contains an BCP 47 identifier.
>   
I think I see what you have in mind, but remember RDF is not mainly 
about the structure of a published XML document, but about the semantics 
of URIs.
Besides the language values themselves, and even before, we need a 
namespace for the ontology, the "Language" class, the different "subtag" 
properties etc.
And defining a namespace is  more or less dependent of the vocabulary 
publication.
See e.g. http://www.w3.org/TR/swbp-vocab-pub/

Hope that helps, and that we don't speak cross each other.

Regards

Bernard
> Felix
>
> Bernard Vatant wrote:
>   
>> Sue Ellen
>>
>> Thanks for all this. I will munch over it and try to come up with
>> something by the first week of January, when everybody is out of the
>> bubbles ... :-)
>>
>> Bernard
>>
>> Sue Ellen Wright a écrit :
>>     
>>> Hi, All,
>>> Indeed, I suspect that lots of people would be delighted if someone
>>> wants to go ahead with this for SKOS, provided that no one has already
>>> started such a project. Rather than searching for IANA, you want to
>>> reference IETF BCP 47, which will be your permanent ID reference for
>>> the Language Tags. My contacts on BCP 47 are Felix Sasaki, Addison
>>> Phillips, and Mark Davis, but as noted, they may possibly be off line
>>> right now, as many people are. On the ISO side, Gerhard Budin is the
>>> Chair of ISO TC 37/SC 2, whose WG 2 is responsible for the 639 family
>>> of standards. I know that he shares my view that any new initiatives
>>> in this area should be oriented toward the set of codes and the syntax
>>> rules contained in the current IETF RFC 4645, 4646 and 4647, taking
>>> into consideration any successor recommendations of the IETF. (There
>>> is, for instance, a current effort to update the recently approved
>>> RFCs to bring documents into compliance with the new ISO 639-3, which
>>> essentially identifies the SIL Ethnologue codes as the extended codes
>>> for comprehensive identification of languages. Also bear in mind (I
>>> probably said this in another email) that when it comes to xml:lang,
>>> we need to concern ourselves with langauge tags per IETF, not just
>>> language codes alone.
>>>  
>>> Sorry I'm not coming up with the absolute final answer here, but
>>> sooner or later, one of the IETF guys will check his mail!
>>> Best regards
>>> Sue Ellen
>>>
>>>  
>>> On 12/21/06, *Bernard Vatant* <bernard.vatant@mondeca.com
>>> <mailto:bernard.vatant@mondeca.com>> wrote:
>>>
>>>     Sue Ellen
>>>     > I think you are absolutely right about this not being a significant
>>>     > task: the main issue is to get a variety of people from a number of
>>>     > communities of practice to agree on a single approach.
>>>     Sure enough. But at least we could help proposing at least one. :-)
>>>     > SKOS would certainly be one avenue. There may be others, and in the
>>>     > end, we may need more than one flavor in order to conform to
>>>     > requirements in a given environment, which is OK as long as we
>>>     can map
>>>     > successfully back and forth.
>>>     Yes, this is a good use case for mapping, either SKOS-to-SKOS
>>> mapping,
>>>     or mapping from some RDF dialect to another. You know it's one of my
>>>     favourite topics.
>>>     > I'm hoping that sooner or later one of the guys for W3C will weigh
>>>     > into this discussion and let us know whether they are already
>>>     > addressing this issue.
>>>     I've been searching the W3C I18n Activity
>>>     http://www.w3.org/International/ which looks to me the place where
>>>     such
>>>     things should happen, but it looks like at first sight there is no
>>>     connection between this activity and the SW activity. I will
>>>     investigate
>>>     further.
>>>     > It's a bad time of year to hope to catch everybody monitoring their
>>>     > email!
>>>     Indeed. By the way, Happy Xmas to all :-)
>>>
>>>     Bernard
>>>     > There will be an ISO TC 37 meeting in January where we'll be
>>>     > addressing issues regarding our own metadata registry, and this
>>> will
>>>     > surely come up.
>>>     > Best regards
>>>     > Sue Ellen
>>>     >
>>>     > On 12/21/06, *Bernard Vatant* < bernard.vatant@mondeca.com
>>>     <mailto:bernard.vatant@mondeca.com>
>>>     > <mailto:bernard.vatant@mondeca.com
>>>     <mailto:bernard.vatant@mondeca.com>>> wrote:
>>>     >
>>>     >     Hi Sue Ellen
>>>     >
>>>     >     Thanks for your insights. Do you have pointers to the
>>>     discussions you
>>>     >     mention, and/or any contact with people taking part in them,
>>>     and who
>>>     >     would see some interest in RDF-ization of  those resources?
>>>     (assuming
>>>     >     such a class definition is satisfiable).
>>>     >     Actually when one looks at
>>>     >     http://www.iana.org/assignments/language-subtag-registry
>>>     >     < http://www.iana.org/assignments/language-subtag-registry>,
>>> the
>>>     >     technical
>>>     >     task of migrating its content into RDF, as long as a relevant
>>>     >     vocabulary
>>>     >     is defined, is quite trivial.
>>>     >     After that it's mainly a political issue. :-)
>>>     >     But there is a point that has not been answered so far in my
>>>     original
>>>     >     question. Would SKOS a relevant format for such a
>>>     representation?
>>>     >
>>>     >     Bernard
>>>     >
>>>     >
>>>     >     Sue Ellen Wright a écrit :
>>>     >     > Hi, All,
>>>     >     > There's serious discussions going on concerning the IETF
>>>     >     language tag
>>>     >     > subtag registry and the ISO implementations of the 639
>>>     family of
>>>     >     > codes, so I think it makes sense to coordinate any efforts
>>>     in this
>>>     >     > direction with the folks working on those two sets of
>>>     standards.
>>>     >     IETF
>>>     >     > 4647 spells out means for matching codes, but it would
>>>     make things a
>>>     >     > lot simpler if we have a more or less standard format for
>>>     >     representing
>>>     >     > them in rdf.
>>>     >     > Bye for now
>>>     >     > Sue Ellen
>>>     >     >
>>>     >     >
>>>     >     > On 12/20/06, *Thomas Baker* <baker@sub.uni-goettingen.de
>>>     <mailto:baker@sub.uni-goettingen.de>
>>>     >     <mailto:baker@sub.uni-goettingen.de
>>>     <mailto:baker@sub.uni-goettingen.de> >
>>>     >     > <mailto:baker@sub.uni-goettingen.de
>>>     <mailto:baker@sub.uni-goettingen.de>
>>>     >     <mailto:baker@sub.uni-goettingen.de
>>>     <mailto:baker@sub.uni-goettingen.de>> >> wrote:
>>>     >     >
>>>     >     >
>>>     >     >     On Mon, Dec 18, 2006 at 06:54:18PM +0100, Bernard
>>>     Vatant wrote:
>>>     >     >     > ISO-639 languages are used in XML and in RDF, and in
>>>     SKOS, via
>>>     >     >     their
>>>     >     >     > code used as value of xml:lang attribute.
>>>     >     >     > But for various applications, it would be
>>> interesting to
>>>     >     define
>>>     >     >     those
>>>     >     >     > languages as proper RDF resources.
>>>     >     >     >
>>>     >     >     > So far, the only attempt to do so I've found in RDF is
>>>     >     >     > http://downlode.org/rdf/iso-639/ and the description it
>>>     >     provides is
>>>     >     >     > quite basic.
>>>     >     >     ...
>>>     >     >
>>>     >     >     > So, we have public concepts, a lot of data to mine, we
>>>     >     have use
>>>     >     >     cases,
>>>     >     >     > all we need is a namespace to which append ISO 639
>>>     codes to
>>>     >     >     forge URIs.
>>>     >     >     > Who is likely to host and maintain that namespace?
>>>     >     >     > http://www.w3.org/2004/02/skos/language#
>>>     >     >     <http://www.w3.org/2004/02/skos/language#>  ?
>>>     >     >     > http://purl.org/dc/language/
>>>     <http://purl.org/dc/language/>  ?
>>>     >     >     ...
>>>     >     >     > Since I think we can wait for quite a while before ISO
>>>     >     delivers
>>>     >     >     such a
>>>     >     >     > thing in its own namespace - and I would be happy to
>>>     be proven
>>>     >     >     wrong
>>>     >     >     > here - I wonder what kind of initiative could move
>>>     this thing
>>>     >     >     forward.
>>>     >     >     > Is it in DCMI intention to define those instances in
>>>     its own
>>>     >     >     namespace
>>>     >     >     > (Tom, any clues on that?).
>>>     >     >
>>>     >     >     Well, I agree with the need :-)
>>>     >     >
>>>     >     >     Several years ago, we considered opening a DCMI
>>>     service for the
>>>     >     >     "registration" of URIs identifying controlled
>>>     vocabularies for
>>>     >     >     use as encoding schemes in metadata.  While the demand
>>>     for such
>>>     >     >     a service was clear, the project did not look
>>>     maintainable,
>>>     >     >     sustainable, and scalable.
>>>     >     >
>>>     >     >     Unless URIs are coined "once and for all" and "with no
>>>     >     >     guarantees" (and how useful is that?), it is not clear
>>>     >     >     how such a namespace host should operate over time.  The
>>>     >     >     impulse to "just do it" comes up against hard questions.
>>>     >     >     Even just maintaining URIs for entities in a separately
>>>     >     >     maintained ISO standard would involve a significant
>>>     commitment.
>>>     >     >
>>>     >     >     Tom
>>>     >     >
>>>     >     >     --
>>>     >     >     Tom Baker - tbaker@tbaker.de <mailto:tbaker@tbaker.de>
>>>     <mailto:tbaker@tbaker.de <mailto:tbaker@tbaker.de>>
>>>     >     <mailto:tbaker@tbaker.de <mailto:tbaker@tbaker.de>
>>>     <mailto:tbaker@tbaker.de <mailto:tbaker@tbaker.de>>> -
>>>     >     >     baker@sub.uni-goettingen.de
>>>     <mailto:baker@sub.uni-goettingen.de>
>>>     >     <mailto:baker@sub.uni-goettingen.de
>>>     <mailto:baker@sub.uni-goettingen.de>> <mailto:
>>>     >     baker@sub.uni-goettingen.de
>>>     <mailto:baker@sub.uni-goettingen.de>
>>>     <mailto:baker@sub.uni-goettingen.de
>>>     <mailto:baker@sub.uni-goettingen.de>>>
>>>     >
>>>     >
>>>     > <mailto:sewright@neo.rr.com <mailto:sewright@neo.rr.com>>
>>>
>>>
>>>     < http://mondeca.wordpress.com/>
>>>
>>>
>>>
>>>
>>> -- 
>>> Sue Ellen Wright
>>> Institute for Applied Linguistics
>>> Kent State University
>>> Kent OH 44242 USA
>>> sellenwright@gmail.com <mailto:sellenwright@gmail.com>
>>> swright@kent.edu <mailto:swright@kent.edu>
>>> sewright@neo.rr.com <mailto:sewright@neo.rr.com>
>>> ------------------------------------------------------------------------
>>>
>>> No virus found in this incoming message.
>>> Checked by AVG Free Edition.
>>> Version: 7.5.432 / Virus Database: 268.15.26/594 - Release Date:
>>> 20/12/2006 15:54
>>>   
>>>       
>
>
>
>   

-- 

*Bernard Vatant
*Knowledge Engineering
----------------------------------------------------
*Mondeca**
*3, cité Nollez 75018 Paris France
Web:    www.mondeca.com <http://www.mondeca.com>
----------------------------------------------------
Tel:       +33 (0) 871 488 459
Mail:     bernard.vatant@mondeca.com <mailto:bernard.vatant@mondeca.com>
Blog:    Leçons de Choses <http://mondeca.wordpress.com/>
Received on Friday, 22 December 2006 08:38:08 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 7 December 2009 10:38:55 GMT