W3C home > Mailing lists > Public > www-international@w3.org > April to June 2007

RE: [Ltru] RE: [Fwd: Language Ontology]

From: Debbie Garside <debbie@ictmarketing.co.uk>
Date: Wed, 25 Apr 2007 10:43:48 +0100
To: <daviddalby@linguasphere.info>, "'Elisa F. Kendall'" <ekendall@sandsoft.com>, "'Misha Wolf'" <Misha.Wolf@reuters.com>
Cc: "'WWW International'" <www-international@w3.org>, "'Semantic web list'" <semantic-web@w3.org>, <Gauri.Salokhe@FAO.ORG>, "'LTRU Working Group'" <ltru@ietf.org>
Message-ID: <E1Hge2Y-0001bI-TF@maggie.w3.org>
I will not respond to personal attacks in public or private or indeed enter
into any further dialog with you where it is obvious that your motives are
to discredit rather than to participate in this forum.  Do not misinterpret
this response as being through a lack of knowledge of the subject matter.
That is my last word. I will respond to you no further. Please also desist
from emailing me privately.


From: David Dalby [mailto:daviddalby@linguasphere.info] 
Sent: 25 April 2007 10:08
To: 'Debbie Garside'; 'Elisa F. Kendall'; 'Misha Wolf'
Cc: 'WWW International'; 'Semantic web list'; Gauri.Salokhe@FAO.ORG; 'LTRU
Working Group'
Subject: RE: [Ltru] RE: [Fwd: Language Ontology]

It is unfortunate that a business-person well-versed in ICT, in marketing
techniques and in the workings of ISO, should make such an ungracious and
ill-informed remark about the important standard ISO 3166-1.  Such a comment
is particularly unhelpful in a field requiring international co-operation
and linguistic precision, since it is made by a representative of the
British Standards Institution and of the team in charge of the related ISO
639 standard.


The argument that incomplete data are “not good data” is of course nonsense.
ISO 3166 has made an important step forward in making available for the
first time standardised data on the administrative use of specific languages
at the level of national states.  To propose the deletion of that data, on
the basis of a single (ill-chosen) example, leads one to ponder the motives
for such a proposal.  


I hope that this working group may be informed at once of all the other
reasons which are prompting the UK to make such an extraordinary request, in
the form of D.Garside’s proposed ISO NWIP (New Work Item Proposal).


At present, the only accusation (based on inadequate understanding of a
complex situation) is that ISO 3166-1 “shows only two Administrative
Languages for India where there are at least twenty-two”.  In fact, Hindi
and English are the languages used for the federal administration of India
(and are thus relevant to the listing of administrative languages in ISO
3166-1) whereas the many other official languages are used either at the
level of individual states or union territories, or in communications
between those individual states (or territories) and the central government
(and will thus be relevant to the further listing of administrative
languages in ISO 3166-2, covering sub-divisions of countries).   I hope that
any member of the working group will correct me, if my summary of the Indian
situation is itself too simplified.


David Dalby




Dr David Dalby



L’Observatoire linguistique / The Linguasphere Observatory




SA34 0XT


From: Debbie Garside [mailto:debbie@ictmarketing.co.uk] 
Sent: 23 April 2007 23:55
To: 'Elisa F. Kendall'; 'Misha Wolf'
Cc: 'WWW International'; 'Semantic web list'; Gauri.Salokhe@FAO.ORG; 'LTRU
Working Group'
Subject: [Ltru] RE: [Fwd: Language Ontology]


Please be very careful with the use of the "Administrative Language"
information from ISO 3166-1.  It is incomplete and therefore not good data.


For example, it shows only two "Administrative Languages" for India where
there are at least twenty-two.  I am hoping that this information will be
taken out of the standard in the near future.  I am currently writing an ISO
NWIP for a revision of ISO 3166-1 which will include a proposal for the
deletion of this data.


Best regards


Debbie Garside

Editor ISO DIS 639-6

www.geolang.com <BLOCKED::http://www.geolang.com>  




From: www-international-request@w3.org
[mailto:www-international-request@w3.org] On Behalf Of Elisa F. Kendall
Sent: 23 April 2007 18:25
To: Misha Wolf
Cc: Gauri.Salokhe@FAO.ORG; WWW International; Semantic web list; LTRU
Working Group
Subject: Re: [Fwd: Language Ontology]

Hi Misha,

We are very aware of it, and have been following the work, but I failed to
mention it in the email.  I should say that our ontology was developed for
offline use in an internal system, as an initial requirement.  Having said
that, if you look at the RFCs, they only describe tags, not an RDF
vocabulary or OWL ontology.  Our approach is compatible with the RFCs but
adds capabilities that support co-reference resolution, for example, in
target application.



Misha Wolf wrote:

This sounds very worrying as you don't seem to be aware of BCP 47.





From: www-international-request@w3.org
[mailto:www-international-request@w3.org] On Behalf Of Elisa F. Kendall
Sent: 23 April 2007 17:32
To: Gauri.Salokhe@FAO.ORG
Cc: 'WWW International'; Semantic web list
Subject: Re: [Fwd: Language Ontology]

Hi Gauri,

We've done this for some of our government customers, using essentially the
second approach you cite.  We're also in the process of relating the
ontology to another one we've built to represent ISO 3166, which includes
the administrative languages used by countries and non-sovereign territories
represented in that standard.

If you can hang out for a few days, we (Sandpiper) are just finalizing a
version that includes both ISO 639-1 and 639-2. The approach is more of a
hybrid of the two you present, based on customer needs.  It includes a
fragment of ISO 1087, and also some inverse relations since there is a
one-to-one correspondence between languages and codes.  We elected to create
a 'Language' class, rather than 'LanguageCode', which we reuse in other
applications; classes for Alpha-2Code and Alpha-3Code are subclasses of
CodeElement, from ISO 5127, with instances of these codes as first class
individuals. We use literals (via datatype properties) to represent the set
of English, French, and in the case of 639-1 Indigenous names.  We've also
created subclasses of Alpha-3Code to support distinctions between
bibliographic and terminologic, collective, and special identifiers, with
individual and macrolanguages to support 639-3.  A subsequent release will
include all of the languages described in ISO 639-3, as well as additions to
support at least some of the subtagging that Dan mentions, fyi.  Our intent
is to publish it on a new portal that will become part of a new service
offered by the Ontology PSIG in the OMG, since we've been asked to publish
several ontologies in recent RFPs.  I'll be happy to send our preliminary
version when it's "baked and tested", and follow up with an announcement of
the new portal (where a revision using OMG URIs will be posted) once that's
available.  It may be a couple of months before we're ready to make that
announcement, but we're hoping that the service will be useful to many of us
in the Semantic Web community.

Best regards,


Dan Brickley wrote: 

Forwarding from the Dublin Core list, in case folk here can advise. 

Gauri, one thing I'd suggest as useful would be to take the concepts
implicit in RFC 4646, 

see also

...and in particular the subtag mechanism, script, region, variant etc. 

It would be great to have those expressed explicitly. 






Language Ontology


"Salokhe, Gauri (KCEW)"  <mailto:Gauri.Salokhe@FAO.ORG>


Mon, 23 Apr 2007 17:28:39 +0200






Dear All, 
We are working on creating Ontology for languages. The need came up as we
tried to convert our XML metadata files into OWL. In our metadata (XML)
records, we have three types of occurrences of language information. 
<dc:language scheme="ags:ISO639-1">En</dc:language>
<dc:language scheme="dcterms:ISO639-2">eng</dc:language>
We have two options for modelling the language ontology:
1) Create a class for each language, assign URI to it and add all the other
lexical variations, ISO codes (create datatype property) as follows:
|_ Class:Language
      |_ Instance:URI1
              |_ rdfs:label xml:lang="en" English
              |_ rdfs:label xml:lang="es" InglÚs
              |_ rdfs:label xml:lang="it" Inglese
              |_ rdfs:label xml:lang="fr" Anglais
              |_ etc.
              |_ property:hasISO639-1Code  en (string)
              |_ property:hasISO639-2Code  eng (string)
              |_ etc.
      |_ Instance:URI2
      |_ Instance:URI3
      |_ Instance:URI4
2) Create Classes called Language and Language code and make links between
instances of Language and Language Codes as follows:
|_ Class:Language
      |_ Instance:URI1
              |_ property:hasCode  en  (link to the en instance of Class
ISO639-1 below)
              |_ property:hasCode  eng  (link to the eng instance of Class
ISO639-1 below)
|_ Class:LanguageCode
      |_ SubClass ISO639-1
              |_ Instance:en
              |_ Instance:fr
              |_ etc.
      |_ SubClass ISO639-2
              |_ Instance:eng
              |_ Instance:fra
              |_ etc.
      |_ etc.
Does anyone have similar experience with modelling in OWL? Any suggestions
which model is better and (extensible)? Does an ontology already exist that
we can reuse?
Than you, 

This email was sent to you by Reuters, the global news and information
To find out more about Reuters visit www.about.reuters.com 

Any views expressed in this message are those of the individual sender,
except where the sender specifically states them to be the views of Reuters

Reuters Limited is part of the Reuters Group of companies, of which Reuters
Group PLC is the ultimate parent company. Reuters Group PLC - Registered
office address: The Reuters Building, South Colonnade, Canary Wharf, London
E14 5EP, United Kingdom
Registered No: 3296375
Registered in England and Wales 
Received on Wednesday, 25 April 2007 09:44:01 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 21 September 2016 22:37:28 UTC