- From: Felix Sasaki <fsasaki@w3.org>
- Date: Tue, 24 Apr 2007 15:51:35 +0900
- To: "Elisa F. Kendall" <ekendall@sandsoft.com>
- CC: Debbie Garside <md@ictenterprise.co.uk>, 'WWW International' <www-international@w3.org>, 'Semantic web list' <semantic-web@w3.org>, 'LTRU Working Group' <ltru@ietf.org>
Hello Elisa,
Elisa F. Kendall wrote:
> Hi Debbie,
>
> Thanks for the warning. We did know that it was incomplete, but are
> interested in representations of place names in local languages, so
> having a structure for capturing this information, even if incomplete,
> is useful.
Debbie might expect that I point you to this: CLDR [1] already has such
as structure, and the structure is filled with region (and other) names
in many "locales". See an excerpt of locale display names for English below:
<ldml>
<identity> [...] <language type="en"/>
</identity>
<localeDisplayNames>
<languages>
<language type="de">German</language> [...] </languages>
<scripts>
<script type="Latn">Latin</script> [...] </scripts>
<territories>
<territory type="DE">Germany</territory> [...] </territories>
<variants>
<variant type="1901">Traditional German orthography</variant>
<variant type="1996">German orthography of 1996</variant>
[...] </variants>
</localeDisplayNames>
you might want to see if this is useful for your efforts.
Regards, Felix.
[1] http://unicode.org/cldr/index.html
> We're also looking at other government and research community
> resources to assist with both structure and content. If you have
> suggestions for references, that would be helpful.
>
> Best regards,
>
> Elisa
>
> Debbie Garside wrote:
>> Please be very careful with the use of the "Administrative Language"
>> information from ISO 3166-1. It is incomplete and therefore not good
>> data.
>>
>> For example, it shows only two "Administrative Languages" for India
>> where there are at least twenty-two. I am hoping that this
>> information will be taken out of the standard in the near future. I
>> am currently writing an ISO NWIP for a revision of ISO 3166-1 which
>> will include a proposal for the deletion of this data.
>>
>> Best regards
>>
>> Debbie Garside
>> Editor ISO DIS 639-6
>> www.geolang.com <http://www.geolang.com>
>>
>> ------------------------------------------------------------------------
>> *From:* www-international-request@w3.org
>> [mailto:www-international-request@w3.org] *On Behalf Of *Elisa F.
>> Kendall
>> *Sent:* 23 April 2007 18:25
>> *To:* Misha Wolf
>> *Cc:* Gauri.Salokhe@FAO.ORG; WWW International; Semantic web
>> list; LTRU Working Group
>> *Subject:* Re: [Fwd: Language Ontology]
>>
>> Hi Misha,
>>
>> We are very aware of it, and have been following the work, but I
>> failed to mention it in the email. I should say that our
>> ontology was developed for offline use in an internal system, as
>> an initial requirement. Having said that, if you look at the
>> RFCs, they only describe tags, not an RDF vocabulary or OWL
>> ontology. Our approach is compatible with the RFCs but adds
>> capabilities that support co-reference resolution, for example,
>> in target application.
>>
>> Best,
>>
>> Elisa
>>
>> Misha Wolf wrote:
>>> This sounds very worrying as you don't seem to be aware of BCP 47.
>>>
>>> Misha
>>>
>>> ------------------------------------------------------------------------
>>> *From:* www-international-request@w3.org
>>> [mailto:www-international-request@w3.org] *On Behalf Of *Elisa
>>> F. Kendall
>>> *Sent:* 23 April 2007 17:32
>>> *To:* Gauri.Salokhe@FAO.ORG
>>> *Cc:* 'WWW International'; Semantic web list
>>> *Subject:* Re: [Fwd: Language Ontology]
>>>
>>> Hi Gauri,
>>>
>>> We've done this for some of our government customers, using
>>> essentially the second approach you cite. We're also in the
>>> process of relating the ontology to another one we've built to
>>> represent ISO 3166, which includes the administrative languages
>>> used by countries and non-sovereign territories represented in
>>> that standard.
>>>
>>> If you can hang out for a few days, we (Sandpiper) are just
>>> finalizing a version that includes both ISO 639-1 and 639-2. The
>>> approach is more of a hybrid of the two you present, based on
>>> customer needs. It includes a fragment of ISO 1087, and also
>>> some inverse relations since there is a one-to-one
>>> correspondence between languages and codes. We elected to
>>> create a 'Language' class, rather than 'LanguageCode', which we
>>> reuse in other applications; classes for Alpha-2Code and
>>> Alpha-3Code are subclasses of CodeElement, from ISO 5127, with
>>> instances of these codes as first class individuals. We use
>>> literals (via datatype properties) to represent the set of
>>> English, French, and in the case of 639-1 Indigenous names.
>>> We've also created subclasses of Alpha-3Code to support
>>> distinctions between bibliographic and terminologic, collective,
>>> and special identifiers, with individual and macrolanguages to
>>> support 639-3. A subsequent release will include all of the
>>> languages described in ISO 639-3, as well as additions to
>>> support at least some of the subtagging that Dan mentions, fyi.
>>> Our intent is to publish it on a new portal that will become
>>> part of a new service offered by the Ontology PSIG in the OMG,
>>> since we've been asked to publish several ontologies in recent
>>> RFPs. I'll be happy to send our preliminary version when it's
>>> "baked and tested", and follow up with an announcement of the
>>> new portal (where a revision using OMG URIs will be posted) once
>>> that's available. It may be a couple of months before we're
>>> ready to make that announcement, but we're hoping that the
>>> service will be useful to many of us in the Semantic Web community.
>>>
>>> Best regards,
>>>
>>> Elisa
>>>
>>> Dan Brickley wrote:
>>>>
>>>> Forwarding from the Dublin Core list, in case folk here can
>>>> advise.
>>>>
>>>> Gauri, one thing I'd suggest as useful would be to take the
>>>> concepts implicit in RFC 4646,
>>>>
>>>> http://www.rfc-editor.org/rfc/rfc4646.txt
>>>> see also
>>>> http://www.w3.org/International/articles/language-tags/Overview.en.php
>>>>
>>>>
>>>> ...and in particular the subtag mechanism, script, region,
>>>> variant etc.
>>>>
>>>> It would be great to have those expressed explicitly.
>>>>
>>>> cheers,
>>>>
>>>> Dan
>>>>
>>>> ------------------------------------------------------------------------
>>>>
>>>> Subject:
>>>> Language Ontology
>>>> From:
>>>> "Salokhe, Gauri (KCEW)" <Gauri.Salokhe@FAO.ORG>
>>>> Date:
>>>> Mon, 23 Apr 2007 17:28:39 +0200
>>>> To:
>>>> DC-GENERAL@JISCMAIL.AC.UK
>>>>
>>>> To:
>>>> DC-GENERAL@JISCMAIL.AC.UK
>>>>
>>>>
>>>> Dear All,
>>>>
>>>> We are working on creating Ontology for languages. The need came up as we
>>>> tried to convert our XML metadata files into OWL. In our metadata (XML)
>>>> records, we have three types of occurrences of language information.
>>>>
>>>> <dc:language scheme="ags:ISO639-1">En</dc:language>
>>>> <dc:language scheme="dcterms:ISO639-2">eng</dc:language>
>>>> <dc:language>English</dc:language>
>>>>
>>>>
>>>> We have two options for modelling the language ontology:
>>>>
>>>> 1) Create a class for each language, assign URI to it and add all the other
>>>> lexical variations, ISO codes (create datatype property) as follows:
>>>>
>>>> OWL:Thing
>>>> |_ Class:Language
>>>> |_ Instance:URI1
>>>> |_ rdfs:label xml:lang="en" English
>>>> |_ rdfs:label xml:lang="es" Inglés
>>>> |_ rdfs:label xml:lang="it" Inglese
>>>> |_ rdfs:label xml:lang="fr" Anglais
>>>> |_ etc.
>>>> |_ property:hasISO639-1Code en (string)
>>>> |_ property:hasISO639-2Code eng (string)
>>>> |_ etc.
>>>> |_ Instance:URI2
>>>> |_ Instance:URI3
>>>> |_ Instance:URI4
>>>>
>>>>
>>>> 2) Create Classes called Language and Language code and make links between
>>>> instances of Language and Language Codes as follows:
>>>>
>>>>
>>>> OWL:Thing
>>>> |_ Class:Language
>>>> |_ Instance:URI1
>>>> |_ property:hasCode en (link to the en instance of Class
>>>> ISO639-1 below)
>>>> |_ property:hasCode eng (link to the eng instance of Class
>>>> ISO639-1 below)
>>>>
>>>> |_ Class:LanguageCode
>>>> |_ SubClass ISO639-1
>>>> |_ Instance:en
>>>> |_ Instance:fr
>>>> |_ etc.
>>>> |_ SubClass ISO639-2
>>>> |_ Instance:eng
>>>> |_ Instance:fra
>>>> |_ etc.
>>>> |_ etc.
>>>>
>>>> Does anyone have similar experience with modelling in OWL? Any suggestions on
>>>> which model is better and (extensible)? Does an ontology already exist that
>>>> we can reuse?
>>>>
>>>> Than you,
>>>> Gauri
>>>>
>>>
>>> This email was sent to you by Reuters, the global news and
>>> information company.
>>> To find out more about Reuters visit www.about.reuters.com
>>>
>>> Any views expressed in this message are those of the individual
>>> sender, except where the sender specifically states them to be
>>> the views of Reuters Limited.
>>>
>>> Reuters Limited is part of the Reuters Group of companies, of
>>> which Reuters Group PLC is the ultimate parent company. Reuters
>>> Group PLC - Registered office address: The Reuters Building,
>>> South Colonnade, Canary Wharf, London E14 5EP, United Kingdom
>>> Registered No: 3296375
>>> Registered in England and Wales
>>>
Received on Tuesday, 24 April 2007 06:51:51 UTC