- From: Felix Sasaki <fsasaki@w3.org>
- Date: Mon, 14 Jul 2008 22:07:26 +0900
- To: Sandro Hawke <sandro@w3.org>
- CC: Axel Polleres <axel.polleres@deri.org>, Ivan Herman <ivan@w3.org>, "Phillips, Addison" <addison@amazon.com>, Jie Bao <baojie@cs.rpi.edu>, "public-owl-wg@w3.org" <public-owl-wg@w3.org>, "public-i18n-core-comments@w3.org" <public-i18n-core@w3.org>, "public-rif-comments@w3.org" <public-rif-comments@w3.org>, Boris Motik <boris.motik@comlab.ox.ac.uk>
Sandro Hawke さんは書きました: > Axel: > >>> So, why could a lang: datatype hierarchy not simply state that the >>> hierarchy is defined *implicitly*. We don't need to list this >>> hierarchy explicitly, but could just define: >>> >>> <i>lang:tag1</i> is a supertype of </i>lang:tag2</i> if and only if >>> <i>tag1</i> is a prefix of <i>tag2</i>, where both <i>tag1</i> and >>> <i>tag2</i> are both valid language tags, following [BCP 47]. >>> >>> Maybe, I am oversimplifying things here, but I really don't understand >>> the deep problem with this approach - which probably there is, but I'd >>> appreciate if someone could point me explicitly. >>> > > Felix: > >> I'm looking at http://tools.ietf.org/html/draft-ietf-ltru-4646bis-16 ,=20 >> the currently planned revision of BCP 47. See esp. >> http://tools.ietf.org/html/draft-ietf-ltru-4646bis-16#section-2.2.8 >> >> Many of the grandfathered tags have been superseded by the subsequent >> addition of new subtags: each superseded record contains a Preferred- >> Value field that ought to be used to form language tags representing >> that value. For example, the tag "art-lojban" is superseded by the >> primary language subtag 'jbo'. >> >> That is, for the language tags "art-lojban" and "jbo" there is no=20 >> hierarchy. The language tags express the same language. >> >> Another issue is with so-called Macro languages and extended language=20 >> subtags, see >> http://tools.ietf.org/html/draft-ietf-ltru-4646bis-16#section-4.1.2 >> I can't explain these concepts in detail here, but the problem with the=20 >> notion of "a longer sub tag =3D deeper hierarchy" arises here: >> >> [ >> Each encompassed language's subtag SHOULD be used as the primary >> language subtag. For example, a document in Mandarin Chinese >> would be tagged "cmn" (the subtag for Mandarin Chinese) in >> preference to "zh" (Chinese). >> o If compatibility is desired or needed, the encompassed subtag MAY >> be used as an extended language subtag. For example, a document >> in Mandarin Chinese could be tagged "zh-cmn" instead of either >> "cmn" or "zh". >> ] >> >> That is, Mandarine Chinese could be tagged as "zh-cmn" or "cmn" or "zh.=20 >> Again you have no clear "length to hierarchy" relation. >> >> Addison can provide more examples and can judge if my concerns here are=20 >> valid. >> > > It seems to me that we can use datatypes like this and simply refer to > other specs for what the sub-type and equivalent-type relations are. > > But, imagining a better future, ... > > It would be nice (but doesn't seem necessary) for W3C to publish these > relations in machine-usable form. The subtag registry has a lot of relations available. E.g. the relation between "art-lojban" and "jbo" is described in the entry for "art-lojban: %% Type: grandfathered Tag: art-lojban Description: Lojban Added: 2001-11-11 *Preferred-Value: jbo Deprecated: 2003-09-02 Comments: replaced by ISO code jbo* %% The provide access to XML processing, various formats of the registry have been created in XML, see http://www.langtag.net/registries.html I think it would be useful to have it also available in RDF, for Semantic Web processing. > Since I'm on vacation, I'm just > going to wonder about two things rather than look them up like I > should. :-) > - Does XSD give us a way to do that for data types? > You mean "publishing the relations"? You could define a hierarchy of simple types, e.g. with "en-US" being sub ordinate to "en". Though you would run again into the "language tags are generative and hard to enumerate as types" issue. > - Can we do it with OWL by treating datatypes as > properties? I think you can, though the enumeration issue would probably be the same. > It seems clear to me that > if bar is a datatype: > "foo"^^bar == [ bar foo ] > ie bar is a property where the domain is the lexical space and > the range is the value space. Read "xs:int" as "the integer > value serialized in this string". If the RDF or OWL semantics > don't allow it, then we'd have to back off to > "foo"^^bar == [ bar2 foo ] > where there's a one-to-one correspondence between bar and bar2. > > That would allow people with a decent semantic web engine (which doesn't > know anything about BCP 47) > to query for lang=en and get results which > were lang=en-US. > Sounds very reasonable to me. Felix
Received on Monday, 14 July 2008 13:08:31 UTC