Re: I18N issues an OWL2

> > So, why could a lang: datatype hierarchy not simply state that the
> > hierarchy is defined *implicitly*. We don't need to list this
> > hierarchy explicitly, but could just define:
> >
> > <i>lang:tag1</i> is a supertype of </i>lang:tag2</i> if and only if
> > <i>tag1</i> is a prefix of <i>tag2</i>, where both <i>tag1</i> and
> > <i>tag2</i> are both valid language tags, following [BCP 47].
> >
> > Maybe, I am oversimplifying things here, but I really don't understand
> > the deep problem with this approach - which probably there is, but I'd
> > appreciate if someone could point me explicitly.

> I'm looking at ,=20
> the currently planned revision of BCP 47. See esp.
>  Many of the grandfathered tags have been superseded by the subsequent
>    addition of new subtags: each superseded record contains a Preferred-
>    Value field that ought to be used to form language tags representing
>    that value.  For example, the tag "art-lojban" is superseded by the
>    primary language subtag 'jbo'.
> That is, for the language tags "art-lojban" and "jbo" there is no=20
> hierarchy. The language tags express the same language.
> Another issue is with so-called Macro languages and extended language=20
> subtags, see
> I can't explain these concepts in detail here, but the problem with the=20
> notion of "a longer sub tag =3D deeper hierarchy" arises here:
> [
> Each encompassed language's subtag SHOULD be used as the primary
> language subtag. For example, a document in Mandarin Chinese
> would be tagged "cmn" (the subtag for Mandarin Chinese) in
> preference to "zh" (Chinese).
> o If compatibility is desired or needed, the encompassed subtag MAY
> be used as an extended language subtag. For example, a document
> in Mandarin Chinese could be tagged "zh-cmn" instead of either
> "cmn" or "zh".
> ]
> That is, Mandarine Chinese could be tagged as "zh-cmn" or "cmn" or "zh.=20
> Again you have no clear "length to hierarchy" relation.
> Addison can provide more examples and can judge if my concerns here are=20
> valid.

It seems to me that we can use datatypes like this and simply refer to
other specs for what the sub-type and equivalent-type relations are.

But, imagining a better future, ...

It would be nice (but doesn't seem necessary) for W3C to publish these
relations in machine-usable form.   Since I'm on vacation, I'm just
going to wonder about two things rather than look them up like I
should.  :-)   
    -  Does XSD give us a way to do that for data types? 
    -  Can we do it with OWL by treating datatypes as
       properties?   It seems clear to me that 
            if bar is a datatype:
               "foo"^^bar == [ bar foo ] 
       ie bar is a property where the domain is the lexical space and
       the range is the value space.  Read "xs:int" as "the integer
       value serialized in this string".  If the RDF or OWL semantics
       don't allow it, then we'd have to back off to
               "foo"^^bar == [ bar2 foo ]
       where there's a one-to-one correspondence between bar and bar2.

That would allow people with a decent semantic web engine (which doesn't
know anything about BCP 47) to query for lang=en and get results which
were lang=en-US.

    -- Sandro

Received on Monday, 14 July 2008 11:50:23 UTC