Re: Datatypes and xml:lang from Patrick Stickler on 2002-02-06 (w3c-rdfcore-wg@w3.org from February 2002)

From: Patrick Stickler <patrick.stickler@nokia.com>
Date: Wed, 06 Feb 2002 10:39:34 +0200
To: Jeremy Carroll <jjc@hplb.hpl.hp.com>, Dave Beckett <dave.beckett@bristol.ac.uk>, Sergey Melnik <melnik@db.stanford.edu>
CC: RDF Core <w3c-rdfcore-wg@w3.org>
Message-ID: <B886B866.D570%patrick.stickler@nokia.com>

On 2002-02-05 15:23, "ext Jeremy Carroll" <jjc@hplb.hpl.hp.com> wrote:

> 
>>> Dave Beckett wrote:
>>>> 
>>>> Can the TDL / S authors say something about where they see how the
>>>> xml:lang attribute will appear in the data type models.
>> 
> 
> I think there is a choice.
> 
> 
> Version 1 (lang-string)
>  works for both S and TDL
> 
>  wherever we have been talking about a string, we now talk about a
> lang-string.
>  This is a pair of an optional language tag and a string and has been
> explored in depth at:
> http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2001Sep/0341.html

Hmmm... Does this approach allow the literal to be
"paired" with more than one language tag?

If not, then it can't work.

A literal does not belong to only one language. It's
just a string, and that string may intersect with
the set of lexical tokens in more than one language.

Is "pan" English? Spanish? Klingon?

In reality, it's just the literal "pan". It is context
(just as with typed data literals) that provides its
interpretation.

We need a solution that allows a given literal to be
qualified for language for each language context. There
are no absolute, global associations between literals
and language.

> 
> Version 2 (lang-triple)
>  IMO works better with TDL, because of tidiness.
> 
>  xml:lang goes into the triple structure
>  (i.e. we decide M&S got it wrong)

This, I think, works better.

> 
> I have a preference for 2 if we can live with the charter issues.
> My understanding of Sergey's message:
> http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2002Feb/0008.html
> is that in S, the lang-triple needs to modify a bNode, which at least in
> S-B, needs to be autogenerated.

I think that this would be the same representation in the
convergence TDL/S-P model.

I.e.

   <rdf:Description rdf:ID="MyBook">
      <dc:title xml:lang="en" rdf:value="The Tao of Poo"/>
   </rdf:Description>

gives us

  MyBook dc:title _:1 .
  _:1 rdf:value "The Tao of Poo" .
  _:1 xml:lang _:2 .
  _:2 rdf:value "en" .

which combined with the presumed range constraint

  xml:lang rdfs:range xsd:lang .

entails

  MyBook dc:title _:1 .
  _:1 rdf:value "The Tao of Poo" .
  _:1 xml:lang _:2 .
  _:2 rdf:value "en" .
  _:2 rdf:dtype xsd:lang .

Note that the bNode preserves the context of
the language attribution. Other contexts may
associate the same literal with different languages.

Eh?

[Note that the literal contraction transformation
means that in the graph/triples the only property
that will actually have a literal as its value will
be rdf:value]

--

Additional issue:

I think it is also necessary to clarify the relationship
between xml:lang and xsd:lang, as it seems that the latter
is the range of the former, but this has never been
explicitly defined. The only documented relationship is
that the value/lexical space of both is based on the
ISO language codes standard.

Perhaps there is no official relation, and no range
constraint such as above should be presumed?

Patrick

--

Patrick Stickler              Phone: +358 50 483 9453
Senior Research Scientist     Fax:   +358 7180 35409
Nokia Research Center         Email: patrick.stickler@nokia.com

Received on Wednesday, 6 February 2002 03:38:51 UTC