W3C home > Mailing lists > Public > w3c-rdfcore-wg@w3.org > February 2002

possible way out of maze? [was: Re: xml:lang [was Re: Outstanding Issues ]]

From: Pat Hayes <phayes@ai.uwf.edu>
Date: Wed, 27 Feb 2002 11:22:06 -0600
Message-Id: <p0510140ab8a2bd7e909d@[]>
To: Frank Manola <fmanola@mitre.org>
Cc: w3c-rdfcore-wg@w3.org
>At the risk of further complicating this discussion, let me give my
>interpretation, for what it's worth, of the M&S material in question,
>[[ (P221) The xml:lang attribute may be used as defined by [XML] to
>associate a language with the property value. There is no specific data
>model representation for xml:lang (i.e., it adds no triples to the data
>model); the language of a literal is considered by RDF to be a part of
>literal. An application may ignore language tagging of a string. All RDF
>applications must specify whether or not language tagging in literals is
>significant; that is, whether or not language is considered when
>string matching or other processing.]]
>I'd first observe that the XML spec cites as an example of using
>xml:lang distinguishing between
><p xml:lang="en-GB">What colour is it?</p> and
><p xml:lang="en-US">What color is it?</p>
>Then a few observations based on P221:
>1.  "There is no specific data model representation for xml:lang (i.e.,
>it adds no triples to the data
>model)".  That is, the lang attribute isn't explicitly reflected in the
>"data model" *as triples*
>2.  The problem is interpreting what "the language of a literal is
>considered by RDF to be a part of the
>literal" means. Brian says it means that a literal is really
>(effectively) a pair.  Patrick says the language is non-existent in the
>RDF graph.
>3.  P221 also says: "All RDF applications must specify whether or not
>language tagging in literals is
>significant; that is, whether or not language is considered when
>performing string matching or other processing."  [Note:  RDF
>application, not XML application].  If the language tagging is not
>available in what an RDF application processes, this doesn't appear to
>make any sense;  the application would have nothing to consider.  If an
>RDF application always processes an XML serialization, things would be
>OK.  But if an RDF application only processes triples (not an XML
>serialization), it seems to me we need to do one of two things:
>a.  dispense with most, if not all, of P221:  not just the part that
>says that the language is considered part of the literal, but also the
>part that talks about RDF applications possibly considering language
>tagging in string matching and other processing.
>b.  accept that the language information is *somehow* there in the
>literal (although the M&S doesn't say how).  Effectively, that sounds
>like a pair.
>[actually, maybe there's a c.:  change what we mean by "RDF

Let me suggest a possible way out of this maze. Its the kind of thing 
that a mathematician would say, so maybe it won't be acceptable, but 
here goes.

Literals are strings. However, an app might decide that what counts 
as the 'same' string for inference purposes might be 
language-sensitive, so that the UK-spelling string "What colour is 
it?" might be allowed to match, ie to be the 'same as' the 
US-spelling string "What color is it?". Such language-sensitive 
matching would require an application which used it to maintain 
language tags associated with literals, but those tage are invisible 
to RDF, and are not considered to be part of the RDF graph syntax. If 
an RDF application uses language-sensitive matching then it will be 
able to draw more conclusions than one which does not, for example

ex:Nigel  ex:believes "color is red" .
ex:BillyBob ex:believes "color is red" .

might have the consequence

ex:Nigel  ex:believes _:x .
ex:BillyBob ex:believes _:x.

with language-sensitive stringmatching, but would not if simple 
string matching were used.

IN mathematical terms, a literal is in general an equivalence class 
of strings, but the criteria that determine equivalence are under the 
RDF hood. And if there isnt anything under the hood, then every class 
just has a single string in it.

This would I think allow Brian to preserve his code with a clear 
conscience, but also would avoid the issues that arise from saying 
that languages were anything like properties. (?)


PS. One case which this might not handle well would be where the one 
string means different things in different languages. Are there any 
cases like that?

IHMC					(850)434 8903   home
40 South Alcaniz St.			(850)202 4416   office
Pensacola,  FL 32501			(850)202 4440   fax
Received on Wednesday, 27 February 2002 12:22:13 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 20:24:10 UTC