Re: Denotation of datatype values from Patrick Stickler on 2002-04-18 (w3c-rdfcore-wg@w3.org from April 2002)

From: Patrick Stickler <patrick.stickler@nokia.com>
Date: Thu, 18 Apr 2002 19:46:14 +0300
To: Pat Hayes <phayes@ai.uwf.edu>
CC: RDF Core <w3c-rdfcore-wg@w3.org>
Message-ID: <B8E4D306.135CF%patrick.stickler@nokia.com>
On 2002-04-18 0:38, "ext Pat Hayes" <phayes@ai.uwf.edu> wrote:


>> If RDF Datatyping cannot provide a consistent and unambiguous
>> interpretation resulting in a specific datatype value, then
>> we're just wasting our time.
> 
> No no no. This is a misunderstanding. Some idioms provide only
> lexical form checking, other idioms provide unambiguous denotations
> of datatype values. Both are needed (by different user communities,
> maybe, but needed nevertheless.) Some people want both, some people
> want one without the other, some people want to be able to remain
> agnostic. The various idioms provide for all possibilities. Insisting
> that any rational person MUST be involved with finding datatype
> values, and the (for example) the Dublin Core style of using RDF is
> just wasting time, is both wrong and also confusing to the reader.

I see the only point in constraining literals to the lexical forms
of a given datatype to be so that there is a consistent interpretation
of those literals as denoting specific datatype values.

While some idioms do provide denotations of those datatype values
and the inline idiom does not, that does not mean that the purpose
of the inline idiom is not to identify a datatype value.

The purpose of *all* datatyping idioms is to identify datatype values.

If an idiom doesn't have that purpose, then it is not a datatyping
idiom. If the inline idiom doesn't have that purpose, then it
should not be called a datatyping idiom.

And BTW, I consider an idiom that places a literal in the lexical
space of a datatype to have identified a datatype value. How could
it not?! So I consider the inline idiom to be a fully valid datatyping
idiom.

That *doesn't* mean that the literal node denotes the datatype value,
but that ultimately, some application should understand at some level
that the RDF graph is talking about the actual datatype value in relation
to that particular property, even if that datatype value is not denoted
in the graph itself.

I.e.

   literal node all by itself          = literal
   literal node combined with datatype = datatype value

the latter does *not* change the meaning of the literal
itself, no more than 2 + 1 = 3 changes the meaning of 2
just because if combined with 1 we get 3.

The datatyping idioms have a meaning that is the sum of their
parts -- the literal and the datatype -- and that total meaning
does not change the meaning of their component parts.

I tried to ask a question a week ago: does a datatype value have to
have an explicit denotation in the graph, by a single node, in order
for it to be expressed/identified/communicated by the graph?

I was told, by a combination of silence and limited comments, that
no, it does not.

You seem to be arguing that it does. That if an RDF graph is going
to express/identify/communicate a particular datatype value, that
value must have denotation in the graph, or at least explicit
definition in the MT.

I considered it to be an important and pivotal question then,
and I still do. And the answer to that question seems to be
at the root of all of the misunderstanding and disagreement
about what meaning the idioms capture and how to define the
MT to that end.

My answer to this question is no, it does not. In that a literal
node and a datatype associated with that literal node *together*
can represent/express/identify/communicate a specific datatype
value even if that value has no explicit denotation in the graph
by a single node. And I consider that to be the ultimate goal of RDF
Datatyping, to communicate datatype values.

Here's an analogy:

Datatype Idiom:       value = foo("x");
Lexical Form Idiom:   function = foo; value = *function("x");
Inline Idiom:         function = foo; *function("x");

Now, in the first two cases, the result of executing the
function 'foo' with the argument "x" is stored in the variable
'value'. In the last case, the result is not stored, but it
is still expressed.

In all cases, the function is executed and the result is
obtained.

It is exactly the case with the datatyping idioms. It it
analogous to

Datatype Idiom:       bnode = xsd:integer("10");
Lexical Form Idiom:   datatype = xsd:integer; bnode = *datatype("10");
Inline Idiom:         datatype = xsd:integer; *datatype("10");

In all cases, the lexical to value mapping is defined and the
datatype value is identified. That's the whole point of
the datatyping idioms.

You seem to say that only the bnode idioms actually identify
the datatype value, because the value has explicit denotation;
and that the inline idiom does not identify a datatype value
but only constrains the literal to the lexical space of a datatype.

But in order to test if the literal is a member of the datatype,
you must -- and this is crucial -- attempt to map it to a datatype
value, and see if you get an actual value! And thus, to say a
literal is a lexical form of a datatype is to identify the value
that it represents, presuming it is a valid lexical form.

Thus, I find your intepretation of the inline idiom as *only*
constraining literals to valid lexical forms to be both
useless insofar as the practical needs of datatyping are concerned
(which is to communicate datatype values), and actually
fail to understand how you can say that identifying a literal
as a lexical form of a datatype does not indirectly yet
unambiguously and unquestionably identify the datatype value
which is represented by that lexical form.

The expression L2V(I(ddd))("LLL") identifies a specific
datatype value, no?

And all three idioms define L2V(I(ddd))("LLL"), no?

And the bnode idioms further make the assignment
I(ccc) = L2V(I(ddd))("LLL"), i.e. they fix the datatype
value to the bnode, no?

Thus when I say that the datatyping
MT provides a datatype value interpretation for all
three idioms, I just don't get why you say that's not
correct. If that's not what the MT is saying, then
that's what the MT *should* be saying, IMHO, and
that is certainly what I thought the MT was saying
when I voted in favor of the "stake in the ground".

>> If a given approach to reaching that goal results in contradictions,
> 
> Ignoring datatype values while being concerned with lexical forms is
> not being involved in a  contradiction. It might be bad style, or
> bone-headed, but some of our customers are like that.

I think you have misunderstood the motivations for the inline
idiom, or then I have. I understand the issue such that
it is the presence of the blank node that is considered
cumbersome -- and more from the point of view of the serialization,
not the graph representation, and it's not just the desire to
only restrict the literal to valid lexical forms of a datatype.

(and as I've said before, the only point to restricting literals
to the lexical forms of a datatype is to assert an interpretation
in terms of that datatype)

I expect that the same datatype clash occurs for both of the
following cases:

Case 1:

   ex:age rdfd:datatype xsd:integer .
   ex:age rdfd:datatype xsd:string .
   Jane ex:age "10" .

Case 2:

   ex:age rdfd:datatype xsd:integer .
   ex:age rdfd:datatype xsd:string .
   Jane ex:age _:x .
   _:x rdfd:lex "10" .

Note that datatype clashes happen only above the RDF level.
RDF cannot know that the two datatypes map the lexical
form to different values, and thus cannot know that in
the latter case, the blank node is assigned two values.

And since the interpretation of the inline idiom and
rdfd:datatype assertion *together* as identifying a
datatype value happens above the RDF level, the clash
also occurs.

(and of course, it's important to note that datatype
clashes do not mean software crashes ;-) but rather that
there are conflicting interpretations being asserted by
the datatypes globally associated with the lexical
forms, and an application needs to deal with it in
some fashion)

If people do not wish to constrain the datatyping
interpretation of ex:age values to a given datatype
or datatypes, then they shouldn't make any rdfd:datatype
assertions, for *any* of the idioms, since they apply
to *all* of the idioms.

Patrick

--
               
Patrick Stickler              Phone: +358 50 483 9453
Senior Research Scientist     Fax:   +358 7180 35409
Nokia Research Center         Email: patrick.stickler@nokia.com
Received on Thursday, 18 April 2002 14:25:08 UTC