Re: Denotation of datatype values from Patrick Stickler on 2002-04-11 (w3c-rdfcore-wg@w3.org from April 2002)

From: Patrick Stickler <patrick.stickler@nokia.com>
Date: Thu, 11 Apr 2002 10:29:12 +0300
To: Jeremy Carroll <jjc@hplb.hpl.hp.com>, RDF Core <w3c-rdfcore-wg@w3.org>
Message-ID: <B8DB15F8.12DFC%patrick.stickler@nokia.com>
On 2002-04-10 22:13, "ext Jeremy Carroll" <jjc@hplb.hpl.hp.com> wrote:

> 
> Patrick:
>> But it seems that no'one (myself included) feels that the
>> answer is 'yes' (apart from perhaps Jeremy, though he has
>> not responded to this question directly) so I'll drop it.
> 
> Whilst it is good of you to keep thinking of me, I have decided that I am
> too far from the group consensus to contribute positively to the process.
> 
> I have lost quite what the "question" refered to above was, but it seems
> that the e-mail has settled on a rational point (which I disagree with).
> 
> In some contexts the literal string "25" in the model theory denotes the
> literal string "25" but "according to our shared understanding" there is a
> corresponding value of 25. The 25 does not occur in the model theory, but
> the application is expected to use it.
> 
> Personally I detect doublethink. If the specification indicates that the
> application should use 25 when it receives "25" then wherever the official
> line at which the model theory stops there is an implicit and
> ill-articulated extended conceptual model which has the value 25 in it. That
> is the conceptual model that the specifications expect the application
> writer to use (and document authors ...).

Well it may be ill-articulated (I'm working on that ;-) but it's meant to
be explicit (and this thread is very useful in nailing down where the WD
fails to communicate).

The section on "Datatyped Literals" attempts to both explicitly define
this conceptual model and also states that it is the focus of RDF
Datatyping.

The goal of RDF Datatyping is to provide applications with Datatyped Literal
pairings.

The application you reference does not recieve "25". Rather, it recieves
the datatyped literal pairing <xsd:integer, "25">,
and there are three different idioms by which that datatyped literal can
be represented. If the application only gets "25", then it can't derive
any datatype value from that *insofar as* the RDF knowledge is concerned.
(it might guess that it is some kind of number, but it won't be able
to know for sure; it could be a monthday, or in octal notation, etc.).

I do admit that this conceptual model of datatyped literal pairings is
at a level above the graph MT proper, and the MT and datatyping MT apply
only to the idioms than to what the idioms fully represent. A datatyped
literal pairing has no explicit definition in the MT nor any explicit
denotation in the graph. The implicit representation in the idioms
is the closest you get to it at that level.

Another difference between the datatyped literal conceptual level and
the MT graph level is whether the value itself has explicit
denotation in the graph, and in the case of the inline idiom it doesn't
yet at the datatyping interpretation level it does (given complete
idioms) which is why it seems that there is "doublethink" going on.
But just because the value doesn't have denotation in the graph doesn't
mean that any "doublethink" is going on. These are, after all, different
levels -- and the higher level is not changing the interpretation
of the lower level, it simply clarifies it, making explicit what
is implicit at the lower level.

Alot has to do with perspective and expectation. If you are
expecting the property to always have a datatype value, and you
are looking at the MT level, you will then be disappointed to
find a literal. If, however, you are looking at the datatyping
level, then you will find an unambiguous representation of
a value (the datatyped literal pairing), which may still
disappoint you, if you want the actual explicit value. At the
highest level, then, with full understanding of the datatype
you will finally get the actual value you are looking for.

The MT defines the graph syntax and certain constraints on the
datatyping idioms. Those idioms serve to communicate datatyped
literal pairings (though the implicit/global idioms without
rdfd:range constraints will be incomplete) and the datatyping
MT helps to achieve consistency in the interpretation of those
idioms in deriving datatyped literal pairings from them.

An application which is concerned with datatyped values will look
for the idioms that will provide it with the datatyped literal
pairings, and if it groks the datatype in question, will be
understand which is the value identified by the datatyped literal
pairing.

Applications which are only concerned with the literal graph
syntax will have a solid MT by which to interpret the idioms,
along with the rest of the graph, in terms of the MT.

I don't see a problem to having these different levels. In fact, an RDF
API could provide separate access at all three levels shown in my
"Levels of Interpretation" section. At the first level, property
values such as ex:age are simply graph nodes; either literal nodes
or non-literal nodes. At the second level, property values are
extra-RDF conceptual objects denoting datatyped literal pairings
(the idioms in the literal graph are hidden). At the third level,
property values are the actual values, in the native, system-specific
representation (presuming the API groks the datatypes in question
and the system is able to provide a native representation).

The separation of the MT/Idiom level from the conceptual datatyped
literal pairing level also gives us more liberty in the future
for defining new idioms (e.g. if literals become subjects) with
little to no impact to the higher levels based on the abstraction
of datatyped literal pairings.

While we may want to re-examine extending the Datatyping MT to
provide some explicit definition of datatyped literal pairings,
the separation of levels between idioms and those pairings is,
I think, a good thing.

> My earlier argument that the
> datatyping proposal was non-monotonic is refuted up to the model theory.
> This is from the clarity that there is only "25" and not 25. But the
> argument still stands as far as the conceptual model goes.

I appreciate this point, really.

But I don't see any real non-monotonicity
here, given we are talking about two distinct levels of interpretation,
not the same level of intepretation.

Neither level is non-monotonic, and the datatyping level only clarifies
but does not change the knowledge at the lower idiom level, so the
transition between levels also is not non-monotonic.

The complaint may then really be about having the different levels
of interpretation, rather than addressing everything at the MT level.

My question which started this thread was really trying to determine
if these different levels were going to cause any problems to anyone
(e.g. DAML, OWL, etc.) and if so, whether we should do anything about
it. The present concensus (by measure of silence ;-) is that it is
not a problem.

> It seems that the clarity of this thread that one idiom delivers 25 whereas
> the other delivers "25" should be made clear in the document,

I do not agree that any of the idioms are delivering "25". A given
property may have a value that is the literal node "25", but the *idioms*
deliver datatyped literal pairings (or are incomplete and underdefined).
The idioms are more than just the property value node in isolation.

At the level of the MT, a property may have either a literal node
denoting a lexical form or a non-literal node denoting a value, to which
is attached a literal node denoting the lexical form. But all idioms
produce the equivalent datatyping interpretation -- which is a datatyped
literal pairing.

It does appear that the datatyping MT may not fully extend to that
point, and perhaps it should. Pat?

The example above of an RDF API that provides access at the level of
the datatyping interpretation is, I think, useful here. It would never
just return a property value of "25". It would return a datatyped
literal pairing -- which, if the idiom was incomplete, may itself
be missing the datatype -- but it would not return just the graph
node that is the object of the statement, because the idiom is more
than just that single graph node and whether that node is a literal
or a non-literal node, it is insufficient by itself for determining
which actual datatype value is meant.

> and an
> explicit admission that the disideratum of interoperability between the
> global and local type mechanism has not been met.

I believe this desideratum has been met. Interoperability has to do
with semantic conflicts between the idioms, and the present proposal
does not reflect any such conflicts. All idioms involving the same
literal and datatype produce interpretations which are semantically
equivalent, and the introduction of any idiom involving a given literal
and datatype does not modify the interpretation of any other idiom
having the same literal and datatype. The "Levels of Interpretation"
illustrations could also be used to illustrate this semantic
interoperability.

Cheers,

Patrick

--
               
Patrick Stickler              Phone: +358 50 483 9453
Senior Research Scientist     Fax:   +358 7180 35409
Nokia Research Center         Email: patrick.stickler@nokia.com
Received on Thursday, 11 April 2002 03:26:35 UTC