Re: RDFCore WG: Datatyping documents from Patrick Stickler on 2002-02-04 (www-rdf-logic@w3.org from February 2002)

From: Patrick Stickler <patrick.stickler@nokia.com>
Date: Mon, 04 Feb 2002 22:03:06 +0200
To: "ext Peter F. Patel-Schneider" <pfps@research.bell-labs.com>
CC: <D.M.Steer@lse.ac.uk>, RDF Logic <www-rdf-logic@w3.org>
Message-ID: <B884B59A.D292%patrick.stickler@nokia.com>
On 2002-02-04 20:43, "ext Peter F. Patel-Schneider"
<pfps@research.bell-labs.com> wrote:

> From: Patrick Stickler <patrick.stickler@nokia.com>
> Subject: Re: RDFCore WG: Datatyping documents
> Date: Mon, 04 Feb 2002 20:18:24 +0200
> 
>> On 2002-02-04 19:53, "ext Peter F. Patel-Schneider"
>> <pfps@research.bell-labs.com> wrote:
>> 
>>> From: Patrick Stickler <patrick.stickler@nokia.com>
>>> Subject: Re: RDFCore WG: Datatyping documents
>>> Date: Mon, 04 Feb 2002 19:26:35 +0200
>>> 
>>>> On 2002-02-04 17:52, "ext Damian Steer" <D.M.Steer@lse.ac.uk> wrote:
>>>> 
>>>>> TDL's method, which doesn't require those clauses, appears much more
>>>>> troublesome. <"0.0",0> != <"0",0> is a typical problem.
>>>> 
>>>> This is a problem with all datatyping proposals that RDF could
>>>> consider, since RDF cannot escape non-canonical lexical forms
>>>> and thus more than one lexical form can denote the same value
>>>> in for a given datatype.
>>>> 
>>>>> This is hardly an original thought (it was discussed on Friday), but
>>>>> could somebody explain why TDL does this? I can see hope for the
>>>>> 'almost a function' approach, but not for the lexical-value pairs.
>>>> 
>>>> Well, not to disparage Jeremy's efforts at providing an MT for
>>>> TDL (which I am not capable of doing and for which I am very
>>>> very grateful to Jeremy for his contributions), the particular approach
>>>> he took, that of the lexical-value pairing, is not exactly the
>>>> same as the basic concept behind TDL, which is more I think
>>>> along the lines of your 'almost a function' approach, and pairs
>>>> the lexical form (literal) with the URI of the datatype as
>>>> a basis for interpretation rather than a lexical form and a
>>>> value.
>>> 
>>> The problem mentioned above has everything to do with the denotation of
>>> Unicode nodes, and nothing to do with lexical forms.
>>> 
>> 
>> I'm not quite sure what you're saying here. Do you mean
>> that a Unicode node is not a lexical form?
> 
> Well I may have been confused by the use of the term lexical form for what
> I thought used to be called literal nodes and are called Unicode nodes in
> the TDL document.
> In any case, by Unicode node I mean what used to be called a literal node,
> as is used in the TDL MT.  If that is the same as lexical form, then please
> revise my above statment to something like
> 
> The problem mentioned above has everything to do with the
> denotation of Unicode nodes, and nothing to do with multiple or
> non-canonical lexical forms.

Fair enough. I take this to mean that whether or not the mapping
from a Unicode node to a value is N:1 or 1:1 is irrelevant so
long as it is not N:N or 1:N, right?

>>> I don't think that you can claim that the TDL model theory is where the
>>> mistake is.   All that this part of the formal TDL model theory is
>>> reflecting is the wordings
>>> 
>>> ... a datatype class corresponds to its map, a set of pairs of
>>> lexical strings and their corresponding values.
>>> 
>>> An interpretation maps each Unicode node to some literal-value
>>> pair.
>>> 
>>> As long as that wording is in the TDL document, and is reflected in the
>>> formal model theory then it *IS* TDL.  The example pictures cannot override
>>> these ``facts on the ground''.
>> 
>> Forgive me for not being clear. I forget that not all are privy
>> to the history of the TDL proposal.
>> 
>> That wording is part of the model theory, not part of the
>> original concept of TDL pairings.
> 
> All we have (now) to go by is the TDL document, which has your name on it.

Quite so. Though there are some references to the documents
on which it was based, but I guess it's fair not to require
or expect folks to look at those necessarily.
 
>> I understand that some folks have examined TDL solely on the basis
>> of the MT presented, and consider the rest of the verbage to be
>> just so much hand waving and babbling, but the MT was an attempt
>> at capturing the idea that the identitity of a datatype provides
>> the necessariy context for interpreting a given lexical form
>> and that with only the pair of lexical form (literal) and datatype
>> identity (URIref) it is possible to derive a single consistent
>> value in the value space of that datatype -- i.e. that a TDL
>> pairing has a 1:1 correspondence to a mapping between the lexical
>> and value space.
> 
> Well, yes, and that may be consistent with the diagrams in the TDL
> document, but the diagrams don't really say how any of that works, so we
> are left only with the model theory.  A simple example of how TDL works in
> practice, with an example like
> 
> age rdfs:range xsd:integer .
> John age "10".

In this example, which is using the global/implicit
idiom, you have a literal "10" which may be interpreted
as a member of the lexical space xsd:integer. Thus,
we have a TDL pairing ("10", xsd:integer). This pairing,
given the MT definitions of datatypes -- namely that
a datatype has a lexical space, a value space, and an
N:1 mapping from the lexical space to the value
space -- unambiguously denotes the mapping ("10", 10).

Now, if we take the local/explicit idiom

   John age _:1 .
   _:1 rdf:value "10" .
   _:1 rdf:type xsd:integer .

again we have the literal "10" and again the same TDL
pairing ("10", xsd:integer) which denotes the mapping
("10", 10).

Insofar as the graph syntax is concerned, "10" is just
a literal (a Unicode node). It is the context of the
datatype, provided by the global or local idiom, that
defines the TDL pairings that are the basis for
interpretation.
 
Thus, it is the TDL pairing, not the literal/Unicode node
that denotes the value.

> 
> (inventing some syntax so that we can refer to literal nodes) would be most
> instructive.

No extra syntax is needed. A literal in the graph is a literal node.
In and of itself it denotes a literal -- though its datatyping
interpretation is not necessarily a "string".

Or perhaps I'm not completely understanding you here.

>> I believe we are in agreement. So I'm not sure what
>> the key point of this discussion now is.
> 
> The *key* point is that the TDL document makes the denotation of a literal
> node be a pair consisting of a Unicode string and a value.  This is just
> wrong.

Fine. Fair enough. Then the flaw that you see is with the MT and
not with the TDL concept itself (though, again, I appreciate that
many folks are not interested in entertaining any concept not
expressed in some MT)

> So, I guess the question is whether you agree with this.

I do agree that the lexical-value pairing employed by the
present TDL MT is not the fundamental idea underlying the TDL
proposal, but merely a mechanism employed by one interpretation
of TDL pairings, and that there are known issues with that
interpretation -- though my understanding is that those issues
have been or are being addressed.

The TDL concept is not inextricably bound to the present TDL MT
and there are other proposals as to how the MT for TDL could
be expressed.

> If you do not,
> then I expect you to retract the TDL document as it now stands.

Although I sincerely value your opinion and input a great
deal, even if I did not agree with you (which I think I do),
I would not have retracted the TDL proposal, per se, but rather
continue to support the refinement/correction of its MT.

All of the proposals are works
in progress, and all have issues to be addressed. We are not
allowing only "perfect" and "pristine" proposals to remain on the
table for discussion and consideration. RDF Core is after all, a
*working* group, no? ;-)

Cheers,

Patrick

--
               
Patrick Stickler              Phone: +358 50 483 9453
Senior Research Scientist     Fax:   +358 7180 35409
Nokia Research Center         Email: patrick.stickler@nokia.com
Received on Monday, 4 February 2002 15:03:32 UTC