Re: RDFCore WG: Datatyping documents

From: Patrick Stickler <patrick.stickler@nokia.com>
Subject: Re: RDFCore WG: Datatyping documents
Date: Mon, 04 Feb 2002 20:18:24 +0200

> On 2002-02-04 19:53, "ext Peter F. Patel-Schneider"
> <pfps@research.bell-labs.com> wrote:
> 
> > From: Patrick Stickler <patrick.stickler@nokia.com>
> > Subject: Re: RDFCore WG: Datatyping documents
> > Date: Mon, 04 Feb 2002 19:26:35 +0200
> > 
> >> On 2002-02-04 17:52, "ext Damian Steer" <D.M.Steer@lse.ac.uk> wrote:
> >> 
> >>> TDL's method, which doesn't require those clauses, appears much more
> >>> troublesome. <"0.0",0> != <"0",0> is a typical problem.
> >> 
> >> This is a problem with all datatyping proposals that RDF could
> >> consider, since RDF cannot escape non-canonical lexical forms
> >> and thus more than one lexical form can denote the same value
> >> in for a given datatype.
> >> 
> >>> This is hardly an original thought (it was discussed on Friday), but
> >>> could somebody explain why TDL does this? I can see hope for the
> >>> 'almost a function' approach, but not for the lexical-value pairs.
> >> 
> >> Well, not to disparage Jeremy's efforts at providing an MT for
> >> TDL (which I am not capable of doing and for which I am very
> >> very grateful to Jeremy for his contributions), the particular approach
> >> he took, that of the lexical-value pairing, is not exactly the
> >> same as the basic concept behind TDL, which is more I think
> >> along the lines of your 'almost a function' approach, and pairs
> >> the lexical form (literal) with the URI of the datatype as
> >> a basis for interpretation rather than a lexical form and a
> >> value.
> > 
> > The problem mentioned above has everything to do with the denotation of
> > Unicode nodes, and nothing to do with lexical forms.
> >
> 
> I'm not quite sure what you're saying here. Do you mean
> that a Unicode node is not a lexical form?

Well I may have been confused by the use of the term lexical form for what
I thought used to be called literal nodes and are called Unicode nodes in
the TDL document.
In any case, by Unicode node I mean what used to be called a literal node,
as is used in the TDL MT.  If that is the same as lexical form, then please
revise my above statment to something like

	The problem mentioned above has everything to do with the
	denotation of Unicode nodes, and nothing to do with multiple or
	non-canonical lexical forms.

> > I don't think that you can claim that the TDL model theory is where the
> > mistake is.   All that this part of the formal TDL model theory is
> > reflecting is the wordings
> > 
> > ... a datatype class corresponds to its map, a set of pairs of
> > lexical strings and their corresponding values.
> >
> > An interpretation maps each Unicode node to some literal-value
> > pair.
> >
> > As long as that wording is in the TDL document, and is reflected in the
> > formal model theory then it *IS* TDL.  The example pictures cannot override
> > these ``facts on the ground''.
> 
> Forgive me for not being clear. I forget that not all are privy
> to the history of the TDL proposal.
> 
> That wording is part of the model theory, not part of the
> original concept of TDL pairings.

All we have (now) to go by is the TDL document, which has your name on it.

> I understand that some folks have examined TDL solely on the basis
> of the MT presented, and consider the rest of the verbage to be
> just so much hand waving and babbling, but the MT was an attempt
> at capturing the idea that the identitity of a datatype provides
> the necessariy context for interpreting a given lexical form
> and that with only the pair of lexical form (literal) and datatype
> identity (URIref) it is possible to derive a single consistent
> value in the value space of that datatype -- i.e. that a TDL
> pairing has a 1:1 correspondence to a mapping between the lexical
> and value space.

Well, yes, and that may be consistent with the diagrams in the TDL
document, but the diagrams don't really say how any of that works, so we
are left only with the model theory.  A simple example of how TDL works in
practice, with an example like

	age rdfs:range xsd:integer .
	John age "10".
	
or even

	John age "10":1 .
	"10":1 rdf:type xsd:integer .

(inventing some syntax so that we can refer to literal nodes) would be most
instructive.


> I believe that to a great extent the MT provided by Jeremy, and
> for which I am aware you provided input, did capture most of
> that idea, but not without running into some issues regarding
> entailment and compatability with tidy literals (both issues
> which, I believe, have been resolved -- though work continues
> on improving the MT).

Sure, but, again, entailment and tidiness are independent of what the
denotation of a literal node is.

> Again, I am very grateful to Jeremy's significant contributions
> in providing the MT for TDL and feel that the TDL proposal is a
> far better proposal because of it.
> 
> > And yes, if anyone is counting, I view this as a fatal flaw in TDL.
> 
> I'm sorry, what is the fatal flaw?
> 
> > When I say 
> > 
> > age rdfs:range xsd:integer .
> > John age "10" .
> > 
> > (the XML Schema extension of) the RDF model theory had [...] better have a
> > denotation of age relationship between the denotation of John and the
> > integer 10 in *every* model for these two triples.  Anything else is just
> > plain wrong.
> 
> I believe we are in agreement. So I'm not sure what
> the key point of this discussion now is.

The *key* point is that the TDL document makes the denotation of a literal
node be a pair consisting of a Unicode string and a value.  This is just
wrong.

So, I guess the question is whether you agree with this.  If you do not,
then I expect you to retract the TDL document as it now stands.

> patrick

Peter F. Patel-Schneider
Bell Labs Research

Received on Monday, 4 February 2002 14:40:05 UTC