RE: RDF specifications from Peter F. Patel-Schneider on 2001-12-12 (www-rdf-interest@w3.org from December 2001)

From: Peter F. Patel-Schneider <pfps@research.bell-labs.com>
Date: Wed, 12 Dec 2001 09:34:27 -0500
To: Patrick.Stickler@nokia.com
Cc: www-rdf-interest@w3.org
Message-Id: <20011212093427M.pfps@research.bell-labs.com>
From: Patrick.Stickler@nokia.com
Subject: RE: RDF specifications (was RE: Cutting the Patrician datatype kno	 t)
Date: Wed, 12 Dec 2001 15:35:54 +0200

> > -----Original Message-----
> > From: Peter F. Patel-Schneider 
> > Sent: 10 December, 2001 10:09
> > 
> > From: Patrick.Stickler@nokia.com
> > Date: Sat, 8 Dec 2001 00:51:30 +0200 
> > 
> > > > For example, for XML Schema, the interface could pass a pair like
> > > > <integer,10> or even <integer,"10"> instead of <decimal with 0
> > > > fractionDigits union string,"010">.  This would be much easier for
> > > > applications to handle than requiring them to understand all 
> > > > of XML Schema
> > > > constructed datatypes.
> > > 
> > > I believe this is very similar to what I am presently arguing, that
> > > data typing insofar as RDF is concerned is simply capturing the 
> > > pairing of lexical form and data type identifier, and that there may
> > > be various ways to do that, such as globally via rdfs:range or
> > > locally via an anonymous node with rdf:value and rdf:type values. 
> > 
> > Not at all.  I don't see how our approaches could be any more 
> > different.  
> > 
> > The description above talks solely about the interface between an RDF
> > systems and RDF applications, not about how the RDF system 
> > itself works.
> > However, the above interface requires that the RDF system pass only
> > primitive XML Schema datatypes to applications.  It thus 
> > *requires* that
> > the RDF system has some deep understanding of XML Schema 
> > datatypes.  (I
> > also argue that this understanding can be embodied in an XML Schema
> > datatypes implementation, which is called by the RDF part of 
> > the system.) 
> 
> But what you are then talking about is a specific RDF API, not RDF
> itself. 

How can RDF exist without some sort of API?  As I have argued earlier,
without an API I can exhibit a box that accepts any input and does nothing,
and it will act exactly the same as an RDF box.  If you want the RDF API to
be full access to the RDF graph, then all I am suggesting is that there
*is* a way of providing access that does not expose all the datatype
constructors to applications.

> If a given RDF API wishes to execute the mappings of typed
> literals (based on pairings defined by the RDF graph) and pass
> canonical internalized representations of the values to applications
> which share the same value space representations as the API, fine,
> in fact, that would be expected, but that's not RDF. 

How is this not RDF?  RDF has to be able to pass data to applications
somehow?  How is this to be done?

> That's out of
> scope insofar as how we are going to capture that data typing
> knowledge in the RDF graph itself, in a way that is independent of
> any data typing scheme or any particular representation used by
> any given platform.
> 
> I will reiterate, RDF should not use XML Schema data types natively
> in the graph.

I just don't understand this approach.  If RDF is going to have datatypes
at all, then it has to have some understanding of the datatypes, otherwise
entailment cannot be performed.



A formalism that allows various datatyping schemes to be incorporated
has to solve all the problems that a formalism that just
incorporates a particular datatyping scheme, and more.

Consider an example from ML:

If I build a function to sum an integer list, then I need to provide the
code for this, perhaps

   fun sum l = List.foldl op+ 0 l;

if I (only) want to provide a function that collapses a list using an
arbitrary other function (of the right type) then I could use

  fun rollup plus l = List.foldl plus 0 l;

Note particularly that (rollup op+) is essentially the same as sum.

If you want to have a method for incorporating arbitrary datatyping schemes
in RDF then it better be the case that the application of the RDF
``functor'' to a particular datatyping scheme is RDF with datatypes.

Otherwise you have accomplished nothing.

In particular, how are you going to determine whether 

	age rdfs:range xsd:integer .
	John age "10" .

entails

	John age "010" .



> > ...
> > > So, to recap:
> > > 
> > > Do we really need anything more than the definition of the
> > > pairing and at least two idioms for global and local definition
> > > of such pairings? I say no.
> > 
> > > Do we really want the MT itself to say whether the pairings 
> > > (xsd:integer,"10") and (xsd:integer,"010") actually denote the 
> > > same value or not? I think not.
> > 
> > > Or do we rather leave such questions up to applications that 
> > > "know about" the specific data types and is able to determine
> > > such things? Yes, definitely.
> > 
> > In this case RDF does not have datatypes.  
> 
> BINGO! RDF itself should not *have* datatypes. It should provide
> a generic mechanism for allowing data types to be associated with
> literals (or really, any resource).

But then you haven't done anything.  If you truely believe this, then why
are you arguing for PDU?  


> The RDF Core WG is not tasked to add XML Schema data types to RDF, IMO,
> but to provide a consistent, clear, and well defined means to use
> XML Schema data types in a way that applications which are XML Schema
> aware can recognize the typing knowledge and apply XML Schema 
> mechanisms accordingly.

Well I don't see how this goal can be realized.  Perhaps you have some way
of achieving this goal without adding a datatyping mechanism to RDF, but I
do not believe that I have seen it.

> RDF should not have any native data types itself, and should be fully
> neutral with regards to data typing schemes, supporting other schemes
> as well as it supports XML Schema data types.

This is your reading of how to proceed.  I do not see how this can work
and, further, I do not see how it is consistent with 

	RDF Schema must use and build upon XML Schema datatypes to the
	fullest extent that is practical and appropriate. [W3C RDF Core
	Working Group Charter]

> The PDU proposal accomplishes this fully, and is also the way folks
> are doing data typing in RDF now. And it works.

I dispute all three of these claims.  First, I have recently sent you a
message concerning difficulties in PDU.  Second, PDU places restrictions on
data typing that are not enforced by current RDF and, I expect, are not
followed by many uses of data typing in RDF.  Third, PDU does not provide a
full account of what datatypes mean in RDF.

> Yes, you may wish to have API layers above RDF that make it easier
> for certain types of applications to deal with typed literals, but
> that is not what "data typing in RDF" should provide (IMMHO).

And I disagree, strongly.

> > All that it has are some
> > unspecified conventions 
> 
> The point is to have them explicitly and officially specified.
> 
> > on passing information that may or may not be
> > datatype information on to downstream applications.  
> 
> Exactly, because RDF cannot know what that "downstream" application
> is, and what its internal data type scheme/representations are.

But then, again, what do datatypes mean?  RDF needs to know so that it can
do entailment.

> > Because 
> > RDF does not
> > understand the conventions no RDF document should mention them.
> 
> I don't follow your argument here.

If something is not defined by RDF then no RDF document should talk about
that thing.  (For example, if datatypes are not defined in RDF then no RDF
document should be talking about datatyping.)

> > > We still need to work through some issues of class relations
> > > and the semantics of e.g. rdfs:subClassOf and rdfs:subPropertyOf
> > > with regards to data types (lexical vs. value spaces) but those
> > > are issues that will have to be addressed no matter what idiom
> > > is used to define pairings.
> > 
> > If you don't provide a meaning for datatypes, then you can't determine
> > how they interact with other parts of RDF, and thus there is 
> > nothing that
> > can be done.  
> 
> The point is for RDF to define a meaning for data types that is 
> independent of specific knowledge about value space representations
> or actual mappings -- leaving these to applications which "know"
> about the data typing scheme in question.
> 
> It's no different than XML just capturing <p> and leaving it to
> the application to know what that means.
> 
> For RDF, we're just saying something like
> 
>    <typedDataLiteral>
>       <xsd:integer>10</xsd:integer>
>    </typedDataLiteral>
> 
> but you could just as well have
> 
>    <typedDataLiteral>
>       <foo:count>10</foo:count>
>    </typedDataLiteral>
> 
> The PDU approach is actually analogous to the treatment that typed
> literals recieve in XML. It's up to the application (possibly an
> API) to interpret the data. We don't have
> 
>    <typedDataLiteral>
>       10
>    </typedDataLiteral>
> 
> which is just presumed to be an xsd:integer. I.e. XML itself does not
> have built in data types. Why should RDF?

Because RDF has a nobler goal than XML --- meaning.  XML can get away
without providing a display meaning for <p> because XML does not inhabit
the space where display meanings are defined.  All XML does is provide a
syntax so that applications that want to pass documents with display
meaning can do this easier.  However, if you want to have a datatyping
scheme for RDF you need to know enough about the scheme to provide an
RDF meaning for datatyping constructs, one that is sufficient to determine
what RDF entailment is on these constructs.

And, by the way, XML did make changes to <p>.  Just try to pass an
old-style HTML document through an XML parser and see how many errors you
get.


> > I really do not understand how it is possible to include in 
> > RDF something
> > whose meaning is explicitly not given in RDF.  How can this generate a
> > consistent definition of RDF?
> 
> The meaning will be explicit and consistent. It is similar to an
> expression that is not evaluated, but which fully and unambiguously
> denotes the value. Thus "2+5" is not actually the value '7' but
> it unambiguously denotes that value. Likewise (Literal,DataType)
> unambiguously denotes a value in the value space of a data type.
> Where and when that "expression" is evaluated to achieve the value
> is up to the application layers above the RDF graph.

If "2+5" denotes '7' then RDF needs to know it so that it can determine
that 

	John age "2+5" .

entails

	John age "7" .


> Cheers,
> 
> Patrick


Peter F. Patel-Schneider
Bell labs Research
Received on Wednesday, 12 December 2001 09:35:09 UTC