datatypes [was Re: Argh!] from Peter F. Patel-Schneider on 2001-10-29 (w3c-rdfcore-wg@w3.org from October 2001)

From: Peter F. Patel-Schneider <pfps@research.bell-labs.com>
Date: Mon, 29 Oct 2001 15:07:36 -0500
To: melnik@db.stanford.edu
Cc: phayes@ai.uwf.edu, w3c-rdfcore-wg@w3.org
Message-Id: <20011029150736H.pfps@research.bell-labs.com>
From: Sergey Melnik <melnik@db.stanford.edu>
Subject: Re: Argh!
Date: Mon, 29 Oct 2001 11:49:17 -0800

> "Peter F. Patel-Schneider" wrote:
> > 
> > > You are right, according to the DAML/OIL schema, &quot;2&quot; is supposed to be a
> > > nonNegativeInteger. Following Pat's proposal, &quot;2&quot; could denote many
> > > different things like numbers, shoe sizes and weights in pounds
> > > depending on the context. In my opinion, this ambiguity is
> > > counterproductive and is a heavy burden for interoperability.
> > 
> > Arghhhh!  Pat's proposal does not introduce any extra ambiguity.  Pat's
> > proposal is quite compatible to the DAML+OIL way of doing things.  Pat's
> > proposal produces a single denotation for literals that are the object of
> > DAML+OIL cardinality properties.
> > 
> > Please, please, please, if you are going to argue against Pat's proposal
> > get the facts right!
> 
> I'd be glad to and I'm sorry if I don't. Of course, you are right in
> that the immediate interpretation (by means of 'I') of each literal
> symbol is some uniquely determined entity in DD. Then, in your/Pat's
> suggestion,  some other mapping (IDT, if I remember right) assigns each
> of these uniquely determined entities a set of other entities. So what's
> the difference? Effectively, a literal symbol is mapped to a *set* of
> entities in DD by a composition of mappings.

Close (in my model theory a literal is mapped to a set of entities by a
union of mappings, Pat's model theory may be slightly different), but that
is no more ``ambiguous'' than a blank-node proposal.  The ambiguity is just
in a different place.

> Please prove me wrong - maybe that's just a big misunderstanding. I
> brought up several examples in
> http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2001Oct/0530.html and
> illustrated graphically my understanding of how they should be
> interpreted. I'm still waiting for Pat's output of MT DT section, in
> which he will presumably provide an illustration of an alternative
> approach. How would the examples I referred to look like?

Unfortunately, all the examples in your message use a different way of
doing literals than Pat's/my way.  My way, in particular, does not allow
types to be directly associated with literals, you have to use range
restrictions instead.  (Modifications of my model theory could have closer
typing, such as via

	<foo xsi:type="xsd:int">570</foo>
or
	<foo xsi:type="xsd:float">0570</foo>

but this would NOT be handled by an xsi:type link in the RDF graph.) 

> > > If literals may denote everything you like (and many things at once), I
> > > don't see why we need resources/URIs any more. We could do just fine
> > > with literals. For example, literal &quot;Peter&quot; could denote a person,
> > > sometimes Peter Pan, another time Peter The Great (even in the same
> > > graph!). Literal &quot;2&quot; in the above example could well denote Peter The
> > > Great, too.
> > 
> > This is soo wrong that I don't know how any reasonable person could even
> > think of it.  Literals are constrained in their interpretation.
> > Non-literals are less constrained than literals.  The only differences
> > between literals and non-literals is that a literal has a ``print-string''
> > that is used to restrict what it can denote and a non-literal can have a
> > label, which has no real import in a tidy RDF graph.
> 
> So? Literals are constrained by what? By properties defined in some
> schemas? So why can't "Peter" denote Peter Pan (or Patel-Schneider,
> given the context ;) ?

[Warning:  In the following, I will be bit loose in terminology and
syntax.  No ambiguities should result, however!]

My model theory states that a literal Peter, as in 

	peter name "Peter".

can only denote something that has an XML Schema datatype lexical
representation that is the literal "Peter".  

If Peter Pan is not in the value space of any XML Schema datatype then it
cannot be the denotation of a literal.  peter, of course, could denote
Peter Pan.  Only if you were using a datatype scheme that had Peter Pan as
one of its literal values, and a lexical representation of that literal
value was "Peter", could you have "Peter" denote Peter Pan.

Further, if you have 

	name rdfs:range xsd:string .
	peter name "Peter" .

then the name of peter has to be the string ``Peter''.

More interesting is the following situation:

	peter age "07" .
	age rdfs:range xsd:integer .

	susan shoe-size "07" .
	shoe-size rdfs:range xsd:string .

Then, indeed, the two occurences of "07" have different denotations, one
being the integer 7 and the other being the string ``07''.  Similarly, if
all you have is

	mary phone "5824471" .

then you don't know whether mary's phone is the decimal 5 824 471 or the
string ``5824471'' or even the floating point number 5824471 (which is
different, in XML schema, from the decimal 5 824 471).  There are even
other interpretations for mary's phone (such as a URI, I think).

> > If it makes you feel better, you could use a different (but equivalent)
> > graph where a literal maps into two nodes.  One node, corresponding to the
> > node in Pat's RDF graphs, would be *like* a blank node.  The other node
> > would be the ``print-string'' of the first node.  The model theory would
> > then constrain the interpretation of the first kind of node by having a
> > built-in interpretation of ``print-string''.
> 
> The last sentence stops me from feeling better. If you change it to "The
> property that connects the blank node and the ``print-string'' node
> constrains the interpretation of the blank node", I would totally
> subscribe under the above paragraph.

In my paragraph above ``print-string'' is the relationship between the two
nodes.  I didn't want to call it a property.

> > Why use Pat's RDF graphs instead of these graphs?  Two reasons:
> > 1/ Pat's RDF graphs are closer to RDF M&S.
> 
> Graphs are graphs, and are as close to graphs, as any other graphs. I
> don't buy that Pat's *interpretation* of those graphs is closer to M&S -
> I believe the opposite.

How can you say this?

The first example in M&S is

	http://www.w3.org/Home/Lassila Creator "Ora Lassila" .

Further, I see many RDF ``documents'' that have things like

	john name "John" .
	john age "07" .

In both cases the literal is the object of the relationship and no other
node is required.

Requiring a blank literal value node and a ``print-string'' between the
literal value node and its lexical form would go against both M&S and
common RDF practice.

> > 2/ The ``print-string'' edge would be subject to lots of misunderstandings.
> 
> This is where I vehemently disagree. The edge determines the
> interpretation of the blank node, it represents a mapping (possibly
> partial) between a lexical space and a value space. XML Schema defines
> such mappings in a quite precise fashion. IMO, the edge is what makes
> datatyping work!

If you employ the ``print-string'' edge, you have to disallow or provide
interpretations for several constructions that use it, including
1/ multiple print-string edges from the same literal value node
2/ print-string edges from non-literal nodes
Both of these possibilities give rise to potential misunderstanding.
The details can be worked out, and the misunderstandings can be reduced by
suitable wording in the new RDF documents, but why bother when the
``print-string'' edge is not necessary?  

> > Peter F. Patel-Schneider
> 
> Sergey

peter
Received on Monday, 29 October 2001 15:08:13 UTC