RE: Literals (Re: model theory for RDF/S) from Patrick.Stickler@nokia.com on 2001-10-02 (www-rdf-logic@w3.org from October 2001)

From: <Patrick.Stickler@nokia.com>
Date: Tue, 2 Oct 2001 16:33:02 +0300
To: pfps@research.bell-labs.com
Cc: drew.mcdermott@yale.edu, www-rdf-logic@w3.org
Message-ID: <2BF0AD29BC31FE46B788773211440431621539@trebe003.NOE.Nokia.com>
> -----Original Message-----
> From: ext Peter F. Patel-Schneider 
> [mailto:pfps@research.bell-labs.com]
> Sent: 02 October, 2001 15:41
> To: Stickler Patrick (NRC/Tampere)
> Cc: drew.mcdermott@yale.edu; www-rdf-logic@w3.org
> Subject: RE: Literals (Re: model theory for RDF/S)
> > It is true that int:5 and int:05 would technically constitute
> > different URIs and hence different resources, but that's how
> > RDF does things, eh? Different URI, different resource. I'm
> > sure we don't want to shift that foundational pillar...  ;-)
> 
> If RDF treats int:5 and int:05 different then it doesn't understand
> integers.

Correct. 

RDF does not understand integers. It doesn't even understand URIs! 

It no more needs to understand the difference between int:5 and
int:0005 as it does between "5" and "00005".

> > Insofar as as a generalized, consistent, global representation
> > for a given data type, though, one would expect that there would
> > be constraints defined which prohibit semantically vacuous variant
> > forms, such as above. So yes, you bring up a very valid requirement
> > for e.g. an int: scheme, that we wouldn't get int:00000000005, etc.
> > but that's an issue for the particular scheme, not the methdology of
> > URI encoded literals itself, I think (apart from specifying it as
> > an expected quality of every such scheme to not have semanticly
> > vacuous variant forms).
> 
> So then you need special-purpose parsers for each scheme. 
> More than that,
> you need at most one literal for each value in the datatype, which is
> certainly not the normal way of doing things.

Stay tuned. Generalized solution on the way... (rather get the full
doc done then toss out fragments that may be misunderstood, etc.)
 
> > And on a practical level, one would not necessarily expect URI 
> > encoded literals to act as the subject of statements, or to
> > serve as indirect identifiers of other resources, even if 
> technically
> > they could be coerced to do so (and regardless of whether they
> > were guarunteed to be free of semantically vacuous variant forms).
> 
> Even if they are just objects of triples, you can run into problems.
> 
> Consider
> 
> #Susan #favorite-integer int:05 .
> #Susan #favorite-integer int:5 .
> 
> how is an RDF query system supposed to respond when asked 
> about Susan's
> favorite integers?

Good point. But again, that has nothing to do with the proposed encoding
of literals as URIs. The very same problem exists with

 #Susan #favorite-integer "05" .
 #Susan #favorite-integer "5" .
 
Right?

In any case, with literals as URIs, one can enforce that e.g. "05" is
not legal and thus achieve a more robust system (even if the only
action is to signal an error and/or disregard the statement).

Stay tuned...  (the more I write here the slower I write "there")...

> > URIs to an RDF processor are just opaque, globally unique 
> identifiers. 
> > An RDF processor does not, and should not have to 
> understand anything 
> > about any URI, insofar as the semantics of the URI or URI 
> scheme themselves 
> > are concerned (I stress that last clause of the assertion, 
> re-read as
> > needed). 
> 
> Again, if the RDF processor does not understand something 
> about the URI
> scheme then it has not captured anything about the URI 
> scheme.  If these
> strange URIs are not given some sort of theory, and the part 
> of that theory
> that makes a difference to RDF is not followed by an RDF 
> processor, then
> you have not captured the meaning of the URI scheme.  (You 
> may think that
> RDF is so expressively impoverished that it doesn't need to 
> know about any
> part of the theory, but this is just not so.)

I probably both agree and disagree with you here, but simply
based on where lines are drawn and how things are layered.
 
> Not at all.  URLs have the (potentially strange) 
> characteristic that two
> different URLs that map to the same ``place'' are different 
> objects.  There is
> nothing theoretically wrong with this way of looking at URLs. 
>  You may say
> that it is incorrect, and if you convince enough people some W3C
> recommendations may have to change, but this view of URLs is 
> internally
> consistent.

But RDF does not use URLs as URLs, it uses them as globally unique
*opaque* strings. Some higher processing layer interacting with
RDF encoded data may understand URLs as URLs and do something
*special* with those, but that's beyond the scope of RDF proper.

(anyone feel free to jump in here and correct me if I'm wrong)

> ...A DAML+OIL (March 2001) processor has to 
> understand a portion of
> XML Schema, not just the syntax but also the semantics.

Really? I thought it just borrowed the URI defined identity of
the XML Schema data types. I once asked if a DAML parser had
to also include that subset of functionality of XML Schema
for dealing with data types -- particularly user defined
data types, and was told a clear "no". Not that it couldn't,
but it didn't have to.

Maybe things have changed?

Do you know of any DAML system that presently does, or at 
least plans to?

> It is true that you can make a consistent view of all this from this
> ``RDF'' viewpoint, but you do have to be a bit careful.  In 
> particular, if
> you want to allow RDF to be consistent with different URI 
> schemes, you have
> to modify the "one-URI, one-Resource" philosophy to a 
> "one-URI, possibly
> one-Resource".  

Maybe, but that is yet another issue.

> ...it does mean that RDF moves further away from 
> the XML / XML
> Schema way of representing literals and datatypes.

Though I'm all for standardization (heck that's why I
spend so much time working on standards stuff) I'm not
so sure that we want to necessarily adopt XML Schema
data types as the fundamental data type framework for
the entire web. Maybe. Maybe not. But in any case, that
is still yet another issue ;-)

Even if we adopted XML Schema data types, we could benefit
from URI encoded literals. The two are not inter-dependent.
 
> ... but if you stick with this 
> philosophy I don't
> think that you can claim to be representing anything besides 
> uninterpreted
> URIs.  

Or rather, its a question about at what functional layer
you wish to add that "patch" and address the URI equivalence
issue.

> Moreover, there are certain consequences of this 
> approach that need
> to be analyzed.

Most certainly. Blast shields up. Impulse power, slow.

Cheers,

Patrick

--
Patrick Stickler                      Phone:  +358 3 356 0209
Senior Research Scientist             Mobile: +358 50 483 9453
Nokia Research Center                 Fax:    +358 7180 35409
Visiokatu 1, 33720 Tampere, Finland   Email:  patrick.stickler@nokia.com
Received on Tuesday, 2 October 2001 09:33:39 UTC