RE: Literals (Re: model theory for RDF/S) from Patrick.Stickler@nokia.com on 2001-10-08 (www-rdf-logic@w3.org from October 2001)

From: <Patrick.Stickler@nokia.com>
Date: Mon, 8 Oct 2001 11:31:46 +0300
To: phayes@ai.uwf.edu
Cc: www-rdf-logic@w3.org
Message-ID: <2BF0AD29BC31FE46B78877321144043114C01C@trebe003.NOE.Nokia.com>
> I meant being able to tell just by looking at the literal itself 
> whether one had a numeral (in some base) or a string (in some 
> language) or whatever

OK. Now I think I see where/why the disconnect has been happening... 

It all boils down to strong versus weak data typing, and how early
on your catch your errors. (thanks to Ora's comments about the
development history of RDF which helped clarify that in my head)

RDF, by default assumes weak data typing -- but I am looking for
a means to introduce strong data typing (or at least the explicit
hooks for it) which would allow for more generalized validation of
terminal data values being defined in my knowledge base.

Let's (please) not go into the (often religious) debate over
strong versus weak data typed languages -- but merely look at this
methdology as a means to make RDF bases SW applications "neutral" 
with regards to strong versus weak data typing.

Those who prefer weak data typing can just use untyped (or at
best implicitely typed) literal strings, and those who prefer
strong data typing can use e.g. URI encoded typed data values.

It should be noted -- without falling into the weak vs. strong typing
debate per se -- that most eBusiness and data management systems rely
on strong data typing -- and from that perspective, having to go from
what is *known* as e.g. a non-negative integer to a 'string' and trust that
such knowledge will be properly inferred and restored by some SW
agent, or guess about a representation of a unit of measure and hope
some SW agent understands the same encoding, seems IMO to be very sloppy,
careless, and dangerous.

And also, being able to deduce the data type of a literal simply by
its lexical form is unlikely to be feasible for any arbitrary set of 
data types with lexical intersections, and thus cannot be considered
IMO a reasonable approach in contexts where data types matter.

> >and what that buys us, in general. I may just be looking
> >to closely...
> >
> >>  >Granted, in order to have "extra" knowledge about the actual
> >>  >data type used, we need to either interpret the URI scheme (which
> >>  >is outside the scope of the MT) or employ mechanisms such as
> >>  >the (now unfortunately deprecated) rdf:aboutEachPrefix, e.g.
> >>
> >>  If aboutEachPrefix was only used in this way it wouldnt be so bad,
> >>  but it got all mixed up with containers.
> >
> >Could not one decouple its troublesome use with containers yet
> >retain it for useful stuff like making statements about all instances
> >of a given URI scheme?
> 
> Yes, though just as a political/pedagogic point it would probably be 
> better to trash it and invent a new name for that use.

Fair enough -- and in fact, I'd rather opt for a mechanism which is
not tied to the parsing of a single instance, but rather provides
global, persistent knowledge about all members of a given class, 
irregardless of where/when they are defined. After all, constructs
like aboutEach or aboutEachPrefix must be defined for every 
single serialization -- and thus fail to act as global knowledge about
a class of resources.

Regards,

Patrick

--
Patrick Stickler                      Phone:  +358 3 356 0209
Senior Research Scientist             Mobile: +358 50 483 9453
Nokia Research Center                 Fax:    +358 7180 35409
Visiokatu 1, 33720 Tampere, Finland   Email:  patrick.stickler@nokia.com
Received on Monday, 8 October 2001 04:32:47 UTC