Re: Syntax-level typing (was Re: A data typing proposal) from Patrick Stickler on 2002-08-06 (w3c-rdfcore-wg@w3.org from August 2002)

From: Patrick Stickler <patrick.stickler@nokia.com>
Date: Tue, 06 Aug 2002 11:31:51 +0300
To: ext Sergey Melnik <melnik@db.stanford.edu>
CC: "R.V.Guha" <guha@guha.com>, RDF Core <w3c-rdfcore-wg@w3.org>
Message-ID: <B9756427.199AA%patrick.stickler@nokia.com>
On 2002-08-06 10:53, "ext Sergey Melnik" <melnik@db.stanford.edu> wrote:

> Patrick Stickler wrote:
> 
>> On 2002-08-02 14:28, "ext Sergey Melnik" <melnik@db.stanford.edu> wrote:
>> 
>> 
>>> R.V.Guha wrote:
>>> 
>>> 
>>>> ...
>>>> The simplest thing I can think of is to say that the literal always
>>>> denotes the string, unless there is an explicit xsd attribute which
>>>> specifies some other data type. Life just becomes so much simpler ...
>>>> 
>>> Let me elaborate a bit on the above. If what comes below is not what
>>> Guha had in mind, I apologize; call it syntax-level typing, anyway.
>>> 
>>> The simplest things to do might be to make the primitive XSD datatypes
>>> part of RDF abstract syntax and tackle an extensible generic typing
>>> scheme later on (in WebOnt or RDF 2.0).
>>> 
>>> In essence, we could assume that typed values can be referred to
>>> directly in the graph, without using their lexical forms. So, we simply have
>>> 
>>> Jenny --age--> (int)5
>>> 
>>> where (int)5 is a literal, just like "5" is another one. URIs like
>>> xsd:integer denotes the class of integers (as defined in XSD), so that
>>> 
>>> age --rdfs:range--> xsd:integer
>>> 
>>> has the expected effect.
>>> 
>>> Typed literals as used above would be opaque to RDF; their
>>> interpretation be fixed. An extended serialization syntax needs to be
>>> used to distinguish (int)5 from "5". For RDF/XML we could simply use the
>>> XSD syntax, e.g.:
>>> 
>>> <age xsi:type="xsd:integer">5</age>
>>> 
>>> It would be the task of the parser to look at the xsi:type declaration
>>> and generate the correct triples. Other RDF syntaxes (e.g., NTriples)
>>> would have to design their own means of encoding typed values.
>>> 
>>> All idioms that we've been discussing go away. Later on, other ways of
>>> referring to typed literals (e.g., using our idioms or URI-schemes) can
>>> be developed along with an extensible type system for RDF, which would
>>> allow defining derived types etc.
>>> 
>>> The syntax-level typing sketched above does not require (but of course,
>>> can leave with) untidiness. In fact, typed literals like (int)5 can be
>>> mapped directly to say Java built-in types.
>>> 
>>> Sergey
>>> 
>> 
>> Well, ahem, this was basically the URV idiom that I tossed on the table
>> nearly a year ago.
>> 
>> And it requires no changes to RDF whatsoever. Just use a URI to denote
>> the typed literal which denotes the value in question. Done.
>> 
>> C.f. http://ietf.org/internet-drafts/draft-pstickler-val-01.txt
> 
> 
> Right, it is very close the URV typing, except that typed values are not
> required to have URIs (pro: no URIs have to be standardized (yet), con:
> concrete syntaxes need to be adjusted).

Well, you really will end up with the equivalent of URIs, even if you
adjust the concrete syntax, because you'll need globally unambiguous
identifiers for the datatypes (I'm presuming that RDF won't have any
built in types such as 'int' in your examples) so whether you write

   (http://www.w3.org/2001/XMLSchema#integer)10

or

   val:(http://www.w3.org/2001/XMLSchema%23integer)15

is not much of a difference. And the latter requires no mods to RDF.

If you're just using non-URI strings as the types, i.e. (SSS)LLL where
there can be ambiguity about what SSS denotes, then that just
tranfers all the issues of untidy/ambiguous semantics of literals
to the type names themselves. I hardly see how that is beneficial.

And whether one allows syntactic sugar to permit qnames rather
than URIs is a secondary issue. E.g. one could define for N3
that 

   (xsd:integer)10

is just a short hand for, and corresponds to, the URI

   val:(http://www.w3.org/2001/XMLSchema%23integer)15

etc.

Likewise, one may also define (if they choose to, either via
an included mechanism in RDF datatyping or as a local convention)
that the following

   :s :p "LLL" .
   :p rdfs:range <DDD> .
   <DDD> rdf:type rdfs:Datatype .

entails

   :s :p <val:(DDD)LLL> .

etc.

Which of course, err, ahem, brings us all the way back to my
original proposal to the WG way back when...  ;-)

Now, if folks want to avoid a new URI scheme and add the URV
mechanism into the syntax proper of RDF, I'm all for that, but
that seems just as large a change to RDF syntax as allowing literals
as subjects, and since one can accomplish the same result with
a val: URI or similar, the latter seems much more workable a
solution for the time being.

Cheers,

Patrick

--
               
Patrick Stickler              Phone: +358 50 483 9453
Senior Research Scientist     Fax:   +358 7180 35409
Nokia Research Center         Email: patrick.stickler@nokia.com
Received on Tuesday, 6 August 2002 04:31:57 UTC