Re: literals and typing

>I've been taking a look at the current MT draft and the proposed 
>changes for datatyping. I've been trying to understand what it might 
>be like to work with from an implementation standpoint. Any MT gurus 
>out there want to take a look at the following dump of my 
>thinking and shoot it full of holes where appropriate?
>
>First, I don't think it presents any great mental hurdle for anyone 
>(including programmers) to understand that the interpretation of a 
>particular literal is dependant upon its context. We all speak 
>languages after all where this is true (the same word often denotes 
>different things with the sense only discernable from context). It's 
>only in the land of the uri that this is in theory not true (by 
>definition). That said, I initially thought it would be painful for 
>a reasoning system used in real world applications not to have local 
>datatyping. After some playing around, I realized that it might very 
>well be painful for a _reasoning system_ but not so much of a 
>problem if _rdf_ didn't.
>
>I'm going to use rdfql below (the sql-like query language used in 
>RDF Gateway). A quick primer: {} surround triples, [] surround 
>uris, single quotes('') surround literals, triple order is PSO.
>
>Today if we:
>     insert {[age] [fido] '9'} into xxx
>followed by
>     select ?a using xxx where {[age] [fido] ?a}
>will return a value for ?a of 9, type literal
>
>I would expect that regardless of further datatype information this 
>would always be the response.
>
>But given:
>     {[rdfs:range] [age] [xsd:integer]}
>and
>     infer {[typedValue] ?l ?v} from {[rdf:type] ?l [xsd:integer]} 
>and ?v=int(?l)
>
>where the function int() would be provided by the reasoning system 
>to convert decimally encoded literals to integers.
>
>I would expect that I could
>     select ?v using xxx where {[age] [fido] ?a} and {[typedValue] ?a ?v}
>would return a value of ?v of 9, type integer
>
>Probably would be clearer if rdfql had a function definition syntax 
>so we could say
>     ?v=typedValue(?l) where {[rdf:type] ?l [xsd:integer]} and ?v=int(?l)
>and so
>     select ?v using xxx where {[age] [fido] ?a} and ?v=typedValue(?a)
>just to take the conversion from literals to typed values firmly out 
>of the rdf triple space
>
>
>It seems that it is the job of the reasoning system to actually 
>extract the literal value from the literal

Well....not exactly. I see the job of the RDF reasoning system being 
to figure out what datatype scheme is supposed to be associated with 
the literal (if that needs figuring out, cf. the options below), and 
to check the result for consistency (eg if the literal is said to be 
one datatype but the range of the property is a different, 
incompatible, type then something is wrong. ) But actually figuring 
out the 'real value' of the literal is something else's job; 
something that has access to the innards of the datatyping scheme. 
For example, I wouldn't expect an RDF engine to know all the details 
of XML datatyping, only to be able to consistently record a datatype, 
infer it where necessary, and deliver the results to something that 
does know the details. As far as RDF is concerned, 'xsd:integer' is 
just a URI that gets handed to some engine that knows about 
XMLschema, maybe along with a literal that is known to be of that 
type.

>(though RDF describes the rules - expressed in type and range 
>relationships - by which this can be done). And it doesn't appear 
>that it would be difficult to implement in practice.
>
>One difficulty might be if you wanted to express in RDF equivalence 
>conditions based upon literals (i.e. two names denote the same 
>person if they are associated with the same email address through 
>property foaf:mbox). But I guess that's just the situation - either 
>RDF has knowledge of builit-in intrinsic datatypes or it doesn't. 
>(and RDF doesn't do equivalence so...)
>
>Thoughts?

I think I agree with most of the above, modulo above comment. (I 
presume that you are taking 'literal' to be synonymous with 'string', 
ie a kind of label, a syntactic entity, right? ) However, I think the 
current Big Issue over how best to do datatyping is really about how 
'far apart' the literal and its datatyping information is allowed to 
be. The chief proposals on the table range from:

X: very close indeed, in fact part of the same label, so a 'literal' 
looks like "xsd:integer:10" (Patrick Stickler)

S: pretty damn close, in that the datatype links a bNode representing 
the value of the literal to the literal itself, ie one writes (sorry, 
I use Ntriples S P O ordering)
aaa eg:prop _:x .
_:x xsd:integer "10" . (Sergey Melnik)

P: arbitrarily far away, provided that RDFS can make the connection, 
eg by specifying a property range to be a datatype.  (Me and Peter 
Patel-Schneider).

Only the last of these requires the rather elaborate extensions to 
the model theory that are being contemplated, and the coreWG is still 
debating the options, which is why the new version of the MT is not 
yet released.

BTW, you are right that RDF doesn't do equivalence. Let us be 
thankful for small mercies.

Pat


-- 
---------------------------------------------------------------------
IHMC					(850)434 8903   home
40 South Alcaniz St.			(850)202 4416   office
Pensacola,  FL 32501			(850)202 4440   fax
phayes@ai.uwf.edu 
http://www.coginst.uwf.edu/~phayes

Received on Thursday, 8 November 2001 23:05:05 UTC