Re: ISSUE: Malformed literals and non-lexical literals from Pat Hayes on 2006-08-19 (public-rdf-dawg@w3.org from July to September 2006)

From: Pat Hayes <phayes@ihmc.us>
Date: Fri, 18 Aug 2006 18:23:12 -0700
To: Bijan Parsia <bparsia@cs.man.ac.uk>
Cc: RDF Data Access Working Group <public-rdf-dawg@w3.org>
Message-Id: <p0623094fc10c1249d7c2@[192.168.1.6]>
>On Aug 17, 2006, at 8:58 AM, Pat Hayes wrote:
>
>>>I'm breaking this issue out and consolidating it.
>>>
>>>Under RDF, RDFS, and the D variants interpretations, some things 
>>>which are spelled like literals do not denote literals, and 
>>>somethings which are not spelled like literals do denote literals.
>>
>>Terminology check. "literal" is a *syntactic* term in RDF. Literal 
>>values are what they normally denote,



>>>	:x where
>>>		x rdf:type rdf:XMLLiteral. (RDF)
>>>	_:b where
>>>		 _:b rdf:type rdf:XMLLiteral (RDF, but _:b doesn't 
>>>stand for a particular literal)
>>>	:x where
>>>		:x rdf:type rdfs:Literal (RDFS)
>>
>>That is really all you need. The others all entail one of the above.
>
>I didn't follow exactly. You mean these three are all you need?

I meant, all the other cases entail one of these. Actually I was 
wrong, because there has to be a bnode version of the RDFS case also.


>>>
>>>I'll also note that:
>>>	"-5"^^rdf:positiveInteger < "5"^^rdf:positiveInteger
>>>
>>>should be an error, not false (I think).
>>
>>That seems correct, indeed.
>
>So, < is on the value space.

Well, yes; but this is a delicate point. If its possible to figure 
out the denotation (or, enough about the denotation to compute an 
answer/error) from the terms alone, which in this case it is 
(assuming that we know enough about the datatypes), then we can treat 
value-space tests as one variety of syntax-space tests. That is, we 
can say that the above "really" means

' "-5"^^rdf:positiveInteger ' << ' "5"^^rdf:positiveInteger '

where the << between literals is *defined* in terms of the numerical 
ordering of the corresponding literal values: L1<<L2 just when I(L1) 
numerically-< I(L2) in any xsd-interpretation I.

Why go through this torture? Because it gives a single framework in 
which the above makes sense but also isBnode(T) makes sense, i.e. we 
can meaningfully apply tests which look at the surface form. This 
also gives up more design options. For example, we could allow 
single-argument numerical tests (nonzero, positive) but not allow 
binary ones (<, as above), without actually being semantically 
coherent. (Im not recommending that particular decision, only saying 
we would have the option.)

>Which is fine but then
>
>	:x < "5"^^rdf:positiveInteger
>
>Where in the graph
>	:x rdf:type xsd:negativeInteger.
>
>should be true.

Well, maybe not, since its impossible in this case to determine the 
truth locally. That is, if I am just presented with _:x and the 
literal, I have no idea what the ordering is. So a bnode isn't << a 
literal under any circumstances. I guess what this amounts to is 
treating the answer bindings as meaningful, but only *in isolation*, 
not as part of a larger graph, for the tests to be able to use 
semantic criteria.

>>>This goes with the weird case:
>>>	:x rdf:type xsd:positiveInteger.
>>>	:y rdf:type xsd:negativeInteger.
>>>
>>>Let ?x/:x and ?y/y:
>>>	what does ?x > ?y evaluates to? Even if we think that 
>>>isLiteral is a special case, it's hard to see that the comparisons 
>>>should be (as I said before). Especially as some of them are 
>>>detectable for the lexical form of the binding.
>>
>>Not sure I follow your reasoning here.
>
>In the telecon, before you arrived, It was suggested that :x > :y 
>with the above graph shouldn't return "true". The only other 
>reasonable return would "error". But then < is only sometimes 
>sensitive to the value (i.e., where the value is derivable from the 
>lexical form

Right, that's what Im suggesting above. This position seems to me to 
be both coherent and practical.
>). (If we decide on that then it makes the looking back to the graph 
>for redundancy extra strange.)

Well, there are different kinds of redundancy. I think the 
bnode/URIref kind is rather special in RDF, and can legitimately be 
treated as a special case.

>There are obviously several choices to make it all well specified. 
>isLiteral(?x) clearly should be decided by the lexical form. (as 
>opposed to?x rdf:type rdfs:Literal, though it could be werid that 
>{?x rdf:type rdfs:Literal FILTER isLiteral(?x))} returns no answer 
>against :x rdf:type rdfs:Literal. I can live with it, but it needs 
>to be explained.

The RDF WG spent a lot of time on this very point, and indeed the 
class rdfs:Literal has a very unfortunate name, but it was thought to 
be too late to change the RDFS namespace. The documentation in the 
specs does draw attention to this point several times. We should hit 
this button a few more times, indeed.

>We might want to add value sensitive type functions. Or just leave 
>that for the BGP.
>
>We could make the operators sensitive to the value only or the value 
>that can be derived from "locally" from the lexical form alone.

We seem to have converged on this notion. I would vote for this 
option. And the spec should LIST them explicitly, in full detail, as 
what exactly 'local' means might be a matter of debate.

>That's probably the biggest decision to make. And my example with 
>malformed -5 and the neg and pos integer are perfect test cases for 
>distinguishing.
>

Yup.

Pat
-- 
---------------------------------------------------------------------
IHMC		(850)434 8903 or (650)494 3973   home
40 South Alcaniz St.	(850)202 4416   office
Pensacola			(850)202 4440   fax
FL 32502			(850)291 0667    cell
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
Received on Saturday, 19 August 2006 01:23:37 UTC