Re: ISSUE: Malformed literals and non-lexical literals from Bijan Parsia on 2006-08-17 (public-rdf-dawg@w3.org from July to September 2006)

From: Bijan Parsia <bparsia@cs.man.ac.uk>
Date: Thu, 17 Aug 2006 09:26:34 +0100
To: Pat Hayes <phayes@ihmc.us>
Cc: RDF Data Access Working Group <public-rdf-dawg@w3.org>
Message-Id: <EC95D8F3-E496-4BA4-BCA3-24CF16AB9261@cs.man.ac.uk>
On Aug 17, 2006, at 8:58 AM, Pat Hayes wrote:

>> I'm breaking this issue out and consolidating it.
>>
>> Under RDF, RDFS, and the D variants interpretations, some things  
>> which are spelled like literals do not denote literals, and  
>> somethings which are not spelled like literals do denote literals.
>
> Terminology check. "literal" is a *syntactic* term in RDF. Literal  
> values are what they normally denote,

Well, with plain literals, the literal value are themselves. It's  
also confusing because of the relation to rdfs:Literal, which is all  
the more reason for me not to abuse the terminology this way.

> and the class of all literal values is rdfs:Literal. '  
> "foodle"^^xsd:number ' is a literal. That particular literal  
> denotes a non-literal value in any D-interpretation with D  
> containing xsd:number.

Yes.

> I think what you mean is, some literals do not denote literal  
> values, and some non-literals (URIrefs and bnodes) may denote  
> literal values. The first was a design decision, the second is  
> semantically inevitable, unless we made RDF into a sorted logic.

Yes. Sorry for the teminological slop.

>> Beside each line I indicate the semantics under which the  
>> condition holds using the following key: RDF for at least RDF  
>> interpretations, RDFS for at least RDFS, *(D) for at least under  
>> (* +  D-entailment
>
> Which D? Im guessing you mean the XSD version.

I mean D interpretations given the set of XSD literals explicitly  
mentioned in the SPARQL document.

>> ), and obviously, the datatypes have to be part of the datatype  
>> theory.
>>
>> Literals:
>
> None of these are literals. They are all non-literals which denote  
> literal values.

Yes, sorry.

>> 	:x where
>> 		x rdf:type rdf:XMLLiteral. (RDF)
>> 	_:b where
>> 		 _:b rdf:type rdf:XMLLiteral (RDF, but _:b doesn't stand for a  
>> particular literal)
>> 	:x where
>> 		:x rdf:type rdfs:Literal (RDFS)
>
> That is really all you need. The others all entail one of the above.

I didn't follow exactly. You mean these three are all you need? I  
wasn't aiming for exhaustiveness but a variety of examples.

>> 	:x where
>> 		:x :p :y. :p rdfs:domain rdfs:Literal (or rdf:Literal)  
>> (symmetric with range) (RDFS)
>> 	:x where
>> 		x rdf:type xsd:integer (RDF(D))
>
> These 'where's aren't really needed. Any URIref *could* denote a  
> literal value. The conditions you note - and they really all boil  
> down to membership in rdf:XMLLiteral (RDF) or rdfs:Literal (RDFS) -  
> are what would *entail* that the term denotes a literal value.

Well, I'm trying to make a connection between the triples and what  
might appear in a binding. Just trying to make it easy to read for  
people coming in.

>> The rest should be obvious. I may have missed some.
>
> Yes, you can do it with subproperty and subclass as well.
>
>> Non-literals:
>> 	"<"^^rdf:XMLLiteral (RDF)
>> 	"a"^^rdf:integer (RDF(D))
>> 	"-5"^^rdf:positiveInteger (RDF(D))
>> (there are range and domain things analogous to the RDFS ones)
>>
>> (The latter are non-literals because they are illformed.
>
> No, they *are* literals. They don't denote literal values.

Yep. Slop. Sorry.

>> Illformed literals denote a non-literal element of the domain.
>
> Quite.
>
>> )
>>
>> <http://www.w3.org/TR/rdf-mt/#RDFINTERP>
>> """ The third condition requires that ill-typed XML literals  
>> denote something other than a literal value: this will be the  
>> standard way of handling ill-formed typed literals."""
>>
>> If I understand the current intention of the current document,  
>> isLiteral returns false for all the above literals and true for  
>> all the above non-literals, even under datatype semantics (which  
>> is the only other semantics mentioned in the document).
>
> My understanding is quite different, that isLiteral is a purely  
> syntactic check on the form of the binding term. In other words, it  
> means exactly what it says. Similarly for isBnode, of course, which  
> would be meaningless or incoherent if understood as a predicate on  
> denotations.

Yes.

>> I find that counterintuitive at the very least. Perhaps we could  
>> wave away the literals above by constraining isLiteral to only the  
>> lexical form, but under RDF semantics "<"^^rdf:XMLLiteral *is not  
>> a literal*.
>
> No, it really is a literal. See above.

Yep. Doesn't denote a literal.

>>  (Is simple + datatype a possible combination? In which case all  
>> the RDF weird literals/non-literals will also be in Simple(D)).
>>
>> I'll also note that:
>> 	"-5"^^rdf:positiveInteger < "5"^^rdf:positiveInteger
>>
>> should be an error, not false (I think).
>
> That seems correct, indeed.

So, < is on the value space. Which is fine but then

	:x < "5"^^rdf:positiveInteger

Where in the graph
	:x rdf:type xsd:negativeInteger.

should be true.

>> This goes with the weird case:
>> 	:x rdf:type xsd:positiveInteger.
>> 	:y rdf:type xsd:negativeInteger.
>>
>> Let ?x/:x and ?y/y:
>> 	what does ?x > ?y evaluates to? Even if we think that isLiteral  
>> is a special case, it's hard to see that the comparisons should be  
>> (as I said before). Especially as some of them are detectable for  
>> the lexical form of the binding.
>
> Not sure I follow your reasoning here.

In the telecon, before you arrived, It was suggested that :x > :y  
with the above graph shouldn't return "true". The only other  
reasonable return would "error". But then < is only sometimes  
sensitive to the value (i.e., where the value is derivable from the  
lexical form). (If we decide on that then it makes the looking back  
to the graph for redundancy extra strange.)

There are obviously several choices to make it all well specified.  
isLiteral(?x) clearly should be decided by the lexical form. (as  
opposed to?x rdf:type rdfs:Literal, though it could be werid that {?x  
rdf:type rdfs:Literal FILTER isLiteral(?x))} returns no answer  
against :x rdf:type rdfs:Literal. I can live with it, but it needs to  
be explained.

We might want to add value sensitive type functions. Or just leave  
that for the BGP.

We could make the operators sensitive to the value only or the value  
that can be derived from "locally" from the lexical form alone.

That's probably the biggest decision to make. And my example with  
malformed -5 and the neg and pos integer are perfect test cases for  
distinguishing.

Sorry for letting my abuse of terminology get me a bit confused about  
isLiteral.

"""3.4 Matching Values and RDF D-entailment
RDF defines D-Entailment where extra semantic conditions are allowed  
for datatypes. When matching RDF literals in graph patterns, the  
datatype lexical-to-value mapping may be reflected into the  
underlying RDF graph, leading to additional matches where it is known  
that two literals are the same value. RDF semantics does not require  
this of all RDF graphs."""

This is confusing and doesn't specify what it needs to. RDF semantics  
doesn't require it for any graph except with regard to XMLLiterals,  
if by "RDF semantics" we mean "RDF interpretations". But RDF+D wrt a  
set of datatypes does apply to all graphs, just in some graphs it  
makes no difference (i.e., if there are no typed literals or  
statements involving literals.) I think we need to say what the  
behavior is with and without D entailment.


Cheers,
Bijan.
Received on Thursday, 17 August 2006 08:26:42 UTC