Re: ISSUE: Inconsistent graphs (and illformed literals) from Pat Hayes on 2006-08-17 (public-rdf-dawg@w3.org from July to September 2006)

From: Pat Hayes <phayes@ihmc.us>
Date: Wed, 16 Aug 2006 23:24:07 -0700
To: Bijan Parsia <bparsia@cs.man.ac.uk>
Cc: RDF Data Access Working Group <public-rdf-dawg@w3.org>
Message-Id: <p06230942c109a8800cde@[192.168.1.6]>
>I hope this is an easy point to resolve.
>
>In 
><http://lists.w3.org/Archives/Public/public-rdf-dawg/2006JulSep/0101>, 
>I pointed out this phrase:
>
>See <http://www.w3.org/TR/rdf-mt/#RDFINTERP>
>
>""""""An ill-typed literal does not in itself constitute an
>inconsistency, but a graph which entails that an ill-typed literal
>has rdf:type rdfs:Literal, or that an ill-typed XML literal has
>rdf:type rdf:XMLLiteral, would be inconsistent.""""""
>
>However, RDF does not allow us to express such inconsistency, since 
>we can neither have a literal in a subject position, nor can we 
>equate a URI with a particular literal, nor can we fix the range of 
>a BNode to illtyped literals (though in RDFS we can). (At least, I 
>couldn't think of how to write an RDF-inconsistent graph...Pat?)

Right, RDF just *is* consistent; you can always construct a Herbrand 
interpretation. http://www.w3.org/TR/rdf-mt/#defherbinterp

>And of course datatyped interpretations allow for more 
>inconsistency. OWL, of course, allows for much more.
>
>Simple interpretations (and I believe RDF interpretations) never 
>result in inconsistency.

Right.

>Hmm. Do I have an example of an inconsistent RDFS without D Entailment graph?

Yes, because the XMLLiteral datatype is incorporated into the RDF and 
RDFS semantics, so you don't need D-entailment to get 
malformed-literal contradictions. There is .

>I guess it doesn't matter since we sanction D entailment. RDF + D 
>entailment + our sanctioned datatypes can definitely have 
>inconsistent graphs:

Indeed.

>	:x rdf:type xsd:positiveInteger.
>	:x rdf:type xsd:negativeInteger.
>
>(See <http://www.w3.org/TR/rdf-mt/#DTYPEINTERP> for example of other 
>inconsistent, typically RDFS, graphs.)
>
>(Ooh, there is some trickiness when dealing with illtyped literals 
>and the comparison operators. For example, is 
>isLiteral("adfadf"^^xsd:integer) true? I think not by D-Entailment, 
>since it's a illformed literal and
>
>	 """The condition also requires that an ill-typed literal, 
>where the literal string is not in the lexical space of the 
>datatype, not denote any literal value. Intuitively, such a name 
>does not denote any value, but in order to avoid the semantic 
>complexities which arise from empty names, the semantics requires 
>such a typed literal to denote an 'arbitrary' non-literal value. """"
>
>So it denotes a *non*-literal value, thus it isn't a literal, thus 
>the isLiteral is false!

Right. And you don't need D-entailment for the XMLLiteral case, eg 
isLiteral("<"^^rdf:XMLLiteral) is false even in RDF; at least, it is 
if isLiteral means the same as being in the class of literal 
*values*. Even an ill-formed literal is of course a literal, 
considered as a piece of syntax.

>Tricky to get around this if literals are supposed to "give" their 
>values from the graph.)

Well, you could introduce a special 'value' to be the 'error' case, 
represented by a URI sparql:badLiteral, say. That would be consistent 
with the RDF semantics treatment. (Don't make it be a simple string, 
though.) Hmm, on reflection you might need several of them, so use 
sparql:badLiteral#nnnn

>Personally, I think inconsistent graphs by RDF, datatype, or 
>RDF/Datatype, or RDFS or RDFS/Datatype interpretations are rather 
>rare.

They would usually be signals of coding errors when writing a literal 
string, or attaching the wrong datatype.

>But it's worth settling the corner case.
>
>Inconsistent graphs entail everything, thus entail all answers. 
>That's obviously silly.

Indeed. But one could argue that it is acceptable to return the 
answers generated by a (maximal?) consistent subgraph of an 
inconsistent graph. After all, consider the case where a large, 
useful, corpus happens to get one little bug introduced into it 
because of a typo, or someone not having read the XSD spec properly. 
It seems a bit draconian to require it to effectively shut itself 
down when this is detected.

>There are two additional choices:
>
>	1) An inconsistent graph returns no answers
>	2) A query on an inconsistent graph returns an error, oh, 
>"inconsistent graph".
>
>I personally prefer the latter.

I like the idea of sending an error in these cases, but not of 
sending no answers.

>Note, just as an aside, I think this shows that the original idea of 
>just doing graph matching and then doing graph matching against 
>"virtual graphs" is not, by itself, a sufficient way of 
>specification, even aside from the other problems it has with 
>scaling to OWL. There are other side conditions that one must check.
>
>Oh, there is a third choice, though I tend not to take it seriously:
>	3) An inconsistent graph returns the set of answers that 
>graph matching against a graph generated by forward chaining 
>application of the entailment rules + some hacking to avoid BNode 
>proliferation.

Well, 'hacking' is tendentious. We have to do some 'hacking' to do 
this in *any* scheme: all the elaborate machinery of scoping sets and 
so on in the E-entailment definition is exactly this kind of 
'hacking', phrased in mathematical terminology. I think this way of 
phrasing the conditions is actually quite useful and effective when 
it can be used, and should be taken more seriously.

>3 can be used as a way to detect contradictions, since there is a 
>(large, disjuncitve) query that should test for datatype clashs. But 
>clearly this particular incompleteness isn't sanctioned by the 
>semantics, and we have to be *very* careful about specifying the 
>rules and how they are applied to assure interoperability.
>
>It wouldn't be that hard to come up with a paraconsistent reading of 
>all this so as to get some version of 3 (i.e., "useful answers" out 
>of the graph). For example, we could sanction all answers following 
>from every maximal consistent subgraph of the inconsistent graph.

Ah, indeed we could :-).

>  It would still be good to distinguish between some answers that 
>don't follow because the triples *aren't there* and some answers not 
>following because they depend on *inconsistent* triples.

Well, we could return a binding to a sparql:badLiteral URI to signal 
the presence of the error. This does not cover all possible cases, 
but it will cover most of the actual cases. These inconsistencies are 
all of a very special kind and their source can usually be traced to 
a particular typed literal.

Pat

>
>Cheers,
>Bijan.


-- 
---------------------------------------------------------------------
IHMC		(850)434 8903 or (650)494 3973   home
40 South Alcaniz St.	(850)202 4416   office
Pensacola			(850)202 4440   fax
FL 32502			(850)291 0667    cell
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
Received on Thursday, 17 August 2006 06:24:34 UTC