Re: [TF-ENT] Condition C2 modifications from Ivan Herman on 2010-02-25 (public-rdf-dawg@w3.org from January to March 2010)

From: Ivan Herman <ivan@w3.org>
Date: Thu, 25 Feb 2010 07:14:50 +0100
To: Birte Glimm <birte.glimm@comlab.ox.ac.uk>
CC: Axel Polleres <axel.polleres@deri.org>, SPARQL Working Group <public-rdf-dawg@w3.org>
Message-ID: <4B86155A.8080209@w3.org>
For the sake of completeness let me add one more argument that can be
used as a 'con' argument to the modified (C2). If I use the query

SELECT ?ind ?cl WHERE { ?ind rdf:type ?cl }

with the data

x a A;
  pred something .

Under the current C2 the answer will be

[?ind->x,?cl->A].

However, with the modified (proposed) C2, we will also get

[?ind->pred,?cl->rdf:Property], [?pred->rdf:type,?cl->rdf:Property]

with RDFS entailment. This is because, well, this is the way RDFS
entailment is defined and those come out of the axiomatic triples. The
fact that we get those extra triples motivated us to make (C2) a bit
more restrictive (those solutions would not appear with the current C2
or with the one modified by Axel).

While this may be a 'con' argument for the modified C2, I believe it is
actually a 'pro' argument. As I said yesterday on the phone: this is the
way RDFS is defined, and it is not SPARQL's job to restrict perfectly
valid solutions. by adding suitable FILTER-s to the query the user can
filter out unwanted results easily (essentially by filtering out those
that are in the RDF(S) namespace). Ie, I regard that, actually, as an
extra argument to adopt the modified (C2).

Cheers

Ivan

On 2010-2-24 21:02 , Birte Glimm wrote:
> Hi Axel,
> here a few arguments for why I prefer the proposed/modified C2 version
> and how that relates to data values.
> 
> Current C2:
> (C2) Each variable x that occurs in the subject position of a triple in
>        BGP is such that sk(μ(x)) occurs in sk(SG).
> 
> This C2 is not enough for OWL RDF-Based Semantics since as soon as I
> have an individual in my queried graph, e.g., just by having a triple
> such as
> ex:a ex:b ex:c .
> this entails
> ex:a owl:topDataProperty "1"^^xsd:integer .
> ex:a owl:topDataProperty "2"^^xsd:integer .
> ex:a owl:topDataProperty "3"^^xsd:integer .
> etc. And a query with BGP
> ex:a ?x ?y
> has infinite answers. Similar effects happen with number restrictions
> (owl:minCardinality etc). You could strengthen the condition such that
> it applies to variable in both the subject and object position. Even
> with conditions on subject and object positions, that C2 is not
> sufficient if we assume that even for an inconsistent graph SG is
> defined as graph equivalent to the active graph. E.g., if I have
> ex:a ex:b "<"^^rdf:XMLLiteral .
> ex:b rdfs:range rdfs:Literal .
> then I should explicitly say that SG is not graph equivalent and the
> answers are not defined by the spec, but they are system dependant and
> systems have to guarantee finiteness. Obviously systems will just work
> as if there was no inconsistency and you'll only notice the difference
> once the inconsistency is encountered.
> 
> Since systems don't necessarily check consistency, it is, however,
> natural to assume that they just use the definition of answers and the
> conditions indiscriminately. As soon as I allow SG to be graph
> equivalent to AG also for inconsistent AGs, I can, however, also
> imagine an evil system that just says "No matter what, SG=AG" and
> starts generating all kind of shit:
> ex:a ex:aa ex:a .
> ex:a ex:aaa ex:a .
> ex:a ex:aaaa ex:a .
> and both subject and object occur in the input and that's all
> entailed. My worry is just that I have to prove finiteness in any case
> and that's not so easy in this situation. Even if subject and object
> bindings are limited, you still have to be sure that under no
> circumstances you can have infinite predicates. I also don't like that
> in the presence of inconsistencies everything is left to the system. I
> had one variant, where I defined SG for an inconsistent AG as
> "repaired" AG, where malformed XML literals are replaced by
> well-formed ones, but that is also not nice since we don't want to
> return corrected values as answers and it is not really possible to
> repair inconsistencies in OWL Full.
> 
> I then used the condition that you proposed, where C2 applies to
> variables in all positions. That gives you an easy proof of finiteness
> and works also for AG graph equivalent to SG in any case, but it has
> some not nice side effects in particular for OWL RDF Based
> Semantics/OWL RL. E.g., you have
> ex:a ex:b ex:c .
> and a query with BGP:
> ex:a ?pred ex:a
> which has no answer. Then you add
> ex:somethingelse owl:sameAs ex:somethingelse .
> and for the same query you suddenly get
> ?pred/owl:sameAs
> which is not nice since the added triple had nothing to do with ex:a.
> 
> This is what lead to the proposed C2:
> (C2) For each variable x in V(BGP), sk(μ(x)) occurs in sk(SG) or in Vocab.
> where vocab is defined as the reserved vocabulary for the entailment
> regime (e.g., the RDF vocabulary for RDF entailment) minus terms of
> the form rdf:_n with n in {1, 2, …}.
> 
> Here you can safely say that SG is graph equivalent to AG in any case
> and the condition works for all RDF based regimes. The proof that
> answers will be finite is easy in any case because the vocabulary from
> which answers are taken is finite, because for integers and other data
> values you still get a restriction of infinite answers because data
> values are not in vocab.
> 
> So the question is whether there are objections to the changed C2 and
> if so, why and what would be a better alternative.
> 
> Birte
> 

-- 

Ivan Herman, W3C Semantic Web Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
PGP Key: http://www.ivan-herman.net/pgpkey.html
FOAF   : http://www.ivan-herman.net/foaf.rdf
vCard  : http://www.ivan-herman.net/HermanIvan.vcf
Attachments

application/pkcs7-signature attachment: S/MIME Cryptographic Signature
Received on Thursday, 25 February 2010 06:15:03 UTC