Re: [TF-ENT] RDFS entailment regime proposal from Birte Glimm on 2009-09-28 (public-rdf-dawg@w3.org from July to September 2009)

From: Birte Glimm <birte.glimm@comlab.ox.ac.uk>
Date: Mon, 28 Sep 2009 12:28:31 +0100
To: Chimezie Ogbuji <ogbujic@ccf.org>
Cc: SPARQL Working Group <public-rdf-dawg@w3.org>
Message-ID: <492f2b0b0909280428hb725cddvd0ece71f1407e44e@mail.gmail.com>
[snip]

> Hey Birte.  I have a few comments, focusing mostly on the technical content
> rather than style, wording, etc.
>
>> [..] The only additional answers from RDFS compared to RDF are some axiomatic
>> triples, plus any IRI used as a property will end up as part of an answer to ?p
>> rdf:type rdf:Property.
>
> The initial language here seems to suggest ruling out answers that, when
> substituted into the pattern instance, result in graphs which can be derived
> from SG, axiomatic triples, and the application of the RDFS entailment
> rules.  Later on, however, the RDFS entailment relation is used in defining
> the solution mappings for the query answers.

Hm, I don't think I get your point here. The sentence you cite is just
an explanation of why we do only provide an RDFS entailment regime and
not also an RDF entailment regime. Should I rephrase that?

In general, the set of answers that, when substituted into the pattern
instance, result in graphs which can be derived from SG, axiomatic
triples, and the application of the RDFS entailment rules can be
infinite and we cannot change that and that is fine. So, yes, I use
RDFS entailment to define what possible answers are under an RDFS
entailment regime. What I suggest is to NOT return all (possibly
infinitely many) answers, but just a subset of them that is guaranteed
to be finite. To do this, I proposed the restriction that, for
example, restricts the number of axiomatic triples by only returning
those for which the subject occurs in the signature. Does that clarify
your issue? Can you otherwise try and rephrase what your concern is?

>> There are a view design choices that I have listed at the end of the
>> section on RDFS, which could be an alternative to the currently
>> proposed way of restricting the answers sets to a finite size. Please
>> let me know prefer any of them or have any other suggestions.
>
>> .. if μ(v) is a blank node, then μ(v) occurs in the scoping graph SG
>
> So, is the general intuition here that answers where subject terms are Blank
> nodes must "refer" to a priori blank nodes in the SG?

Well, that is one use of this restriction. Blank nodes that were not
there a priori, can, however, cause infinite answers even in other
ways. For example,
Take the triple
ex:a ex:b ex:c .
as your data. This triple entails
ex:a ex:b _:exc1 .
for some blank node _:exc1 allocated to <ex:c>. Now you can further
derive the triple
ex:a ex:b _:exc2 .
for some blank node _:exc2 allocated to _:exc1. And from there
ex:a ex:b _:exc3 .
for some blank node _:exc3 allocated to _:exc3. And from there...

Which of the blank nodes would you allow as a binding for ?x in the query
SELECT ?x WHERE { ex:a ex:b ?x }

under the restriction that I propose you would get { ?x-><ex:c> } and
that is it. None of the blank nodes occured in the input, so they can
be used in the derivation of consequences, but they will now show up
in answers. I that ok to you or do you have an alternative proposal?

>> Ideally,
>> a monotonic behavior for queries would be very nice, but that is not
>> easy to achieve when solution sequences are possibly infinite without
>> restrictions.
>
> Well, isn't monotonic usually a characteristic of an entailment
> relationship? In this case it is not the RDFS entailment relationship that
> is monotonic (in the sense of the word I'm used to, anyways), but rather
> there is a difference in answers based on whether or not pattern
> substitutions involve terms in some combination of either the signature of
> the SG or the query.
>
> But, how does
>
> ASK { rdf:_1 rdf:type rdf:property }
>
> provide an answer via the entailment regime if the possible solutions for a
> BGP under an entailment relationship is defined only WRT the variables in
> the query but there are none in this query?

Well, I think using the term monotonicity might be misleading. What I
would want is the following: when I ask the Boolean query
ASK { rdf:_1 rdf:type rdf:property }
against the empty graph, I would get true since this is an axiomatic
triple and rdf:_1 occurs in the signature of the query. There is no
variable, but in that case I just check RDFS entailment of the given
triple. If I then go on to ask
SELECT ?x WHERE { ?x rdf:type rdf:property }
again against the empty graph, I get an empty answer set. That is not
what I expect. I just learned that the triple rdf:_1 rdf:type
rdf:property is entailed (ASK query), but then I get no answer when I
replace rdf:_1 with a variable. That is what happens under the current
definition and that is not nice.
What I thought is that one could define the relevant signature as only
the names (IRIs, literals, blank nodes) from the queried data set and
ignore the names from the query. For RDFS that would result in a more
intuitive bahavior I think. For OWL that wouldn't work due to
nominals, but that is a different issue.

> I like the 2nd option amongst the alternative design choices. It requires
> that if the user makes the mistake of giving a query that would normally
> give infinite answers (the use of patterns that match against axiomatic
> triples being one example), they must provide a restriction on the solution
> set.  Having no control over the returned answer seems reasonable given the
> 'unsafe' nature of the query.

Ok, I'll keep that in mind.

Thanks for the comments,
Birte

> ----------------------
> Chime (chee-meh) Ogbuji (oh-bu-gee)
> Heart and Vascular Institute (Clinical Investigations)
> Architect / Informatician
> Cleveland Clinic (ogbujic@ccf.org)
> Ph.D. Student Case Western Reserve University
> (chimezie.thomas-ogbuji@case.edu)
>
>
> ===================================
>
> P Please consider the environment before printing this e-mail
>
> Cleveland Clinic is ranked one of the top hospitals
> in America by U.S. News & World Report (2008).
> Visit us online at http://www.clevelandclinic.org for
> a complete listing of our services, staff and
> locations.
>
>
> Confidentiality Note:  This message is intended for use
> only by the individual or entity to which it is addressed
> and may contain information that is privileged,
> confidential, and exempt from disclosure under applicable
> law.  If the reader of this message is not the intended
> recipient or the employee or agent responsible for
> delivering the message to the intended recipient, you are
> hereby notified that any dissemination, distribution or
> copying of this communication is strictly prohibited.  If
> you have received this communication in error,  please
> contact the sender immediately and destroy the material in
> its entirety, whether electronic or hard copy.  Thank you.
>
>



-- 
Dr. Birte Glimm, Room 306
Computing Laboratory
Parks Road
Oxford
OX1 3QD
United Kingdom
+44 (0)1865 283529
Received on Monday, 28 September 2009 11:29:06 UTC