W3C home > Mailing lists > Public > public-rdf-dawg@w3.org > January to March 2010

Re: [TF-ENT] Entailment regimes doc update

From: Ivan Herman <ivan@w3.org>
Date: Fri, 19 Feb 2010 12:30:31 +0100
Message-ID: <4B7E7657.10809@w3.org>
To: Birte Glimm <birte.glimm@comlab.ox.ac.uk>
CC: SPARQL Working Group <public-rdf-dawg@w3.org>, Chimezie Ogbuji <ogbujic@ccf.org>, Axel Polleres <axel.polleres@deri.org>


On 2010-2-19 11:57 , Birte Glimm wrote:
> On 19 February 2010 08:26, Ivan Herman <ivan@w3.org> wrote:
>> I am not sure, procedurally, what the next step is. Would we need a nod
>> from the WG to make these changes on your draft, or would you just go
>> ahead and do it? It affects the (C2) condition on previously published
>> entailment regimes, too.
> 
> You know the official process much better than I do I guess ;-)
> 

In general, such changes are to be discussed with the group, but this
case is special because this is still a 'time permitting' thingy...

>> Personally, I would prefer some other 'go-ahead' reactions from all
>> parties interested... You can also raise this shortly on the next telco
>> if there is time enough (noting that this document still has the
>> 'time-permitting' label attached to it...)
> 
> How about putting that on the agenda of our (RIF) entailment regimes
> teleconf. At least there we'll have a bigger group and everybody cares
> about entailments. If we all agree, then I would just go ahead and
> change it unless there are official requirements that require a thumbs
> up from the whole group.
> 

Discussing it on the telco is and excellent idea.

Cheers

I.

> Birte
> 
>> Ivan
>>
>> On 2010-2-18 13:18 , Birte Glimm wrote:
>>> [snip]
>>>>> I am not 100% happy. I think it is better because I can imagine tools
>>>>> that allow you to choose between getting all triples and getting only
>>>>> those that use the input vocabulary. Thinking even more about this,
>>>>> here is another refinement that might capture even better what most
>>>>> users would expect.
>>>>> (C2) For each variable x in V(BGP), sk(μ(x)) occurs in sk(SG) or in Vocab.
>>>>> (C3) For each triple s p o in P(BGP), either s or o occurs in sk(SG).
>>>>>
>>>> So... I wonder whether we are not going down the false route here. The
>>>> _only_ requirement we have is to ensure finiteness. (C2) ensures that,
>>>> but we could, maybe, make it even more lax by concentrating only on the
>>>> rdf:_i properties and disallow any of those in the conclusion that do
>>>> not appear in the original graph. (This is what ter Horst does.). That
>>>> might ensure finiteness.
>>>
>>> C2 is that lax at the moment since we said vocab is the reserved
>>> vocabulary for the regime you are using minus the rdf:_i and variable
>>> binding come either from there or from the queried graph, so if the
>>> graph talks about rdf:_3, then rdf:_3 is allowed. Ter Horst goes even
>>> a little bit further if I remember correctly by taking the largest n
>>> such that rdf:_n occurs in the input and allows all rdf:_m with m<=n.
>>>
>>>> The result set will be large? Sure it will. But this _is_ what the
>>>> RDF/OWL semantics dictates, isn't it, so why would we want to second
>>>> guess the user? Does he/she wants to control the size of the output?
>>>> Well, that is why, in their infinite wisdom, the authors of SPARQL
>>>> invented FILTER-s...:-)
>>>
>>> That's a good point. It might be a bit slower to first add all the
>>> results and then remove them again, but in the end there are not that
>>> many. (More than a user might have expected, but in total numbers not
>>> many).
>>>
>>>> If I look at the particular example that you gave, here is a modified
>>>> version:
>>>>
>>>> SELECT ?ind ?class
>>>> WHERE {
>>>>  ?ind a ?class .
>>>>  FILTER(
>>>>     !regex( str(?ind), "^http://www.w3.org/2000/01/rdf-schema#" ) &&
>>>>     !regex( str(?ind), "^http://www.w3.org/1999/02/22-rdf-syntax-ns#")
>>>>  )
>>>> }
>>>>
>>>> with the data:
>>>>
>>>> ex:a a ex:C .
>>>>
>>>> and, I believe, with RDFS entailment we would get what we want.
>>>
>>> True.
>>>
>>>> So I believe we may want to examine an alternative route:
>>>>
>>>> - define (C2) to be the absolute strict minimum to ensure a finite solution
>>>> - look at the existing FILTER possibilities to see if we can handle
>>>> those common use cases. We may want to propose some shortcuts, like the
>>>> one above, ie, some sort of an operation which says
>>>>
>>>> inNamespace( ?x, URI )
>>>>
>>>> which means that ?x as a term starts with the URI.
>>>>
>>>> What do you (and others!) think?
>>>
>>> I think that could work out very nicely :-)
>>>
>>> Birte
>>>
>>>> I have the gut feeling it will also help in defining the RIF
>>>> alternative, b.t.w.
>>>>
>>>> Ivan
>>>>
>>>>
>>>>> Here, C2 is very lax and it allows almost all axiomatic triples, but
>>>>> still guarantees finiteness of the answers in all cases. Now C3
>>>>> basically says that if the instantiated BGP has nothing to do with
>>>>> your data, then omit it. I omit the predicate as an option in C3
>>>>> since predicates such as rdf:type occur in most graphs (possibly by
>>>>> implicit triples as in OWL) and will then let through most axiomatic
>>>>> triples. Answers that are then filtered out are basically axiomatic
>>>>> triples that you didn't mention at all. This could be relaxed to (not
>>>>> sure which one makes more sense, needs more thinking)
>>>>> (C3) There is at least one triple s p o in P(BGP) such that either s
>>>>> or o occurs in sk(SG).
>>>>>
>>>>> Going back to the examples, if we have the query:
>>>>> SELECT ?ind ?class WHERE { ?ind a ?class }
>>>>> and data:
>>>>> [[
>>>>> ex:a a ex:b .
>>>>> ]]
>>>>> as in you generated output, then you would get under OWL RL
>>>>> ?ind/ex:a, ?class/ex:b
>>>>>
>>>>> For the query
>>>>> SELECT ?r WHERE { ex:a ?r ex:a }
>>>>> over
>>>>> [[
>>>>> ex:a ex:b ex:c .
>>>>> ]]
>>>>> you would get
>>>>> ?r/owl:sameAs
>>>>> no mater whether there is ex:something owl:sameAs ex:something or not.
>>>>>
>>>>> It is still possible to generate artificial examples, but less so. E.g.,
>>>>> SELECT ?type WHERE { rdf:type a ?type }
>>>>> over the empty graph gives you nothing since C3 cannot be satisfied in
>>>>> an empty graph. If we query over
>>>>> [[
>>>>> ex:b a ex:c.
>>>>> ]]
>>>>> we get
>>>>> ?type/rdf:Property
>>>>> C2 holds because rdf:Property is in vocab (assuming we do RDF(S) entailment)
>>>>> C3 holds because the instantiated BGP is rdf:type a rdf:Property and
>>>>> the subject rdf:type occurs in its abbreviated form in your data. That
>>>>> seems a nicer compromise. If we don't exclude axiomatic triples, we
>>>>> get infinite answers, and if we exclude some, we get some non-local
>>>>> side effects, but that cannot be totally avoided.
>>>>>
>>>>>> Well... this is clearly not our decision. To be formal, we should
>>>>>> definitely flag that as an issue to be discussed. In some ways, the
>>>>>> question is: is it better to have many potential responses (ie, the user
>>>>>> will have to filter things out) or a small number though some expected
>>>>>> results will not be returned (only via ugly tricks).
>>>>>
>>>>> Yes.
>>>>> I would prefer better too many than too few. Tools will hopefully
>>>>> provide a way to be configured such that they hide what I don't want
>>>>> to see, while still giving me the chance to see it if I want to.
>>>>>
>>>>> [snip]
>>>>>>>                                         Another would be to define
>>>>>>> something like "extensible entailment regimes", where the entailment
>>>>>>> regime is a combination of one of the defined semantics plus some
>>>>>>> rules/axioms that the endpoint will always apply, e.g., the SKOS
>>>>>>> axioms are always assumed to be present.
>>>>>>
>>>>>> I think the answer is: this is where RIF comes in. That gives me the
>>>>>> necessary flexibility (and I can always simulate OWL 2 RL level with a
>>>>>> RIF rule set).
>>>>>
>>>>> Yes. That would be nice.
>>>>>
>>>>>> B.t.w.: a point here for the future RIF discussion:
>>>>>>
>>>>>>  - we have something for OWL 2 RL
>>>>>>  - we will have something for RIF (hopefully)
>>>>>>  - there is a RIF document that, essentially, defines a RIF rule set for
>>>>>> OWL 2 RL
>>>>>>
>>>>>> Surely the result of the entailment should be the same whether I use the
>>>>>> OWL 2 RL entailment definition or the RIF one and use the official rule
>>>>>> set...
>>>>>
>>>>> Yes. It would be not nice at all if that were not the case.
>>>>>
>>>>> Cheers,
>>>>> Birte
>>>>>
>>>>>> Cheers
>>>>>>
>>>>>> Ivan
>>>>>>
>>>>>>> Birte
>>>>>>>
>>>>>>>>> We might still want to exclude axiomatic triples unless they occur in
>>>>>>>>> the input because they potentially add lots of answers that you don't
>>>>>>>>> really want.
>>>>>>>>
>>>>>>>> See above. But if I am wrong, I do not mind that either.
>>>>>>>>
>>>>>>>>> Another possibility would be to apply C2 only to variables in subject
>>>>>>>>> and object position and allow anything in the predicate position. That
>>>>>>>>> still can cause counterintuitive side effects, but many more cases are
>>>>>>>>> covered. In that case, inconsistencies need extra care because in
>>>>>>>>> principle this would still allow infinite answers if the given graph
>>>>>>>>> is inconsistent and we just assume the scoping graph to be equivalent
>>>>>>>>> to the queried graph no matter what.
>>>>>>>>
>>>>>>>> I guess you had that in one of the first drafts and we did not really
>>>>>>>> like it:-(
>>>>>>>> [snip]
>>>>>>>>>>
>>>>>>>>>> Sigh.
>>>>>>>>>
>>>>>>>>> Sigh too.
>>>>>>>>
>>>>>>>> :-)
>>>>>>>>
>>>>>>>> Ivan
>>>>>>>>
>>>>>>>>> Birte
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Ivan
>>>>>>>>>>
>>>>>>>>>> [1]
>>>>>>>>>> http://www.w3.org/TR/2009/REC-owl2-rdf-based-semantics-20091027/#Appendix:_Axiomatic_Triples_.28Informative.29
>>>>>>>>>> [2]
>>>>>>>>>> http://www.w3.org/TR/2009/REC-owl2-syntax-20091027/#Entity_Declarations_and_Typing
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 2010-2-16 18:42 , Birte Glimm wrote:
>>>>>>>>>>> Hi all,
>>>>>>>>>>> I have committed a new version of the entailment regimes document:
>>>>>>>>>>> http://www.w3.org/2009/sparql/docs/entailment/xmlspec.xml
>>>>>>>>>>>
>>>>>>>>>>> There is now a description of the OWL RDF-Based Semantics incl. the
>>>>>>>>>>> OWL 2 RL profile. The OWL 2 RL profile can also be used with Direct
>>>>>>>>>>> Semantics, so I have added that there too. Further I have added a
>>>>>>>>>>> section about aggregates with RDF(S) entailment, addressing at least
>>>>>>>>>>> parts of Axel's comments (no owl:sameAs discussion yet for
>>>>>>>>>>> aggregation). I also defined the behaviour for inconsistent graphs
>>>>>>>>>>> more clearly because the previous spec didn't define the scoping graph
>>>>>>>>>>> in the case of inconsistencies. It was rather assumed that the scoping
>>>>>>>>>>> graph is still equivalent to the active graph, so that systems can
>>>>>>>>>>> just use the graph as is modulo bnode renaming, but that allowed
>>>>>>>>>>> infinite answers for inconsistent graphs. I now use Axel's suggestion
>>>>>>>>>>> for condition C2 and require not only bindings for variables inn
>>>>>>>>>>> subject position to occur in the input, but require this for all
>>>>>>>>>>> variables. This also solves the OWL RDF-Based semantics problem where
>>>>>>>>>>> you can have infinite answers from owl:topDataProperty, which relates
>>>>>>>>>>> an individual to all data values. Now all RDF-Based regimes (RDF,
>>>>>>>>>>> RDFS, OWL 2 RDF-Based (for OWL Full and OWL RL)) use the same
>>>>>>>>>>> definitions, which is nice IMO.
>>>>>>>>>>>
>>>>>>>>>>> Birte
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>>
>>>>>>>>>> Ivan Herman, W3C Semantic Web Activity Lead
>>>>>>>>>> Home: http://www.w3.org/People/Ivan/
>>>>>>>>>> mobile: +31-641044153
>>>>>>>>>> PGP Key: http://www.ivan-herman.net/pgpkey.html
>>>>>>>>>> FOAF   : http://www.ivan-herman.net/foaf.rdf
>>>>>>>>>> vCard  : http://www.ivan-herman.net/HermanIvan.vcf
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>>
>>>>>>>> Ivan Herman, W3C Semantic Web Activity Lead
>>>>>>>> Home: http://www.w3.org/People/Ivan/
>>>>>>>> mobile: +31-641044153
>>>>>>>> PGP Key: http://www.ivan-herman.net/pgpkey.html
>>>>>>>> FOAF   : http://www.ivan-herman.net/foaf.rdf
>>>>>>>> vCard  : http://www.ivan-herman.net/HermanIvan.vcf
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>> Ivan Herman, W3C Semantic Web Activity Lead
>>>>>> Home: http://www.w3.org/People/Ivan/
>>>>>> mobile: +31-641044153
>>>>>> PGP Key: http://www.ivan-herman.net/pgpkey.html
>>>>>> FOAF   : http://www.ivan-herman.net/foaf.rdf
>>>>>> vCard  : http://www.ivan-herman.net/HermanIvan.vcf
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>> --
>>>>
>>>> Ivan Herman, W3C Semantic Web Activity Lead
>>>> Home: http://www.w3.org/People/Ivan/
>>>> mobile: +31-641044153
>>>> PGP Key: http://www.ivan-herman.net/pgpkey.html
>>>> FOAF   : http://www.ivan-herman.net/foaf.rdf
>>>> vCard  : http://www.ivan-herman.net/HermanIvan.vcf
>>>>
>>>>
>>>
>>>
>>> --
>>> Dr. Birte Glimm, Room 306
>>> Computing Laboratory
>>> Parks Road
>>> Oxford
>>> OX1 3QD
>>> United Kingdom
>>> +44 (0)1865 283529
>>>
>>
>> --
>>
>> Ivan Herman, W3C Semantic Web Activity Lead
>> Home: http://www.w3.org/People/Ivan/
>> mobile: +31-641044153
>> PGP Key: http://www.ivan-herman.net/pgpkey.html
>> FOAF   : http://www.ivan-herman.net/foaf.rdf
>> vCard  : http://www.ivan-herman.net/HermanIvan.vcf
>>
>>
> 
> 
> 

-- 

Ivan Herman, W3C Semantic Web Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
PGP Key: http://www.ivan-herman.net/pgpkey.html
FOAF   : http://www.ivan-herman.net/foaf.rdf
vCard  : http://www.ivan-herman.net/HermanIvan.vcf



Received on Friday, 19 February 2010 11:27:57 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:15:41 GMT