Re: Proposed change to the OWL-2 Direct Semantics entailment regime from Guido Vetere on 2010-12-10 (public-rdf-dawg@w3.org from October to December 2010)

From: Guido Vetere <gvetere@it.ibm.com>
Date: Fri, 10 Dec 2010 16:38:38 +0100
To: Bijan Parsia <bparsia@cs.man.ac.uk>
Cc: public-rdf-dawg@w3.org
Message-ID: <OFFB1964F4.7C21EA91-ONC12577F5.00511F71-C12577F5.0055EF1E@it.ibm.com>
Bijan Parsia <bparsia@cs.man.ac.uk> 
10/12/2010 13.54

To
Guido Vetere/Italy/IBM@IBMIT
cc
public-rdf-dawg@w3.org
Subject
Re: Proposed change to the OWL-2 Direct Semantics entailment regime






> (Sorry for the delay responding. Classes and illness hit :))
> 
> On 6 Dec 2010, at 20:58, Guido Vetere wrote:

> > > Bijan Parsia <bparsia@cs.man.ac.uk> 
> > 
> > > > We don't use SPARQL as a query language (we adopt a datalog-style 
> > > > syntax instead) but we might support (some) SPARQL as a front-end 
in
> > > > the future, as long as it does not misses relevant features. I 
> > > > cannot tell if making all variable distinguished would definitely 
> > > > prevent form covering some relevant use case.
> > > 
> > > That's unfortunate. It would be really helpful to find some cases 
> > > where users would *notice* the difference. Best is when they would 
> > > rely on non-distinguished variables.
> > 
> > Maybe the average user would hardly understand the difference, but
> we know about that difference, we know that it may show up in some 
> cases and we should understand (in advance) if it matters for 
> customers. I cannot honestly tell about concrete cases we 
> experimented with, because we didn't consider how our queries would 
> have been answered without the features we actually support. But we 
> can try to go through some of the use case we've run and make some 
> simulation. In this case, we'll be back to you. 
> 
> That would be very amazing. If there's anything I can do to 
> facilitate this, please let me know. I'd be happy to drudge through 
> some examples

We managed to look at some concrete example of what we've been 
experimenting at Banca d'Italia this year. Basically, they want to 
integrate statistical information coming from different sources mostly 
through a specific XML format (SDMX), and integrate it with other data 
conforming to a custom model (called MATRIX). Hence, we developed an 
ontology and defined mappings with both metamodels, and delivered a 
solution based on our DL-Lite reasoner, that gathers data from different 
sources, materializes it into a knowledge base, where users can freely 
issue conjunctive queries (there's a web-based GUI to help them browsing 
the ontology and making these queries). 

The information we gather is incomplete, but the ontology is powerful 
enough to drive some useful inference. For instance, we have that "bonds" 
can be "issued_by" "institution", and we have that "government bond" are 
bonds issued by "ministry" institutions. The KB might not know, for some 
government bond instance, which specific ministry issues it, and yet 
answer the query: { x | bond(x), issued_by(x,y), ministry(y) } completely.
 

> > I hope I don't give offense by asking the following clarificatory 
> > question: Do you really mean variables which range over unnamed 
> > individuals, or do you just mean variables which are projected away 
> > (in the Datalog world, these coincide as there are no unnamed 
> > individuals; hence my question)? 
> 
> We don't have unnamed individuals, we may have generated names. 

> But you do have existential restrictions? But  no bNodes? (I.e., you 
don't generally work with RDF data?)

In fact, we don't work with RDF data.


> > > > provided that supporting this feature would be up to implementers.
> > > > After all, not all SPARQL features are going be supported by all 
> > > > implementers, I guess. 
> > 
> > > The problem is that nondistinguished variables get prohibitively 
> > > harder as you raise the expressiviety of the logic. Ideally, we 
> > > would like to make it easy for a user to port a query from an RDF 
> > > engine to a OWL QL engine to an OWL DL engine and get compatible 
> > > results (if the engines all support all of SPARQL). Nondistinguished
> > > varibles makes that impossible.
> > 
> > I understand your point. Maybe this in naive, but why not consider
> a sort of SPARQL layering, much like OWL does? 
> 
> In a sense, the many entailment regimes we have achieve such a 
> layering, or close to it. The current issue is how complicated to 
> make that layering and what properties we want of the layering.
> 
> I think the best (from a user understandability perspective) is that:
>    1) A user can use any entailment regime with (just about) any 
dataset. 
>    2) As many queries as possible are admitted by all regimes. 
> (I.e., it's as close to "one query language" as we can make it)
>    3) If a user uses a more expressive regime (with some fudging) 
> they get at least all the answers they got with a less expressive 
> regime, and possibly more.
>    4) Differences between queries, data, and regime produce 
> different answers in a way that is easily understood by users of all 
regimes.
> 
> With the current entailment regime, we have 1 owl regime that 
> adheres to 1, 2, and 3 for all DL profiles (e.g., OWL DL, QL, EL, 
> and RL) and is pretty close to that for DL vs. Fullish (e.g., RDF, 
> RDFS, and OWL Full) regimes.
> 
> If we introduce nondistinguished variables, we will have to have at 
> least two regimes (for QL, EL, and maybe RL) and for OWL DL. The OWL
> DL version will not allow for many queries that the other one does. 

> 
> If we don't skolemize bNodes, we will have to forbid lots of 
> datasets (e.g., with cyclic patterns of bNodes) and queries, and the
> answer set we get in some cases with RDF will be *smaller* than what
> we get with OWL QL, EL, EL, or RL.
> 
> We could keep the current regime and then introduce two more, but 
> that seems even more complicated.
> 
> My current preference is to keep the single regime and see what 
> extensions sort themselves out.

> > > On the flip side,would you find it extremely burdensome to add 
> > > nondistinguished variables as an extension? I see you already depart
> > > from the OWL spec by imposing the UNA, would this departure 
> > > discourage you from implementing SPARQL at all?
> > 
> > Not at all, the point is that we would miss a feature that we 
> would otherwise support. 
> 
> Of course. But we need to determine whether the trade off in 
> complicating the overall story and interop is worth standardizing 
> the feature at this time.

Of course, I warmly encourage you to support our features. In any case, I 
wish you all the best!


Cordiali Saluti, Best Regards,

Guido Vetere
Manager & Research Coordinator, IBM Center for Advanced Studies Rome
-----------------------
IBM Italia S.p.A.
via Sciangai 53, 00144 Rome, 
Italy
-----------------------
mail:     gvetere@it.ibm.com
phone: +39 06 59662137
mobile: +39 335 7454658

    
IBM Italia S.p.A.
Sede Legale: Circonvallazione Idroscalo - 20090 Segrate (MI)
Cap. Soc. euro 384.506.359,00
C. F. e Reg. Imprese MI 01442240030 - Partita IVA 10914660153
Società soggetta all?attività di direzione e coordinamento di 
International Business Machines Corporation

(Salvo che sia diversamente indicato sopra / Unless stated otherwise 
above)
Received on Friday, 10 December 2010 15:39:16 UTC