Re: worries about useMentionOp and how queries relate to rules and proofs from Eric Prud'hommeaux on 2005-02-05 (www-archive@w3.org from February 2005)

From: Eric Prud'hommeaux <eric@w3.org>
Date: Sat, 5 Feb 2005 02:14:43 -0500
To: Pat Hayes <phayes@ihmc.us>
Cc: Dan Connolly <connolly@w3.org>, www-archive@w3.org, Jos De Roo <jos.deroo@agfa.com>
Message-ID: <20050205071443.GA10422@w3.org>
On Fri, Feb 04, 2005 at 01:28:12PM -0600, Pat Hayes wrote:
> >
> >
> >Let's add some data to query (I think just querying the schema info
> >slightly obscures the problem):
> >
> >	_:somebody	  foaf:homePage <dansHomePage#topic>.
> >> >	<dansHomePage#topic> owl:sameAs _:somebody .
> >
> >and ask a question:
> >	CONSTRUCT *
> >	    WHERE { (?who foaf:homePage <dansHomePage#topic>) }
> >
> >A complete OWL reasoner must know that
> >	<dansHomePage#topic> foaf:homePage <dansHomePage#topic>.
> >Just for giggles, it could add
> >	_:x foaf:homePage <dansHomePage#topic>.
> >	_:y foaf:homePage <dansHomePage#topic>.
> >	_:y foaf:homePage <dansHomePage#topic>.
> >	_:z foaf:homePage <dansHomePage#topic>.
> 
> Well, it could, yes. So could an RDF reasoner, for that matter, since 
> an inference to change a blanknodeID is valid in any version of RDF. 
> But this particular example is artificial, since according to the 
> SPARQL scoping rules these answers are all the same answer, so this 
> engine is just repeating itself.
> 
> >Does anything but a query like
> >	CONSTRUCT *
> >	    WHERE { (?who foaf:homePage <dansHomePage#topic>) }
> >	      AND isURI(?who)
> >
> >keep the reasoner from reporting an endless series of equivlient bNode
> >solutions?
> 
> I think the SPARQL rules already stop that, if you say that you don't 
> want repeated answers.  But look, whats to stop the answering engine 
> from inventing a string of URIs like mine:uri1 mine:uri2... and adding
> 
> mine:uri1 owl:sameAs <dansHomePage#topic> .
> mine:uri2 owl:sameAs mine:uri1 .
> etc.
> 
> to the graph? There is no way to be totally secure against getting 
> silly stuff back from a truly brain-damaged, or maybe malicious, 
> answering engine.
> 
> >Is it logically equivilent to substitute a bNodes for any
> >URI in the graph?  It seems that OWL would not worry about this
> >limitless enumeration.
> 
> Yes, and indeed it should not, and almost any real OWL or RDF 
> answering engine would not do this (though it might accidentally 
> repeat itself when using a large graph, if the graph contains 
> redundancies.)
> 
> > > >and by definition
> >> >	isURI(<dansHomePage#topic>)
> >> >but not
> >> >	isURI(_:somebody)
> >
> >If isURI is a constraint on a set of bindings of nodes/literals to
> >variables, and each node/literal is only one of URI, bNode, Literal,
> >then it seems like we're fine. If owl:sameAs makes some node both a
> >URI and a bNode, then I don't understand owl:sameAs (a definite
> >possibility).
> 
> Well, not sure what you mean by 'makes some node both'. One can 
> assert a sameAs between a bnode and a URI, as you did above. This 
> isnt a problem, surely.

Since I think this is the crux of this issue, I will expound...

SPARQL queries over RDF data, so any data that isn't expressed in
triples is not our problem.

(1)	_:somebody        foaf:homePage <dansHomePage>
results in a single triple with a bNode subject.

(2)	<dansHomePage#topic> owl:sameAs _:somebody .
results in the obvious triple. I'm trying to see what comes out of
OWL closure over this data. I believe it is only

(3)	<dansHomePage#topic> foaf:homePage <dansHomePage>
which makes (1) redundant and forgettable. If, however, I'm confused,
and OWL inference changes the _somebody node to be both a bNode and
the resource <dansHomePage#topic>, then SPARQL is oversimplifying
the data model over which it queries.

Expressing this SPARQL query in RDF may bring up use/mention issues,
but I don't see how asking the question about what note types are in
the RDF graph runs the risk of confusing a reference to a node with
the node itself.


> >Also, I'm not sure why this is a use/mention problem rather than a
> >potential over-simplification of the RDF model.
> 
> The 'predicate' in isURI(x) refers to the syntax of the expression 
> substituted for the variable, not to whatever that expression 
> denotes. To this extent it is semantically different from an RDF 
> pattern with a variable in it. Its meta-RDF rather than RDF.

Agreed. Just as SQL steps outside of relational algebra (UNIQUE, GROUP
BY), I'm happy doing that in SPARQL.

For fun, let's imagine the predicates log:isURI and log:isBnode.
(1)	_:somebody        foaf:homePage <dansHomePage> .
+	_:somebody        log:isBnode   1 .
+	foaf:homePath     log:isURI     1 .
+	<dansHomePage>    log:isURI     1 .

(2)	<dansHomePage#topic> owl:sameAs _:somebody .
+	<dansHomePage#topic> log:isURI     1 .
+	owl:smaeAs           log:isURI     1 .

OWL inference:
(3)	<dansHomePage#topic> foaf:homePage <dansHomePage>

{ ?who foaf:homePage ?page. 
  ?who log:isURI 1 } => { (?who ?page) a answer }
would give you
  ( <dansHomePage#topic> <dansHomePage> ) a answer }

For that monotonic fuzzy feeling,
{ ?who foaf:homePage ?page. 
  ?who log:isBnode 1 } => { (?who ?page) a answer }
*could* give you
  ( _:1 <dansHomePage> ) a answer }

because <dansHomePage#topic> foaf:homePage <dansHomePage> 
implies _:1                  foaf:homePage <dansHomePage>

Eliminating logically redundant bNodes could have a parallel operation
that looks for things that could be bNodes on the matching side. The
use cases aren't well-served by this extra inference. The one I see all
the time is the variance in the object of dc:creator.

	<annot1>   dc:creator    <dansHomePage#topic>.
	<annot2>   dc:creator    _:creator2.
	_:creator2 rdf:type      foaf:Person.
	_:creator2 foaf:homePage <dansHomePage#topic>.

If I were looking for the pages where the creator had put in some sort of structure to describe themselves, I would be tempted to ask cwm:

{ ?page dc:creator ?creator.
  ?creator log:isBnode 1. } => { ( ?page ) a answer. }

at which point the query engine dutifully infers that both could be
considered bNodes:

	( <annot1> ) a answer.
	( <annot2> ) a answer.

Fat load of good that did me. In the end, I think I either want to
pull isBnode out of SPARQL because it imposes a peculiar inference
burden and gives answers that people won't expect, or, put in some
text that says that the constraints clause is a filter on results,
does not imply inference, is non-monotonic, is fattening, leads to
heart disease, etc. The latter option addresses the construction:

  AND !(isURI(?creator) || isLiteral(?creator))

-- 
-eric

office: +81.466.49.1170 W3C, Keio Research Institute at SFC,
                        Shonan Fujisawa Campus, Keio University,
                        5322 Endo, Fujisawa, Kanagawa 252-8520
                        JAPAN
        +1.617.258.5741 NE43-344, MIT, Cambridge, MA 02144 USA
cell:   +81.90.6533.3882

(eric@w3.org)
Feel free to forward this message to any list for any purpose other than
email address distribution.
Received on Saturday, 5 February 2005 07:14:44 UTC