- From: Pat Hayes <phayes@ihmc.us>
- Date: Sat, 5 Feb 2005 10:29:19 -0600
- To: Eric Prud'hommeaux <eric@w3.org>
- Cc: Dan Connolly <connolly@w3.org>, www-archive@w3.org, Jos De Roo <jos.deroo@agfa.com>
>On Fri, Feb 04, 2005 at 01:28:12PM -0600, Pat Hayes wrote: >> > >> > >> >Let's add some data to query (I think just querying the schema info >> >slightly obscures the problem): >> > >> > _:somebody foaf:homePage <dansHomePage#topic>. >> >> > <dansHomePage#topic> owl:sameAs _:somebody . >> > >> >and ask a question: >> > CONSTRUCT * >> > WHERE { (?who foaf:homePage <dansHomePage#topic>) } >> > >> >A complete OWL reasoner must know that >> > <dansHomePage#topic> foaf:homePage <dansHomePage#topic>. >> >Just for giggles, it could add >> > _:x foaf:homePage <dansHomePage#topic>. >> > _:y foaf:homePage <dansHomePage#topic>. >> > _:y foaf:homePage <dansHomePage#topic>. >> > _:z foaf:homePage <dansHomePage#topic>. >> >> Well, it could, yes. So could an RDF reasoner, for that matter, since >> an inference to change a blanknodeID is valid in any version of RDF. >> But this particular example is artificial, since according to the >> SPARQL scoping rules these answers are all the same answer, so this >> engine is just repeating itself. >> >> >Does anything but a query like >> > CONSTRUCT * >> > WHERE { (?who foaf:homePage <dansHomePage#topic>) } >> > AND isURI(?who) >> > >> >keep the reasoner from reporting an endless series of equivlient bNode >> >solutions? >> >> I think the SPARQL rules already stop that, if you say that you don't >> want repeated answers. But look, whats to stop the answering engine >> from inventing a string of URIs like mine:uri1 mine:uri2... and adding >> >> mine:uri1 owl:sameAs <dansHomePage#topic> . >> mine:uri2 owl:sameAs mine:uri1 . >> etc. >> >> to the graph? There is no way to be totally secure against getting >> silly stuff back from a truly brain-damaged, or maybe malicious, >> answering engine. >> >> >Is it logically equivilent to substitute a bNodes for any >> >URI in the graph? It seems that OWL would not worry about this >> >limitless enumeration. >> >> Yes, and indeed it should not, and almost any real OWL or RDF >> answering engine would not do this (though it might accidentally >> repeat itself when using a large graph, if the graph contains >> redundancies.) >> >> > > >and by definition >> >> > isURI(<dansHomePage#topic>) >> >> >but not >> >> > isURI(_:somebody) >> > >> >If isURI is a constraint on a set of bindings of nodes/literals to >> >variables, and each node/literal is only one of URI, bNode, Literal, >> >then it seems like we're fine. If owl:sameAs makes some node both a >> >URI and a bNode, then I don't understand owl:sameAs (a definite >> >possibility). >> >> Well, not sure what you mean by 'makes some node both'. One can >> assert a sameAs between a bnode and a URI, as you did above. This >> isnt a problem, surely. > >Since I think this is the crux of this issue, I will expound... > >SPARQL queries over RDF data, so any data that isn't expressed in >triples is not our problem. Agreed. > >(1) _:somebody foaf:homePage <dansHomePage> >results in a single triple with a bNode subject. > >(2) <dansHomePage#topic> owl:sameAs _:somebody . >results in the obvious triple. I'm trying to see what comes out of >OWL closure over this data. I believe it is only > >(3) <dansHomePage#topic> foaf:homePage <dansHomePage> >which makes (1) redundant and forgettable. Right, well put. >If, however, I'm confused, >and OWL inference changes the _somebody node No, inference engines never change anything: they only add things. So the OWL engine might add 3 to {1,2}, but it won't change (or indeed, if it is strictly an inference engine) delete either 1 or 2. >to be both a bNode and >the resource <dansHomePage#topic>, then SPARQL is oversimplifying >the data model over which it queries. But it isn't. A node can't be both a bnode and a URIref. Those are exclusive syntactic categories in RDF. > >Expressing this SPARQL query in RDF may bring up use/mention issues, >but I don't see how asking the question about what note types are in >the RDF graph runs the risk of confusing a reference to a node with >the node itself. That question doesn't do it, but what does (arguably) do it is asking a query about the node type rather than expressing the query as an RDF pattern > > >Also, I'm not sure why this is a use/mention problem rather than a >> >potential over-simplification of the RDF model. >> >> The 'predicate' in isURI(x) refers to the syntax of the expression >> substituted for the variable, not to whatever that expression >> denotes. To this extent it is semantically different from an RDF >> pattern with a variable in it. Its meta-RDF rather than RDF. > >Agreed. Just as SQL steps outside of relational algebra (UNIQUE, GROUP >BY), I'm happy doing that in SPARQL. OK, but I think that is the nub of the issue for Dan C. >For fun, let's imagine the predicates log:isURI and log:isBnode. Well, hang on, I have trouble imagining that. If this really is a PREDICATE then its truth-value is determined by the denotation of its arguments, not the form of its arguments. To illustrate, here's a logically valid inference pattern, expressed in RDF: aaa bbb ccc . |= _:x bbb ccc . In other words, if A is B'd to C, then something is B'd to C. Now, however, try this with log:isURI: aaa rdf:type log:isURI |=?= _:x rdf:type log:isURI or maybe aaa log:isURI 1 |=?= _:x log:isURI 1 Seems to me that if log:isURI really is a predicate, this ought to be true: after all, it's the inference from 'A is a URI' to 'something is a URI', which seems hard to argue with. But I bet that with what you intended log:isURI to mean, the conclusion here would be false, right? Because you don't mean its a logical predicate over things denoted, you mean its a meta-predicate over things displayed in the triple itself, ie a predicate on the syntax rather than a predicate on the meaning. Mention instead of use. I don't mean to imply that meta-predicates like this are incoherent, but they certainly do behave differently in reasoning than normal predicates: they don't even obey the same logical rules (unless you make the quotation explicit). So mixing them freely with normal predicates with no, er, protection, quickly gets things very confused (see: http://fisher.osu.edu/~tomassini_1/whotext.html.) >(1) _:somebody foaf:homePage <dansHomePage> . >+ _:somebody log:isBnode 1 . >+ foaf:homePath log:isURI 1 . >+ <dansHomePage> log:isURI 1 . > >(2) <dansHomePage#topic> owl:sameAs _:somebody . >+ <dansHomePage#topic> log:isURI 1 . >+ owl:smaeAs log:isURI 1 . > >OWL inference: >(3) <dansHomePage#topic> foaf:homePage <dansHomePage> Ah, but now there are also some others things in the closure, including all the existential generalizations on URIrefs: _:y owl:sameAs _:somebody . _:y log:isURI 1 . _:y foaf:homePage <dansHomePage> . (subgraph derived from (2) and (3) by RDF rule SE2 with _;y allocated to <dansHomePage#topic>, cf. http://www.w3.org/TR/rdf-mt/#simpleRules) > >{ ?who foaf:homePage ?page. > ?who log:isURI 1 } => { (?who ?page) a answer } >would give you > ( <dansHomePage#topic> <dansHomePage> ) a answer } Sure, but it will also give you what you don't want, if you apply it to the actual closure, because you will get a bnode binding for ?who. Now, I know Im being deliberately awkward here, since you could insist on a kind of limited closure which does not introduce bnodes. That might work, for a while. But some inference engines need to generate bnodes (applying rules SE1 and 2, in effect) to get to perfectly legitimate conclusions; and it seems kind of tacky to say that they shouldnt do this when it is perfectly valid, and they would be conformant in doing it. (Note, Im not here talking about silly repeats of this kind of inference, as in your first example. Just doing it once screws up log:isURI.) However, if you were to go to the other extreme, and say that you wanted to target the query against the actual raw RDF graph, with NO inferences or alterations or additions done to it, then I think this kind of query could make sense; its just a graph-matching query, and things like isBnode and isURI can be checked in any particular graph as syntactic conditions on variable bindings. But you just have to keep in mind that things all fall to pieces if you try to think of these as genuine RDF predicates, or when you mix these kinds of query with inference. Same applies to UNSAID, which makes perfect sense as a direct graph query, but totally screws up inference (and is screwed up by it). > >For that monotonic fuzzy feeling, >{ ?who foaf:homePage ?page. > ?who log:isBnode 1 } => { (?who ?page) a answer } >*could* give you > ( _:1 <dansHomePage> ) a answer } > >because <dansHomePage#topic> foaf:homePage <dansHomePage> >implies _:1 foaf:homePage <dansHomePage> > >Eliminating logically redundant bNodes could have a parallel operation >that looks for things that could be bNodes on the matching side. The >use cases aren't well-served by this extra inference. The one I see all >the time is the variance in the object of dc:creator. > > <annot1> dc:creator <dansHomePage#topic>. > <annot2> dc:creator _:creator2. > _:creator2 rdf:type foaf:Person. > _:creator2 foaf:homePage <dansHomePage#topic>. > >If I were looking for the pages where the creator had put in some >sort of structure to describe themselves, I would be tempted to ask >cwm: > >{ ?page dc:creator ?creator. > ?creator log:isBnode 1. } => { ( ?page ) a answer. } > >at which point the query engine dutifully infers that both could be >considered bNodes: > > ( <annot1> ) a answer. > ( <annot2> ) a answer. > >Fat load of good that did me. Well, true, but Im tempted to ask in reply, whose fault is that? The reasoner isn't doing anything illogical or wrong; you just aren't talking to it in its language. >In the end, I think I either want to >pull isBnode out of SPARQL because it imposes a peculiar inference >burden and gives answers that people won't expect, or, put in some >text that says that the constraints clause is a filter on results Nice phrasing >, >does not imply inference, is non-monotonic, is fattening, leads to >heart disease, etc. Right, that's what I think we should do. Its like the difference between banning poisons, or insisting on putting them in clearly labelled bottles. I'm for the latter. What we definitely shouldn't do, however, is pretend they are food. Pat -- --------------------------------------------------------------------- IHMC (850)434 8903 or (650)494 3973 home 40 South Alcaniz St. (850)202 4416 office Pensacola (850)202 4440 fax FL 32502 (850)291 0667 cell phayes@ihmc.us http://www.ihmc.us/users/phayes
Received on Saturday, 5 February 2005 16:28:52 UTC