Re: AW: {Disarmed} Re: blank nodes (once again)

On Mar 24, 2011, at 10:13 AM, Sandro Hawke wrote:

> On Thu, 2011-03-24 at 09:45 -0500, Pat Hayes wrote:
>> Michael, greetings.
>> 
>> Of course you are right. Which is why it would probably not be useful or practical to *change* the interpretation of blank nodes in RDF. On the other hand, it might be useful to define a simplified version of RDF which simply does not have blank nodes in it. They really are of very little practical use. 
> 
> I don't know of real data about this, and I may not be representative,
> but I know when I write RDF by hand, and when I write software which
> constructs RDF, I often find it easier to use blank nodes than to think
> about what to name every item referred to in my content.

Agreed. Which is why we would probably need to provide some kind of interface tool which autoselects a 'new' 'blank' URI when we don't want to have to be bothered inventing one or even thinking about it. That is, we could, um, recommend that interface designers supply this auto-skolemizing functionality :-)

> 
> As I pointed out earlier, I think blank nodes are a convenience for the
> speaker and an inconvenience for the (machine) listener.   

Exactly.

> Since they're
> also a convenience for the listener when the listener is human

Well, they do seem that way on first blush. But I note that Cyc for example has been using automatically generated full Skolem functions (way more complicated than this case) in its axioms for years now, and humans do manage, with practice, to process the resulting horrendous formulae. Humans are the most adaptable part of the system :-)

> , and
> right now so much RDF isn't really being used by machines, I think the
> sense of them as an overall convenience has persisted.
> 
> 
>> Regrettable as it may be, there is now a large (and growing) community of RDF users who really do not care very much about OWL or RIF, certainly do not care a jot for the distinctions between the various species of OWL, use SPARQL only as an RDF version of SQL, and have absolutely no use for blank nodes and strongly advise their peers to avoid using them. The patterns of reasoning exemplified by blank node scoping are of no interest to them whatsoever. If anything, existential generalization is a nuisance, rather than a useful inference. They would be very happy with RDF engines which flag blank nodes as errors or (better) automatically skolemize them. 
> 
> I think there's a whole lot to be said for automatically Skolemizing
> them.   To do it well requires some work, but I think it's feasible for
> many kinds of deployment.
> 
> In particular, I think the system which first exposes the RDF content on
> the Web should be the one which Skolemizes it, since it knows what URL
> prefix to use.   (If there isn't one such system, then Skolemizing is a
> problem.)   This system has the interesting challenge of minimizing
> changes if/when it re-reads modified content destined for the same URL.
> That's the most interesting problem in this space, to me....
> 
> To rephrase that problem: given similar RDF graphs G1 and G2, and a
> labeling of the blank nodes in G1 to produce G1', how do you produce a
> labeling of the blank nodes in G2, G2', such that the differences
> between G1' and G2' are as small as the differences between G1 and G2?
> 
> In practice, imagine I have a hand authored page of turtle with maybe
> 150 triples, much of it lists.  I click "publish" and it gets Skolemized
> and published at URL U.  Then I change my mind about something, make a
> tiny edit, click "re-publish" and it gets Skolemized again, and the new
> version gets published at U.   If someone is watching U, I want them to
> see that only a little change was made.  A naive (uuid) Skolemization
> would make the change look huge, as every blank node got an entirely new
> label.   

OK, let me change the scenario a little. Suppose that ALL the actual RDF is skolemized. There is *no such thing* as unSkolemized RDF with blank nodes in it. The "blank node ids" are purely in the GUI editor, generated for you to see on the screen: they are just a graphic blurring device to obscure these ugly skolemURIs from your tender human sight. Now, when you compose and then publish the RDF, it has URIs in it (even if you can't see them, they are there). When you read it back in and edit it, it still has those skolemized URIs in it.  (When you look at it, you might see different bnodeIDs, of course, just like with SPARQL results, since these ids are generated by your GUI.) When you re-publish it, it still has the URIs it always had in it (unless you have edited them). 

Does this work OK?  I know that you geeks who hand-edit using ed or bbedit some other Neanderthal text editor might have to actually look at these URIs, but then surely you are used to seeing URIs in RDF by now, aren't you?  

Of course, what would be even nicer would be a GUI which hides all the RDF list machinery completely and lets you write things like 

:whatever :property [:this :that :theOther]

I'm sure someone will be able to write such a thing eventually :-)

> 
>> The one possible exception I can see is the use of bnodes to encode OWL syntax, using the RDF list construction. Clearly, one does not want to have OWL/RDF entailments ruined because a list has been given a name. This might require some special conventions; but in practice, again, this use of RDF has never been seriously intended to be used by RDF inference engines. Rather, this 'encoding' of OWL uses RDF as a serialization mechanism to move OWL around the Web via RDF portals. If we were to make this explicit, we could isolate this from RDF entailment regimes altogether. Which now that I think about it, might be a very good idea. 
> 
> Is there something in the OWL specs that says OWL doesn't work (or that
> we're no longer in DL) if the nodes composing the lists are not blank?

Not actually a statement, but the entailments would not work in the same way. (To see why, consider some OWL/RDF and skolemize it two different ways. These two are identical OWL but do not entail one another in RDF.) Yes, this would be a problem. The OWL/RDF spec would have to be re-written. However, if we follow the idea below, it would become a lot simpler. The OWL/RDF spec would basically just describe how to translate the  OWL/RDF syntax into OWL abstract syntax, and then refer to that OWL for the semantics. This is by far the easiest and most elegant way to proceed in any case, as it lets the OWL people define their logic without worrying about RDF encodings. We would have to define a 'manchester syntax' version of OWL-Full, but that has in effect already been done (eg see http://www.ihmc.us/users/phayes/cl/sw2scl.html )

> That would be a problem.
> 
> Is the isolation you're talking about any different from "dark triples"?

Same basic idea, yes. That idea seems in retrospect to have been an opportunity missed, IMO. OWL/RDF used RDF entailment to imitate datastructures encoding OWL syntax. We managed it, but it was like dancing in a full lotus position. This idea really does not have a future, so it might make sense to codify something more workable. Just turning the RDF semantics off in places, and so freeing up the RDF syntax to be a kind of ur-LISP, seems like the simplest and most future-oriented way to proceed. Then the fact that one copy of an OWL expression does not RDF-entail the other is completely unimportant, because RDF entailment wouldn't be the tool that an OWL/RDF parser would use to figure this out.  (It would be a bit like two copies of the same LISP S-expression using different absolute addresses: so what?) 

BTW, the 'meaning' of this darkened RDF would be determined by the specifications associated with the URI of the property of the triple, in this vision of how RDF would work.   Except that lists would 'belong' to the spec that owns the property which applies to their first member, so that OWL would have semantic jurisdiction over anything in any list that is the value of any OWL property, and so on, and similarly for RIF. (I havn't checked the RIF syntax in detail, but I presume this would work out similarly?)

Pat


> 
>   -- Sandro
> 
> 
>> Pat
>> 
>> PS. other comments added in-line below.
>> 
>> 
>> On Mar 24, 2011, at 8:59 AM, Michael Schneider wrote:
>> 
>>> Hi all!
>>> 
>>> Consider this: If you treat blank nodes in the way currently specified in the RDF spec, that is, as *locally scoped* to their containing graph, then it makes a clear difference whether their semantics is that of existential variables or that of (skolem) constants when it comes logical conclusion and, hence, for reasoning. 
>>> 
>>> For example, given the following two graphs:
>>> 
>>>  G1 = { 
>>>      ex:s1 ex:p1 ex:o1 . 
>>>      ex:s2 ex:p2 _:x . 
>>>   }
>>> 
>>>  G2 = { 
>>>      ex:s2 ex:p2 _:x . 
>>>   }
>>> 
>>> Under current RDF simple entailment with existential semantics and local scope for blank nodes, G1 obviously entails G2. But if you modify RDF simple entailment to interpret blank nodes as /constants/, while still keeping them /local/ to their graphs, then this becomes a /non/-entailment.
>> 
>> But nobody has suggested that particular combination.
>> 
>>> The reason is that, on the one hand, now being constants, both occurrences of the name "_:x" in the two graphs denote some individual in the universe of discourse each but, on the other hand, since the "_:x" constants are local to their respective graph, there are interpretations under which they denote /different/ individuals. Just as different names within the same graph may denote different individuals. In fact, if you would merge G1 and G2, the blank nodes would need to be renamed (that's essentially what locality of blank nodes is all about!), leading to (modulo blank node identifiers):
>>> 
>>>   G12 = { 
>>>       ex:s1 ex:p1 ex:o1 . 
>>>       ex:s2 ex:p2 _:y . 
>>>       ex:s2 ex:p2 _:z . 
>>>   }
>>> 
>>> For comparison, under (current) existential semantics of blank nodes in RDF simple entailment, the merged graph G12 semantically implies both G1 and G2. In fact, G12 is even semantically equivalent to G1, i.e., G12 contains redundant triples. However, this would not be the case anymore when blank nodes are seen as /local constants/. In this case, G12 would be free of redundancy (all constants "ex:o1", "_:y" and "_:z" can be interpreted pairwise differently), and from this it becomes clear that G1 cannot imply G12. Further (and probably more surprisingly), not even does G1 nor G2 semantically follow from G12, although G12 has been created by merging the two original graphs . The reason is basically the same as for why G2 does not follow from G1 (although there is no sharing of blank node identifiers between G12 and G1 as it has been between G1 and G2, but this doesn't make a difference under local scope assumption). 
>>> 
>>> So, under local constant view, the three graphs are semantically largely unrelated, while under local existential view, they are largely related. I'd call this a sensible difference!
>>> 
>>> Not only for reasoners would such a change from an existential to a constant view have considerable consequences (a reasoner that innocently infers G2 from G1 would be unsound ("broken") w.r.t. the changed RDF semantics, which would probably hit most if not all existing RDF(S) reasoners). Also SPARQL would be affected, including SPARQL 1.1. For example, the current Working Draft of SPARQL 1.1 has a nice example on blank nodes in query results:
>>> 
>>>   <http://www.w3.org/TR/2010/WD-sparql11-query-20101014/#BlankNodesInResults>
>>> 
>>> The given result sets and the discussion in the cited section are only really justified under the assumption of blank nodes having /local scope/ and /existential/ semantics. If one would switch to /globally/ scoped /constants/ (as it is the case for URIs), then the querying result should consist of the original blank node names "_:a" and "_:b" and no others, which clearly conflicts with the result sets and the discussion in the cited section.
>> 
>> Indeed. But look how much effort is expended to explain carefully how local bnode identifiers don't act like global names, and now add in the amount of confusion and implementation difficulty this causes. Would it not be better if this simply were eliminated? That query would still *work* if the RDF used anonymous URIs. Any patterns of node identity in the queried data would still be visible in the query results (which is really all that matters here). Everything would work perfectly, in fact, without needing this explanation. So yes, the SPARQL documents would need a little editing, but this would consist chiefly of deleting unnecessary material. 
>> 
>>> And if one would switch to /locally/ scoped /constants/, then the result set should be empty, following the explanation I gave above - again much different from the cited section.
>>> 
>>> So, if the RDF WG intents to make any changes to the RDF spec concerning the syntactic and semantic properties of blank nodes, then it should also consider hinting the SPARQL WG, so that they can update their current working drafts accordingly. Of course, this would require a major update that breaks backwards-compatibility with SPARQL 1.0. It would also have a strong effect on the new SPARQL 1.1 entailment regimes (http://www.w3.org/TR/sparql11-entailment/), at least for the entailment regimes based on RDF, RDFS and the OWL 2 RDF-Based Semantics, since these are all defined with respect to the original model theories defined in the current RDF Semantics spec, or with respect to the OWL 2 RDF-Based Semantics spec (http://www.w3.org/TR/owl2-rdf-based-semantics/), which itself is based on the current RDF Semantics spec, i.e., they all depend on existential blank node semantics. 
>>> 
>>> And, as we mention OWL 2, this standard should then perhaps also be revised (or at least its future successors should be changed in a backwards-incompatible way in order to conform to the changed RDF spec) . This would have particularly strong consequences for OWL 2 Full (which uses the mentioned OWL 2 RDF-Based Semantics as its semantics), as this language is fully based on the current RDF Semantics spec and additionally includes definitions that heavily assume that blank nodes are seen as existentially quantified variables. This was even stronger the case for OWL 1 Full, but still is the case for OWL 2 Full (ask, if you are interested in further explanation). But also OWL 2 DL, even though it's semantics is /not/ based on the RDF Semantics, still has a notion of "anonymous individuals", which are represented by blank nodes in the RDF mapping of OWL 2, and which happen to be interpreted as existentially quantified variables as well. So, should OWL 2 DL also be changed, or would a further drifting apart of RDF and OWL DL be ok for everybody?
>>> 
>>> And let's also not forget RIF, at least the specification of RIF-RDF combinations in <http://www.w3.org/TR/2010/REC-rif-rdf-owl-20100622/>. The definition of "satisfaction" of a RIF-RDF combination by a "common-RIF-RDF interpretation"  reuses, for the RDF part, the specification of "RDF satisfaction" as provided by the current RDF semantics specification - that is, it makes use of the existential semantics for blank nodes occurring in the RDF graphs in a RIF-RDF combination.
>>> 
>>> And also let's not forget about all the books and papers that have been written on the topic, software that has been created, projects, conferences, companies... 
>>> 
>>> It appears to me that a little change in the semantics of blank nodes would go a long way... :->
>>> 
>>> Cheers,
>>> Michael
>>> 
>>> ________________________________________
>>> Von: semantic-web-request@w3.org [semantic-web-request@w3.org]&quot; im Auftrag von &quot;Graham Klyne [GK-lists@ninebynine.org]
>>> Gesendet: Donnerstag, 24. März 2011 10:08
>>> Bis: Dieter Fensel
>>> Cc: Enrico Franconi; Pat Hayes; Hugh Glaser; Mark Wallace; Alan Ruttenberg; Reto Bachmann-Gmuer; Ivan Shmakov; Ivan Shmakov; <semantic-web@w3.org>
>>> Betreff: Re: {Disarmed} Re: blank nodes (once again)
>>> 
>>> FWIW, my recollection of the working group discussions followed a similar path:
>>> that bNodes don't fundamentally add expressive power when making assertions
>>> about the world.  I.e. that Skolemization achieves the same effect.  I think it
>>> was mainly the convenience (maybe not for logicians!) argument that carried the day.
>>> 
>>> But I do recall some discussion also about the use of RDF expressions as
>>> patterns, a kind of query, in which their logical interpretation might vary.  If
>>> that viewpoint once had any merit, I suspect it has been rather overtaken by the
>>> subsequent standardization of SPARQL.
>>> 
>>> I know that I find bNodes convenient when constructing RDF, but also I have
>>> found them problematic when implementing inference machinery (by reason of
>>> unclear intermediate scope boundaries).  One implemenation strategy I'd probably
>>> use in future is to replace all bNodes internally by some form of unique
>>> identifier (maybe a UUID URI), then map back to bNode when serializing a graph.
>>> 
>>> So, yes, it is then just a syntactic convenience.  But not one I'd necessarily
>>> choose to forego.
>>> 
>>> #g
>>> --
>>> 
>>> Dieter Fensel wrote:
>>>> Dear all,
>>>> 
>>>> I am not sure it is useful to add another comment and I also
>>>> only partially understand the contents of the flow of emails
>>>> on this issue. However, I will try it and risking to look like a fool.
>>>> 
>>>> 1) bnodes are a trick to avoid thinking about useful names
>>>> in situations you do not really care about them
>>>> and used f.e. in implementing lists in RDF. Obviously
>>>> they were not really needed but make life easier.
>>>> 
>>>> 2) Logicans entered the place and started to interpret them as
>>>> existential quantified variables. This is not wrong (since they
>>>> are statements about something that exists and has a certain
>>>> property), however, it is a somehow heavy way to interpret a
>>>> simple syntactical short-cut.
>>>> 
>>>> I do not think that RDF wants to forbid to interpret them as names,
>>>> only one does not care about the specific one. Maybe a straight-forward
>>>> way is to think about them as unique constants, i.e., use the idea
>>>> of skolemization. I think this is also in line with a proposal of Pat,
>>>> a down-sized version of the Jos & Enrico paper, and in sync with
>>>> [1].
>>>> 
>>>> Alternatively one may simply recommend to not using them (or to
>>>> read these thousand emails before using them).
>>>> 
>>>> Obviously, I may have missed the point, I may violate the charter, and I
>>>> should read 1000 emails more carefully.  Btw, I do not think that the
>>>> discussion is not interesting but obviously indicates a problem.
>>>> 
>>>> [1] G. Yang and M. Kifer: Reasoning about Anonymous Resources
>>>> and Meta Statements on the Semantic Web, J. Data Semantics, 2003: 69~97.
>>>> 
>>>> 
>>>> 
>>>> At 21:33 20.03.2011, Enrico Franconi wrote:
>>>> 
>>>>> On 18 Mar 2011, at 22:14, Pat Hayes wrote:
>>>>> 
>>>>>> As a fallback, I am thinking of writing up a spec-like document
>>>>> defining 'ground RDF', to show how much simpler everything is when you
>>>>> don't have them. It would cover RDF, RDFS, OWL and SPARQL. What do you
>>>>> think?
>>>>> 
>>>>> In [1] we have formally explored this case.
>>>>> --e.
>>>>> 
>>>>> [1] Jos de Bruijn, Enrico Franconi, Sergio Tessaris (2005). Logical
>>>>> Reconstruction of normative RDF. Proc. of the Workshosp on OWL
>>>>> Experiences and Directions (OWLED 2005), Galway, Ireland, November
>>>>> 2005. <http://www.inf.unibz.it/~franconi/papers/owled-05.pdf>
>>>> 
>>> 
>>> --
>>> Dipl.-Inform. Michael Schneider
>>> Research Scientist, Information Process Engineering (IPE)
>>> Tel  : +49-721-9654-726
>>> Fax  : +49-721-9654-727
>>> Email: michael.schneider@fzi.de
>>> WWW  : http://www.fzi.de/michael.schneider
>>> ==============================================================================
>>> FZI Forschungszentrum Informatik an der Universität Karlsruhe
>>> Haid-und-Neu-Str. 10-14, D-76131 Karlsruhe
>>> Tel.: +49-721-9654-0, Fax: +49-721-9654-959
>>> Stiftung des bürgerlichen Rechts
>>> Stiftung Az: 14-0563.1 Regierungspräsidium Karlsruhe
>>> Vorstand: Dipl. Wi.-Ing. Michael Flor, Prof. Dr. rer. nat. Ralf Reussner,
>>> Prof. Dr. rer. nat. Dr. h.c. Wolffried Stucky, Prof. Dr. rer. nat. Rudi Studer
>>> Vorsitzender des Kuratoriums: Ministerialdirigent Günther Leßnerkraus
>>> ==============================================================================
>>> 
>> 
>> ------------------------------------------------------------
>> IHMC                                     (850)434 8903 or (650)494 3973   
>> 40 South Alcaniz St.           (850)202 4416   office
>> Pensacola                            (850)202 4440   fax
>> FL 32502                              (850)291 0667   mobile
>> phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
>> 
>> 
>> 
>> 
>> 
>> 
>> 
> 
> 
> 

------------------------------------------------------------
IHMC                                     (850)434 8903 or (650)494 3973   
40 South Alcaniz St.           (850)202 4416   office
Pensacola                            (850)202 4440   fax
FL 32502                              (850)291 0667   mobile
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes

Received on Thursday, 24 March 2011 17:15:21 UTC