AW: AW: {Disarmed} Re: blank nodes (once again) from Michael Schneider on 2011-03-25 (semantic-web@w3.org from March 2011)

From: Michael Schneider <schneid@fzi.de>
Date: Fri, 25 Mar 2011 00:00:52 +0000
To: Sandro Hawke <sandro@w3.org>, Pat Hayes <phayes@ihmc.us>
CC: Graham Klyne <GK-lists@ninebynine.org>, Dieter Fensel <dieter.fensel@sti2.at>, Enrico Franconi <franconi@inf.unibz.it>, Hugh Glaser <hg@ecs.soton.ac.uk>, Mark Wallace <mwallace@modusoperandi.com>, Alan Ruttenberg <alanruttenberg@gmail.com>, Reto Bachmann-Gmuer <reto.bachmann@trialox.org>, Ivan Shmakov <oneingray@gmail.com>, Ivan Shmakov <ivan@main.uusia.org>, "<semantic-web@w3.org>" <semantic-web@w3.org>
Message-ID: <D951F012D98783438CFE1553243F2D4F021F9D@ex-ms-1a.fzi.de>

Hi Sandro!

Sandro Hawke wrote:

>In particular, I think the system which first exposes the RDF content on
>the Web should be the one which Skolemizes it, 

(Oh Dear...)

>since it knows what URL
>prefix to use.   (If there isn't one such system, then Skolemizing is a
>problem.)   This system has the interesting challenge of minimizing
>changes if/when it re-reads modified content destined for the same URL.
>That's the most interesting problem in this space, to me....
>
>To rephrase that problem: given similar RDF graphs G1 and G2, and a
>labeling of the blank nodes in G1 to produce G1', how do you produce a
>labeling of the blank nodes in G2, G2', such that the differences
>between G1' and G2' are as small as the differences between G1 and G2?

I don't understand this. What do you mean by "similar RDF graphs"? And what exactly is G1'? A skolemized version of G1? And in which form do you want to measure the "differences between G1' and G2'"? That all sounds, erm, slightly vague to me...

In the easiest case, if "similar" means semantically equivalent, and if there have been existentially interpreted blank nodes in both G1 and G2, which have all been substituted by skolem constants in G1' and G2', then most likely G1' and G2' will /not/ be equivalent (aka "similar") anymore. How "much" they "differ" will depend entirely on the form of the original graphs G1 and G2 (maybe one can compare G1' and G2' by a pair of maximal subgraphs that are semantically equivalent, or whatever, I still don't know what "difference" means here). It doesn't depend on some "specially optimized" form of skolemization. Skolemization will, in any case, mean that you replace existentially quantified variables by fresh (i.e. not used anywhere else so far) constants. How these constants are labeled is entirely irrelevant.

>In practice, imagine I have a hand authored page of turtle with maybe
>150 triples, much of it lists.  I click "publish" and it gets Skolemized
>and published at URL U.  Then I change my mind about something, make a
>tiny edit, click "re-publish" and it gets Skolemized again, and the new
>version gets published at U.   If someone is watching U, I want them to
>see that only a little change was made.  A naive (uuid) Skolemization
>would make the change look huge, as every blank node got an entirely new
>label.

There is nothing "naive" about this approach. If you skolemize the same RDF graph twice and compare the two resulting graphs, then skolemization means that all skolem constants that have been introduced in the first graph are different from all skolem constants introduced in the second graph. So, yes, the change will look huge, and will look the huger the more blank nodes have been in the original graphs. Of course, the original graph /structure/ will be retained in a sense.

Question: Do you believe that you can simply skolemize existentially quantified variables in a logical formula and think that /everything/ will be fine afterwards? It depends heavily on what one plans to do with the graphs for the question whether skolemization is "harmless", "mostly harmless", or a very bad idea. For example, if you are interested in whether a given RDF graph (under some given entailment regime) is semantically consistent or not, then you can safely skolemize the graph, since skolemization retains equisatisfiability. But skolemization does /not/ lead to logically equivalent graphs. So, if you want to know whether a graph G1 entails G2 or not, then you really shouldn't ask this same question for the skolemized versions G1' and G2', since with good probability an original entailment between G1 and G2 won't hold anymore for G1' and G2'. With your above proposal that "the system which first exposes the RDF content on the Web should be the one which Skolemizes it", entailment checking (inferencing) on the Web would be greatly doomed.

Cheers,
Michael

--
Dipl.-Inform. Michael Schneider
Research Scientist, Information Process Engineering (IPE)
Tel  : +49-721-9654-726
Fax  : +49-721-9654-727
Email: michael.schneider@fzi.de
WWW  : http://www.fzi.de/michael.schneider
==============================================================================
FZI Forschungszentrum Informatik an der Universität Karlsruhe
Haid-und-Neu-Str. 10-14, D-76131 Karlsruhe
Tel.: +49-721-9654-0, Fax: +49-721-9654-959
Stiftung des bürgerlichen Rechts
Stiftung Az: 14-0563.1 Regierungspräsidium Karlsruhe
Vorstand: Dipl. Wi.-Ing. Michael Flor, Prof. Dr. rer. nat. Ralf Reussner,
Prof. Dr. rer. nat. Dr. h.c. Wolffried Stucky, Prof. Dr. rer. nat. Rudi Studer
Vorsitzender des Kuratoriums: Ministerialdirigent Günther Leßnerkraus
==============================================================================

Received on Friday, 25 March 2011 00:01:29 UTC