- From: Jeremy Carroll <jjc@hplb.hpl.hp.com>
- Date: Mon, 8 Oct 2001 12:27:26 +0100
- To: <w3c-rdfcore-wg@w3.org>
This message is a bit muddled, sorry in advance. The main points are: + rdf entailment is NP complete + finding a canonical graph from a given RDF graph is NP complete + equivalence of RDF graphs is NP complete ======================== We have this thread about whether a graph is a set or a bag. http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2001Oct/thread.html#61 I fear it is the tip of an iceberg. My understanding is that we have: two concrete syntaxes: RDF/XML and N-triple one abstract syntax: a graph (see MT). one model: see Pat's MT ('one' in "one model" means "one model theoretic articulation of RDF"). So the "graphs are sets?!" thread was triggered by my observation that an ntriple file with two identical triples A: <uri> <pred> <uri2> . <uri> <pred> <uri2> . was equivalent to one with only one: B: <uri> <pred> <uri2> . This was intended as a harmless sort of comment :-(. It was motivated by the graph abstract syntax. Under the model theory, we would say that two graphs are equivalent if they mutually entail one another. This would mean - wait for it Aaron :) - that C: <uri> <pred> <uri2> . <uri> <pred> _:bnode . was another version of the same RDF model (if not the same RDF graph!). From a model theoretic point of view A B and C are all equivalent. This follows from the bnode being taken as an existentially qualified variable. I assume that it would, in the fullness of time, be desirable to determine a canonical RDF graph for any particular equivalence class of RDF graphs, for instance, the equivalence class containing graphs corresponding to A, B, and C would have canonical form B. This paragraph refers to the graph not to a serialization of the graph. The canonicalization of the serialization of a graph is known to be graph isomorphism complete. If we consider an extension of C being: D: <uri> <pred> <uri2> . <uri> <pred> _:bnode . _:bnode <pred> _:bnode2 . then we can no longer ignore the second triple, since it is saying something more substantive than the second triple in C. ================= Observations: rdf entailment as stated in the model theory requires in the worst case an NP complete computation (proof: an arbitrary subgraph isomorphism can be encoded in rdf entailment - but not in RDF/XML). rdf graph canonicalization requires an NP complete computation. This is intended to mean solely at the level of the graph the abstract syntax, before we serialize the graph itself. Canonical graph serialization is graph isomorphism complete. two rdf graphs are equivalent if they entail one another. this also is NP complete - i.e. it is (thought to be) worse than graph isomorphism. ============== I am unclear as to whether these are theoretical difficulties that can safely be ignored; or real practical difficulties that should be addressed by fundamental changes to the model theory. Jeremy
Received on Monday, 8 October 2001 07:27:48 UTC