- From: Jeremy Carroll <jjc@hplb.hpl.hp.com>
- Date: Mon, 8 Oct 2001 12:06:47 +0100
- To: <w3c-rdfcore-wg@w3.org>
Pat or Jeremy wrote: > > > > > > > >> One note: the current "equality" tests are computational > less expensive than > > >> an rdf entailment test. Technically it is the difference > between Graph > > >> Isomorphism and SubGraph Isomorphism. GI is thought to be > strictly between P > > >> and NP where SubGraph Iso is known to be NP. > Sergey: > The requirement to check for graph isomorphism in order to determine > equality sounds just horrible! A simple implementation (which, due to > lazyness may be favored by many developers) might require O(N!) > operations in the number of bNodes. This is prohibitive even for N=15 > bNodes in the two graphs. In an HP internal version of Jena we did have such a simple implementation: the test case that got me writing a better version had N=12. Sergey: > 2. As far as I remember, RDF/XML serialization does not allow cycles > made entirely of bNodes. In this case, only tree isomorphism is > required, which is logspace (easier than polynomial). Would this be an > option? > I think this is correct .... I think this is an observation about the tests, and not part of the normative specification. RDF/XML serialization does allow a forest (i.e. many trees) but they are well-behaved in terms of their bNode production. Ignoring reification and distributed referents, each bNode is the object of at most one triple, and there are no cycles consisting entirely of bNodes. For checking RDF/XML parsers it is OK to have an equality checker that makes use of such details. My own view is that an adequately written (sub)graph isomorphism should run fast enough on good data (such as waht we would expect from RDF/XML), while avoiding some of the catastrophic holes on data that has come from some other source. (Such as the pathological test cases that I use!). Jeremy
Received on Monday, 8 October 2001 07:21:08 UTC