RE: A proposal for entailment tests

Pat or Jeremy wrote:
> >
> > >
> > >>  One note: the current "equality" tests are computational
> less expensive than
> > >>  an rdf entailment test. Technically it is the difference
> between Graph
> > >>  Isomorphism and SubGraph Isomorphism. GI is thought to be
> strictly between P
> > >>  and NP where SubGraph Iso is known to be NP.
>

Sergey:
> The requirement to check for graph isomorphism in order to determine
> equality sounds just horrible! A simple implementation (which, due to
> lazyness may be favored by many developers) might require O(N!)
> operations in the number of bNodes. This is prohibitive even for N=15
> bNodes in the two graphs.

In an HP internal version of Jena we did have such a simple implementation:
the test case that got me writing a better version had N=12.

Sergey:
> 2. As far as I remember, RDF/XML serialization does not allow cycles
> made entirely of bNodes. In this case, only tree isomorphism is
> required, which is logspace (easier than polynomial). Would this be an
> option?
>

I think this is correct ....
I think this is an observation about the tests, and not part of the
normative specification.

RDF/XML serialization does allow a forest (i.e. many trees) but they are
well-behaved in terms of their bNode production. Ignoring reification and
distributed referents, each bNode is the object of at most one triple, and
there are no cycles consisting entirely of bNodes. For checking RDF/XML
parsers it is OK to have an equality checker that makes use of such details.
My own view is that an adequately written (sub)graph isomorphism should run
fast enough on good data (such as waht we would expect from RDF/XML), while
avoiding some of the catastrophic holes on data that has come from some
other source. (Such as the pathological test cases that I use!).


Jeremy

Received on Monday, 8 October 2001 07:21:08 UTC