# Re:Matching RDF models + anon nodes

Date: Wed, 18 Jul 2001 11:00:17 +0200
Message-ID: <3B555021.804A9382@mathematik.uni-osnabrueck.de>
To: Jeremy Carroll <jjc@hplb.hpl.hp.com>

```Hi Jeremy,

I like your approach using standard algorithms from
graph theory. I will read the paper in detail and
come back to that later.

I would like RDFCore to recognize this approach.
It shows how one can leverage existing graph
theory.

However, there is no formal definition of RDF graphs
in the specification. (I made a quick shot at [1]).
From section 5:

"This specification shows three representations
of the data model; as 3-tuples (triples), as a graph,
and in XML. These representations have equivalent
meaning."

That means we have four things:
1) data model
2) triples
3) graphs
4) XML

As mentioned before [2] the data model is *not* just
a set of triples. Triples are just one representation
of the data model.

There are two basic problems:

1. Only one of these representaions is formally
defined: XML.

2. What does "These representations have equivalent
meaning." really mean?

My personal view on both:

1. All of these representations should be formally
defined in the RDF specification. I think one should
use NTriples to formally define 'triples'. But one should
also formally define RDF graphs! I would like to offer
help here.

2. There should be explicitly given mappings (in a mathematical
sense) between the representations. (Currently, there is only
one: from XML to triples.). The sentence "These representations
have equivalent meaning." should be changed to "There are
well-defined
mappings between the representations".

RDFCore must decide if these representaions should really be
"equivalent" in the sense that every term in one representation
must be expressible in all others. If yes, then  the data model
is redundant and can be ommited. It would be implicitly given by
the mappings which would be bijections in this case. If no, it
should be explicitly mentioned which terms of the data model
can be expressed in a given representation.

Example: A resource is part of the data model, but can't
be expressed in the triple representation. A resource can
be expressed in XML:
and in the graph representation. A literal is part of
the data model, but can't be expressed in XML (and in the
triple representation). A literal can be expressed in a
graph.

Regards,
Stefan

[1]
http://lists.w3.org/Archives/Public/www-rdf-interest/2001Jun/0008.html
[2]
http://lists.w3.org/Archives/Public/www-rdf-interest/2001Jul/0028.html

Jeremy Carroll wrote:
>
> One of the improvements in Jena-1-1-0
> http://www-uk.hpl.hp.com/people/bwm/rdf/jena/
> is a matching algorithm that can tell if two models are the same.
>
> The algorithm aligns the anonymous resources; so that two files, identical
> except for the order of statements will compare equal.
>
> I've written up the algorithm used, the first draft is available at:
>
> http://www-uk.hpl.hp.com/people/jjc/tmp/matching.pdf
>
> It's based on a standard algorithm from graph theory.
>
> It could also be useful for deeper notions of equivalence (e.g. after we
> have decided that certain pairs of URI's actually refer to the same
> resource).
>
> Any feedback, including stuff like typos and spelling errors, as well as
> more profound comments, would be welcome. I plan to take the doc to a second
> final version in three weeks time, when I will post a technical report
> number and a non-transitory URL.
>
> enjoy
>
> Jeremy Carroll
> HP Labs
```
Received on Wednesday, 18 July 2001 05:15:15 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:07:37 UTC