- From: Stefan Kokkelink <skokkeli@mathematik.uni-osnabrueck.de>
- Date: Wed, 18 Jul 2001 11:00:17 +0200
- To: Jeremy Carroll <jjc@hplb.hpl.hp.com>
- CC: www-rdf-interest@w3.org
Hi Jeremy, I like your approach using standard algorithms from graph theory. I will read the paper in detail and come back to that later. I would like RDFCore to recognize this approach. It shows how one can leverage existing graph theory. However, there is no formal definition of RDF graphs in the specification. (I made a quick shot at [1]). From section 5: "This specification shows three representations of the data model; as 3-tuples (triples), as a graph, and in XML. These representations have equivalent meaning." That means we have four things: 1) data model 2) triples 3) graphs 4) XML As mentioned before [2] the data model is *not* just a set of triples. Triples are just one representation of the data model. There are two basic problems: 1. Only one of these representaions is formally defined: XML. 2. What does "These representations have equivalent meaning." really mean? My personal view on both: 1. All of these representations should be formally defined in the RDF specification. I think one should use NTriples to formally define 'triples'. But one should also formally define RDF graphs! I would like to offer help here. 2. There should be explicitly given mappings (in a mathematical sense) between the representations. (Currently, there is only one: from XML to triples.). The sentence "These representations have equivalent meaning." should be changed to "There are well-defined mappings between the representations". RDFCore must decide if these representaions should really be "equivalent" in the sense that every term in one representation must be expressible in all others. If yes, then the data model is redundant and can be ommited. It would be implicitly given by the mappings which would be bijections in this case. If no, it should be explicitly mentioned which terms of the data model can be expressed in a given representation. Example: A resource is part of the data model, but can't be expressed in the triple representation. A resource can be expressed in XML: <rdf:Description about="URI"/></rdf:Decription> and in the graph representation. A literal is part of the data model, but can't be expressed in XML (and in the triple representation). A literal can be expressed in a graph. Regards, Stefan [1] http://lists.w3.org/Archives/Public/www-rdf-interest/2001Jun/0008.html [2] http://lists.w3.org/Archives/Public/www-rdf-interest/2001Jul/0028.html Jeremy Carroll wrote: > > One of the improvements in Jena-1-1-0 > http://www-uk.hpl.hp.com/people/bwm/rdf/jena/ > is a matching algorithm that can tell if two models are the same. > > The algorithm aligns the anonymous resources; so that two files, identical > except for the order of statements will compare equal. > > I've written up the algorithm used, the first draft is available at: > > http://www-uk.hpl.hp.com/people/jjc/tmp/matching.pdf > > It's based on a standard algorithm from graph theory. > > It could also be useful for deeper notions of equivalence (e.g. after we > have decided that certain pairs of URI's actually refer to the same > resource). > > Any feedback, including stuff like typos and spelling errors, as well as > more profound comments, would be welcome. I plan to take the doc to a second > final version in three weeks time, when I will post a technical report > number and a non-transitory URL. > > enjoy > > Jeremy Carroll > HP Labs
Received on Wednesday, 18 July 2001 05:15:15 UTC