- From: Giovanni Tummarello <giovanni@wup.it>
- Date: Mon, 30 May 2005 22:21:27 +0200
- To: semantic-web@w3.org, semanticweb@yahoogroups.com
******************** RDFContext Tools 0.1 http://www.dbin.org/RDFContextTools.php 27/5/2005 By Giovanni Tummarello, Christian Morbidoni http://semedia.deit.univpm.it Part of the DBin project http://www.dbin.org INTRODUCTION ******************** This API gives a way to attach "context" information to pieces of an RDF model by adding triples to the model itself. This is similar to reification but at a different, coarser, level. These tools use the concept of MSG (Minimal Self-contained Graph) [1]. Given a triple, the MSG that contains it is composed by that triple plus, recursively, for each blank node involved all the triples connected to it. An MSG therefore has a boundary consisting entirely of URIs or literals. An MSG is also the minimum "piece" of an RDF graph that can be can transferred to another peer that still allows the original graph to be incrementally reconstructed. One of the nice properties of MSGs is that they uniquely partition an RDF Graph, independently of which triple you begin decomposing the graph from. This (actually very simple) theory allows a number of interesting "operations", most of which are directly supported by this API: * Digital signatures on pieces of RDF Graphs (MSGs) stored in the graph itself This is useful in scenarios where "bits" of information are requested remotely and it is desirable to merge them with existing RDF while retaining knowledge about who said what. This is even more useful in scenarios where information can be served by peers other than the original author, possibly "information collectors" or aggregators: signatures on an MSG will be verifiable independently of the pre-existing content of the graph the MSG is merged into. * Address groups of statements without quoting them explicitly Given a hash function, MSGs can be deterministically hashed once property canonicalized (an implementation of Carroll's canonicalization procedure is implemented in this lib). This is useful for several purposes among which Revocation without quotation, RDF Based challenge response operations (e.g. will communicate only if already know ... ) and possibly, efficient RDF RSynching. * Generic "context" to pieces of the graph This is a generalization of the use of Digital Signatures on MSGs. The same node to which signatures are attached can in general be used to apply other contextual information like authorship, date, color, temperature etc. Issues: 1) Since this methodology uses reifications as a way to attach the signature to the MSGs, it is subject to the issues typical of this standard RDF construct. In particular, care should be used when using this proposed method in OWL full reasoners as the owl:sameAs property might cause substitutions inside MSGs. RDFS inference presents similar problems, as new triples resulting from schema entailments could be automatically added by the RDFS triple store involving blank nodes (thus usually invalidating the signatures on MSG). Since RDFS reasoning is usually needed, we differentiate in the DBin platform between the repository where "raw" data is exchanged and those where reasoning happens. At P2P level a "raw" repository is used, where MSGs are stored and served unchanged if requested. Based on the use of the MSG signatures, contexts and specific local rules, the application will then decide whether to also merge the raw MSGs into the higher level repositories, e.g. those used for RDFS, OWL and/or rules reasoning). 2) By MSG definition and RDF Semantics, the structure of existing MSGs will not be affected by insertion of new ones. While this property enables our RDF digital signature schema, care must be taken when inserting the same MSG twice. Although RDF Semantics states that parts of the graph which have identical interpretations should not be duplicated (thanks J. Carroll for pointing this out!), existing toolkits will usually duplicate the MSG when inserting it twice. The MSG class implements a hasSemanticsOf(MSG) which is useful to avoid this. A simple (but not the most efficient) example to prevent duplicate MSGs from being inserted into the triple store: | Graph ourGraph = db.getGraph(); RDFN ourRDFN = new RDFN(ourGraph, URI INSIDE THE INCOMING MSG); MSG[] ourMSGs = ourRDFN.getComposingMSGs(); boolean merge = true; for (int j = 0; j < ourMSGs.length; j++) { if ((incomingMSGs.hasSemanticsOf(ourMSGs[j]))) { merge = false; } } if (merge) { db.addXMLRDF(incomingMSGs[i].getRDFXML()); } | Note that this example uses the RDFN concept (all the MSGs surrounding a given URI). 3) Triple overhead. Worse case: In case of ground statements, the MSG "context node" is actually the reification node. This means that for each triple that is signed at least 4 (reification, but could be done with 3)+2 (certificate+hash) tripes will be added. Typical case: In DBin, a typical MSG is composed by more than 20 triples, this leads to an overhead of 25% to 30%. For a better explanation of the definitions, properties and issues, please see: [1] G. Tummarello, C. Morbidoni, P. Puliti, F. Piazza, "RDF signing supporting resource centric requests" Proceedings of the Poster track, ESWC 2005. http://semedia.deit.univpm.it/submissions/ESWC2005_Poster/ESWC2005_signignRDF.pdf RELEASE NOTES ******************** This is release 0.1, the code is to be considered beta and/or experimental. It is used inside DBin where, so far, it appears to be doing a good job. RDFContext Tools requires the following libraries in classpath Jena Framework and related: commons-logging.jar icu4j.jar jakarta-oro-2.0.5.jar jena.jar xercesImpl.jar Log4j: log4j-1.2.8.jar BouncyCastle APIs: bcpg-jdk14-125.jar bcprov-jdk14-125.jar All of these are included in the release file (RDFTrusttoolkit/lib), except the Jena and related which are available in a single file (jenaLibs.zip) at dbin.org -> RDFContextTools Place these jars in the RDFTrusttoolkit/lib directory to able to run the examples. Running the sample code: --------------------------- In the "samplecode" folder there are two basic examples illustrating the API. You can run them using the .bat files or the equivalent command line. SigningMSG --------------- Shows how to create an MSG object from a graph where a digital signature already exists. Once the MSG is created, the signature is checked and another one is attached to it. RevokingMSG ---------------- By means of the signature process it is possible to remotely "revoke" an MSG that, for example, was previously issued to another peer. This example uses the signature hash value of the MSG as inverse functional property to find the MSG to be revoked. The revocation itself is a digitally signed MSG containing the hash value of the MSG to be revoked. The revocation policy implemented in this example is: "remove the MSG if the same public keys are used to sign both the revocation and the MSG itself" (only the author can revoke his/her annotations). More sophisticated policies can be simply implemented (revocation might come from "groups moderators" etc.) LICENCE ---------------- This library is distributed under the terms of the LGPL licence <http://www.opensource.org/licenses/lgpl-license.php>. Acknowledgments ---------------- Gratitude goes to Fabio Panaioli for part of the implementation and to J. Carroll for the suggestions.
Received on Monday, 30 May 2005 20:22:38 UTC