- From: Giovanni Tummarello <giovanni@wup.it>
- Date: Mon, 30 May 2005 22:21:27 +0200
- To: semantic-web@w3.org, semanticweb@yahoogroups.com
********************
RDFContext Tools 0.1
http://www.dbin.org/RDFContextTools.php
27/5/2005
By Giovanni Tummarello, Christian Morbidoni
http://semedia.deit.univpm.it
Part of the DBin project
http://www.dbin.org
INTRODUCTION
********************
This API gives a way to attach "context" information to pieces of an RDF
model by adding triples to the model itself. This is similar to
reification but at a different, coarser, level.
These tools use the concept of MSG (Minimal Self-contained Graph) [1].
Given a triple, the MSG that contains it is composed by that triple
plus, recursively, for each blank node involved all the triples
connected to it. An MSG therefore has a boundary consisting entirely of
URIs or literals. An MSG is also the minimum "piece" of an RDF graph
that can be can transferred to another peer that still allows the
original graph to be incrementally reconstructed.
One of the nice properties of MSGs is that they uniquely partition an
RDF Graph, independently of which triple you begin decomposing the graph
from. This (actually very simple) theory allows a number of interesting
"operations", most of which are directly supported by this API:
* Digital signatures on pieces of RDF Graphs (MSGs) stored in the graph
itself
This is useful in scenarios where "bits" of information are requested
remotely and it is desirable to merge them with existing RDF while
retaining knowledge about who said what. This is even more useful in
scenarios where information can be served by peers other than the
original author, possibly "information collectors" or aggregators:
signatures on an MSG will be verifiable independently of the
pre-existing content of the graph the MSG is merged into.
* Address groups of statements without quoting them explicitly
Given a hash function, MSGs can be deterministically hashed once
property canonicalized (an implementation of Carroll's canonicalization
procedure is implemented in this lib). This is useful for several
purposes among which Revocation without quotation, RDF Based challenge
response operations (e.g. will communicate only if already know ... )
and possibly, efficient RDF RSynching.
* Generic "context" to pieces of the graph
This is a generalization of the use of Digital Signatures on MSGs. The
same node to which signatures are attached can in general be used to
apply other contextual information like authorship, date, color,
temperature etc.
Issues:
1) Since this methodology uses reifications as a way to attach the
signature to the MSGs, it is subject to the issues typical of this
standard RDF construct. In particular, care should be used when using
this proposed method in OWL full reasoners as the owl:sameAs property
might cause substitutions inside MSGs. RDFS inference presents similar
problems, as new triples resulting from schema entailments could be
automatically added by the RDFS triple store involving blank nodes (thus
usually invalidating the signatures on MSG). Since RDFS reasoning is
usually needed, we differentiate in the DBin platform between the
repository where "raw" data is exchanged and those where reasoning
happens. At P2P level a "raw" repository is used, where MSGs are stored
and served unchanged if requested. Based on the use of the MSG
signatures, contexts and specific local rules, the application will then
decide whether to also merge the raw MSGs into the higher level
repositories, e.g. those used for RDFS, OWL and/or rules reasoning).
2) By MSG definition and RDF Semantics, the structure of existing MSGs
will not be affected by insertion of new ones. While this property
enables our RDF digital signature schema, care must be taken when
inserting the same MSG twice. Although RDF Semantics states that parts
of the graph which have identical interpretations should not be
duplicated (thanks J. Carroll for pointing this out!), existing toolkits
will usually duplicate the MSG when inserting it twice. The MSG class
implements a hasSemanticsOf(MSG) which is useful to avoid this. A simple
(but not the most efficient) example to prevent duplicate MSGs from
being inserted into the triple store:
|
Graph ourGraph = db.getGraph();
RDFN ourRDFN = new RDFN(ourGraph, URI INSIDE THE INCOMING MSG);
MSG[] ourMSGs = ourRDFN.getComposingMSGs();
boolean merge = true;
for (int j = 0; j < ourMSGs.length; j++) {
if ((incomingMSGs.hasSemanticsOf(ourMSGs[j]))) {
merge = false;
}
}
if (merge) {
db.addXMLRDF(incomingMSGs[i].getRDFXML());
}
|
Note that this example uses the RDFN concept (all the MSGs surrounding a
given URI).
3) Triple overhead.
Worse case: In case of ground statements, the MSG "context node" is
actually the reification node. This means that for each triple that is
signed at least 4 (reification, but could be done with 3)+2
(certificate+hash) tripes will be added.
Typical case: In DBin, a typical MSG is composed by more than 20
triples, this leads to an overhead of 25% to 30%.
For a better explanation of the definitions, properties and issues,
please see:
[1] G. Tummarello, C. Morbidoni, P. Puliti, F. Piazza, "RDF signing
supporting resource centric requests" Proceedings of the Poster track,
ESWC 2005.
http://semedia.deit.univpm.it/submissions/ESWC2005_Poster/ESWC2005_signignRDF.pdf
RELEASE NOTES
********************
This is release 0.1, the code is to be considered beta and/or
experimental. It is used inside DBin where, so far, it appears to be
doing a good job.
RDFContext Tools requires the following libraries in classpath
Jena Framework and related:
commons-logging.jar
icu4j.jar
jakarta-oro-2.0.5.jar
jena.jar
xercesImpl.jar
Log4j:
log4j-1.2.8.jar
BouncyCastle APIs:
bcpg-jdk14-125.jar
bcprov-jdk14-125.jar
All of these are included in the release file (RDFTrusttoolkit/lib),
except the Jena and related which are available in a single file
(jenaLibs.zip) at dbin.org -> RDFContextTools
Place these jars in the RDFTrusttoolkit/lib directory to able to run the
examples.
Running the sample code:
---------------------------
In the "samplecode" folder there are two basic examples illustrating the
API. You can run them using the .bat files or the equivalent command line.
SigningMSG
---------------
Shows how to create an MSG object from a graph where a digital signature
already exists. Once the MSG is created, the signature is checked and
another one is attached to it.
RevokingMSG
----------------
By means of the signature process it is possible to remotely "revoke" an
MSG that, for example, was previously issued to another peer. This
example uses the signature hash value of the MSG as inverse functional
property to find the MSG to be revoked. The revocation itself is a
digitally signed MSG containing the hash value of the MSG to be revoked.
The revocation policy implemented in this example is: "remove the MSG if
the same public keys are used to sign both the revocation and the MSG
itself" (only the author can revoke his/her annotations). More
sophisticated policies can be simply implemented (revocation might come
from "groups moderators" etc.)
LICENCE
----------------
This library is distributed under the terms of the LGPL licence
<http://www.opensource.org/licenses/lgpl-license.php>.
Acknowledgments
----------------
Gratitude goes to Fabio Panaioli for part of the implementation and to
J. Carroll for the suggestions.
Received on Monday, 30 May 2005 20:22:38 UTC