- From: Alberto Reggiori <alberto@asemantics.com>
- Date: Thu, 26 Jun 2003 15:32:27 +0200
- To: www-rdf-dspace@w3.org
- Cc: Alberto Reggiori <alberto@asemantics.com>, staff@asemantics.com
On Thursday, June 26, 2003, at 02:11 PM, Butler, Mark wrote: > > Hi Dave > >> Non-standard extensions would be best avoided if you want >> SIMILE to be a full >> participant in the semantic web. > > But to take this back to my original suggestion does this apply to > quads? My > understanding from Andy is that they are used by RDFStore and a number > of > RSS processors, and from Jeremy that although Jena 2 does not have a > quads > API it does actually use a quad data structure "under the hood". So > although > they are non-standard at the moment, people are using them, so should > we > really rule them out? hi Mark, in our interpretation of provenance/contexts in RDFStore we assumed that a statement represents a fact that is asserted as true in a certain context. This circumstance (e.g. space/temporal, situation or scope) where the statement has been stated represents “contextual” information about the statement [1][2]. For example, when triples are being added to a graph it is often useful to be able to track back where they came from (e.g. Internet source Web site or domain), how they were added, by whom, why, when (e.g. date), when they will expire (e.g. Time-To-Live) and so on. Such context (or provenance information) can be thought of as an additional and orthogonal dimension to the other 3 components. This concept is not part of the current RDF data model [3] and referred to as “statement reification". From the application developer point of view there is a clear need for such primitive constructs to layer different levels of semantics on top of RDF which can not be represented in the RDF triples space. Applications normally need to build meta-levels of abstraction over triples to reduce complexity and provide an incremental and scaleable access to information. For example, if a Web robot is processing and syndicating news coming from various on-line newspapers, there will be overlap. An application may decide to filter the news based not only on a timeline or some other property, but perhaps select sources providing only certain information with unique characteristics. This requires the flagging of triples as belonging to different contexts and then describing in the RDF itself the relationships between the contexts. At query time such information can then be used by the application to define a search scope to filter the results. Another common example of the usage of provenance and contextual information is about digital signing RDF triples to provide a basic level of trust over the Semantic. In that case triples could be flagged for example with a PGP key to uniquely identify the source and its properties. There have been several attempts [4][5][6][7] trying to formalize and use contexts and provenance information in RDF but there is not yet a common agreement how to do it. It is also not completely clear how an application would benefit from this information. Jena2 seems is also trying some steps in that direction too. Our approach to model contexts and provenance has been simpler and motivated by real-world RDF applications we have developed [8][9]. We found that an additional dimension to the RDF triple can be useful or even essential. Given that the usage of full-blown RDF reification can be cumbersome due to its verbosity and inefficiency, we developed a different modeling technique that flags or mark a given statement as belonging to one or more specific contexts. On the practical side, our Perl/C API allows to add/remove and search triples into specific "spaces" or contexts and serialize them back as Quads (simple extension to N-Triples syntax) - at the moment we are about to implement a serialization of context back to RDF/XML (also as Jan suggested) via the rdf:ID reification stuff and at parse time will just flag those triples (predicates) as "special" or asserted in a different context - in the past we used rdf:bagID for to hack this functionality but it has been recently dropped from the specs as you probably noticed. At the RDQL query level we allow a 4-th component as URI (resource) on triple-patterns to specify/select the context - the nice part of it is that sub-sequent triple-patterns can refine and select the vars from that 4-th component to "unify" descriptions of different levels. As an example, as presented at the WWW2003 devday, we have some demo queries using contexts available http://demo.asemantics.com/rdfstore/www2003/ The example database contains scraped news from most italian newspapers, where each channel and news item is put into a specific source context - this allows us to filter results by date, by source avoiding overlaps and clashing of URLs (eg. some newspapers recycling the same URL every day but with different HTML content). In particular look at the last two queries (number 9 and 10) using contextual information at the RDQL level - the very last one is pretty cool to me, which allows to describe the 4-th context component with a dc:date and then join it into the other triple space. BTW: while at www2003 I had a chat with Matt Biddulph about his RSS codepiction code/demo and he seems to have similar problems and solutions using Jena with reification to mimic contextual information - that means that this aspect is going to fundamental for the success of the whole Semantic Web and RDF systems to me but yes, all this is not "standard" :-) hope this helps all the best Alberto [1] Graham Klyne, 13-Mar-2002 “Circumstance, provenance and partial knowledge - Limiting the scope of RDF assertions” http://www.ninebynine.org/RDFNotes/UsingContextsWithRDF.html [2] John F. Sowa, “Knowledge Representation: Logical, Philosophical, and Computational Foundations”, Brooks Cole Publishing Co., ISBN 0-534-94965-7 [3] Patrick Hayes “RDF Semantics” (W3C Working Draft 23 January 2003) http://www.w3.org/TR/rdf-mt/ [4] Graham Klyne, 18 October 2000 “Contexts for RDF Information Modelling” http://public.research.mimesweeper.com/RDF/RDFContexts.html [5] Seth Russel, 7 August 2002 “Quads” http://robustai.net/sailor/grammar/Quads.html [6] T. Berners-Lee, Dan Connoly “Notation 3” http://www.w3.org/2000/10/swap/doc/Overview.html [7] Dave Beckett, “Contexts Thoughts" http://www.redland.opensource.ac.uk/notes/contexts.html [8] http://demo.asemantics.com/biz/isc/ [9] http://demo.asemantics.com/biz/lmn/ > > I'd be interested in feedback here from Eric Miller and David Karger > also? > > thanks > > Mark
Received on Thursday, 26 June 2003 09:33:00 UTC