- From: Gabe Beged-Dov <begeddov@jfinity.com>
- Date: Wed, 29 Nov 2000 14:18:01 -0800
- To: Stefan Kokkelink <skokkeli@mathematik.uni-osnabrueck.de>
- CC: "www-rdf-interest@w3.org" <www-rdf-interest@w3.org>
Stefan Kokkelink wrote: > > Gabe Beged-Dov wrote: <snip /> > > As you say, I am proposing that we assume that a conformant parser > > must generate the bags and reified statements. Once we take that step > > we can then discuss how to provide straightforward and efficient API > > and implementations based on a standard interpretation. > > I disagree here. In general there is no need to know > about the XML structure since the XML serialization > is meant for exchanging RDF models (at least that is > my point of view ;-). If you look at the examples of > M&S you won't find an RDF graph containing a reification > or bagification unless bagID or propertyID are explicitly > given. In my opinion a parser SHOULD provide a configuration > setting that enforces a bagification for every rdf:Description > element (if someone really is interested in the XML structure > of the serialization...) I am trying to achieve multiple goals with this interpretation of the M&S. The goals are: - To have a single consistent interpretation of what an RDF processor generates - To bring in-band the various types of information that implementations (especially storage) are handling out-of-band - To not lose any information that is contained in the source representation - To be able to trace back statings to their occurrence (and also quotings although that's less clear). - To push a web document centric view of RDF - To allow an entire RDF document set to be manipulated directly as a single graph I distinguish between the ability to surface the raw triples that occurred in the source document and the ability to track all of the information contained in the source document. This is similar to the XML infoset and even more so to HyTime Graves (the full information) and Grove Plans (a filtered view of that information. Here's a thought experiment. You have a streaming pipeline like this: source_doc -> normalizer -> infoset_gen -> MyStatement_gen -> triple_gen The normalizer takes in the various syntactic variations and outputs an equivalent version in the basic syntax of the M&S. The infoset_gen takes this basic syntax and adds full reification labelling and any other necessary metadata. It emits this version of the source document as RDF/XML basic syntax. The MyStatement_gen generates an efficient high level API version of this annotated version of the source document. This is discussed more below. Finally, the triple_gen is a filter that gives you an expansion into triples of some subset of the statement stream that was emited by the MyStatement_gen module. If you assume my interpretation of what information needs to be generated from a source document, the following MyStatement structure would convey that information: { BagID, StatementID, subject, predicate, object, isStated } If RDF processors generated this sextuple rather than the current triple, they would generate no more "statements" than in the current usage. In fact they would generate alot less in the face of explicit or requested reification. They would have all the information necessary to generate the "legacy" triple interface if an application wanted the raw view of the information. You could also control the triple filter to only return ground triples, etc.. I am working on an implementation based on these ideas that leverages the SAX2 filter architecture. It is based on the David Megginson's RDFFilter. I hope to have something to share in the near future. If anyone would like to collaborate on this, I would love help. It is currently visible as an empty project at sourceforge called RaPFiSH (RDF; A Parser Framework implemented using SAX2 Handlers :). Send me e-mail if you are interested. Just to clarify, I know that there are alot of RDF Frameworks out there already, but they mostly focus on issues downstream from the parser and take the triple is king view of RDF processing. I am focusing on the upstream portion of the pipeline and on this different approach of passing sextuples in the API between the parser and application. I plan to be able to plug into existing frameworks using triple based API like those that Sergey and others have developed. > All the best, > Stefan Gabe -- --------------------------- http://www.jfinity.com/gabe
Received on Wednesday, 29 November 2000 16:17:50 UTC