- From: Holger Knublauch <holger@knublauch.com>
- Date: Fri, 22 Aug 2008 22:37:39 -0700
- To: Leigh Dodds <leigh@ldodds.com>, semantic-web at W3C <semantic-web@w3c.org>
- Cc: Paul Tyson <phtyson@sbcglobal.net>, ndw@nwalsh.com, Jeremy Carroll <jeremy@topquadrant.com>
> As its stands you can do some RDF/XML processing in XProc as it is: > > * use XSLT to generate RDF/XML from arbitrary XML documents > * use the http request support to invoke a SPARQL endpoint and process > the results to add to the XML being processed > * transform RDF/XML (probably constrained subsets) > * extra data from RDF/XML (again, probably subsets) > * using extension points to include native SPARQL or other operations > (e.g. putting RDF/XML into a store) I think most people in the RDF community agree that operating directly on the RDF/XML syntax is not a perfect solution. For example there are many different renderings of the same RDF triples, so relying on XSLTs to do transformations is potentially very error-prone. Having said this I very much agree with Paul's statement that having some standard pipeline facilities for RDF would be good thing. I would even take it further and claim that for certain application areas RDF might be a better starting point for a pipeline system than XML. In particular, I believe that RDF makes merging triples and linking data from multiple source more natural than with traditional "closed-world" systems such as XML and databases. At TopQuadrant we have therefore taken an approach to start with RDF as the foundation of a pipeline language, called SPARQLMotion [1]. Each module (step) in SPARQLMotion can take RDF triples as input and can create new RDF triples as output. In addition, modules can process variables, and these variables can (among others) contain XML DOM trees. SPARQLMotion also provides a facility that we call Semantic XML [2] to round-trip between XML and RDF. The specifications of SPARQLMotion are themselves defined in publicly available RDF/OWL files, and graphical RDF/OWL editors such as TopBraid [3] can be used to edit the pipeline scripts. SPARQL is used to drive the behavior of most of the module types, for example to bind variables or to perform iterations. Judging from customer feedback such an RDF-based pipeline facility has a huge potential for solving information integration tasks and defining web services. At TopQuadrant we are using SPARQLMotion ourselves in various customer projects, including work for processing XML and generating XHTML documents for government agencies. Regards, Holger [1] http://www.sparqlmotion.org [2] http://composing-the-semantic-web.blogspot.com/2007/11/xmap-mapping-arbitrary-xml-documents-to.html [3] http://www.topquadrant.com/topbraid/composer
Received on Saturday, 23 August 2008 05:38:41 UTC