Re: rdf in xproc from Holger Knublauch on 2008-08-23 (semantic-web@w3.org from August 2008)

From: Holger Knublauch <holger@knublauch.com>
Date: Fri, 22 Aug 2008 22:37:39 -0700
To: Leigh Dodds <leigh@ldodds.com>, semantic-web at W3C <semantic-web@w3c.org>
Cc: Paul Tyson <phtyson@sbcglobal.net>, ndw@nwalsh.com, Jeremy Carroll <jeremy@topquadrant.com>
Message-Id: <DA2E8FF8-01D1-415D-AF53-4089622BCFB1@knublauch.com>

> As its stands you can do some RDF/XML processing in XProc as it is:
>
> * use XSLT to generate RDF/XML from arbitrary XML documents
> * use the http request support to invoke a SPARQL endpoint and process
>  the results to add to the XML being processed
> * transform RDF/XML (probably constrained subsets)
> * extra data from RDF/XML (again, probably subsets)
> * using extension points to include native SPARQL or other operations
> (e.g. putting RDF/XML into a store)

I think most people in the RDF community agree that operating directly  
on the RDF/XML syntax is not a perfect solution. For example there are  
many different renderings of the same RDF triples, so relying on XSLTs  
to do transformations is potentially very error-prone.

Having said this I very much agree with Paul's statement that having  
some standard pipeline facilities for RDF would be good thing. I would  
even take it further and claim that for certain application areas RDF  
might be a better starting point for a pipeline system than XML. In  
particular, I believe that RDF makes merging triples and linking data  
from multiple source more natural than with traditional "closed-world"  
systems such as XML and databases.

At TopQuadrant we have therefore taken an approach to start with RDF  
as the foundation of a pipeline language, called SPARQLMotion [1].  
Each module (step) in SPARQLMotion can take RDF triples as input and  
can create new RDF triples as output. In addition, modules can process  
variables, and these variables can (among others) contain XML DOM  
trees. SPARQLMotion also provides a facility that we call Semantic XML  
[2] to round-trip between XML and RDF. The specifications of  
SPARQLMotion are themselves defined in publicly available RDF/OWL  
files, and graphical RDF/OWL editors such as TopBraid [3] can be used  
to edit the pipeline scripts. SPARQL is used to drive the behavior of  
most of the module types, for example to bind variables or to perform  
iterations.

Judging from customer feedback such an RDF-based pipeline facility has  
a huge potential for solving information integration tasks and  
defining web services. At TopQuadrant we are using SPARQLMotion  
ourselves in various customer projects, including work for processing  
XML and generating XHTML documents for government agencies.

Regards,
Holger

[1] http://www.sparqlmotion.org
[2] http://composing-the-semantic-web.blogspot.com/2007/11/xmap-mapping-arbitrary-xml-documents-to.html
[3] http://www.topquadrant.com/topbraid/composer

Received on Saturday, 23 August 2008 05:38:41 UTC