Returning un-escaped XML literals in SPARQL 1.1 XML results

Howdy Folks,

In response to the solicitation of suggestions for new features in SPARQL 1.1,
I would like to raise this horse from the dead for further beatings:

http://www.w3.org/2001/sw/DataAccess/issues#unescapedXml

This feature has significant utility in cases where a user wants
to store blocks of (well-formed!) XML in an RDF model, and then process
this XML in a pipelined context.  In my experience, this feature
makes it easier to gradually adopt RDF as part of an ongoing SOA
project.  The stored XML could be either document content or message
content.  I do not advocate retrieval of ill-formed legacy HTML through
this mechanism (which was a possibility raised in previous discussion).

I also think it is now relevant to consider the impact on XProc integration,
as raised by Paul Tyson on 2009-03-04.   I say this without being well
versed in XProc, but based on the assumption that un-escaped XML results
are useful in any pipelined processing context.   I welcome clarifications
from the more XProc-savvy.

http://lists.w3.org/Archives/Public/public-rdf-dawg-comments/2009Mar/0004.html

The implementation might also be linked to the use of the ReturnFormatKeyword:

http://www.w3.org/2009/sparql/wiki/Feature:ReturnFormatKeyword

I know there is some complexity involved in embedding arbitrary
XML into the results stream.   It might be sensible to make xml-literal
results an optional feature (both in the sense that SPARQL implementors
are not required to implement it, and in the sense that SPARQL
users are not required to use it).  I would also support placing
restrictions on the XML content that can be returned this way,
e.g. to address some of the encoding issues addressed by Eric
Prud'hommeaux here:

http://lists.w3.org/Archives/Public/public-rdf-dawg/2007JulSep/0163.html

(Perhaps there's been some progress on c14n in recent months?)

Regarding XML schemas and implementation, one idea is that
the XML literal might come wrapped in a child tag of <binding> called
<xml-literal>, which has content type xsd:any.
This means the overall SPARQL-results schema would not be
weakened for any results that do not happen to include <xml-literal>.

Example of well-formed XHTML content (we could just as well use WSDL,
a SOAP message, etc.):

<binding name="o">
   <xml-literal datatype="http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral">
      <xh:p xmlns:xh="http://www.w3.org/1999/xhtml">Contents of <xh:em>important</xh:em> paragraph</xh:p>
   </xml-literal>
</binding>

I hope these thoughts are useful!

peace,

Stu

Received on Tuesday, 10 March 2009 19:51:20 UTC