- From: Stu Baurmann <stub@logicu.com>
- Date: Thu, 04 Oct 2007 21:47:15 -0500
- To: Lee Feigenbaum <lee@thefigtrees.net>
- CC: public-rdf-dawg-comments@w3.org
Hi Lee, I read through the group's discussion. I'm glad that some members agree that this feature would be useful. I understand that there are some design challenges which will be difficult to resolve cleanly and quickly. I hope that this feature can be taken up as a scheduled design task in some future round of the group's activity. Until then, perhaps some implementors will take the opportunity to offer raw XML output as a non-standard extension, with appropriate caveats applied (e.g. "this extension only produces well-formed output if conditions A, B, C hold on the stored XML fragment"). So, yes, I am satisfied, and thanks for considering my suggestion! peace, Stu >> Hi Stu, >> >> My apologies for the long delay in responding to your comment. >> >> The Working Group discussed your comment on our mailing list and at >> last week's teleconference[1]. While there is some measure of support >> for the goals of your suggestion, the combination of schedule concerns >> with a lack of a mature existing technical design led to the group >> choosing not to add the possibility for unescaped XML literals in the >> SPARQL XML result format. >> >> To note the issue and to help inform a potential future working group, >> I opened and immediately postponed an issue[2] regarding unescaped XML >> in the SPARQL XML result format. >> >> Please let us know if you are satisfied with this response to your >> comment. >> >> Lee >> >> [1] >> http://lists.w3.org/Archives/Public/public-rdf-dawg/2007JulSep/att-0175/25-dawg-minutes.html#item02 >> >> [2] http://www.w3.org/2001/sw/DataAccess/issues#unescapedXml >> >> Stu Baurmann wrote: >>> >>> Howdy! >>> >>> (I have tried sending this before and it didn't seem to go through. >>> Apologies if you get multiple copies). >>> >>> In September 2006 I posted a question about literal XML inclusion in >>> SPARQL results to the jena-dev list, >>> and Andy suggested that I post the issue here. Sorry it's taken me >>> so long to do that. I don't see >>> anywhere on this list that the subject has come up in the meantime, >>> so perhaps it's still germane. >>> >>> When literal XML is stored inside an RDF model, it is in some cases >>> desirable to fetch that content >>> as part of a SPARQL XML result stream *without escaping*. For >>> example, consider the storage of >>> XHTML content within RDF literals. It seems reasonable (and works >>> fine) to assert a triple like this: >>> >>> s = m:someDocument >>> p = m:hasContent >>> o = <xh:p xh="http://www.w3.org/1999/xhtml">Contents of >>> <xh:em>THE</xh:em> paragraph</xh:p>^^rdf:XMLLiteral >>> >>> Note that the datatype of the object node is rdf:XMLLiteral. >>> Also note that I am only using XHTML as an example, and the literal >>> block could be of any XML type. >>> >>> What I would like is to query a model containing this triple, and >>> receive the results as SPARQL-XML, >>> with the literal's contents simply included into the result stream as >>> XML, so we would see output like this: >>> >>> <binding name="o"> >>> <literal >>> datatype="http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral"> >>> <xh:p xmlns:xh="http://www.w3 >>> .org/1999/xhtml">Contents of <xh:em>THE</xh:em> paragraph</xh:p> >>> </literal> >>> </binding> >>> >>> (I have hacked up my copy of ARQ to do this, and it works great, for >>> my purposes!). >>> >>> This is not the normal behavior of ARQ, however. ARQ will always >>> escape the XML tag characters, turning angle brackets into entity >>> references, and so on. >>> >>> I can see how some users might want that escaping, but it also seems >>> reasonable to NOT want it. >>> Turning the escaped tags back into parsed XML requires a consumer >>> (who shares my assumptions >>> and preferences) to serialize the result set document into a buffer >>> and re-parse it from text, which >>> is not fun or fast. From my use case (embedding RDF technology >>> within an established content >>> management application based on the Cocoon XML-pipeline framework), >>> it is very nice to have the >>> result set available as a single unbroken XML tree, which is >>> immediately ready for downstream >>> processing using XSLT. With this feature available, the embedding of >>> small fragments of XML >>> content within RDF models becomes quite attractive in some situations. >>> >>> To me it would be reasonable to control this behaviour ("to escape or >>> not to escape") at the SPARQL >>> query engine API level, probably by setting a flag on the ResultSet >>> object. I proposed this on the >>> jena-dev list (with the simple implementation that I had hacked up >>> for my own use), and Andy gave a >>> very comprehensive and thoughtful response indicating how this >>> serialization issue relates to >>> the design of the XML Schema for SPARQL results, the defined lexical >>> form of the results >>> in the spec, and concerns about reparsing the literals in downstream >>> processes: >>> >>> http://tech.groups.yahoo.com/group/jena-dev/message/25395 >>> >>> I understand the desire to keep schemas tight and not have gratuitous >>> XSD:ANY's flying around. >>> But, on the other hand, it seems to me that RDF+SPARQL users who >>> choose to use the XMLLiteral >>> datatype are essentially choosing to store arbitrary XML within their >>> RDF, and they are tagging >>> it as such. So, if we want to support that use case, allowing the >>> <literal> return block >>> to contain XML, and using the XSD:ANY schema type to implement the >>> facility seems appropriate. >>> I do see that there are some choices which would need to be made in >>> the face of the limited >>> expressiveness of the XML-schema standard, and I won't launch into a >>> discussion of those >>> details unless/until others are interested. >>> >>> I can also see that some apparent collision issues could arise if >>> literal content uses the >>> default namespace, but these don't appear insurmountable. Again, I >>> won't >>> launch into examples until others have had a chance to respond in >>> general terms. >>> (Perhaps this whole issue was already debated somewhere before). >>> >>> In summary: I think that if there was a way for SPARQL engines to >>> (optionally) return >>> XMLLiterals without escaping the tags (and preferably, without >>> violating applicable >>> standards), that would be peachy. >>> >>> sincerely, >>> >>> Stu Baurmann >>> >>> >>> >>> >> >> > >
Received on Friday, 5 October 2007 02:47:34 UTC