- From: Michael Brundage <xquery@comcast.net>
- Date: Wed, 14 Jan 2004 10:42:29 -0800
- To: David Carlisle <davidc@nag.co.uk>, XQuery Public Comments <public-qt-comments@w3.org>
Although interesting, none of your comments are specific to XQuery. For example, consider embedding this text in an XML document: <foo/>&<?foo?> If you don't escape it first (and unescape it when parsing), then it would be interpreted as structure instead of text. If the parser interprets the processing instruction to mean something, then you've achieved XML injection. Now consider embedding the XPath "<foo/>&<?foo?>" into the text of an XML document as you show below. The XPath is just a literal string, but the XML parser would treat this as a quote character, followed by an element, an entity, a p-i, and a quote character. So these problems already exist for XPath, without even resorting to full-blown XQuery. Therefore, they can't be avoided, without removing XPath from XQuery. Also, embedding XQuery queries by wrapping them with CDATA sections clearly doesn't work when a query contains the sequence ]]> (in a string, a comment, as the terminator of a CDATA section, etc.). Although the embedding could work around this by splitting into separate CDATA sections at each occurrence of ]]>, I have yet to see an XML application that uses CDATA in this way account for this (and I can name several break when you embed ]]>). Just my $0.02, michael On 1/14/04 3:05 AM, "David Carlisle" <davidc@nag.co.uk> wrote: > I believe that the approach outlined in this section is very dangerous > and can easily lead to queries being accidentally or maliciously altered. > > It is unfortunate that XQuery misuses XML syntax for a non XML language > (previous comments to this list on XML Query have suggested that it does > not do that, but I assume now that the current syntax is fixed) However > this means that great care needs to be taken when inserting XML Query > fragments into XML documents. > > Even in this small section, three different embeddings are suggested, > and no indication is given in the embedding syntax about which embedding > has been used. > > Most problematic are situations where extracting the query using the > wrong embedding produce a valid, but different, Xquery from the one > intended. > > For example the second embedding shown > > <XQuery><![CDATA[for $i... let $j...where $x < $y...return...]]></XQuery> > > > Could (if the XML parser used, reports CDATA sections) be extracted using > the embedding used for the first example. the result would be that > instead of getting an Xquery FOR expression, you get an Xquery CDATA > constructor, this is a perfectly valid Xquery expression. Of course one > might to be expected to use common sense to distinguish the cases, but > machines are not too good at common sense, and in harder cases it would > be harder to guess. > > Similar things occur with > > <XQuery> ("abc","xyz") </XqueryX> > > If you put that "trivial" embedding through an XML parser which reports > that the content of the XQuery element is > ("abc","xyz") > then I don't see how you could reliably decide whether the original > expression was a sequence of one or two strings. > > Other problems exist with things such as Xquery Comment constructors > if these are embedded using the first scheme (ie just literally > included) then whether or not the extracted query contains a comment > constructor will depend on whether the XML parser used reports > comments. > > > My preferred solution would be to modify the Xquery syntax so that the > first suggested embedding is always legal and safe, this means > essentially not using unescaped <, modifying the rules for the timing of > the expansion of character references, and using a different syntax for > comment and pi constructors (as in xslt) however failing that: section 5 > should be dropped or replaced by a much more fully spec'd proposal that > would allow Xqueries to be unambiguously and safely embedded in XML > documents. > > David > > ________________________________________________________________________ > This e-mail has been scanned for all viruses by Star Internet. The > service is powered by MessageLabs. For more information on a proactive > anti-virus service working around the clock, around the globe, visit: > http://www.star.net.uk > ________________________________________________________________________ >
Received on Wednesday, 14 January 2004 13:45:37 UTC