W3C home > Mailing lists > Public > xsl-editors@w3.org > April to June 2000

Re: attribute value escaping

From: Mike Brown <mike@skew.org>
Date: Fri, 9 Jun 2000 18:59:28 -0600 (MDT)
Message-Id: <200006100059.SAA68112@skew.org>
To: xsl-list@mulberrytech.com
CC: xsl-editors@w3.org
Mike Kay wrote:
> The only characters I escape are non-ASCII characters (specifically,
> characters not in the range 32 to 126) plus space and "%".

I think the only room for leeway here is in the XSLT spec's statement that
the HTML output method outputs HTML that "conforms to" the Recommendation.
Since the HTML DTDs are part of the Recommendation, and they all state
that certain attribute values (like img src) must be URIs conforming to
RFC 2396, then the question is, should the XSL processor try to ensure
that such attribute values conform to RFC 2396?

If the answer is yes, then not only should non-ASCII characters be escaped
as per the spec, but spaces should be as well. "%" should be escaped if it
is not followed by a pair of hex characters from [012345679ABCDEF]. This
of course is problematic because it assumes that anything that looks like
an escape sequence must be one. All other characters, including "`" like I
mentioned earlier in this thread, would not be safe to escape, per RFC
2396 sec. 2.4.2., which emphasizes that URIs are by definition "already
escaped".

If the answer is no, as I think it should be, then only the non-ASCII
characters should be escaped, and spaces, "%" etc should be left alone,
because it is the document author's job to make it be a valid URI.

Obviously this will not help the poor guy trying to write JHTML that
(apparently) allows an img src to be script data. Even if the HTML spec
allowed an img src to be script data, the XSLT spec's omission of "script
data"-type attribute values from the no-escaping-for-<script>-and-<style>
clause would still cause problems.

I am cc'ing xsl-editors@w3.org because I'd like to know the answer to the
yes/no question above, re: escaping URI-type attribute values in the HTML
output method, and because I would like to know if "script data"-type
attribute values should get the same no-escaping treatment as <script> and
<style> elements.

   - Mike
____________________________________________________________________
Mike J. Brown, software engineer at         My XML/XSL resources:
webb.net in Denver, Colorado, USA           http://www.skew.org/xml/
Received on Friday, 9 June 2000 20:59:36 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 7 December 2009 10:59:50 GMT