- From: Mike Brown <mike@skew.org>
- Date: Thu, 13 Feb 2003 11:39:14 -0700 (MST)
- To: "Kay, Michael" <Michael.Kay@softwareag.com>
- CC: Mike Brown <mike@skew.org>, public-qt-comments@w3.org
Kay, Michael wrote: > > XSLT 2.0 section 20.3 says: > > > > The html output method should not perform escaping for the > > content of > > the script and style elements. > > > > According to HTML 4.01 section B.3.2, "</" within the content > > of a SCRIPT or STYLE slement should be escaped. > > Thanks for alerting me to this (non-normative) appendix. 'The following notes are informative, not normative. Despite the appearance of words such as "must" and "should", all requirements in this section appear elsewhere in the specification.' To me this is saying "we're designating this section as informative even though it reads like it's normative. The reason we're doing this is because there's nothing new here; all requirements listed here are also listed elsewhere." So the appendix is just further explanation of requirements that are more formally defined elsewhere. For example, "HTML is an SGML application..." (http://www.w3.org/TR/REC-html40/conform.html#h-4.2) ...therefore it is implicit that an HTML document must conform to SGML's restrictions. I don't think we are in disagreement over what constitutes "correct" vs "widely-accepted but incorrect" HTML. I think you are just misinterpreting my recommendations (that the spec acknowledge these issues and encourage the output of correct HTML) being a call to make failure to output correct HTML be regarded as an error or conformance issue for an XSLT processor. I'm sorry that I wasn't more clear. > > HTML SCRIPT elements do cause some serious problems. The fact is, users > frequently do produce the cited "illegal" example: > > <SCRIPT type="text/javascript"> > document.write ("<EM>This won't work</EM>") > </SCRIPT> > > and they seem to get away with it. What's more, they produce it in two > different ways, both of which generally work: > > (1) as an element: > > <xsl:template match="x"> > <SCRIPT type="text/javascript"> > document.write ("<EM>This won't work</EM>") > </SCRIPT> > </xsl:template> > > (2) as characters: > > <xsl:template match="x"> > <SCRIPT type="text/javascript"> > document.write ("<EM>This won't work</EM>") > </SCRIPT> > </xsl:template> The fact that they get away with either of these is testament to the lenience of HTML user agents in accepting what is, strictly speaking, illegal HTML. It is much the same as with the characters #x80-#x9F being disallowed, even by reference, according to HTML's SGML declaration, yet being accepted in common practice. The XSLT spec's failure to encourage HTML generators to produce correct output just perpetuates the situation. I feel that an XSLT processor should not be forbidden from producing strictly conforming SGML HTML, rather than just widely-accepted tag soup; it should be encouraged, but not required, to make the output as correct as possible. "strict in what you produce, lenient in what you accept".. or something like that. > > > > It also indicates that unencodable characters appearing in a > > SCRIPT or STYLE element "must" be escaped according to the > > conventions of the script or style language's syntax -- they > > cannot be written as character references. This is a bit > > unreasonable to impose on an XSLT processor; nevertheless, it > > should be acknowledged. Processors should be given the option > > of performing such escaping, and a fallback behavior (of > > emitting a character reference anyway, I presume) should be defined. > > > > I think the right solution for XSLT is that we should pass on this > requirement to use language-specific escaping to the user. We should include > an example showing that the correct way to write the above is: > > (3) > <xsl:template match="x"> > <SCRIPT type="text/javascript"> > document.write ("<EM>This will work<\/EM>") > </SCRIPT> > </xsl:template> > > I don't think we should make either (1) or (2) an error, even though they > result in illegal HTML - there are many, many ways to produce illegal HTML > as the output of an XSLT stylesheet. If you do provide an example, then it should also demonstrate the unencodable character issue. The paragraph quoted above is saying that it is illegal to do something like this in a us-ascii encoded document: <script type="text/javascript"> document.write('¡Hola!'); </script> You must do this (I think this is right): <script type="text/javascript"> document.write('\u00A1Hola!'); </script> I think it is probably safest to just recommend that any non-ASCII characters in a script or style element be written according to the script or style language's syntax. Mike -- Mike J. Brown | http://skew.org/~mike/resume/ Denver, CO, USA | http://skew.org/xml/
Received on Thursday, 13 February 2003 13:39:25 UTC