W3C home > Mailing lists > Public > public-html@w3.org > October 2009

Re: Issues arising from not reparsing

From: Simon Pieters <simonp@opera.com>
Date: Mon, 26 Oct 2009 18:14:50 +0100
To: "Ian Hickson" <ian@hixie.ch>
Cc: "Henri Sivonen" <hsivonen@iki.fi>, "HTMLWG WG" <public-html@w3.org>
Message-ID: <op.u2e3i0n2idj3kv@simon-pieterss-macbook.local>
On Mon, 26 Oct 2009 11:21:49 +0100, Simon Pieters <simonp@opera.com> wrote:

> On Sun, 25 Oct 2009 12:03:19 +0200, Simon Pieters <simonp@opera.com>  
> wrote:
>
>> Further research
>>
>> We could rerun the same data collection but this time with "scripting  
>> enabled" so that we can get data for <noscript>, and also include  
>> <script> so we could get more accurate results than the regexp  
>> searches, in order to find out whether the double escape algorithm can  
>> be tweaked somehow for better compat or less complexity.
>
> <Philip`> zcorpan_:  
> http://philip.html5.org/data/cdata-containing-self-close-with-script.txt
> <zcorpan_> Philip`: thanks!
> <zcorpan_>  
> http://simon.html5.org/dump/cdata-containing-self-close-with-script.xml
> <zcorpan_> 884 occurrences for script


> I'll have a look at the script occurrences and see if I can come to any  
> conclusion.

Having looked at the first 125 occurrences with my compat hat on, I didn't  
come up with anything to change in the spec.


There are some pages doing something like

<script>
<!--
...
	document.write('<SCRIPT LANGUAGE=VBScript\> \n');
	document.write('...');
	document.write('</SCRIPT\> \n');
}
//-->
</script>

However it appears that all of those manage to close their outer escape  
properly (or not use <!-- at all).

Similarly with <\/script>; looking at  
http://philip.html5.org/data/script-open-in-escape.txt again there are no  
occurrences of <\/script> even though  
<script><!--d.w('<script><\/script>');</script> would match the regexp.

This suggests that those aren't really problematic from the compat point  
of view.

But, do we want to say that it's invalid to do something like:

<script><!-- d.w('<script><\/script>'); //--></script>
<script><!-- d.w('<script></script\>'); //--></script>
<script><!-- d.w('<script></scr'+'ipt>'); //--></script>

...? Technically they don't cause a direct problem, however they would  
cause a problem as soon as the author forgets the -->. I'm leaning towards  
flagging these as potentially problematic and something that a validator  
should whine about. If I understand the ABNF correctly, the spec makes  
these invalid currently.

-- 
Simon Pieters
Opera Software
Received on Monday, 26 October 2009 17:15:37 UTC

This archive was generated by hypermail 2.3.1 : Monday, 29 September 2014 09:39:09 UTC