[whatwg] Javascript: URLs as element attributes

On 2/9/11 10:12 PM, Ian Hickson wrote:
> On Mon, 15 Nov 2010, Boris Zbarsky wrote:
>> On 11/15/10 8:15 PM, Ian Hickson wrote:
>>>> Gecko's currently-intended behavior is to do what [the spec]
>>>> describes in all cases except:
>>>>
>>>>     <iframe src="javascript:">
>>>>     <object data="javascript:">
>>>>     <embed src="javascript:">
>>>>     <applet code="javascript:">
>>>
>>> What does it do for those cases if it doesn't match the spec?
>>
>> For<iframe>  the behavior in Gecko currently is different in terms of
>> what the URI of the result document of javascript: is set to.
>
> How does it differ? As far as I can tell, it works the same as the spec
> says (the document.location is "about:blank" in the example above).

The example above doesn't actually return a document from the 
javascript: URI; it was a shorthand for a generic javascript: URI that 
does do that.

Try this:

   data:text/html,<body onload="alert(window[0].location)"><iframe 
src="javascript:''">

>> Note that there is some confusion here in terms of browsing contexts and
>> <object>, since<object>  does expose a Document object sometimes (but not
>> others) and does participate in session history sometimes, I believe...  So
>> I'm not quite sure what behavior the spec calls for for<object>.
>
> It's defined; see the section on the<onject>  element.

I've read that section, in fact.  I couldn't make sense of what behavior 
it actually called for.  Has it changed recently (last few months) to 
become clearer such that rereading would be worthwhile?

>> At least in Gecko, the return value string is examined to see whether
>> all the charcode values are<  255.  If they are, then the string is
>> converted to a byte array by just dropping the high byte of every char.
>> So you can pretty easily generate image data this way.
>>
>> If any of the bytes are>  255, then the string is encoded as UTF-8
>> instead.
>
> Hm. This currently isn't specced; the spec just assumes the return value
> is text/html string data and doesn't say what encoding to use. Is there a
> good way to test this in the context of an<iframe>, where all the
> browsers do something with javascript:?

<body onload="alert(window[0].document.characterSet)"><iframe 
src="javascript:'\u0400'">

(can't be a data: URI in webkit, for what it's worth; seems to fail 
same-origin checks).

If I load that from file://, italerts UTF-8 in Gecko, ISO-8859-1 in the 
Webkit-based browsers I have here, empty string in Opera 11 (?).

You could also do things like generate a document that links to a 
stylesheet with no encoding information and see what encoding the sheet 
is treated as.

If the question was whether it's possible to tell by black-box testing 
what the return string is actually treated as, not just what 
characterSet the resulting document reports, I'd have to do some more 
thinking.

-Boris

Received on Wednesday, 9 February 2011 19:52:54 UTC