Re: parsing URI (references) according to RFC 3986

On 6/19/2011 10:54 PM, Boris Zbarsky wrote:
> On 6/20/11 1:32 AM, Chris Weber wrote:
>> It seems a little scary if I understand what you're saying correctly.
>
> Why scary (other than the need to figure out whether this is needed for
> compat and if so specify it)? Are there security issues here that you see?
>

I spoke a bit too soon.  I don't see any security issue here.

> Note that this behavior seems somewhat interoperable in the 5 browsers
> you tested, and that the first test you ran (where you were testing the
> \ behavior) actually depended on it....
>
>> More scary perhaps is that when I test "file://c|/0110/foo" I see in the
>> DOM parsing that IE and Chrome both convert the "|" to the ":".
>
> Again, why is this scary, apart from the magic-ness of it all?

It raises alarm bells for me any time a character is transformed from 
one thing into another.  A classic example is in HTML when a U+FF1C 
FULLWIDTH LESS-THAN SIGN is transformed into a U+003C LESS-THAN SIGN. 
I'm not saying any browser's do this, but I've seen cases where an XSS 
filter did, which easily led to an exploit.

In the case of a "|" transforming to ":" it scares me the same way - 
anything performing security checks on a path component "before" the 
transformation took place would be at risk.

Were you aware that IE and Chrome converted file://c|/foo to 
file:///c:/foo and file:///C:/foo respectively?  It was your test case 
after all :)

I understand this is known behavior on Windows:

http://stackoverflow.com/questions/5026585/file-uri-on-windows-server-2008-wont-accept-colon-but-needs-pipe

http://en.wikipedia.org/wiki/File_protocol#Windows_2

I can understand being liberal in accepting "|" characters in the path 
segment, even though 3986 and 3987bis would have you percent-encode it 
to "%7C".  But I didn't realize that IE and Chrome would actually 
perform a transformation on the input in this way.

Chris

Received on Monday, 20 June 2011 07:11:40 UTC