- From: Kristof Zelechovski <giecrilj@stegny.2a.pl>
- Date: Tue, 17 Jun 2008 18:18:55 +0200
1. Please elaborate how an extension of CSS would require a sanitizer update. 2. Please explain why using a dedicated tag with double parsing is easier for a Web developer than putting the code in an attribute. 3. Your quoting solution would not cause legacy browsers to show plain text; they would show HTML code, which is probably much worse than showing plain text. If you mean JavaScript can be used to extract plain text, I doubt it will be simple; there are probably lots of junctions where this procedure can derail. 4. Please explain why you consider network efficiency for legacy user agents essential. I believe that the correlation between efficiency and compatibility is negative in general. If that causes server overload, the server can discriminate content depending on the user agent; this is a temporary solution to an edge case and it should probably be acceptable. Besides, a blog page with 60 comments on it is rather hard to render and read so you should probably consider other display options in this case. 5. I am not sure why IFRAME content must be HTTP-secured if the containing page is. The specification does not impose such a restriction AFAIK. And, if you need to go secure, do not allow scribbling in the first place, right? Please take these points as a challenge, not as an attempt to let you down. I personally think your idea is worth considering. Chris -----Original Message----- From: whatwg-bounces@lists.whatwg.org [mailto:whatwg-bounces at lists.whatwg.org] On Behalf Of Frode Borli Sent: Tuesday, June 17, 2008 3:05 PM To: whatwg at lists.whatwg.org Subject: Re: [whatwg] Sandboxing to accommodate user generated content. I have been reading up on past discussions on sandboxing content, and I feel that it is generally agreed on that there should be some mechanism for marking content as "user generated". The discussion mainly appears to be focused on implementation. Please read my implementation notes at the end of this message on how we can include this function safely for both HTML 4 and HTML 5 browsers, and still allow HTML 4 browsers to function properly. My main arguments for having this feature (in one form or another) in the browser is: - It is future proof. Changes to browsers (for example adding expression support to css) will never again require old sanitizers to be updated. - It does not require much skill and effort from the web developer to safely sanitize user content. - Security bugs are fixed by browser vendors, and not by each web developer. In the discussions I find that backward compatability is absolutely the most important issue. Second is that it must be easy for web developers to use the features. The suggested solution of using an attribute on an <iframe> element for storing the user generated content has several problems; 1: The use of src= as a fallback means that style information will be lost and stylesheets must be loaded again. 2: The use of src= yields problems with iframe heights (since the src-url must be hosted on another server javascript cannot fix this) and HTML 4 browsers have no other method of adjusting the iframe height according to the content. 3: If you have a page that lists 60 comments on a blog, then the user agent would have to contact the server 60 times to fetch each comment. This again means that perl/php scripts have to be invoked 60 times for one page view - that is 61 separate database connections and session initializations. 4: For the fallback method of using src= for HTML 4 browsers to actually work, the fallback documents must be hosted on a separate domain name. This again means that a website using HTTPS must purchase and maintain two certificates. I do not believe this solution will ever be used. My solution: If we add a new element <htmlarea></htmlarea>, old browsers will run scripts, while new browsers will stop scripts and this is a major problem. If HTML 5 browsers require everything between <htmlarea></htmlarea> to be html entity escaped, that is < and > must be replaced with < and > respectively. If this is not done, HTML 5 browsers will issue a severe warning and refuse to display the page. Developers will quickly learn. HTML 4 browsers will never run scripts (since it will only see plain text). HTML 5 browsers will display rich text. It would be completely secure for both HTML 4 and HTML 5 browsers. A simple Javascript could clean up the HTML markup for HTML 4 browsers.. > I believe the idea to deal with this is to add another attribute to <iframe>, besides sandbox="" and seamless="" we already have for sandboxing. This attribute, doc="", would take a string of markup where you would only need to escape the quotation character used (so either ' or "). The fallback for legacy user agents would be the src="" attribute. -- Best regards / Med vennlig hilsen Frode Borli Seria.no Mobile: +47 406 16 637 Company: +47 216 90 000 Fax: +47 216 91 000 Think about the environment. Do not print this e-mail unless you really need to. Tenk miljo. Ikke skriv ut denne e-posten dersom det ikke er nodvendig.
Received on Tuesday, 17 June 2008 09:18:55 UTC