- From: Henri Sivonen <hsivonen@iki.fi>
- Date: Wed, 11 Mar 2009 10:20:34 +0200
- To: Ian Hickson <ian@hixie.ch>
- Cc: Doug Schepers <schepers@w3.org>, public-html@w3.org, www-svg <www-svg@w3.org>
On Mar 11, 2009, at 01:48, Ian Hickson wrote: > On Tue, 10 Mar 2009, Doug Schepers wrote: >> * For the case where an SVG file is inadvertently served as 'text/ >> html', >> the SVG WG proposes that if the parser encounters an 'svg' element in >> the "before html" parse mode that no 'html' and 'body' element be >> inserted above the 'svg' element. [...] > Why would we want to support SVG files sent as text/html? Surely > this is > an error and should not be supported. See http://lists.w3.org/Archives/Public/www-archive/2009Mar/0036.html for a use case. (Note that the use case makes sense if <svg> as root is conforming. I think it isn't worthwhile to add all the complexity of <svg> root handling as mere error handling.) > * I am concerned also that actually implementing this would consist > of a > significant change to the parsing algorithm, reaching across > multiple > insertion modes, affecting very sensitive things like quirks mode > detection. Indeed, I think we should stick to <!DOCTYPE html> to avoid complicating the mode detection and to avoid making it more stateful across tokens. However, I can see how <!DOCTYPE html><svg>... might be seen as aesthetically unpleasing (even though such a construct would even be allowed in well-formed XML!). > I am also concerned that this would lead to very strange behavior for > authors once they started relying on it. Consider for instance the > difference between this: > > [BOM]<svg>... > > ...and: > > [BOM][BOM]<svg>... I think issues of this nature are the strongest reason against. >> * Ideally, the SVG WG would like the HTML tokenizer to be >> case-preserving for attribute and element names. > > My understanding is that doing this would introduce an unacceptable > performance penalty for implementations. Indeed. The case information is lost in the tokenizer early on: http://hg.mozilla.org/users/mrbkap_mozilla.com/html5parsing/file/ed748ec71a6d/content/html/parser/src/nsHtml5Tokenizer.cpp#l655 Then the token interning function can work without caring about case: http://hg.mozilla.org/users/mrbkap_mozilla.com/html5parsing/file/ed748ec71a6d/content/html/parser/src/nsHtml5ElementName.cpp#l58 The camelCase SVG tokens are shared between all parser instances: http://hg.mozilla.org/users/mrbkap_mozilla.com/html5parsing/file/ed748ec71a6d/content/html/parser/src/nsHtml5ElementName.cpp#l507 Then much later SVG camelCase names are fixed based on the pre- interned well-known tokens: http://hg.mozilla.org/users/mrbkap_mozilla.com/html5parsing/file/ed748ec71a6d/content/html/parser/src/nsHtml5TreeBuilder.cpp#l3416 By the time the tree builder knows that the token creates an SVG element, the case information is long lost, and the object the tree builder works with is a pre-interned read-only object shared by all parser instances in the process, and such shared objects can't have any per-parser-instance data such as the original case on them. I wouldn't want to undo this token interning mechanism just to be able to send errors to the error console in Firefox. Also, I wouldn't want to maintain significantly different code paths for different classes of products (error-detecting and not error-detecting). >> * The SVG WG requests that minimized and unquoted attribute values >> raise parse >> errors when found on SVG elements. Rationale: >> 1. Consistent with making incorrect xmlns attributes generate parse >> error. >> 2. Minimizing the number of documents which are conforming HTML >> whose SVG >> fragments when copied to "image/svg+xml" are non-wellformed. > > This seems reasonable; what do other people think about this? (There > have > been requests that we make SVG-in-HTML support HTML-like attribute > syntax.) I think it makes sense to make it a conformance error if an SVG element has an attribute xmlns whose value is not the SVG namespace URI. However, I see no point in making the absence of the xmlns attribute an error, when thing can be made work just fine without it. I don't like the idea of making attributes have different errors for SVG elements: 1) It would make text/html self-inconsistent where self- inconsistency is easily avoidable. 2) It would complicate an error-reporting tokenizer. Furthermore, the doing the check only for tokens that will later result in SVG elements would be complicated. The most reasonable implementation would only query the 'in foreign' state, which would mean that the errors would apply to MathML as well and to HTML elements that end up breaking out of foreign content. I think the issue of moving content from text/html to image/svg+xml by copying and pasting raw source is a lost cause anyway. I think a browser context menu item "Save as SVG Image..." or "View SVG source" would work much better. For a solution that doesn't require a browser, I put forward the HTML2XML command line tool that comes with the Validator.nu HTML Parser. Concretely, I see no practical reason why this demo should be non- conforming: http://hsivonen.iki.fi/test/moz/html5-parsing.html -- Henri Sivonen hsivonen@iki.fi http://hsivonen.iki.fi/
Received on Wednesday, 11 March 2009 08:21:31 UTC