- From: Doug Schepers <schepers@w3.org>
- Date: Tue, 10 Mar 2009 18:57:29 -0400
- To: public-html@w3.org, www-svg <www-svg@w3.org>
Hi, HTML WG- These are some of the opinions of the SVG WG on the topic of SVG-in-text/html, for consideration by the HTML WG. The opinions of individuals within the SVG WG differs; some favor a pure-XML approach, and some are more predisposed to a looser syntax, but in general, this is the state of our group consensus. We are happy to discuss specific details. The requirements are based on consensus reached at the SVG WG Sydney F2F 2009, and in part from TPAC 2008. For reference and contrast, please see the old SVG in HTML proposal from the SVG WG. [1] Requirements 1. HTML5 and SVG should make every effort to minimize the learning curve, pitfalls and other undesirable issues that content authors may encounter due to differences between SVG served as image/svg+xml and SVG in text/html, and when it comes to moving SVG between these two types of document. In so far as is possible, content authors should be able to take a valid SVG document, paste its markup into an HTML document, and have it render as expected and have the SVG fragment's DOM be identical to the DOM of the standalone SVG document when served as image/svg+xml. Content authors should not be burdened with unnecessary debugging, tweaking or cleanup steps in the common case when it comes to this simple process. 2. HTML5 should not place unnecessary barriers in the way of, or unnecessarily restrict, the future evolution of the SVG language. (Both working groups should coordinate to maximize compatibility between the two specifications and avoid standing on each others toes, of course.) In line with making the Open Web Platform as easy and pain free to use as possible, the WG believes that, in general, when HTML5 parsers encounter SVG that would not be valid XML SVG, the SVG should be non-conforming, even though it would render. The rational is that validators and error consoles would flag and raise awareness of any issues in someones HTML SVG that would stop them from copying the markup out to an XML file, and thereby be another weapon in reducing author pain when working with open formats like SVG and HTML. Feedback The following is feedback on the "foreign content" text that is currently commented out of the HTML5 draft by <!--XXXSVG ... --> comments. (These comments can be seen by loading pages from the parsing section of the HTML5 draft [2], running this Show XXXSVG comments bookmarklet [3], and then searching the pages for "XXXSVG".) * The SVG WG is of the opinion that the contents of the SVG 'title' element should be RCDATA, and therefore would prefer that the HTML5 parsing algorithm not require conforming parsers to break out of foreign content mode and parse the element's content as HTML. * The SVG WG feels that, on balance, it would be useful for the contents of SVG's 'desc' and 'foreignObject' elements to be parsed as HTML by default, and therefore do not object to the HTML5 draft requiring conforming parsers to break out of foreign content mode to parse the content of these elements. However, the SVG WG does have some concerns regarding adverse effects on extensibility. We also do not support the use of 'desc' as a container for fallback content, as has been suggested, though we do agree that a fallback mechanism for both SVG and HTML is a useful idea. * The SVG WG recognizes that entities pose a particular challenge: undefined entity/character references won't work if SVG fragments are copied out of HTML, and DOCTYPE-defined entities (as is common for some SVG authoring tools) could only work if those entities definitions are included in the file and are somehow recognized. The same problem could also occur in XHTML+SVG documents. In general, the SVG WG agrees that special-casing some entity handling is acceptable, and is happy to have a further dialog with implementers about this. * For the 'font' element: If the HTML WG believes that it's worth the extra complexity of implementation with the special handling of the <font> element in order to have a minor fraction of existing html content not change its rendering, then ok. (The SVG WG thinks it's good that the <font> element won't break out of foreign content mode for SVG for the most part.) * There's a comment in the HTML5 spec [[<!--XXXSVG need to define processing for </script> to match HTML5's </script> processing -->]] Could the HTML WG please clarify what is required with regards to that? * In XML CDATA-sections are distinct from text, but in HTML it's all the same. It means scripts that look at the structure of documents may not work. However, this is a minor issue that the SVG WG is willing to live with. * The SVG WG is happy to see that XML and DOCTYPE declarations are ignored if found under the root element of the document. In that case they should have no effect (though it may be useful to discuss this in terms of the effect on entities declared in the DOCTYPE). * The HTML5 draft defines a set of tags names for which the parser should break out of foreign content mode. The SVG WG would like to know the rationale for doing so for each of these tags. * The SVG WG suggests that unless proven to be breaking lots of content, adding character encoding-detection for SVG files served as "text/html" based on <?xml encoding="..."?>. There would still be an issue with UTF-8 SVG documents lacking an XML declaration; perhaps the fact that the first open tag encountered in the document is an <svg> tag could make the encoding guesser choose UTF-8 in this case? * Ideally, the SVG WG would like the HTML tokenizer to be case-preserving for attribute and element names. * The SVG WG requests that the SVG case-fixup table be removed from the draft. We believe that HTML5 should defer to the appropriate (SVG) specification(s), and that this is not something that HTML5 should define. If the tokenizer is required to be case-preserving, the table is no longer necessary. * Going forward, the SVG WG recognizes that choosing all lowercase attribute names would be helpful for both integration in HTML and if certain attributes are to become CSS properties. Choosing all lowercase element names would also be preferred, although in some cases consistency would dictate that we would introduce some new mixed case element names. For example, if we introduced a new filter primitive element that didn't adhere to the "feSomethingOrOther" style, it would be confusing for authors. * For the case where an SVG file is inadvertently served as 'text/html', the SVG WG proposes that if the parser encounters an 'svg' element in the "before html" parse mode that no 'html' and 'body' element be inserted above the 'svg' element. Rather, we would prefer that the parser be required to simply insert the 'svg' element and switch to foreign content mode. (HTML5 could specify that documents with 'svg' as the root element are non-conforming so validators would flag this case.) There are at least two reasons for making this change. First, if parented by an implicit 'body' element, most SVG (specifically SVG that depends on the default value of 100% for the 'height' attribute on the 'svg' element) would then get a used height of the 150px (the CSS 2.1 replaced element fallback height). This would result in SVG mistakenly or deliberately served as text/html rendering differently to the same SVG viewed locally or served as image/svg+xml. Secondly, accessing the 'document.documentElement' object is common in JavaScript in SVG, and SVG assumes that this will be the 'svg' element and will not be prepared to encounter inserted parent 'html' and 'body' elements. This script would need to be change if pasted in the middle of an HTML document, but we would be able to prevent breakage if the SVG were pasted as the whole document. Such documents should be in standards mode, regardless of whether they include the SVG DOCTYPE. We do have one unresolved issue with our request, however. If the parser encounters an HTML start tags that break out of foreign content mode, where would it "break out" to (There's no <body> element to pop back to)? * When SVG fragments in HTML are encountered, any invalid element or attribute casing should be generating parse errors. * The SVG WG is happy to see that unknown elements that are inside SVG fragments are inserted as SVG elements, but we'd like to see the casing of attributes and element names preserved. * The SVG WG agrees that foreign content should not be allowed to imply start or end tags. * The SVG WG requests that minimized and unquoted attribute values raise parse errors when found on SVG elements. Rationale: 1. Consistent with making incorrect xmlns attributes generate parse error. 2. Minimizing the number of documents which are conforming HTML whose SVG fragments when copied to "image/svg+xml" are non-wellformed. * The SVG WG agrees that it may be useful to forego namespace declarations for the SVG and XLink namespaces (as well as certain others, such as MathML). However, we believe that rather than hardcoding the namespace prefixes, those prefixes should default to that namespace. We are not suggesting at this time that namespace declarations should be able to override that default in HTML5, but some future revision of the language may specify that behavior, and hardcoding limits the potential for future extensibility solutions. There are other issues on which we do not yet have consensus, and other considerations we believe are germane. These issues are always available for public feedback on the SVG WG wiki. [4] We are eager to discuss our feedback, and hope for a timely resolution. [1] http://dev.w3.org/SVG/proposals/svg-html/svg-html-proposal.html [2] http://www.whatwg.org/specs/web-apps/current-work/multipage/parsing.html [3] http://bookmarklet.link/ [4] http://www.w3.org/Graphics/SVG/WG/wiki/SVG_in_text-html_2009 Regards- -Doug Schepers, on behalf of the SVG WG
Received on Tuesday, 10 March 2009 22:57:45 UTC