- From: Henri Sivonen <hsivonen@iki.fi>
- Date: Tue, 16 Oct 2007 13:37:15 +0300
- To: Doug Schepers <schepers@w3.org>
- Cc: www-svg <www-svg@w3.org>, public-cdf@w3.org, "public-html@w3.org" <public-html@w3.org>
On Oct 13, 2007, at 23:20, Doug Schepers wrote: > Henri Sivonen wrote (on 10/13/2007 10:43 AM): >> Do you mean you'd like to bring in the complication of arbitrary >> namespace prefixes? > > Not necessarily. I'm fine with imposing certain limitations on SVG > content, assuming that it's a set of limitations that can be easily > obeyed by authoring tools (and which, preferably, existing > authoring tools abide by anyway). It seems to me that using colonless element names is an easy limitation for authoring tools to follow. > The most important thing for me is that SVG fragments from an HTML > +SVG (SVG-in-HTML) compound document could be extracted as > standalone SVG documents; the second most important thing is that > the most likely content from standalone SVG documents should work > as an SVG fragment in HTML (this is second because I think it is > likely that this will be the case, given existing SVG content- > creation tools). Do you mean the extraction from HTML should work on the source copy- paste level as opposed to using a tool that incorporates an HTML parser and an XML serializer? Even if the conforming case were carefully specced to allow such copy-paste, content out there would inevitably start to contain constructs that wouldn't be safe for pasting into XML (like content that tries to be XHTML 1.0-as-text/ html is now unsafe for pasting into XML on the source level), so doing the extraction using a parser followed by a serializer would be the safe way to go. >> I'd like make the following deviations from >> SVG-as-XML syntax: >> 1) I'd like to minimize the need of tokenizer parametrization to >> toggling case folding behavior and, if we must, CDATA sections. > > Strictly speaking, CDATA sections are not required in SVG, but as > you know, script will break in an XML parser it if doesn't escape > its "<" and "&" characters. The majority of SVG authoring tools, I > suspect, are not script-aware: they are just drawing apps that > export to SVG; people savvy enough to be scripting can be expected > to take precautions and read FAQs to resolve their problems there. > > Even drawing tools, though, are likely to use CSS, and may > automatically enclose it in a CDATA section "just to be safe". It > would be worthwhile to look at the survey of tools and see if they > do this, and if so, if they can be encouraged to change this practice. > > I would prefer that CDATA be allowed, but it's not a deal-breaker. > I confess I don't know why it's a problem in the HTML parser, > though, if you care to explain. Introducing CDATA sections wholesale into text/html (also into the HTML parts of the document) would be a problem because new CDATA- aware parsers and old CDATA-unaware parsers would give incompatible parse trees and the incompatibility wouldn't even add any expressiveness to the language. As for introducing CDATA sections but only for <svg> subtrees only, there's the issue of whether to be consistent with the surrounding HTML syntax or with XML syntax. Copy-pasteability suggests supporting XMLisms like CDATA sections and /> is <svg> subtrees. Consistency with the surrounding HTML would suggest not supporting CDATA sections. The general problem with SVG <title>, <script>, <style> and <textArea> is ensuring that they don't produce ungraceful results when an SVG-in-text/html document is loaded in a legacy text/html browser. It seems to me that authors who want to avoid <textArea> rendering as HTML <textarea> in legacy browsers just have to avoid <textArea> in SVG-in-text/html. <title> seems harmless enough when the surrounding HTML already has a <title> of its own. In the case of <style> and <script>, legacy browsers would try to treat them as HTML <style> and <script>. Parsing them the same way as HTML <style> and <script> in the case of SVG-in-text/html would at least ensure that both old and new parsers agree on when the elements end even when the script/style content touches edge cases. On the other hand, having CDATA sections and not having element-specific tokenization content models would be good for copying and pasting from XML files. I can't say off-hand which approach is the best. > Most tools do include XML prologs and DOCTYPES in their SVG > output... what affect will this have on a whole-file copy-paste > into HTML, in terms of parsing? You can't paste an XML declaration or a DOCTYPE in the middle of an XHTML+SVG document, so from the conformance point of view I don't think it is necessary to allow them to be pasted in the middle of text/html. As for what should happen if you paste them in nonetheless, I think the current behavior of the HTML5 parsing algorithm is reasonable: the XML declaration turns into a comment node and the doctype gets dropped. >> Specifically, I think attribute tokenization should run the same >> code as attribute tokenization for the HTML parts of text/html. > > Could you elaborate on that? What are the implications? Unquoted attributes would be treated as in text/html in general. XML attribute value normalization wouldn't be performed. (That is, authors should rely on the parser discarding white space around the value. Authors simply shouldn't put extra spaces in there. This is already good advice with XML when the author doesn't know the configuration of the receiving XML parser.) White space between the close quote of a previous attribute and the name of the next attribute wouldn't be required. >> 2) I'd like to avoid supporting arbitrary namespace prefixes both >> in order to sidestep issues in shipped IE versions and in order to >> relieve authors of namespace syntax. (xlink: should probably be >> considered non-arbitrary and hard-wired.) > > I think it's reasonable both to limit arbitrary namespace prefixes > in HTML+SVG, and to hard-wire the XLink namespace. That SVG- > fragment content will still work as expected in a standalone SVG > UA, and most people trying to do clever things in namespaces will > probably be using XHTML+SVG anyway. OK. >> The above trial balloon proposal is designed to optimize SVG >> integration in text/html in *future* browsers in a way that would >> create a namespace-aware DOM that current DOM-based SVG >> implementations would grok immediately but would at the same time >> remove namespace declaration syntax from the sight of authors. The >> proposal specifically isn't designed to clone the colon-based >> namespaces-in-text/html mechanism of IE. OTOH, it shouldn't >> interfere with it, either, except perhaps for xlink:href, which >> could be worked around by introducing href. > > I'm still on the fence about 'null:href'. Can you explain in > detail why this is so problematic in HTML5 (especially given that > SVG isn't natively supported in IE anyway)? Perhaps special-casing xlink:href *only* isn't that bad, but specifying new processing for names with colons *in general* carries the risk of specifying something that's incompatible with what happens when the syntax is fed to current IE. I've got an impression that Microsoft doesn't want to change what they do with names that contain colons, but I guess it is best if they comment on that. (I don't currently have access to IE, so I can't test what exactly happens with xlink:href.) -- Henri Sivonen hsivonen@iki.fi http://hsivonen.iki.fi/
Received on Tuesday, 16 October 2007 10:37:47 UTC