Re: SVG in text/html (was: @role in SVG)

On Oct 12, 2007, at 19:09, Doug Schepers wrote:
> Henri Sivonen wrote (on 10/12/2007 7:23 AM):
>> We don't do inline SVG in text/html yet. Personally, I hope we'll  
>> get there. However, if we do, the main SVG complications will be  
>> the xlink mapping, the /> syntax and SVG-native camelCaps. I don't  
>> think it is a good idea to introduce more complications if we are  
>> already entertaining inline SVG in text/html as a possibility.
> Thanks for outlining the challenges to integrating SVG into text/ 
> html, from an HTML5 standpoint.  That's very helpful.
> I also want that to happen, and would like to facilitate that when  
> the time comes.  Also like you, I do have certain concerns about  
> how it's done.  I'll give you my viewpoint (which is not  
> necessarily shared by the rest of the SVG or CDF WGs).
> From a technical and market viewpoint (an odd pairing, perhaps), I  
> feel very strongly that SVG-in-HTML should maintain identical  
> markup syntax with standalone SVG (or SVG-in-XHTML, and probably X/ 
> HTML-in-SVG); any differences between the two syntaces would be  
> actively harmful to SVG.

Do you mean you'd like to bring in the complication of arbitrary  
namespace prefixes? I'd like make the following deviations from SVG- 
as-XML syntax:
  1) I'd like to minimize the need of tokenizer parametrization to  
toggling case folding behavior and, if we must, CDATA sections.  
Specifically, I think attribute tokenization should run the same code  
as attribute tokenization for the HTML parts of text/html.
  2) I'd like to avoid supporting arbitrary namespace prefixes both  
in order to sidestep issues in shipped IE versions and in order to  
relieve authors of namespace syntax. (xlink: should probably be  
considered non-arbitrary and hard-wired.)

More concretely, I've been thinking something like this might work:
  * Case folding in the tokenizer should be made conditional so that  
potentially camelCap names in <svg> subtrees would not be case-folded.
    - Issue: Should case folding be toggled on and off (in which case  
tokenizing "<svg " would happen in the case-folding state allowing  
"<SvG ") or should names be collected unfolded and then whole names  
conditionally case-folded (in which case we could require "<svg " to  
be in lower case)?
    - Issue 2: If the latter, to avoid expensively case-folding whole  
start tag tokens *including* attributes later on, the tokenizer  
should probably have to know about tag names that turn on the case- 
preserving mode before looking for attributes but the tree builder  
should be the part of the parser telling the tokenizer to switch back  
to the case folding mode. This would be ugly but probably necessary.
  * Start tag tokens should have a flag about the /> presence. The  
tree builder would ignore this for HTML elements but would pop  
immediately for SVG elements.
  * The <svg> element would establish "an SVG scope" in the tree  
builder. The <svg> start tag token would itself be handled in the  
HTML state of the tree builder so that the <svg> element would be  
subject to foster parenting.
  * When in an SVG scope, the tree builder would ignore the HTML tree  
building rules. This means that stray tags looking like HTML tags  
could not cause the tree builder to pop out of the SVG scope. While  
in the SVG scope, the tree builder would assign the SVG namespace URI  
to the element nodes it creates.
    - Issue: What to do if there is a prefixed element?
  * When in the SVG scope, a start tag token would unconditionally  
result in the corresponding element node to be appended to the  
current node. (And if the /> flag is set on the token, the node would  
be popped immediately.)
  * When in the SVG scope, an end tag token would cause a  
corresponding element to be searched starting with the current node  
towards the start of the SVG scope (and no further). If an element  
were found in scope, the stack would be popped until that element got  
popped. If there were no such element in scope, the end tag would be  
ignored. Any outcome but a single pop would be a parse error.
  * When the current node is a foreignObject element in an SVG scope,  
the start tag token <html> would establish a "nested HTML scope". </ 
html>, <body> and </body> would act like "normal" tokens in a nested  
HTML scope. Specifically, any token other than </html> encountered in  
a nested HTML scope would be unable to break out of the nested HTML  
  * Attributes with the name "xlink:href" on the tokenization level  
would be reported by the tokenizer as local name "href" in the XLink  
  * xmlns or xmlns:* attributes would have no meaning and would be  
non-conforming except xmlns="" and  
xmlns:xlink="" would be allowed as  
"talismans" on the <svg> start tag.

The above trial balloon proposal is designed to optimize SVG  
integration in text/html in *future* browsers in a way that would  
create a namespace-aware DOM that current DOM-based SVG  
implementations would grok immediately but would at the same time  
remove namespace declaration syntax from the sight of authors. The  
proposal specifically isn't designed to clone the colon-based  
namespaces-in-text/html mechanism of IE. OTOH, it shouldn't interfere  
with it, either, except perhaps for xlink:href, which could be worked  
around by introducing href.

The approach outlined above could be used for MathML as well.  
However, in that case, the tokenizer should probably me modified to  
switch to MathML entity tables when the tree builder is in a MathML  

> From a logistics standpoint, this work should be done in  
> coordination between the HTML, SVG, and CDF Working Groups.  All  
> have a vested interest in it, and each has a unique set of  
> perspectives, needs, and knowledge.  Perhaps we can begin talking  
> about it at the upcoming Tech Plenary.  We are all busy with other  
> things right now, but opening the dialog will prepare us for what  
> we'll need to consider going forward.

I agree it would make sense to talk about it at the Tech Plenary.

Henri Sivonen

Received on Saturday, 13 October 2007 14:45:01 UTC