Re: HTML5 and SVG

To mess up this nice summary I'll add some of my comments inline below here. :)

On Mon, 12 Jan 2009 01:17:02 +0100, Cameron McCormack <cam@mcc.id.au> wrote:

>
> We discussed SVG in HTML a little at the 11/12/2008 telcon.  Here is a
> summary of the points made distilled from the minutes, with my comments
> in brackets:
>
>   * Doug says that the goal “SVG should remain XML when inline in HTML”
>     should be a consequence of the later, more high level goals listed
>     in Erik’s mail.
>
>   * Erik says that neither his nor my earlier mail captures exactly what
>     we want from this goal, since there are certain kinds of XML
>     non-well-formednesses that would be reasonable to allow in HTML.

Would it help if we had a complete list of XML:isms that are unsupported in HTML, and a list of HTML:isms that are unsupported in XML? I think we have a partial list at the moment, from our proposal and the work that preceeded it. We could try to summarize those and make a table of issues.

>   * Chris says that we shouldn’t be concerned about validity, just
>     well-formedness.  (Although in HTML, the term “validity” covers both
>     validity and well-formedness in the XML sense.)
>
>   * Erik says that we want parsing fixups to be marked as parse errors.
>     (I would like to know exactly what things would be classed as parse
>     errors here and what would be allowable syntax.)

I wouldn't necessarily consider missing attribute quotes a parse error[1] (also see further comments below), but cases where elements are automatically closed (if that happens to svg elements) should IMHO be clearly marked as a parse errors.

>   * I say that we wouldn’t want to have any implied end tags, as that
>     would restrict our ability to extend the language.
>
>   * Chris says that for the “copying the XML SVG and pasting it into an
>     HTML document” use case, there was agreement at TPAC that the author
>     wouldn’t be copying the DOCTYPE.  (I assume that goes for the XML
>     declaration if it exists, too.)

Wasn't it that DOCTYPE:s may be copied, but that they have no effect on the document (essentially that they're ignored by the parser)?

For XML declarations, character encoding caveats apply of course (for copy-pasting between documents in different encodings). What about <?xml-stylesheet? Also we should expect that people are going to use <link> elements for pulling in external stylesheets that apply to svg content too, and that may make svg document fragments harder to export.

>   * Doug says that until there are tools that understand SVG in HTML,
>     authors will not understand how it exactly works (parses, etc.), and
>     that we will need to educate authors on what different syntax is
>     allowed in XHTML and HTML.
>
>   * Chris says that entities are a “kind of transitional phase”, and
>     that Unicode should just be used these days.  (I might disagree with
>     that; I think many authors will continue to use entities because
>     they are easier to type with a regular keyboard layout, and easier
>     to remember than numeric character references.)

1. The predefined entities in HTML/MathML don't need declaration in HTML but would in standalone SVG. 
2. What about the cases where e.g Illustrator generates an svg doctype containing custom entities? (I would be willing to live with having the custom entities ignored, or in other words have the rendering be different from the standalone svg. Also it's rather trivial to convert such files so that they would work.)

>   * Anthony says that omitted attribute quotes are a bad idea because
>     with SVG’s complex attribute syntaxes, it wouldn’t always be clear
>     where the attribute value ends.  (I would say that this shouldn’t
>     matter, and would require just the same amount of thought from
>     authors as in HTML; i.e., if your attribute value has a space in it,
>     then you’ll need to use quotes around it.)

It's clear that HTML has the same problem, though to a lesser degree. I'd rather we allow omitted attribute quotes for SVG in HTML too, to keep it consistent.

>   * Doug says that lots of attributes can take space-less strings, and
>     that omitting the quotes here would be fine, and would be error
>     corrected.  I wondered whether it was worth classing these as parse
>     errors, since they are not parse errors for HTML elements.  Doug
>     says that it should be a parse error to help people who are trying
>     to do standalone SVG.

I'm also wondering, especially after reading what 'parse error' means in HTML5[1].

>   * Doug raises a concern that people will have SVG-like files (i.e.,
>     using the SVG in HTML syntax), which is served as text/html, and
>     which will be confused with regular SVG files, and will be found not
>     to work when loaded into an authoring tool.  Anthony says that an
>     artist would understand “right click → save as, then open the file”,
>     which could ameliorate the problem.

Right, that is a potential problem. However, it's not clear that svg in html would be used exactly like standalone files. But the ability to export something that is valid "image/svg+xml" would be good as a requirement.

>   * Chris says that if browsers do silent parsing correction that this
>     will cause SVG files to exist that are not well formed and thus
>     cannot be edited in standard authoring tools.  (Perhaps it’s just
>     quibbling, but this would cause HTML files to exist whose SVG
>     contents aren’t well formed XML.  I would expect content served
>     as image/svg+xml to still be parsed with an XML parser.)

That svg fragments can become "uneditable" is also true in some degree for XHTML+SVG files.

>   * Doug says that the above point is why accepting non-XML syntax
>     should be seen as a parse error rather than valid syntax, and that
>     there needn’t be a visible indication of error in the content, but a
>     log or a validator can point this out as a warning.  (Currently
>     browsers don’t seem to log HTML parse errors in their error
>     consoles.  I believe HTML parse errors are classified as errors and
>     not warnings in the two validators that I’ve used (the W3C one and
>     Henri’s one): do you still want to have these as a warning as
>     opposed to an error?  I am not convinced that making non-XML syntax
>     as a parse error will prevent the issue of proliferating non-XML
>     SVG content.)
>
>   * Doug says that he wouldn’t be surprised to see this kind of parse
>     error correction in XML within the next five years.
>
>   * Doug says that SVG requires quotes, and that he’s OK with no quotes
>     being parsed as long as it’s acknowledged as error correction.
>
> Please correct any of the above if I have misinterpreted the minutes.
>
>
> Since that telcon, there was also the news that the HTML WG’s proposal
> has been updated to allow SVG <font> elements, which is good news.
>
> I think it might help the discussion if someone were to summarise the
> characteristics of the HTML WG’s current proposal and send it to the
> list.  We could then look at the concrete differences between our
> current thoughts (which I believe aren’t written up, since the TPAC
> discussions) and that proposal, so that we can find points where we
> might converge.
>
> I would still appreciate replies to my earlier mail
> (http://www.w3.org/mid/20081209010141.GB25522@arc.mcc.id.au), as well as
> this one.  IMO we should try to hash out the arguments over mail, rather
> than during the limited telcon time, if possible.
>
> Thanks,
>
> Cameron
> ACTION-2383

Cheers
/Erik

[1] http://www.w3.org/TR/2008/WD-html5-20080610/single-page/#parse1

-- 
Erik Dahlstrom, Core Technology Developer, Opera Software
Co-Chair, W3C SVG Working Group
Personal blog: http://my.opera.com/macdev_ed

Received on Monday, 12 January 2009 11:30:37 UTC