Re: HTML5 and SVG

Hi, Folks-

Erik Dahlström wrote (on 1/12/09 6:29 AM):
> To mess up this nice summary I'll add some of my comments inline
> below here. :)
> 
> On Mon, 12 Jan 2009 01:17:02 +0100, Cameron McCormack <cam@mcc.id.au>
> wrote:
> 
>> 
>> We discussed SVG in HTML a little at the 11/12/2008 telcon.  Here
>> is a summary of the points made distilled from the minutes, with my
>> comments in brackets:
>> 
>> * Doug says that the goal “SVG should remain XML when inline in
>> HTML” should be a consequence of the later, more high level goals
>> listed in Erik’s mail.
>> 
>> * Erik says that neither his nor my earlier mail captures exactly
>> what we want from this goal, since there are certain kinds of XML 
>> non-well-formednesses that would be reasonable to allow in HTML.
> 
> Would it help if we had a complete list of XML:isms that are
> unsupported in HTML, and a list of HTML:isms that are unsupported in
> XML? I think we have a partial list at the moment, from our proposal
> and the work that preceeded it. We could try to summarize those and
> make a table of issues.

Good idea.

We should also have a list of standalone-documentisms (such as the XML
prolog, DOCTYPE, entities, etc.) that we would not expect to work if
copy-pasted directly into an HTML document, and I suppose we'll need to
suggest what should happen when if, say, the DOCTYPE *is* included.
Does that get error-corrected, or does the document simply fail?  If it
gets error-corrected, what's the point in disallowing it?


>> * Chris says that we shouldn’t be concerned about validity, just 
>> well-formedness.  (Although in HTML, the term “validity” covers
>> both validity and well-formedness in the XML sense.)
>> 
>> * Erik says that we want parsing fixups to be marked as parse
>> errors. (I would like to know exactly what things would be classed
>> as parse errors here and what would be allowable syntax.)
> 
> I wouldn't necessarily consider missing attribute quotes a parse
> error[1] (also see further comments below), but cases where elements
> are automatically closed (if that happens to svg elements) should
> IMHO be clearly marked as a parse errors.

Personally, I think that anything that would cause a problem in a
standalone XML or SVG UA should be flagged as a parse error (and
reported in the error console).  This includes missing attribute values.


>> * I say that we wouldn’t want to have any implied end tags, as that
>>  would restrict our ability to extend the language.
>> 
>> * Chris says that for the “copying the XML SVG and pasting it into
>> an HTML document” use case, there was agreement at TPAC that the
>> author wouldn’t be copying the DOCTYPE.  (I assume that goes for
>> the XML declaration if it exists, too.)
> 
> Wasn't it that DOCTYPE:s may be copied, but that they have no effect
> on the document (essentially that they're ignored by the parser)?

Ah, that makes more sense.


> For XML declarations, character encoding caveats apply of course (for
> copy-pasting between documents in different encodings). What about
> <?xml-stylesheet? Also we should expect that people are going to use
> <link> elements for pulling in external stylesheets that apply to svg
> content too, and that may make svg document fragments harder to
> export.

I would expect that people will use <link> elements for stylesheets, and
I think we should add this to the core SVG language.  The PI never made
much sense to me, honestly, and prevents script from adding or changing
stylesheet references on the fly.  I also think that we should consider
adding @xlink:href to the <style> element, in the same way and with the
same processing model as the <script> element has in SVG.


>> * Doug says that until there are tools that understand SVG in HTML,
>>  authors will not understand how it exactly works (parses, etc.),
>> and that we will need to educate authors on what different syntax
>> is allowed in XHTML and HTML.
>> 
>> * Chris says that entities are a “kind of transitional phase”, and 
>> that Unicode should just be used these days.  (I might disagree
>> with that; I think many authors will continue to use entities
>> because they are easier to type with a regular keyboard layout, and
>> easier to remember than numeric character references.)
> 
> 1. The predefined entities in HTML/MathML don't need declaration in
> HTML but would in standalone SVG. 2. What about the cases where e.g
> Illustrator generates an svg doctype containing custom entities? (I
> would be willing to live with having the custom entities ignored, or
> in other words have the rendering be different from the standalone
> svg. Also it's rather trivial to convert such files so that they
> would work.)

Is there a reason against adding those named character references to SVG
as well, referencing HTML5 [2]?  That would solve the problem of them
showing up in SVG content... though I don't know what problems it would
cause (other than backwards incompatibility).  I do wonder how useful
they are in general, even in HTML... there are 2137 of them, and I would
really be surprised to find myself using e.g. "&HilbertSpace;" U+0210B
(then again, I'm not a physicist).  With this expansive list of entities
(including the memorable "&vzigzag;" and catchy "&zigrarr;"), I reserve
a little skepticism that they are all that much more memorable and
useful than their Unicode code points... I think most of these were
included for MathML, fwiw.  (Aside: I see potential for a very geeky
drinking game here, with sad numbers of people committing this whole
list of named character references to memory.)


>> * Anthony says that omitted attribute quotes are a bad idea because
>>  with SVG’s complex attribute syntaxes, it wouldn’t always be clear
>>  where the attribute value ends.  (I would say that this shouldn’t 
>> matter, and would require just the same amount of thought from 
>> authors as in HTML; i.e., if your attribute value has a space in
>> it, then you’ll need to use quotes around it.)
> 
> It's clear that HTML has the same problem, though to a lesser degree.
> I'd rather we allow omitted attribute quotes for SVG in HTML too, to
> keep it consistent.
> 
>> * Doug says that lots of attributes can take space-less strings,
>> and that omitting the quotes here would be fine, and would be error
>>  corrected.  I wondered whether it was worth classing these as
>> parse errors, since they are not parse errors for HTML elements.
>> Doug says that it should be a parse error to help people who are
>> trying to do standalone SVG.
> 
> I'm also wondering, especially after reading what 'parse error' means
> in HTML5[1].

What exactly in your reading of the definition and cited examples of
parse errors gives you a contrary impression.  I skimmed it, and it
didn't seem like it would be categorically silly to indicate these as
parse errors in SVG, even where they may not be so in HTML5.


>> * Doug raises a concern that people will have SVG-like files (i.e.,
>>  using the SVG in HTML syntax), which is served as text/html, and 
>> which will be confused with regular SVG files, and will be found
>> not to work when loaded into an authoring tool.  Anthony says that
>> an artist would understand “right click → save as, then open the
>> file”, which could ameliorate the problem.
> 
> Right, that is a potential problem. However, it's not clear that svg
> in html would be used exactly like standalone files. But the ability
> to export something that is valid "image/svg+xml" would be good as a
> requirement.

Hixie seems to have indicated that he is not comfortable imposing a
requirement on UAs to do so, which means that such a mechanism, while
possible, is a bit theoretical wrt the HTML5 spec.


>> * Chris says that if browsers do silent parsing correction that
>> this will cause SVG files to exist that are not well formed and
>> thus cannot be edited in standard authoring tools.  (Perhaps it’s
>> just quibbling, but this would cause HTML files to exist whose SVG 
>> contents aren’t well formed XML.  I would expect content served as
>> image/svg+xml to still be parsed with an XML parser.)
> 
> That svg fragments can become "uneditable" is also true in some
> degree for XHTML+SVG files.
> 
>> * Doug says that the above point is why accepting non-XML syntax 
>> should be seen as a parse error rather than valid syntax, and that 
>> there needn’t be a visible indication of error in the content, but
>> a log or a validator can point this out as a warning.  (Currently 
>> browsers don’t seem to log HTML parse errors in their error 
>> consoles.  I believe HTML parse errors are classified as errors and
>>  not warnings in the two validators that I’ve used (the W3C one and
>>  Henri’s one): do you still want to have these as a warning as 
>> opposed to an error?  I am not convinced that making non-XML syntax
>>  as a parse error will prevent the issue of proliferating non-XML 
>> SVG content.)
>> 
>> * Doug says that he wouldn’t be surprised to see this kind of parse
>>  error correction in XML within the next five years.
>> 
>> * Doug says that SVG requires quotes, and that he’s OK with no
>> quotes being parsed as long as it’s acknowledged as error
>> correction.
>> 
>> Please correct any of the above if I have misinterpreted the
>> minutes.
>> 
>> 
>> Since that telcon, there was also the news that the HTML WG’s
>> proposal has been updated to allow SVG <font> elements, which is
>> good news.

Indeed.  Is this extended to <metadata> elements as well?


>> I think it might help the discussion if someone were to summarise
>> the characteristics of the HTML WG’s current proposal and send it
>> to the list.  

By "someone", do you mean someone from the SVG WG, or from the HTML WG?
 If the former, I nominate you... ;P


>> We could then look at the concrete differences
>> between our current thoughts (which I believe aren’t written up,
>> since the TPAC discussions) and that proposal, so that we can find
>> points where we might converge.

Good idea.


>> I would still appreciate replies to my earlier mail 
>> (http://www.w3.org/mid/20081209010141.GB25522@arc.mcc.id.au), as
>> well as this one.  IMO we should try to hash out the arguments over
>> mail, rather than during the limited telcon time, if possible.

Fair enough.  Maybe we should do some wikiwork?

[2] http://www.w3.org/TR/2008/WD-html5-20080610/single-page/#named

Regards-
-Doug Schepers
W3C Team Contact, SVG and WebApps WGs

Received on Monday, 12 January 2009 18:22:51 UTC