Re: SVG Semantics Re: SVG and MathML in text/html from Charles McCathieNevile on 2008-09-29 (public-html@w3.org from September 2008)

From: Charles McCathieNevile <chaals@opera.com>
Date: Tue, 30 Sep 2008 09:27:50 +1000
To: "Maciej Stachowiak" <mjs@apple.com>
Cc: "public-html@w3.org" <public-html@w3.org>
Message-ID: <op.uh9ngoctwxe0ny@widsith.local>
On Mon, 29 Sep 2008 08:01:27 +1000, Maciej Stachowiak <mjs@apple.com>  
wrote:

> On Sep 27, 2008, at 10:28 PM, Charles McCathieNevile wrote:

>> On Mon, 17 Mar 2008 10:35:32 +1100, Maciej Stachowiak <mjs@apple.com>

>>> In conclusion, I think considerations of image semantics do not make  
>>> the case that only draconian syntax makes sense for SVG. I think both  
>>> choices, draconian and tolerant, should be available to authors and  
>>> tools.
>>
>> This conclusion seems a little simplistic given the real world cases

Hmmm: s/simplistic/wrong/ for the purpose of the ongoing discussion. My  
assertion related to a particular piece of the reasoning that led to the  
conclusion as atated earlier in the thread.

>> you outline. SVG has different kinds of error-handling for different  
>> kinds of error, and there is not some binary difference between  
>> tolerance and following draco - they are used to identify tendencies on  
>> a continuum.
>
> I'm not claiming that SVG's error handling is strictly draconian. I  
> would claim that it is a hybrid.

OK, so we agree on that. I think that's good.

> But you said that tolerant error handling was not a good idea for SVG  
> because "SVG is about images - having parts of an image not render can  
> drastically alter the semantics ofthe image."

Did I? My position is that introducing divergent error handling to SVG is  
a problem because the semantics of the image changing can have serious  
effects.

> However, some aspects of SVG's error handling are tolerant already, and  
> in fact have the effect of parts of the image not rendering. So clearly  
> this is not a showstopper.

Spome level of tolerance is indeed not a showstopper. Nor, I argue, is the  
far stricter (relative to HTML) syntax requirements SVG has had a  
showstopper for SVG.

> Indeed, over time, SVG's error handling

I assume you mneant to say something like "...has got looser". That would  
be simplifying the case to the point of distortion, since while the way it  
handles certain errors in SVG code has become looser, some XML-level  
errors that were permitted in ASV have not been permitted in Firefox, and  
Opera has made its handling of those errors progressively stricter.

The original looseness in Opera was due to a few clients' legacy content,  
and against the wishes of the SVG Working Group and community at large as  
far as we could tell. Tightening it up as our particular customers have  
got their legacy cleaned up has not caused problems either with them or at  
large.

> Thus, the evolution of the SVG spec itself strongly implies that  
> tolerant error handling is desirable for images in general, and SVG in  
> particular. The only draconian error handling left in SVG is that  
> inherited from XML, which defines the baseline serialization. I believe  
> it is dubious to claim that the severity of XML error handling is  
> intrinsic to SVG, when SVG itself abandoned such strictness at the SVG  
> (rather than XML) level. Or at least, your argument for it is undermined  
> by SVG itself.

The severity of error handling at XML level has been valuable in  
simplifying the parsing of SVG, and I assert in ensuring interoperability  
- unlike HTML there is relatively little invalid content, and specific  
errors are slowly weeded out of the corpus by progressive tightening of  
the requirements. This has meant that authors learn to write correct code,  
which means there has been no ongoing need or desire for everyone to  
further complicate their SVG tools.

[...]

>>> Perhaps you are arguing that we should offer the option of intermixing  
>>> the tolerant serialization of HTML and the draconian serialization of  
>>> SVG.

>> Yes, very roughly speaking, that is what I am suggesting (modulo the  
>> idea that there are many levels of tolerance).
>
> I don't believe you have made a good argument for why draconian error  
> handling at the serialization / surface syntax level is essential for  
> SVG, or why it is appropriate for the text/html serialization.

And I don't believe that I have seen a convincing argument for changing  
the existing form of SVG to require the more permissive and complex  
HTML-style parsing.

[...]
>> Except that in the real world, there is no apparent demand for a lot of  
>> tolerance in SVG markup,
>
> Evidence?

Evidence that there is not a lot of demand? The fact that our customers  
who insist on SVG don't even mention it (and only ever mentioned it for a  
few very specific errors they were sucked into by building for specific  
tools). The fact that it is a tiny topic in SVG community lists, and even  
where people who come from HTML or similar make a mistake and need to  
learn "the SVG way" it doesn't give rise to campaigns for looser handling.  
The fact that tools that have been more tolerant than the spec have become  
less so in significant ways.

The only kind of evidence I have seen that there is a demand for seriously  
changing the SVG syntax is a proposal and discussion in this context, and  
it appears that the community of people who produce and use SVG are not  
that interested in the extremely tolerant HTML model

>> and there is an ecosystem built on the idea that the extreme tolerance  
>> available for HTML is neither necessary or desirable.
>
> And there is an ecosystem built on the idea that the error tolerance of  
> HTML is essential to the success of the Web,

That would be the ecosystem of HTML, and while I agree that tolerance in  
HTML is essential to the ongoing success of the web, I am less convinced  
that it was tolerance itself that led to that success.

> and an ecosystem much larger than either of those based on not caring  
> much one way or the other but benefitting from error tolerance anyway.

I'm not sure where the evidence is that this ecosystem benefits from error  
tolerance, and I believe that there are costs to those benefits, which  
should be weighed.

> I would say the ecosystem you have mentioned is the least popular and  
> successful of the three. Nontheless, HTML5 will cater to both.
>
>> Indeed, the major failure errors in Wikipedia examples, as identified  
>> by Henri, are less common than the cataclysmic failure of the image to  
>> appear at all.
>>
>> We believe that as well as being easier to implement (in browsers and  
>> authoring tools)
>
> As a browser engine implementor, and one who has directly dealt with  
> both the HTML and XML parsers in our engine, I strongly disagree that  
> the SVG WG proposal is easier for browsers to implement. Using a single  
> parser for HTML would be much easier than trying to switch between the  
> HTML and XML parsers midstream. Is there any browser implementor who  
> thinks otherwise?

Yes. I am not making this up for myself, I am reporting the opinion of the  
people who build the Core of Opera and in particular those responsible for  
dealing with parsing HTML, XML and SVG and making them actually work. A  
simple part of the argument is that multiple parsers that pass stuff  
around are already part of a browser (at least HTML, XML, CSS, and  
Javascript are common to more or less all browsers),

> I also disagree that it is any easier for authoring tools. If SVG  
> authoring tools wish to directly import SVG graphics from text/html  
> documents, they have to implement an HTML5 parser anyway, as described  
> by Henri. I suspect that for them, too, it would be easier to stick with  
> one parser for HTML instead of trying to mode-switch partway.

No they don't. Under either proposal, unless they also want to handle  
HTML, they can do a very simple text extraction.

>> the existing SVG language rather than some version of it that adds a  
>> whole new set of parsing requirements, the real-world problem of  
>> enabling people to hand-code rubbish isn't a serious issue in the SVG  
>> world.
>
> The phrase "enabling people to hand-code rubbish" expresses a judgmental  
> point of view regarding authoring errors that I strongly disagree with.

Then you may have misinterpreted me, so let me clarify. In HTML, it is  
crtical for a browser to be able to handle markup that doesn't remotely  
conform to the specifications, and ths it has been necessary to develop  
rules for parsing almost arbitrary content. In SVG, this situation hasn't  
arisen, so there has been no need to develop rules for trying to extract  
author intent from markup that does not follow the specification.

>> Given the relative scarcity of hand-authoring in SVG, tool coders  
>> become the most important authors of code, in terms of understanding  
>> the "priority of audiences" guideline that is sometimes tossed into  
>> this discussion.
>
> If tools authors would like to start round-tripping HTML that contains  
> SVG, they will need an HTML5 parser and serializer, and I believe that  
> for them just as for browser implementors a monolithic one will be  
> easier to work with than a mode-switching one.

This relies on the assumption that the tools need to handle both types of  
content themselves. I don't see how that is a valid assumption.

>> A substantial proportion of SVG already seems to be moved from one tool  
>> to another. Allowing a new syntax would mean breaking compatibility  
>> with the existing toolset
>
> Embedding in HTML at all will break compatibility, except in the "cut  
> and paste" case, in which case existing SVG syntax will work fine for  
> pasting into HTML.

And we would like SVG embedded within HTML in such a way that  
cut-and-pasting it into SVG tools will also work. If SVG-in-HTML is  
successful then breaking the existing tool chain seems unjustifiable. If  
it is not sucessful then encouraging an alternate incompatible syntax  
seems rather worse than simply unjustifiable.

[...]
> One could likewise argue that the only justification given for strict  
> XML-level error handling of SVG in HTML is that it will be a very common  
> use case for content authors to copy chunks of SVG in text form and  
> paste them into an SVG authoring tool. I would instead expect SVG  
> authoring tools to adapt and process HTML directly ...

I'd be surprised. That's not the impression I get of their plans and  
future directions (of course I have not talked to all of them, so I don't  
claim a complete survey).

> ... to extract and round-trip the SVG content, in which case I think a  
> monolithic HTML parsing algorithm will help them.

cheers

Chaals

-- 
Charles McCathieNevile  Opera Software, Standards Group
     je parle français -- hablo español -- jeg lærer norsk
http://my.opera.com/chaals   Try Opera 9.5: http://www.opera.com
Received on Monday, 29 September 2008 23:28:26 UTC