W3C home > Mailing lists > Public > public-html@w3.org > April 2008

Re: several messages about New Vocabularies in text/html

From: Ian Hickson <ian@hixie.ch>
Date: Thu, 3 Apr 2008 19:24:20 +0000 (UTC)
To: Henri Sivonen <hsivonen@iki.fi>, Jeff Schiller <codedread@gmail.com>, Sam Ruby <rubys@us.ibm.com>
Cc: HTML WG <public-html@w3.org>, Public MathML mailing list <www-math@w3.org>, public-html-request@w3.org
Message-ID: <Pine.LNX.4.62.0804031842480.18949@hixie.dreamhostps.com>

On Thu, 3 Apr 2008, Henri Sivonen wrote:
> > 
> > It also fails in the case where someone (author A) using a new browser 
> > writes a page that uses this feature, and then someone (author B) 
> > using an old browser copies and pastes from A's page into his page, 
> > accidentally including a stray <svg> tag or <math> tag. His page looks 
> > fine to most users, but to the users of the new browser, the page is 
> > now horked.
> You could use that cargo cult scenario to stop *any* proposal that 
> allows MathML or SVG in text/html and has the property that the markup 
> looks enough like the markup people are used to so that XML-MathML or 
> XML-SVG can be pasted.

On Thu, 3 Apr 2008, Jeff Schiller wrote:
> >
> > Say the trigger is <newsyntax>. Now assume someone writes:
> >
> >  <p>foo <newsyntax> ... </newsyntax> bar </p>
> >
> > ...and that such a page works well in new browsers. Given how people 
> > copy and paste content on the Web, especially how people copy and 
> > paste _new_ syntax on the Web, even before it is implemented, it is 
> > very likely that someone will copy just the "foo" part, accidentally 
> > including the <newsyntax> bit:
> >
> >  <p>bla bla foo <newsyntax> bla bla </p>
> >
> > This will now effectively "poison" the <newsyntax> idea, since the 
> > pages that result from this cargo-cult copy-and-paste attitude will 
> > render badly in browsers that support the new syntax.
> Now I understand where you are coming from.  I don't think there's any 
> way to avoid the 'rendering badly' for all cases, I'm sorry.
> [<video> example]

The difference between <video> (and the handful of similar elements) and 
embedding a whole DOM tree in another namespace is that we can fix the 
other ones pretty easily. For example, we can turn <video> into an empty 
element if it is used before it is ready, the cost is minimal. With the 
namespace ideas, we would have to change the entire processing model, 
which is far more expensive.

The problem I describe here isn't hypothetical. We've already seen it with 
XHTML (where cargo-cult behaviour resulted in XHTML DOCTYPEs and 
namespaces being scattered to the high winds long before browsers were 
able to adapt to XHTML's new rules, resulting in a situation where we 
simply cannot treat XHTML in text/html as XML), we've seen it with the IE8 
mode switch (where people started using it even before IE8 Beta 1 came 
out, without any idea of how much it would break), and we've seen it in 
many cases where there have not yet been any serious ill effects, e.g. 
with people changing the //EN part of their DOCTYPE lines.

On Thu, 3 Apr 2008, Sam Ruby wrote:
> Ian Hickson wrote on 04/02/2008 11:07:37 PM:
> > On Wed, 2 Apr 2008, Sam Ruby wrote:
> > >
> > > To explain the motivation, it helps if I start at the beginning.
> >
> > I understand the problem. It's the solution I don't understand.
> Each time I have attempted to describe a solution in the past, it has 
> turned out that we were trying to address different problems.  Your 
> continued preocupation with the number of elements in the MathML 
> namespace indicates to me that we still aren't yet on the same page.  
> So please forgive me if I want to take this a bit slowly this time.

I understand the problem you are trying to solve, and I understand that 
solving it would solve a superset of the problems I'm trying to solve.

> Here is a fairly detailed proposal:
> http://intertwingly.net/blog/2007/08/02/HTML5-and-Distributed-Extensibility

Unless you hardcoded HTML tag names to the HTML namespace, this would 
break pages that do things like:

   <unknownElement xmlns="bogus namespace">

...in the middle of the page.

Also, if there's one thing we've learnt from XML Namespaces, it's that 
most authors do not understand the concept of indirection introduced by 
assigning prefixes to namespaces. I really don't want to bring that mess 
to text/html if I can help it.

> Or, if we want, we could explore one that is based on Anne's XML5 
> project which supports both syntaxes.

I'm all for XML5, but that's not compatible with text/html either. It 
would make text/html moot, though, which on the long run is a much better 
idea IMHO. I'd much rather we came up with a clean language designed to be 
generic from the ground up, as XML is, without making the mistakes of XML.

> > > Any trigger has the potential for generating potential false 
> > > positives. In the case of MathML and SVG, it migth be useful to see 
> > > if xmlns attributes with the specific values specified for those 
> > > standards generates any false positives.  In particular, it would be 
> > > clearly be problematic if such an tag were not closed.  Let's 
> > > proceed under the assumption that such a trigger can be found.
> >
> > This is the assumption that I have the most trouble with in your 
> > e-mail.
> If we have a new state, having no way to enter that state would 
> completely miss the point.

I understand. Like I said, I don't know if it is possible to solve the 
problem you are trying to solve, which is why I have been focussed on the 
smaller problem of just making MathML2 and SVG 1.1 work. (Henri and Simon 
have suggested possible ways of making any version of MathML and SVG work, 
by hardcoding HTML tag names instead of MathML and SVG tag names, which I 
am now investigating.)

> > Even if we find such a trigger, it doesn't solve the problem, because 
> > if someone (author A) using a new browser writes a page that uses this 
> > feature, and then someone (author B) using an old browser copies and 
> > pastes from A's page into his page, accidentally including the 
> > trigger, the second page will look fine to most users, but to the 
> > users of the new browser, it will be broken.
> As Henri and Julian and perhaps others have pointed out, this 
> requirement makes all solutions impossible.

It is a flaw with all the solutions proposed so far for addressing the 
generic problem, yes.

> I maintain that matching on an attribute named "xmlns" with a value of " 
> http://www.w3.org/2000/svg" on an element otherwise unknown to HTML5 is 
> an approach that would (a) yield a statistically insignificant number of 
> false positives, (b) enable us to trigger the transition to a new state, 
> and (c) would serve as a model for how new vocabularies are to be 
> introduced.

I haven't taken a sample of text/html pages that contain such triggers 
yet, but I _have_ done such a sample for MathML, and it showed that for 
that vocabulary we would need some sort of hardcoded list to handle the 
existing content. (Either a whitelist of MathML elements or a blacklist of 
HTML elements, but either way we've lost the genericity of the problem you 
are trying to solve.)

Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
Received on Thursday, 3 April 2008 19:25:21 UTC

This archive was generated by hypermail 2.3.1 : Thursday, 29 October 2015 10:15:32 UTC