W3C home > Mailing lists > Public > public-html-xml@w3.org > January 2011

Re: What problem is this task force trying to solve and why?

From: John Cowan <cowan@mercury.ccil.org>
Date: Tue, 4 Jan 2011 14:51:53 -0500
To: Henri Sivonen <hsivonen@iki.fi>
Cc: public-html-xml@w3.org
Message-ID: <20110104195153.GA19937@mercury.ccil.org>
Henri Sivonen scripsit:

> On Dec 20, 2010, at 17:50, David Carlisle wrote:
> It sure has. Hixie ran an analysis over a substantial quantity of                                                           
> Web pages in Google's index and found existing text/html content that                                                       
> contained an <svg> tag or a <math> tag. The justification is making                                                         
> the algorithm not break a substantial quantity of pages like that.                                                          
                                                                                                                              
A number would be nice.  One person's "substantial" is another person's                                                       
"trivial", unfortunately.                                                                                                     
                                                                                                                              
> Web authors do all sorts of crazily bizarre things. It's really not                                                         
> useful to try to apply logic to try to reason what kind of existing                                                         
> content there should be.                                                                                                    
                                                                                                                              
Amen.                                                                                                                         
                                                                                                                              
> > Editing tools also use nsgmls (perhaps just in the background)                                                            
> > It isn't really true to say it is "just the w3c validator".                                                               
>                                                                                                                             
> Which tools? Is the plural really justified or is this about one                                                            
> Emacs mode?                                                                                                                 
                                                                                                                              
You are confusing nsgmls itself with the Emacs mode (which employs                                                            
nsgmls).  Nsgmls is a stand-alone SGML validator that outputs an                                                              
ESIS equivalent to the document being validated.  ESIS is a textual                                                           
representation of SAX-style events, one line per event.  It's the core                                                        
of any reasonably modern SGML system.                                                                                         
                                                                                                                              
> More precisely, my (I'm hesitant to claim this as a general HTML5                                                           
> world view) world view says that using vocabularies that the receiving                                                      
> software doesn't understand is a worse way of communicating than using                                                      
> vocabularies that the receiving software understands. (And if you                                                           
> ship a JavaScript or XSLT program to the recipient, the interface                                                           
> of communication to consider isn't the input to your program but                                                            
> its output. For example, if you serve FooML with <?xml-stylesheet?>                                                         
> that transforms it to HTML, you are effectively communicating with                                                          
> the recipient in HTML--not in FooML.)                                                                                       
                                                                                                                              
This argument strikes me as a defense of putting arbitrary XML on the
wire, since it is not (in your sense of the term) the interface of
communication.

Once that is accepted, it seems plausible to allow mixtures of HTML and
arbitrary XML as well.
                                                                                                                              
> This is a bit of a dirty open secret of HTML5. We pretend in rhetoric
> that #1 is true, but in practice, if you consider elements introduced
> by HTML5 and how they behaved in pre-HTML5 browsers, #2 is true.

Thanks for the explanation.  I will henceforth disregard claims of #1.

> More to the point, DocBook is not XHTML+MathML in we consider that
> to mean "XHTML and MathML and nothing more". If you aren't allowed to
> dump DocBook content as a child of an HTML element, it doesn't really
> make sense to enable dumping it inside annotation-xml.

However, if the day came in which DocBook was an equal-partner vocabulary
(unlikely as that may seem, stranger things have already happened),
we would have to add yet another hack to make it work inside MathML.

It is one thing to say it's not valid HTML to incorporate a foreign
vocabulary inside MathML-in-HTML annotations.  It's another thing to
ensure that such vocabularies are already broken.

> There are security incentives that work against starting to repair
> broken JavaScript where "broken" is what's broken per ES3. However, I
> wouldn't be at all surprised if we ended up in a situation where every
> vendor has an incentive not to enforce the ES5 Strict Mode in order to
> "work" with more Web content than a competing product that halts on
> Strict Mode violations and the ES5 Strict Mode effort collapsed.

Strict mode is a programmer choice, not an implementer choice.  The code
has to contain a "use strict" directive.

-- 
John Cowan   cowan@ccil.org    http://ccil.org/~cowan
Original line from The Warrior's Apprentice by Lois McMaster Bujold:
"Only on Barrayar would pulling a loaded needler start a stampede toward one."
English-to-Russian-to-English mangling thereof: "Only on Barrayar you risk to
lose support instead of finding it when you threat with the charged weapon."
Received on Tuesday, 4 January 2011 19:57:30 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 4 January 2011 19:57:31 GMT