W3C home > Mailing lists > Public > public-html-xml@w3.org > January 2011

Re: What problem is this task force trying to solve and why?

From: Kurt Cagle <kurt.cagle@gmail.com>
Date: Tue, 4 Jan 2011 16:00:18 -0500
Message-ID: <AANLkTinaOOajSD6Kx__3O04CV2NmMqLRwCwuQ4rNiG3D@mail.gmail.com>
To: John Cowan <cowan@mercury.ccil.org>
Cc: Henri Sivonen <hsivonen@iki.fi>, public-html-xml@w3.org
One other possibility that comes to mind is simply to create a
<foreignContent> element in HTML5. SVG has a similar element (usually for
holding HTML, oddly enough). This would simply tell the processor to not
display the content in question, not to parse it, not to do anything with
it. From the standpoint of HTML5, it's non-displayed text. It would be the
responsibility of the web developer to parse this content into something
meaningful, and if it breaks, then it breaks.

Yes, it's a data island. If the HTML5 working group feels so strongly about
the purity of the language, a data island is the minimal subset necessary to
ensure some form of extension. Throwing XML content into a <script block as
"application/xml" id="foo"> works better, because it performs parsing of
corresponding documents, but either way, embedding XML in HTML is not a hard
problem. The only hard problem is getting past this phobia about XML content
ending up in HTML.

Kurt Cagle
XML Architect
*Lockheed / US National Archives ERA Project*



On Tue, Jan 4, 2011 at 2:51 PM, John Cowan <cowan@mercury.ccil.org> wrote:

> Henri Sivonen scripsit:
>
> > On Dec 20, 2010, at 17:50, David Carlisle wrote:
> > It sure has. Hixie ran an analysis over a substantial quantity of
> > Web pages in Google's index and found existing text/html content that
> > contained an <svg> tag or a <math> tag. The justification is making
> > the algorithm not break a substantial quantity of pages like that.
>
> A number would be nice.  One person's "substantial" is another person's
> "trivial", unfortunately.
>
> > Web authors do all sorts of crazily bizarre things. It's really not
> > useful to try to apply logic to try to reason what kind of existing
> > content there should be.
>
> Amen.
>
> > > Editing tools also use nsgmls (perhaps just in the background)
> > > It isn't really true to say it is "just the w3c validator".
> >
> > Which tools? Is the plural really justified or is this about one
> > Emacs mode?
>
> You are confusing nsgmls itself with the Emacs mode (which employs
> nsgmls).  Nsgmls is a stand-alone SGML validator that outputs an
> ESIS equivalent to the document being validated.  ESIS is a textual
> representation of SAX-style events, one line per event.  It's the core
> of any reasonably modern SGML system.
>
> > More precisely, my (I'm hesitant to claim this as a general HTML5
> > world view) world view says that using vocabularies that the receiving
> > software doesn't understand is a worse way of communicating than using
> > vocabularies that the receiving software understands. (And if you
> > ship a JavaScript or XSLT program to the recipient, the interface
> > of communication to consider isn't the input to your program but
> > its output. For example, if you serve FooML with <?xml-stylesheet?>
> > that transforms it to HTML, you are effectively communicating with
> > the recipient in HTML--not in FooML.)
>
> This argument strikes me as a defense of putting arbitrary XML on the
> wire, since it is not (in your sense of the term) the interface of
> communication.
>
> Once that is accepted, it seems plausible to allow mixtures of HTML and
> arbitrary XML as well.
>
> > This is a bit of a dirty open secret of HTML5. We pretend in rhetoric
> > that #1 is true, but in practice, if you consider elements introduced
> > by HTML5 and how they behaved in pre-HTML5 browsers, #2 is true.
>
> Thanks for the explanation.  I will henceforth disregard claims of #1.
>
> > More to the point, DocBook is not XHTML+MathML in we consider that
> > to mean "XHTML and MathML and nothing more". If you aren't allowed to
> > dump DocBook content as a child of an HTML element, it doesn't really
> > make sense to enable dumping it inside annotation-xml.
>
> However, if the day came in which DocBook was an equal-partner vocabulary
> (unlikely as that may seem, stranger things have already happened),
> we would have to add yet another hack to make it work inside MathML.
>
> It is one thing to say it's not valid HTML to incorporate a foreign
> vocabulary inside MathML-in-HTML annotations.  It's another thing to
> ensure that such vocabularies are already broken.
>
> > There are security incentives that work against starting to repair
> > broken JavaScript where "broken" is what's broken per ES3. However, I
> > wouldn't be at all surprised if we ended up in a situation where every
> > vendor has an incentive not to enforce the ES5 Strict Mode in order to
> > "work" with more Web content than a competing product that halts on
> > Strict Mode violations and the ES5 Strict Mode effort collapsed.
>
> Strict mode is a programmer choice, not an implementer choice.  The code
> has to contain a "use strict" directive.
>
> --
> John Cowan   cowan@ccil.org    http://ccil.org/~cowan
> Original line from The Warrior's Apprentice by Lois McMaster Bujold:
> "Only on Barrayar would pulling a loaded needler start a stampede toward
> one."
> English-to-Russian-to-English mangling thereof: "Only on Barrayar you risk
> to
> lose support instead of finding it when you threat with the charged
> weapon."
>
>
Received on Tuesday, 4 January 2011 21:02:11 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 4 January 2011 21:02:11 GMT