- From: Kurt Cagle <kurt.cagle@gmail.com>
- Date: Tue, 4 Jan 2011 16:10:53 -0500
- To: John Cowan <cowan@mercury.ccil.org>
- Cc: Henri Sivonen <hsivonen@iki.fi>, public-html-xml@w3.org
- Message-ID: <AANLkTimHph9QQiuAvj4BefrS4ynZPpUwtasvdsHKS47a@mail.gmail.com>
One additional point about <foreignContent> - this opens up the option of embedding JSON content, YAML or other similar content into the HTML without this information rendering the content. It provides a way of embedding metadata such as RDF. It could even be used for embedding binhex or similar content. It's semantically neutral - the browser doesn't need to supply any processing to it. It's an escape hatch for HTML, something that provides a way to extend the language if necessary but to do so in a consistent manner. Kurt Cagle XML Architect *Lockheed / US National Archives ERA Project* On Tue, Jan 4, 2011 at 4:00 PM, Kurt Cagle <kurt.cagle@gmail.com> wrote: > One other possibility that comes to mind is simply to create a > <foreignContent> element in HTML5. SVG has a similar element (usually for > holding HTML, oddly enough). This would simply tell the processor to not > display the content in question, not to parse it, not to do anything with > it. From the standpoint of HTML5, it's non-displayed text. It would be the > responsibility of the web developer to parse this content into something > meaningful, and if it breaks, then it breaks. > > Yes, it's a data island. If the HTML5 working group feels so strongly about > the purity of the language, a data island is the minimal subset necessary to > ensure some form of extension. Throwing XML content into a <script block as > "application/xml" id="foo"> works better, because it performs parsing of > corresponding documents, but either way, embedding XML in HTML is not a hard > problem. The only hard problem is getting past this phobia about XML content > ending up in HTML. > > Kurt Cagle > XML Architect > *Lockheed / US National Archives ERA Project* > > > > On Tue, Jan 4, 2011 at 2:51 PM, John Cowan <cowan@mercury.ccil.org> wrote: > >> Henri Sivonen scripsit: >> >> > On Dec 20, 2010, at 17:50, David Carlisle wrote: >> > It sure has. Hixie ran an analysis over a substantial quantity of >> > Web pages in Google's index and found existing text/html content that >> > contained an <svg> tag or a <math> tag. The justification is making >> > the algorithm not break a substantial quantity of pages like that. >> >> A number would be nice. One person's "substantial" is another person's >> "trivial", unfortunately. >> >> > Web authors do all sorts of crazily bizarre things. It's really not >> > useful to try to apply logic to try to reason what kind of existing >> > content there should be. >> >> Amen. >> >> > > Editing tools also use nsgmls (perhaps just in the background) >> > > It isn't really true to say it is "just the w3c validator". >> > >> > Which tools? Is the plural really justified or is this about one >> > Emacs mode? >> >> You are confusing nsgmls itself with the Emacs mode (which employs >> nsgmls). Nsgmls is a stand-alone SGML validator that outputs an >> ESIS equivalent to the document being validated. ESIS is a textual >> representation of SAX-style events, one line per event. It's the core >> of any reasonably modern SGML system. >> >> > More precisely, my (I'm hesitant to claim this as a general HTML5 >> > world view) world view says that using vocabularies that the receiving >> > software doesn't understand is a worse way of communicating than using >> > vocabularies that the receiving software understands. (And if you >> > ship a JavaScript or XSLT program to the recipient, the interface >> > of communication to consider isn't the input to your program but >> > its output. For example, if you serve FooML with <?xml-stylesheet?> >> > that transforms it to HTML, you are effectively communicating with >> > the recipient in HTML--not in FooML.) >> >> This argument strikes me as a defense of putting arbitrary XML on the >> wire, since it is not (in your sense of the term) the interface of >> communication. >> >> Once that is accepted, it seems plausible to allow mixtures of HTML and >> arbitrary XML as well. >> >> > This is a bit of a dirty open secret of HTML5. We pretend in rhetoric >> > that #1 is true, but in practice, if you consider elements introduced >> > by HTML5 and how they behaved in pre-HTML5 browsers, #2 is true. >> >> Thanks for the explanation. I will henceforth disregard claims of #1. >> >> > More to the point, DocBook is not XHTML+MathML in we consider that >> > to mean "XHTML and MathML and nothing more". If you aren't allowed to >> > dump DocBook content as a child of an HTML element, it doesn't really >> > make sense to enable dumping it inside annotation-xml. >> >> However, if the day came in which DocBook was an equal-partner vocabulary >> (unlikely as that may seem, stranger things have already happened), >> we would have to add yet another hack to make it work inside MathML. >> >> It is one thing to say it's not valid HTML to incorporate a foreign >> vocabulary inside MathML-in-HTML annotations. It's another thing to >> ensure that such vocabularies are already broken. >> >> > There are security incentives that work against starting to repair >> > broken JavaScript where "broken" is what's broken per ES3. However, I >> > wouldn't be at all surprised if we ended up in a situation where every >> > vendor has an incentive not to enforce the ES5 Strict Mode in order to >> > "work" with more Web content than a competing product that halts on >> > Strict Mode violations and the ES5 Strict Mode effort collapsed. >> >> Strict mode is a programmer choice, not an implementer choice. The code >> has to contain a "use strict" directive. >> >> -- >> John Cowan cowan@ccil.org http://ccil.org/~cowan >> Original line from The Warrior's Apprentice by Lois McMaster Bujold: >> "Only on Barrayar would pulling a loaded needler start a stampede toward >> one." >> English-to-Russian-to-English mangling thereof: "Only on Barrayar you risk >> to >> lose support instead of finding it when you threat with the charged >> weapon." >> >> >
Received on Tuesday, 4 January 2011 21:16:02 UTC