- From: David Carlisle <davidc@nag.co.uk>
- Date: Thu, 06 Sep 2012 13:18:53 +0100
- To: stephengreenubl@gmail.com
- Cc: public-microxml@w3.org
On 06/09/2012 12:58, Stephen D Green wrote: > Then 1) MicroXML could allow parsers to preserve in its data model > whether the markup included an empty element as <abc/> or as > <abc></abc> or 2) maybe HTML would start to treat them as > equivalent. Either way, here is a potential road map to sensible > convergence, isn't it, with MicroXML setting out to make the first > step from the XML side and to highlight potential reciprocal changes > that might be made from the HTML side. I don't think that will happen. The HTML5 designers explicitly rejected (multiple times) the notion that /> syntax should mean empty element. It is now baked into the html parser spec in so many places that the / is ignored and so <foo/> parses as <foo> (and thus as a start tag or as a tag for a void element for elements defined as void in html) it would be very hard to change that. Personally I wish that they had made that the default behaviour even if they had special cased some existing elements (the script element being the most plausible argument where there are some possible attack points if new parsers see zzz as not being inside script but old parsers not understanding a xml-style /> syntax see zzz as script content given <script/>zzz</script>) The polyglot spec lists (most) of the things you need to do to make a document produce equivalent DOM whether parsed as html or xml, and most of those restrictions would apply equally to microxml. http://dev.w3.org/html5/html-xhtml-author-guide/html-xhtml-authoring-guide.html To see how weird html parsing is, consider this microxml document <math> <mfrac> <p>a</p> <p>b</p> </mfrac> </math> If parsed with a microxml or xml parser it produces a math element with a mfrac element child which has two p element children. If parsed with an html parser it produces the DOM which you would get from this xml document (ignoring namespaces): <html><head></head><body><math> <mfrac> </mfrac></math><p>a</p> <p>b</p> </body></html> Note the mfrac element now has no children and the two p elements are siblings of math, not grandchildren. David ________________________________________________________________________ The Numerical Algorithms Group Ltd is a company registered in England and Wales with company number 1249803. The registered office is: Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United Kingdom. This e-mail has been scanned for all viruses by Star. The service is powered by MessageLabs. ________________________________________________________________________
Received on Thursday, 6 September 2012 12:19:17 UTC