- From: David Carlisle <davidc@nag.co.uk>
- Date: Wed, 05 Jan 2011 11:36:55 +0000
- To: Henri Sivonen <hsivonen@iki.fi>
- Cc: public-html-xml@w3.org
On 05/01/2011 10:31, Henri Sivonen wrote: > It doesn't seem plausible to change HTML so much that anything that > an XML serializer could legitimately produce would parse right. agreed. I have no problem with saying that general xml needs to be parsed by an xml parser. But it's a big leap to go from that to saying that it is OK to make it impossible to generate html using an xml serialiser, even if the generation process is very selective in its choice of features. > If we > don't get that far, a text/html-safe serializer is needed anyway and > tweaking the details isn't much of a win. > >> It seems to be very common to use xhtml syntax on pages served as >> text/html >> >> http://www.w3.org/ >> >> for example or >> >> http://www.drupal.org.uk/ > > Drupal has been believing the XHTML2 WG advocacy (RDFa) even after > HTML5 was brought into the W3C. As has the W3C itself. I think its > not a useful use of effort to try to bail out authors who go out > their way to look away from HTML5 into the XHTML2 WG land where specs > were reviewed for processing as XML but were silently condoned or > even pushed for deployment in text/html nonetheless. maybe you can discount w3c and drupal as having inbuilt xml bias, but as you know they were just the first 2 sites that came into my head, there are lots of others, and this is an ongoing and ever increasing problem as the desire to generate new content with xml tools will not go away, and it is defined not to work so as to support a small (vanishingly small in the case of html in foreign content) number of pages that were never valid in the first place. >> or ... >> >> Currently this is just an error waiting to happen (for example try >> mouse-ing over paragraphs in >> >> http://www.w3.org/TR/2009/REC-MathML2-20090303/chapter1.html#intro.notation >> >> >> ) > > In 2009 you should have already known better than to serve<a/> as > text/html. :-( It wasn't me that changed the formatting to make it invalid:-) It was an experimental restyling of the entire TR area that got rolled back when it didn't work. I just remembered that URI as it caused a certain amount of stress at the time:-) Or as I said at the time http://lists.w3.org/Archives/Public/site-comments/2009Oct/0080.html It's really really unfortunate to serve xhtml as text/html, I suppose I wish that html5 had taken the opportunity to make things better in this area. I accept that the reason it hasn't is due to competing concerns rather than negligence, but still it would be good to find some middle way. > I don't deny that this is a problem, but it's a > problem whose parser solution would cause other problems, so I'd > rather continue with solving the problem with counter-propaganda than > by changing HTML parsing. Yes, I know:-) It isn't an altogether unreasonable viewpoint, I just don't share it. I can see from a browser vendor's viewpoint, anything that keeps existing pages working has a definite advantage over any change that has a potential for breaking any existing page no matter how wrong the markup on that page. I think it is that viewpoint that dominates the html5 design. However for content producers, there are costs involved in avoiding all these special case markup rules, surrounding </br> or html in foreign content (or /> generally). The proposal to avoid these problems of always putting an html5 serialiser at the end of the chain isn't always available, James C just gave some use cases so I won't list any more here. David ________________________________________________________________________ The Numerical Algorithms Group Ltd is a company registered in England and Wales with company number 1249803. The registered office is: Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United Kingdom. This e-mail has been scanned for all viruses by Star. The service is powered by MessageLabs. ________________________________________________________________________
Received on Wednesday, 5 January 2011 11:39:28 UTC