Re: What problem is this task force trying to solve and why? from David Carlisle on 2011-01-05 (public-html-xml@w3.org from January 2011)

From: David Carlisle <davidc@nag.co.uk>
Date: Wed, 05 Jan 2011 11:36:55 +0000
To: Henri Sivonen <hsivonen@iki.fi>
Cc: public-html-xml@w3.org
Message-ID: <4D2457D7.6010203@nag.co.uk>
On 05/01/2011 10:31, Henri Sivonen wrote:


> It doesn't seem plausible to change HTML so much that anything that
> an XML serializer could legitimately produce would parse right.

agreed. I have no problem with saying that general xml needs to be 
parsed by an xml parser. But it's a big leap to go from that to saying 
that it is OK to make it impossible to generate html using an xml 
serialiser, even if the generation process is very selective in its 
choice of features.
>  If we
> don't get that far, a text/html-safe serializer is needed anyway and
> tweaking the details isn't much of a win.
>
>> It seems to be very common to use xhtml syntax on pages served as
>> text/html
>>
>> http://www.w3.org/
>>
>> for example or
>>
>> http://www.drupal.org.uk/
>
> Drupal has been believing the XHTML2 WG advocacy (RDFa) even after
> HTML5 was brought into the W3C. As has the W3C itself. I think its
> not a useful use of effort to try to bail out authors who go out
> their way to look away from HTML5 into the XHTML2 WG land where specs
> were reviewed for processing as XML but were silently condoned or
> even pushed for deployment in text/html nonetheless.

maybe you can discount w3c and drupal as having inbuilt xml bias, but
as you know they were just the first 2 sites that came into my head, 
there are lots of others, and this is an ongoing and ever increasing 
problem as the desire to generate new content with xml tools will not go 
away, and it is defined not to work so as  to support a small 
(vanishingly small in the case of html in foreign content) number of 
pages that were never valid in the first place.

>> or ...
>>
>> Currently this is just an error waiting to happen (for example try
>> mouse-ing over paragraphs in
>>
>> http://www.w3.org/TR/2009/REC-MathML2-20090303/chapter1.html#intro.notation
>>
>>
>>
)
>
> In 2009 you should have already known better than to serve<a/>  as
> text/html. :-(

It wasn't me that changed the formatting to make it invalid:-) It was an 
experimental restyling of the entire TR area that got rolled back when 
it didn't work. I just remembered that URI as it caused a certain amount 
of stress at the time:-)

Or as I said at the time

http://lists.w3.org/Archives/Public/site-comments/2009Oct/0080.html

      It's really really unfortunate to serve xhtml as text/html,

I suppose I wish that html5 had taken the opportunity to make things 
better in this area. I accept that the reason it hasn't is due to 
competing concerns rather than negligence, but still it would be good to 
find some middle way.


> I don't deny that this is a problem, but it's a
> problem whose parser solution would cause other problems, so I'd
> rather continue with solving the problem with counter-propaganda than
> by changing HTML parsing.

Yes, I know:-) It isn't an altogether unreasonable viewpoint, I just 
don't share it. I can see from a browser vendor's viewpoint, anything 
that keeps existing pages working has a definite advantage over any 
change that has a potential for breaking any existing page no matter how 
wrong the markup on that page. I think it is that viewpoint that 
dominates the html5 design. However for content producers, there are 
costs involved in avoiding all these special case markup rules, 
surrounding </br> or html in foreign content (or /> generally).

The proposal to avoid these problems of always putting an html5 
serialiser at the end of the chain isn't always available, James C just 
gave some use cases so I won't list any more here.

David



________________________________________________________________________
The Numerical Algorithms Group Ltd is a company registered in England
and Wales with company number 1249803. The registered office is:
Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United Kingdom.

This e-mail has been scanned for all viruses by Star. The service is
powered by MessageLabs. 
________________________________________________________________________
Received on Wednesday, 5 January 2011 11:39:28 UTC