- From: David Carlisle <davidc@nag.co.uk>
- Date: Wed, 05 Jan 2011 11:36:55 +0000
- To: Henri Sivonen <hsivonen@iki.fi>
- Cc: public-html-xml@w3.org
On 05/01/2011 10:31, Henri Sivonen wrote:
> It doesn't seem plausible to change HTML so much that anything that
> an XML serializer could legitimately produce would parse right.
agreed. I have no problem with saying that general xml needs to be
parsed by an xml parser. But it's a big leap to go from that to saying
that it is OK to make it impossible to generate html using an xml
serialiser, even if the generation process is very selective in its
choice of features.
> If we
> don't get that far, a text/html-safe serializer is needed anyway and
> tweaking the details isn't much of a win.
>
>> It seems to be very common to use xhtml syntax on pages served as
>> text/html
>>
>> http://www.w3.org/
>>
>> for example or
>>
>> http://www.drupal.org.uk/
>
> Drupal has been believing the XHTML2 WG advocacy (RDFa) even after
> HTML5 was brought into the W3C. As has the W3C itself. I think its
> not a useful use of effort to try to bail out authors who go out
> their way to look away from HTML5 into the XHTML2 WG land where specs
> were reviewed for processing as XML but were silently condoned or
> even pushed for deployment in text/html nonetheless.
maybe you can discount w3c and drupal as having inbuilt xml bias, but
as you know they were just the first 2 sites that came into my head,
there are lots of others, and this is an ongoing and ever increasing
problem as the desire to generate new content with xml tools will not go
away, and it is defined not to work so as to support a small
(vanishingly small in the case of html in foreign content) number of
pages that were never valid in the first place.
>> or ...
>>
>> Currently this is just an error waiting to happen (for example try
>> mouse-ing over paragraphs in
>>
>> http://www.w3.org/TR/2009/REC-MathML2-20090303/chapter1.html#intro.notation
>>
>>
>>
)
>
> In 2009 you should have already known better than to serve<a/> as
> text/html. :-(
It wasn't me that changed the formatting to make it invalid:-) It was an
experimental restyling of the entire TR area that got rolled back when
it didn't work. I just remembered that URI as it caused a certain amount
of stress at the time:-)
Or as I said at the time
http://lists.w3.org/Archives/Public/site-comments/2009Oct/0080.html
It's really really unfortunate to serve xhtml as text/html,
I suppose I wish that html5 had taken the opportunity to make things
better in this area. I accept that the reason it hasn't is due to
competing concerns rather than negligence, but still it would be good to
find some middle way.
> I don't deny that this is a problem, but it's a
> problem whose parser solution would cause other problems, so I'd
> rather continue with solving the problem with counter-propaganda than
> by changing HTML parsing.
Yes, I know:-) It isn't an altogether unreasonable viewpoint, I just
don't share it. I can see from a browser vendor's viewpoint, anything
that keeps existing pages working has a definite advantage over any
change that has a potential for breaking any existing page no matter how
wrong the markup on that page. I think it is that viewpoint that
dominates the html5 design. However for content producers, there are
costs involved in avoiding all these special case markup rules,
surrounding </br> or html in foreign content (or /> generally).
The proposal to avoid these problems of always putting an html5
serialiser at the end of the chain isn't always available, James C just
gave some use cases so I won't list any more here.
David
________________________________________________________________________
The Numerical Algorithms Group Ltd is a company registered in England
and Wales with company number 1249803. The registered office is:
Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, United Kingdom.
This e-mail has been scanned for all viruses by Star. The service is
powered by MessageLabs.
________________________________________________________________________
Received on Wednesday, 5 January 2011 11:39:28 UTC