- From: Benjamin Hawkes-Lewis <bhawkeslewis@googlemail.com>
- Date: Thu, 08 Jan 2009 13:01:25 +0000
- To: Mark Birbeck <mark.birbeck@webbackplane.com>
- CC: Brett Patterson <inspiron.pattersonb@gmail.com>, David Woolley <forums@david-woolley.me.uk>, Molte <molte93@gmail.com>, Shavkat Karimov <shavkat@seomanager.com>, HTML Working Group Discussion Mailing-List <www-html@w3.org>
On 7/1/09 22:13, Mark Birbeck wrote: > Many > organisations choose to generate documents that are technically XHTML, > but deliver them to browsers using an HTML MIME type, so as to 'force' > the browser to use its HTML parser for rendering. Yep. They send output in such a way as its processing has no detailed conformance requirements, save for those that HTML5 will hopefully provide. > This gives them the best of both worlds; they can use one or more of > the enormous number of XML tools around to generate their documents, Since a serialization to HTML could be appended to any toolchain producing XHTML, I cannot agree that serving text/html gives them this option. > but they can still have these documents rendered in existing browsers, > without having to worry about whether the browser supports XHTML or > not. Instead, they need to worry about whether the browser processes their particular XHTML acceptably as tag soup. Authors don't have any conformance criteria for the subset of XHTML that's processable. And this also means they do need to forgo any benefits of having their markup processed by browsers as XML, such as mixing in other languages. And since HTML 4.01 can be parsed to roughly the same DOM as their XML - typically by the same tools - they're not making it substantially easier for third-parties to process their content in an automated fashion. > The important thing here is that this technique also means that in > principle, even if a 'new' language is created, it could still be > processed by existing browsers, provided that the new language paid > attention to HTML processing rules. Yes, but I didn't mean HTML5 wasn't a new language, I'm saying XHTML 2 is moving beyond the constraints of the text/html serialization in devising a new language. > So XHTML 2 could be delivered with an HTML MIME type, just as HTML5 > could be delivered with an XHTML MIME type -- in both cases the > languages are distinct from the delivery mechanism. Yes. You could deliver any byte stream as text/html. >> HTML5 is premised on the constraints of supporting the existing web with the >> same specification; XHTML 2 is premised on ignoring those constraints. > > I think this is a little misleading. > > First, HTML5 adds new features that are not backwards-compatible with > HTML 4, but it just so happens that the close relationship between > some of the browser implementers and the spec writers mean that > features are being added quite quickly. In effect, the 'existing web' > is changing, even as we discuss it. > > Second, XHTML 2 is not based on ignoring those constraints, although > it would probably be true to say that it was at its inception. While HTML adds new features with backwards-compatibility problems, it's a requirement that the new features are at least not incompatible with the supporting the current web corpus. There's also a general attempt to ensure that with a bit of serverside processing you could provide an acceptable user experience to most existing user agents. There are some exceptions that would require publisher CSS or JS for an acceptable user experience (the "hidden" attribute springs to mind, though authors are already widely using display: none; to the same effect, as does "datagrid", though authors are already creating equivalent features using JS). But, as you note, the existing web is changing to implement these features (e.g. canvas and video) such that these graceful degradation problems will be substantially reduced when HTML5 becomes a recommendation. AFAIK the feedback from browser vendors like Opera seems to be that implementing XHTML 2 even in text/html is not compatible with supporting the current web corpus. I would of course welcome a correction on this point from popular browser vendors. :) Under "Backwards compatibility", the draft clearly states that XHTML 2's element set depends on XML parsing: "Because earlier versions of HTML were special-purpose languages, it was necessary to ensure a level of backwards compatibility with new versions so that new documents would still be usable in older browsers. However, thanks to XML and style sheets, such strict element-wise backwards compatibility is no longer necessary, since an XML-based browser, of which at the time of writing means more than 95% of browsers in use, can process new markup languages without having to be updated." http://www.w3.org/TR/2005/WD-xhtml2-20050527/introduction.html#backCompat If XHTML 2 is not taking advantage of XML to break free of the past, perhaps this needs rephrasing? > For a > long time now XHTML 2 has had a modular architecture, which means that > language designers can create languages that use one or more of the > XHTML 2 modules, and implementers can provide support for whichever > modules they deem appropriate. This makes XHTML 2 useful not just in > browsers and constrained devices, but also for creating Docbook-style > languages, news formats, and so on. If you mean some XHTML 2 modules could be reconciled with text/html processing, that's probably true. The following seem like possible examples: * XHTML Document Module * XHTML Structural Module * XHTML Text Module * XHTML Hypertext Module * XHTML I18N Attribute Module * XHTML Bi-directional Text Attribute Module * XHTML Role Attribute Module * Ruby Module * XHTML Style Attribute Module * XHTML Tables Module These basically reflect features in existing text/html implementations. Another group of modules introduce new features with (arguably) acceptable fallbacks in existing text/html browsers that could perhaps be implemented without breaking support for the text/html corpus: * XHTML List Module * XHTML Edit Attributes Module * XHTML Image Map Attributes Module * XHTML Metainformation Attributes Module * XHTML Media Attribute Module * XHTML Style Sheet Module But there's a whole set of important modules that don't have acceptable text/html fallbacks and/or probably couldn't be implemented without breaking support for the text/html corpus: * XHTML Embedding Attributes Module: Undisplayed images are not an acceptable fallback, and IIRC browser vendors say it's too hard to implement "src" on every element. * XHTML Handler Module: Displaying raw script on the page is not an acceptable fallback. * XHTML Image Module: Missing alternative texts and alternative text displayed beside a visible image is not an acceptable fallback, and treating text after an 'img' tag as alternative text is not going to be possible to implement alongside supporting the existing web corpus. * XHTML Hypertext Attributes Module: Links that don't work are not an acceptable fallback, and IIRC browser vendors say it's too hard to implement "href" on every element. * XHTML Metainformation Module: text following a "meta" tag is displayed in text/html; that's not an acceptable fallback for human-unfriendly content and there is likely to be existing content in the corpus that depends on this behavior. * XForms Module: Existing assistive technology cannot associate labels with fields, select controls don't work; there are likely further practical problems that represent unacceptable failures. * XHTML Object Module: You would need to use different markup to get this working in the most popular browser. * XML Events Module: Event handling wouldn't work since it depends on the Handler Module. (Doubtless other people's versions of these lists would be different; I certainly don't know enough about the implementation problems to provide any sort of strong opinion.) I was talking about XHTML 2 in the round, not individual modules. I submit that if XHTML 2 was designed with an eye towards text/html compatibility and implementability, (a) these modules would have been designed quite differently and (b) their specs would mention differences in text/html processing so that authors could avoid certain markup patterns. These modules include some of biggest changes from HTML4/XHTML1: http://www.w3.org/TR/xhtml2/introduction.html#s_intro_differences They are certainly important enough to say you cannot simply produce an straightforward "strictly conforming XHTML 2 document", serve it as text/html, and expect text/html browsers to provide an acceptable user experience for it now, or indeed ever. I think unacceptably broken forms, scripts, machine data, links, and images are sufficient disincentive to trying to serve XHTML 2 as text/html. On the other hand, the advantages provided by using the remaining XHTML 2 modules fashion instead of just using the XML serialization of HTML5, and then serving the result as text/html, are hard to understand. For one thing, the later would give you much the same featureset but working better in existing browsers, and for another, the later would include forms, scripts, machine data, links, and images that work in text/html. If the basic idea of XHTML 2 has really stopped being about throwing text/html to the winds and taking advantage of XML to rationalize document markup and started being about merely paring down HTML into a document format and using external XML facilities where ever possible, I'd imagine subsetting the following HTML5 features into separate XML Schema modules would accomplish 90% of the same use cases with 10% of the work for spec-writers, implementors, and authors: head, title, base, script, body, section, nav, article, aside, h1, h2, h3, h4, h5, h6, header, footer, p, pre, dialog, blockquote, ol, ul, li, dl, dt, dd, cite, q, em, strong, code, ruby, rt, rp, ins, del, figure, object, table, caption, colgroup, col, tbody, thead, tfoot, tr, td, th, div, span, object Specifying, implementing, and learning another language just to use "blockcode" instead of "<pre><code>" really doesn't seem worth it. ;) A "strictly conforming XHTML 2" conformance checker could simply verify that the document was using only those features from HTML5, plus any other non-HTML XML modules you want to include in XHTML2 (XForms, ARIA, RDFa, XLink, XML Events, SVG, MathML, SMIL, whatever). -- Benjamin Hawkes-Lewis
Received on Thursday, 8 January 2009 13:02:03 UTC