- From: Robert Burns <rob@robburns.com>
- Date: Sun, 2 Sep 2007 10:40:40 -0500
- To: Kornel Lesinski <kornel@geekhood.net>
- Cc: "public-html@w3.org" <public-html@w3.org>
Hi Kornel, On Sep 2, 2007, at 9:37 AM, Kornel Lesinski wrote: > > On Sat, 01 Sep 2007 21:24:31 +0100, Robert Burns <rob@robburns.com> > wrote: > >> I'm not sure what you're saying here. If you change your XSLT to a >> different output mode won't it output a pure HTML serialization >> (with no xml-isms)? > > It won't output XHTML as HTML. It's completly counter-intuitive, > but that's what the spec requires: > "The html output method should not output an element differently > from the xml output method unless the expanded-name of the element > has a null namespace URI;" > http://www.w3.org/TR/xslt#section-HTML-Output-Method OK, but doesn't that just mean the XSLT has to be authored to remove the namespace URI from those XHTML elements before they can be output as legacy HTML. Some of the HTML5 discussion to bring namespaces to HTML would require a change in the XSLT recommendation. We should be sure to liaison with that WG on that issue. > >>> I find it troublesome. The fundamental problem is that you have >>> to observe all restrictions of XML, but you can't use XML tools >>> anymore, because they don't care about additional limitations >>> imposed by HTML. >> >> I think observing the XML restrictions is a good thing. >> I also think the treatment of void elements explicitly with >> something like <br/> makes it easier for authors to understand >> what their doing (which is the only additional restriction for >> HTML I can think of). > > The same syntax can also be source of confusion in case of <script > src=""/>. I don't think the syntax is at all the source of the confusion. The difference — visible in all Appendix C code — between <script src='...' > </script> and <meta name='...' content='...' /> is the very difference authors need to understand. In other words authors need to understand the difference between an element that happens to be empty and on that is defined to be canonically empty. Without the "/>" syntax, the student of HTML can more easily miss that fact. That syntax is a very powerful pedagogical tool. I think it's a source of understanding — not confusion. >> Many of those problems relate to the immaturity of XML / XHTML >> implementations and not anything about the DOM APIs themselves. > > I disagree. If one does intend to parse document as XML, sniffing > will always be required when text/html is used. Incompatibilities > between HTML and XML DOM are part of the spec: case sensitivity vs > case folding, forbidden document.write or implied <tbody> won't > change as implementations mature. Well since HTML5 proposes to add document.write to the XML serialization of HTML then yes that's a part of implementation immaturity. However, leaving that aside, we're talking about XHTML1 documents authored to the Appendix C guidelines. So all of the issues you raise here do not apply. The remaining issues are entirely about XHTML processing immaturity (except for the issue of CDATA sections that Philip raised and Appendix C failed to deal with adequately). >> The CSS issues are minor to non-existent for anyone following >> appendix C. > > Indeed, it's just yet another thing authors have to be aware of, > and it fails silently if they don't. It doesn't fail silently if they author to XHTML 1.0 Appendix C guidelines. Authors testing their documents in an XML processor would see problems immediately even if they subsequently turned those documents into text/html media types to finalize and apply scripts and stylesheets. >>> I think that if a document will not work properly as XHTML, and >>> was never intended to do, it shouldn't be called XHTML. >> >> I'm not clear what you're saying here. Any document that is valid >> and well-formed XHTML 1 and also adheres to the XHTML 1.0 >> appendix C guidelines will work properly as XHTML. > > Yes, if such document adheres to appendix C (and possibly few other > things) it would. The problem is that appendix C is not normative, > it doesn't formalize any new language. XHTML, whether it's > compatible or not, is allowed to be sent as text/html. Yes, I would say the problem is that there is no Appendix C DTD whether Appendix C is normative or not, there could still be an Appendix C DTD for validation., However, despite several long threads we still haven't identified any incompatibilities or ramifications to sending valid XHTML 1.0 as text/html. Even with the CSS and EMCAScript errors in the CDATA end tag and improperly added end tags, do any browsers actually choke on these errors? If they don't than I don't see the problem. Even if they do then I still see no problem with authors adhering to the set of norms that have spontaneously developed around XHTML, appendix C and external scripts and stylesheets. To me this is authors saying loud and clear they want this technology to move forward. > This leads to ridiculous situation where you can have valid, well- > formed, 100% spec-compliant XHTML that's not compatible with XML > mode. And this is common on the web today (unless authors fail > short of creating valid and/or well-formed XHTML in a first place, > of course :) I have no idea what you're talking about here. Can you give an example where this leads to 100% spec compliant XHTML that's not compatible with XML processors? > Therefore my suggestion is not to allow XHTML to be sent as text/ > html. Migration path should come from HTML5 side, which allows > appendix C-compatible syntax now. "HTML with slashes" better > describes what those XHTML-wannabe documents are, and there would > be no confusion which media type applies to which language. The migration path has already happened for many authors. They're now waiting for XML processing implementations to mature and become widely used by their site visitors. HTML5 can certainly help with this. The way it could help authors is in providing a conformance checker that does not flag their XML-isms in their HTML. An even more important benefit HTML5 could bring is to clean up the mess in DOM APIs that are not sufficiently serialization aware and fix that for authors. That way the DOM issues will become as insignificant as the CSS issues. HTML5 could also solve the CSS issues by making the tbody and colgroup element a required part of the table element's content model (with optional tag omission in the text/html serialization). Take care, Rob
Received on Sunday, 2 September 2007 15:42:03 UTC