Re: Why HTML should be taught as HTML without pretending it is XML

On Jul 23, 2007, at 2:16 PM, Jon Barnett wrote:

> On 7/23/07, Robert Burns <> wrote:
>> Hi Jon,
>> > - Never intend to switch to serving the document as
>> > application/xhtml+xml, or have the document parsed as XML, without
>> > mowing through a long list of caveats not covered by Appendix C
>> No, These authors jumped on the band-wagon because they're excited
>> about the possibilities of XTHML and XML They want to switch some
>> day. However, it is not their main motivation to simply avoid running
>> HTMLTidy on their content. That's what makes it an even stronger
>> cowpath: it's an indication of author's excitement about XML and
>> XHTML and not an indication that authors are stupid (as others seem
>> to imply).
>> > - ... and do all of these things solely because they prefer the
>> > syntax? (This is the important point, because I believe that's the
>> > crux of your premise)
>> No, not solely. It's the syntax, it's the excitement about the
>> promise of XML and XHTML, it's the assuredness that once
>> implementations are ready it will be a small step to changeover, etc.
>> > By observing the cowpath, that's not what I see.
>> You just see some poor misguided saps? You haven't shown anything
>> wrong with using this appendix C syntax. Yet you think anyone using
>> it is doing something wrong. What are they doing wrong?
> I'm at a loss.  You, others, and I have pointed out countless concrete
> examples of the incompatibilities of converting XML-like HTML to real
> XML-parsed-as-XML.  It's more than a "small step to changeover".
> We've pointed out countless examples that would surprise authors
> making such a changeover.  Examples include, but are not limited to
> DOM methods (non-NS-aware methods, like createElement), the DOM tree
> (the <tbody> element that was but is no more), CSS selectors affected
> by the DOM tree, etc.

I think we should have a chapter or appendix that documents all of  
these issues: especially going from HTML5 to XHTML5 (or whatever we  
call it).

However, my intervention in this thread has simply been to try to  
bring some of the wild claims about this
back to some solid ground. I don't see the "problems" with using XML- 
like syntax and vending it as text/html (which is what this thread is  
about). None of the "countless" concrete examples had anything to do  
with vending XML-like syntax HTML as text/html. Personally, I would  
always recommend anyone making DOM calls switch to NS DOM calls when  
switching to XML vending (if their content even involves DOM calls).

> What exactly are you rooting for?  Are you wanting a section of the
> spec to recommend an authoring style?  A section like The Appendix C
> to explain the differences try to show how to write XML to be served
> as HTML?  The spec itself to actually REQUIRE a certain syntax?

I'm not rooting for anything in particular. I'm pretty happy with how  
the draft is shaping up on this. I think it would be useful to have  
an appendix-c-like syntax for examples, since such a syntax can be  
used without adding caveats about serialization (in chapters that  
aren't about serialization or syntax per se). However, those issues  
of differing content models should be brought to the forefront in the  
sections and subsections that define elements and their content  
models. Aside, from that I'm not looking for any major changes from  
the direction the draft is heading (regarding syntax anyway). I just  
want to make sure that the misunderstanding about appendix C isn't  
perpetuated. That is that writing XHTML1 documents (according to  
appendix C suggestions) and vending them as text/html is dangerous.  
That's simply not the case. The danger is in thinking that one can  
simply take a complex deployment and move it to XML vending without  
any changes. In terms of XHTML/HTML, even then, there are basically  
no changes necessary. The semantic HTML content can stay the same.  
The CSS and the scripting may need to change. I don't want to get  
itno all of those issues again. We've listed many of them. I think we  
should try to comprehensively list them in our own appendix. However,  
I would say that for most authors in most situations those issues add  
up to rather small issues (avoiding DOM0; avoiding most named  
character entities; double-checking CSS selectors and changing to  
namespaced DOM calls). Are there other issues? Probably. Should we  
refer to them as countless? I don't think so. That's just more of the  
FUD I'm trying to avoid.

Back to HTML5 though. We have two serializations. More than that we  
have a recommendation we're drafting that is trying to be  
serialization agnostic as much as possible. That means we need to try  
to simplify the exposition as much as possible (like finding a syntax  
for examples that is correct for either serialization). It also means  
we need to be careful to make clear any significant serialization  
differences throughout the text: like making content model  
differences clear. We should definitely not require an author to read  
a completely different chapter to discover that <p><ol></ol></p>  
won't work with text/html. That's a recipe for disaster. Appendix C  
doesn't create confusion about such a text/html syntax error, but the  
current draft certainly would.

Take care,

Received on Monday, 23 July 2007 19:41:32 UTC