- From: Peter Foti (PeterF) <PeterF@SystolicNetworks.com>
- Date: Mon, 6 Jan 2003 16:57:21 -0500
- To: "'Ian Hickson'" <ian@hixie.ch>, "'Nick Boalch'" <nick@fof.durge.org>
- Cc: "'www-html@w3.org'" <www-html@w3.org>
Ian, I followed that link and read your document. Don't take this as a personal attack, but some of the points that you make are not quite accurate in my view. Your argument does not seem to take into consideration the case where an XHTML document is meant to be treated as HTML. From the XHTML 1.0 recommendation: <snip> It is intended to be used as a language for content that is both XML-conforming and, if some simple guidelines are followed, operates in HTML 4 conforming user agents. Developers who migrate their content to XHTML 1.0 will realize the following benefits: XHTML documents are XML conforming. As such, they are readily viewed, edited, and validated with standard XML tools. XHTML documents can be written to operate as well or better than they did before in existing HTML 4-conforming user agents as well as in new, XHTML 1.0 conforming user agents. XHTML documents can utilize applications (e.g. scripts and applets) that rely upon either the HTML Document Object Model or the XML Document Object Model [DOM]. As the XHTML family evolves, documents conforming to XHTML 1.0 will be more likely to interoperate within and among various XHTML environments. </snip> Having said that, I will now dive into your arguments: <Ian> * Current UAs are HTML user agents (at best) and certainly not XHTML user agents (certainly not when sent as text/html), so if you send them XHTML you are sending them content in a language which is not native to them, and relying on their error handling. </Ian> As the XHTML recommendation stated, XHTML documents are intended to operate in HTML 4 conforming agents. Sending an XHTML document as text/html seems perfectly fine (when the document in question is meant to be viewed as HTML by HTML 4 conforming agents). <Ian> * <script> and <style> elements in XHTML may not have their contents commented out, a trick frequently used in HTML documents to hide the contents of such elements from legacy UAs. [1] [1] Because in XHTML, <script> and <style> elements are #PCDATA blocks, not #CDATA blocks, and therefore <!-- and --> really _are_ comments tags, and are not ignored by the HTML parser. </Ian> This is interesting, and it leads me to wonder if this is a typo in the recommendation. The HTML recommendation states that a script element contains %Script data, which is defined as CDATA. The XHTML recommendation also defines %Script as CDATA, but the script element contains (#PCDATA) instead. I don't know if this is a mistake in the recommendation or not. However, PCDATA can contain CDATA. And again, since the document is meant to be viewed as HTML by HTML 4 conforming agents, comments will be treated as such when the document is served as text/html. <Ian> * XHTML documents that use the "/>" notation, as in "<link />", are not valid HTML documents. </Ian> I don't really have a good argument for this case, other than HTML agents are generally very forgiving regarding valid documents. As stated in the HTML 4 documentation at: http://www.w3.org/TR/1999/REC-html401-19991224/appendix/notes.html#h-B.1 If a user agent encounters an attribute it does not recognize, it should ignore the entire attribute specification (i.e., the attribute and its value). I admit, I don't have a really strong argument on this one, but I do recognize that most (all?) HTML 4 agents will not have any problems with this notation. <Ian> * Document sent as text/html are handled as tag soup [2] by most UAs. Since most authors only check their documents using one or two UAs, rather than using a validator, this means that authors are not checking for validity, and thus most XHTML documents on the web now are invalid. Therefore the main advantage of using XHTML, that it has to be valid, is lost if the document is then sent as text/html. </Ian> You are presuming that all authors will fail to validate their XHTML document. This is an authoring issue and you can't use this as a reason why using text/html for XHTML is bad. Authors will have to catch up someday and start writing valid documents (if they want to find work). Better to get them on track now than to keep waiting for some overnight miracle. :) Seriously, though, if *I* am willing to check the validity of my documents, and I want to send them as text/html, then I will be able to take advantage of using XHTML. <Ian> * If you ever switch your XHTML documents from text/html to text/xml, then you will in all likelyhood end up with a considerable number of XML errors, meaning your content won't be readable by users. (Most XHTML documents do not validate.) </Ian> This is the same argument as the previous, just in different clothing. I *do* write valid XHTML documents, and since I am writing them to act as HTML, I *don't* want to switch them from text/html to text/xml. <Ian> * A CSS stylesheet written for an HTML document has subtly different semantics in an XHTML context (e.g. the <body> element is not magical in XHTML). </Ian> I agree... and that's why I want to serve those documents as text/html instead of text/xml. As I just wrote, I don't want to switch those documents from text/html to text/xml. <Ian> * A script written for an HTML document has subtly different semantics in an XHTML context (e.g. element names are uppercase in HTML, lowercase in XHTML). </Ian> I assume you are referring to the DOM for each of these? Again, this is not that big of an issue, especially since I have no intention of an HTML to XML conversion anytime soon. <Ian> * If a user saves an XHTML-as-text/html document to disk and later reopens it locally, triggering the content type sniffing code since filesystems typically do not include file type information, the document could be reopened as XML, potentially resulting in validation errors, parsing differences, or styling differences. </Ian> It depends on what application the user has associated with the file extension, does it not? If the user saves the file with a .htm extension, then his/her HTML User Agent will most likely be the one to open the file. <Ian> * The only real advantage to using XHTML rather than HTML is that it is then possible to use XML tools with it. However, if tools are being used, then the same tools might as well produce HTML for you. Alternatively, the tools could take SGML as input instead of XML. </Ian> No, they should not produce HTML (I presume you mean HTML 4 with missing end tags, etc.). If they did, then the XML tool would have to guess where elements ended if they re-opened the generated HTML file. Much better to produce XHTML documents that can be viewed as HTML 4. Also, not sure what tools you use, but the ones I work with don't take SGML. SGML is too loose... the point is that they can validate as XML. Also, this is not the only real advantage. <Ian> * HTML 4.01 contains everything that XHTML contains, so there is little reason to use XHTML in the real world. It appears the main reason is simply "jumping on the bandwagon" of using the latest and (perceived) greatest thing. </Ian> True. However, documents that conform to XHTML may perform better than a document that conforms only to HTML 4 because all of the closing tags are defined. The browser doesn't have to do any guess work to try to figure out where they go. And you'll probably say that HTML documents can be written with all of their closing tags as well, but the documents will validate without them, making it more likely that the developer could miss some and not realize it. And you'll probably say that validators can be configured to require all closing tags, but why go through that trouble when you could just write the document as XHTML? You'll be more likely to write cleaner code, you won't have to configure a validator to your own special needs, and you will probably have a better understanding of both XML and HTML instead of just HTML. Much of your argument seems to revolve around converting HTML documents to XML documents, which is a lot of work (mostly presentational). But my arguement is that I want to display XML compatible documents as HTML *BECAUSE* it is so much less work (and because there are so many HTML agents vs. XML agents). At the same time, I get the benefits of using XML tools if I want... I could convert my XHTML document to some other document using XSLT if I wanted... can't do that with HTML. I guess I just don't see why anyone would NOT want to write their documents as XTHML. Phew... I haven't even gotten to read the rest of that document ("Why UAs can't handle XHTML sent as text/html as XML" and on...). Though I will stress that since I'm not wanting the UA to handle XHTML sent as text/html as XML, I probably don't need to read that section. :) Anyway, I'm outta time for today. Regards, Peter Foti > -----Original Message----- > From: www-html-request@w3.org > [mailto:www-html-request@w3.org]On Behalf > Of Ian Hickson > Sent: Monday, January 06, 2003 1:41 PM > To: Nick Boalch > Cc: www-html@w3.org > Subject: Re: HTML or XHTML - why do you use it? > > > > On Mon, 6 Jan 2003, Nick Boalch wrote: > >>> > >>> [1] <URL: http://www.hixie.ch/advocacy/xhtml>, for example. > >> > >> Think I've read this before. It only talks about why one > shouldn't send > >> XHTML as text/html, right? > > > > More or less. It's conclusion is that XHTML delivered as > text/html is > > broken and XHTML delivered as text/xml is risky, so authors > intending > > their work for public consumption should stick to HTML 4.01. > > Wow, I've never seen someone summarise that document so succintly. > > Do you mind if I use that as the abstract? > > -- > Ian Hickson > )\._.,--....,'``. fL > "meow" /, _.. \ > _\ ;`._ ,. > http://index.hixie.ch/ > `._.-(,_..'--(,_..'`-.;.' >
Received on Monday, 6 January 2003 16:47:28 UTC