- From: Boris Zbarsky <bzbarsky@MIT.EDU>
- Date: Fri, 14 May 2010 10:28:25 -0400
- To: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
- CC: public-html@w3.org
On 5/14/10 10:00 AM, Leif Halvard Silli wrote: > That is not typical for XHTML vs HTML syntax - XHTML syntax typically > uses .html as extension. Or more precisely, most things that are "XHTML syntax" are nothing of the sort; they just have a doctype that pretends the document is XHTML and some attempts at being XHTML, but aren't even well-formed XML, much less valid HTML. The documents that browsers actually treat as XHTML most definitely do not have a .html extension. > There are some exceptions - most notably in Web browers Right. Who are typically the final consumers of the files in question. >> Having a file with a .html extension would tend to mean you want it >> treated as an HTML file on most of the currently-popular desktop >> operating systems. > > For parsing, then yes. For editing, then less so. If you're trying to maintain a polyglot document, agreed. But the fact of the matter is that if you're doing that you need to tell your editor so. The simplest way to do that for HTML5/XHTML5 documents, most likely, is to use a .xhtml extension and an HTML5 doctype, right? >> Hold on. We were just talking about wysiwyg HTML/XHTML editors, no? >> Those are very much NOT text editors. > > Subject of e-mail: "ISSUE-4 - versioning/DOCTYPEs". KompoZer is an > example of an editor that relies on the doctype when it decides the > syntax to follow. Other editors, including both text editors and > WYSIWYG editors, also seems to rely on the doctype for choosing syntax. Yes, but is that a hard requirement? That is, going forward they need to be modified anyway to handle whatever the HTML5/XHTML5 doctype(s) are. Given that, does my proposal above to use .xhtml extension and HTML5 doctype for polyglot documents not work? >> Yep. Then again, the text editor I use on a regular basis does make >> a quite clear distinction between HTML and XML modes. > > I will try to find out what editor you use. ;-) Emacs. It's all about modes. ;) > But, based on the file suffix *only*? That's the simplest thing, yes, and the one set up by default, though of course you can set up your own conditions for picking the mode using a turing-complete programming language that has full access to the file data. > I admit that it doesn't make > sense to use HTML4 alike syntax in a .xhtml file. But the question is > also about .html. And again, unless the editor _parses_ your polyglot .html file as XML it will almost certainly fail to create a useful polyglot document when it saves. I have a hard time believing that most editors parse .html files as XML even if they sniff the XHTML doctype (again, because most such files are not well-formed XML). > Yes. But I think that, to a degree, some DOCTYPEs already causes > polyglot mode. E.g. KompoZer turns<img></img> into<img />. That's just a matter of the fact that Gecko's editor (and presumably KompoZer too, if in a different form) has a hardcodedlist of empty HTML tags and tries to make use of it. This doesn't even have to be a mode switch. It could just be done all the time. > If we say that HTML4 vs XHTML1 is like HTML5 vs XHTML5, then it is > simple to discern between HTML4 and XHML1, but impossible to discern > HTML5 versus XHTML5 (versus quirks-mode HTML). You can easily tell what the document will be _consumed_ as for HTML5 vs XHTML5, no? >> Likewise for a >> non-polyglot-aware X(HT)ML editor used on an XHTML document. > > Given the error correction in text/html, this has a much higher chance > to work, IMHO. No. Your typical non-polyglot-aware XML editor will turn <div></div> into <div/> and then you lose in the HTML mode. > Also, even if it is mostly harmless (except for<br > </br> - though 2 instead 1 line break is also often pretty harmless), > XHTML editors tend to prefer<element /> over<element></element> - at > least when creating XHTML1 documents. Right. See above about <div>; it's not "mostly harmless" but a fundamental issue. > That far - I don't know. ;-) But at least we are on the same page when > it comes to 'polyglot mode' - such a mode is needed. And some editors > might choose to offer only that mode, I think. The question is what to > use to discern between those modes. When would an editor that has a polyglot mode not want to use it? -Boris
Received on Friday, 14 May 2010 14:29:01 UTC