- From: Henri Sivonen <hsivonen@iki.fi>
- Date: Sun, 25 Mar 2007 12:16:51 +0300
- To: public-html@w3.org
On Mar 23, 2007, at 22:55, Daniel Schattenkirchner wrote: > from an authors point of view I was wondering how HTML5 will handle > doctypes In text/html, you use <!DOCTYPE html>, which activates the standards mode in relevant browsers. In application/xhtml+xml, no doctype is needed, but the spec cannot forbid the author from using a doctype, because forbidding it would inappropriately tamper with the realm of another layer (XML) in the language layer cake. > (I hope we all know why they are important). Indeed. They are important for activating the standards mode in browsers in the case of text/html. http://hsivonen.iki.fi/doctype/ > Even if Web Applications 1.0 becomes HTML5 I don't think it can > keep "<!DOCTYPE html>" because it probably needs versioning in it. > The public "-//W3C//DTD HTML 5.0//EN" comes to my mind. The doctype <!DOCTYPE html> in the WHATWG spec is not an uninformed accident. It is deliberate. For argumentation against using the public ID as a version information switch in XML, please see http://hsivonen.iki.fi/doctype/#xml > However, I was actually wondering wether there'll be one doctype > for the SGML and XML dialects of HTML5, or one for each dialect, > which could result from different naming (XHTML5?). There is no SGML dialect of HTML5. There's an HTML dialect and an XML dialect. (Even the Charter for this group says: "The Group will define conformance and parsing requirements for 'classic HTML', taking into account legacy implementations; the Group will not assume that an SGML parser is used for 'classic HTML'.") On Mar 24, 2007, at 21:18, Jirka Kosek wrote: > I hope that HTML5 (or whatever else name it will have) will made > !DOCTYPE optional (at least for XML serialization). It's optional for the XML serialization. It cannot be made optional in the text/html serialization without making triggering the quirks mode conforming, which is not what is wanted. > HTML already offers different way of specifying version used -- > profile attribute on head element. It's for versioning stuff like metadata profiles, and it has failed miserably in the marketplace. The attribute is obsolete as of today's WHATWG draft. > What will be more suitable is version attribute allowed on root > element > (html) and also on other elements which can act as roots of HTML > fragments (e.g. div). So for specifying that you are using HTML 5.0 > you > could write: > > <html version="5.0"> > ... > </html> I am opposed to requiring authors to include an incantation like that. First, we need to consider use cases for versioning. I'll go over the usual straw men: 1) Versioning is needed so that browsers can switch to a mode needed for a particular version. No. If the quirks mode has taught us anything about this issue with HTML, the conclusion should not be that more versioned modes are the solution. the conclusion should be that future versions of HTML and CSS must not make changes that are incompatible with real legacy content. Modes are bad for browser development and quality assurance. We shouldn't want to have more. HTML5 including the parsing algorithm have carefully been designed so that HTML5 can be implemented in the standards mode without breaking existing standards mode content. (However, if a browser vendor doesn't want to change HTML 4 parsing despite the by-design compatibility of the HTML5 parsing algorithm, the vendor could use the HTML5 doctype as a parser selection switch. But keeping the old parser around is not something that the spec should encourage.) 2) Versioning is needed for mobile profiles. No. HTML5 doesn't and shouldn't have a mobile profile. The concept of a mobile profile implies a walled garden world-view. If a browser only supports a mobile profile, the browser isn't suitable for the Web because the Web will use full (X)HTML5. On the other hand, Opera Mini is proof by implementation that the need to profile HTML under the pretext of mobile limitations is bogus. See also http://www.w3.org/2004/04/webapps-cdf-ws/papers/opera.html 3) Versioning is needed to prepare for HTML6. No. If HTML6 is designed well, no new processing mode is needed and HTML5 documents will work in browsers that implement HTML6. If, however, whoever designs HTML6 decides to do so badly, HTML6 can add a version incantation. HTML5 doesn't need to. 4) Versioning is needed for online conformance checking. No. First, we need to consider what online conformance checkers are for. Do they exist so that third parties can go "Haha! He used the target attribute and specified the Strict doctype. What a bozo. Clearly, he should have known better and specified the Transitional doctype."? I don't think so. Online conformance checkers are tools for helping with markup authoring. Therefore, it is critical to consider their use in the time frame of the authoring according a particular version taking place. When HTML6 is ready to be deployed, it won't be critical for authors to be able to specify in the document if they meant HTML5 or HTML6. They should write HTML6 and conformance checker *defaults* should be updated accordingly. If HTML6 is a superset of HTML5, writing HTML5 and checking with an HTML6 conformance checker won't be a problem. If HTML6 deprecates or obsoletes parts of HTML5, then we won't want to make it too easy for people to keep using the bad stuff without mentioning it to them, will we? If someone wants to keep checking against the definitions of HTML5 in the era of HTML6, I think it is reasonable put the burden of choosing a different version from a pop-up menu in the conformance checker UI on the person who wants to do legacy checking. Compare with CSS. 5) A CMS uses an implementation-specific subset (e.g. no scripting and no forms permitted). You want to configure a general-purpose authoring tool to limit auto-completion to this subset. This use case actually has merit. However, it doesn't have merit as a reason for requiring all authors to include a version='5' incantation. Discussing this issue pretty much reduces to the discussion about the bogosity of xsi:schemaLocation and about the merits of a PI for declaring the location of a RELAX NG schema in a document instance. Note that I am not saying that authoring tool auto-completion has to be RELAX NG-based. I am just saying that the relevant argumentation is the same as with the arguments about how a RELAX NG-aware editor decides which RELAX NG schema to use with a particular document. I think XHTML5 should neither require nor forbid PIs for configuring authoring tools. This is between the author and his/her editor and leaving the artifact in a file that gets served on the Web is mostly harmless. I am less sympathetic to an attribute on the root element for the same purpose, but I'd be willing to concede to an optional attribute with user-defined contents for the purpose of use as a hook in private authoring workflows. E.g. profile='acme-cms-scriptless-and- formless'. However, I am slightly uncomfortable about this, because it is like giving the little finger to xsi:schemaLocation. The contents being user-defined hook for private workflows is an important point. Normatively prescribing how you can subset XHTML doesn't work. Consider XHTML Mobile Profile. Modularization of XHTML was prepared to cater exactly to things like XHTML Mobile Profile and then the MP spec went and did not follow the prescribed module boundaries anyway. With the schema project for (X)HTML5, fantasai and I have built in some options in the schema for dealing with HTML5 vs. XHTML5 differences and for catering to subsetting in ways that we foresee as reasonable. However, this is entirely non-normative and not endorsed by Hixie. If someone is not happy with the options that fantasai and I were able to foresee, the schema is editable and forkable. It would be pointless to pretend that it weren't. Since subsetters are going to do their own thing anyway, naming the subsets should be user-defined and it would be pointless to try to come up with a closed list of de jure subset names. > Its quite common misconception that for each namespace there is a > single > schema defined somewhere. Indeed. Online conformance checkers should probably default to the broadest feature set they support. For example, allowing embedded SVG and MathML by default. (The reason why mine doesn't, yet, is that I haven't had time to review the SVG and MathML stuff properly, yet.) > Several different approaches for recognizing document types in a > single > namespace are in a common use. One of the easiest is usage of > dedicated > attribute for holding version information. This is case for example > of XSLT. > > Example 4. Version information inside XSLT 2.0 stylesheet > <xsl:stylesheet > xmlns:xsl="http://www.w3.org/1999/XSL/Transform" > version="2.0"> > ... > </xsl:stylesheet> I think XSLT is an example of bad design with versioning. (Disclaimer: I am not an XSLT expert. I try to avoid XSLT when I can.) If you feed an old transformation sheet to SAXON 8, it will just warn you that differences between old versions of XSLT and XSLT 2.0 are your problem and figuring out if the warning applies to your particular transformations sheet is your problem as well. If you are unsure, you should use SAXON 6. So the version attribute doesn't give you old behavior. Downgrading the implementation version does. OTOH, the versions are incompatible enough for the new version of the engine to issue a warning. If you consider XSLT a programming language that you run in your own environment, this might be acceptable. However, what works in such an environment doesn't work for Web stuff. > Strictly speaking document type declaration is not version > indication it > is just reference to DTD which can be used for validation and > definition > of entities used. Indeed. > This for example means that you can not embeded XHTML page into > SOAP message and identify version of XHTML used. Considering what I said above, versioning XHTML inside SOAP messages should not be necessary. Interchange with loosely affiliated or unaffiliated parties is similar to the browser use case. And chances are you'll hit SOAP versioning incompatibilities first when you try to upgrade a SOAP interface. :-) Personally, I am not particularly keen to design for SOAP, XSD or XSL- FO. > Moreover request for download of private > copy of DTD could be misused as attack against Web agent—this DTD > could > be very long or it could use a big amount of entity declarations to > congest XML parser. I hope that whatever this WG does, it doesn't pretend DTDs to work on the Web. http://hsivonen.iki.fi/no-dtd/ > Example 6. More robust way of labeling document as XHTML Print FWIW, I think XHTML Print has remarkably little relevance to Web content or even authoring in editors. -- Henri Sivonen hsivonen@iki.fi http://hsivonen.iki.fi/
Received on Sunday, 25 March 2007 09:17:14 UTC