- From: Maciej Stachowiak <mjs@apple.com>
- Date: Sun, 03 Jan 2010 19:45:13 -0800
- To: Larry Masinter <masinter@adobe.com>
- Cc: "public-html@w3.org" <public-html@w3.org>
- Message-id: <7E10ADFE-1EC4-410F-967C-32D34DA9A9D5@apple.com>
On Jan 2, 2010, at 4:21 PM, Larry Masinter wrote: > The proposal was updated significantly, based on comments. I’ve > tried to address the “compound” issue as well. Here's some purely personal comments on parts of your change proposal: > > In particular, the working group intends to support “polyglot” > documents which are both valid XML and XHTML and also valid as HTML > text/html; since XML workflows often require a !DOCTYPE with a > PublicIdentifier and a SystemIdentifier, this increases the > footprint of “polyglot” documents. Do we have a demonstrated case of an XML workflow that requires both a PublicIdentifier and a SystemIdentifier? > Many of the arguments made in previous discussions about versions > and doctypes were not careful to distinguish between “version of > specification” and “version of implementation”. It should be noted > that many *want* a version indicator to note “version of > implementation”, i.e., as an indicator of “best viewed by FireFox > 4.0 or later” or some such. However, this change proposal is very > clearly providing for a version of a “specification”, and, in > particular, of the HTML specification, with the possibility of “mix” > specifications added. Part of the reason this confusion arose is because at least one prominent implementor said they wanted to use a "version of specification" indicator to trigger versioning of implementation behavior (in addition to implementation-specific triggers). > Many of the arguments in previous discussions were arguing against > version-specific browser behavior. But this change proposal > specifically does NOT allow for (any additional) version-specific > behavior, and in fact explicitly disallows it. > It allows but does not require some validators to perform additional > validation, in that there may be additional validation based on the > PublicIdentifier or SystemIdentifier. As behavior does not depend > on the DOCTYPE, validating the DOCTYPE is not required. Indeed, it looks like the only MUST-level requirements are the following: > For these reasons, the DOCTYPE header is REQUIRED for HTML content > served as text/html (and optional for content served as an XML media > type), but supplying an explicit version indicator is NOT > RECOMMENDED except in limited circumstances. > > The syntax of the DOCTYPE element is: > > <!DOCTYPE html> > <!DOCTYPE html PUBLIC “PublicIdentifier” “SystemIdentifier”> > <!DOCTYPE html SYSTEM “about:legacy-compat”> [...] > HTML documents not served as an XML media type MUST include a > DOCTYPE header, since many browsers, in the absence of a DOCTYPE > header, will trigger a “quirks” mode of rendering. [...] > However, HTML documents MUST NOT use “-//W3C//NONSGML HTML 5.0//EN” > until the edition of this specification referenced is actually > approved and published as a W3C Recommendation. A consequence of this is that under your Change Proposal, documents that trigger quirks mode would be conforming. Is that an intended consequence? I think it is a desirable and intended feature of the current spec that quirks mode documents are nonconforming. Also, minor nitpick: DOCTYPE is not an element. The following two bullet points seem contradictory: > Except for explicitly defined behavior (used to trigger “quirks > mode”, see section [#parse-behavior], [#quirks-mode] and [hsvonin]), > implementations which consume HTML MUST NOT use the DOCTYPE element > to trigger different processing behavior. > Documents served as an XML media type MAY include a DOCTYPE header, > either to allow compatible content (so-called “polyglot” documents > which are both valid HTML and also valid XHTML) or to support > version-specific XML processing. While the DOCTYPE header is not > required, including may help in XHTML/HTML crossover. Implementations MUST NOT use the DOCTYPE to trigger different processing, but documents MAY use it to support version-specific processing. Why would documents have a need to support version- specific processing if version-specific processing is not allowed? > 9.1.1.2 PublicIdentifier for compound specifications > > Note that a PublicIdentifier only identifies a single specification, > not a complete implementation, a suite of specifications, or a > combination of vocabularies from multiple specifications. In order > to construct a PublicIdentifier for such a combination requires > publication of an actual specification which describes that > combination. > > Groups wishing to support the combination of HTML and other > specifications may supply short specifications showing how > additional vocabularies may be used with HTML; for example, a short > document “how to use RDFa with HTML” might be published. (This > document would reference RDFa and HTML but not include either > specification). In such case, the “+” format might be used: > > “-//W3C RDFAWG//NONSGML HTML+RDFa 20100401//EN” might reference the > HTML+RDFA document published by the RDFA working group. > > The W3C Hypertext coordination group is encouraged to coordinate > assignment of public identifiers. This does not, in my opinion, address the compound document use case adequately. One of the original examples for this was syndication. RSS or Atom feeds often pull content from multiple sources which are not under the control of the syndicator. Thus, if versioning is to serve any purpose in such a scenario, it must be possible to label each separate HTML fragment with its own version. Now, one could argue that versioning is of such limited usefulness that it's not important to serve syndication use cases, after all, it's only intended for controlled environments. But this goes completely against the future-proofing argument. A DOCTYPE-based version is not a sound way to future-proof HTML in syndication feeds against incompatible HTML changes, and a great deal of the HTML on the Web is republished in one or more feeds. If HTML did change incompatibly in the future, then we would certainly need a version indicator that can be applied separately to individual fragments of a document, such as a version attribute. This would make the DOCTYPE- based versioning redundant and merely a potential source of conflicting version indicators in the future. Note also that if multiple languages may all be combined and each is versioned, then trying to represent this in a single DOCTYPE will result in a combinatorial explosion. Already we have the potential to combine HTML, MathML, SVG and RDFa. If you imagine we add 4 more languages (perhaps GRDDL, X3D, XForms, XSL-FO), and that each language has at least two versions, then we need standards specifying 256 different doctype strings. With 10 languages having 3 versions each, we'd need 59049 different DOCTYPE strings, each with its own specification. Clearly, this approach is not scalable, compared to identifying each language version independently. For these reasons, I think your approach to versioning in compound documents is not viable. Regards, Maciej
Received on Monday, 4 January 2010 03:45:47 UTC