- From: Aryeh Gregor <Simetrical+w3c@gmail.com>
- Date: Tue, 9 Feb 2010 14:35:14 -0500
- To: Paul Cotton <Paul.Cotton@microsoft.com>
- Cc: "public-html@w3.org" <public-html@w3.org>, "noah_mendelsohn@us.ibm.com" <noah_mendelsohn@us.ibm.com>
On Mon, Feb 8, 2010 at 3:56 PM, Paul Cotton <Paul.Cotton@microsoft.com> wrote: > This comment is regarding the term "conforming document". As you know, > the HTML 5 draft explicitly discusses [1] the conformance of Web browsers, > noninteractive agents, conformance checkers, etc. I have found no similar > explicit definition of "conforming documents" or some similar term. I believe "conforming" is used in its normal English sense. A document is conforming if it obeys all "must" requirements in the HTML5 spec that logically apply to documents (as opposed to requirements that only make sense for user agents). > * Define one or more terms such as "conforming documents". For each such > term, provide a definition sufficiently rigorous that one can determine > for any given string of characters (octet stream?) whether it is or is not > conforming. The spec defines conformance criteria for conformance checkers: [[ Conformance checkers must verify that a document conforms to the applicable conformance criteria described in this specification. Automated conformance checkers are exempt from detecting errors that require interpretation of the author's intent (for example, while a document is non-conforming if the content of a blockquote element is not a quote, conformance checkers running without the input of human judgement do not have to check that blockquote elements only contain quoted material). Conformance checkers must check that the input document conforms when parsed without a browsing context (meaning that no scripts are run, and that the parser's scripting flag is disabled), and should also check that the input document conforms when parsed with a browsing context in which scripts execute, and that the scripts never cause non-conforming states to occur other than transiently during script execution itself. (This is only a "SHOULD" and not a "MUST" requirement because it has been proven to be impossible. [COMPUTABLE]) The term "HTML validator" can be used to refer to a conformance checker that itself conforms to the applicable requirements of this specification. ]] http://www.whatwg.org/specs/web-apps/current-work/multipage/infrastructure.html#conformance-requirements The spec provides specific algorithms to check conformance where possible. I think the spec makes it very clear what a conforming document is: it must obey all the conformance requirements given in the spec (that apply to documents). "Conforming" is used in its regular English sense. > Assuming I've got that right, it might be worth asking whether there > should be separate terminology for conformance of documents that use only > the features explicitly documented in HTML 5 (e.g. <p>, <table>, etc.) vs. > documents that also use extensions from some applicable specification > (<NoahsNewTag>). This is what the current spec has to say: [[ When vendor-neutral extensions to this specification are needed, either this specification can be updated accordingly, or an extension specification can be written that overrides the requirements in this specification. When someone applying this specification to their activities decides that they will recognise the requirements of such an extension specification, it becomes an applicable specification for the purposes of conformance requirements in this specification. ]] http://www.whatwg.org/specs/web-apps/current-work/multipage/infrastructure.html#extensibility The idea (as far as I can tell) is that HTML5 defines a specific set of conformance requirements, and any document satisfying those is a conforming HTML5 document. If another spec extends HTML5, like the HTML+RDFa spec, then documents that conform to it are conforming HTML5+RDFa (or whatever). Validators/authors/implementers/etc. can consider them conforming or not as they choose, depending on whether they want to accept that extension specification as applicable. > II. Same as above, but apply the term "conforming document" to any syntax > that >could have been< defined in an applicable specification. (I suspect > that there is some syntax, such as improperly nested tags, that you would > prohibit even applicable specifications from specifying -- you should make > clear what syntax and processing can and cannot be defined in such > extension specs I think). If an external specification is accepted as applicable, it can override any requirements it sees fit. There's really no way for one spec to say other specs can't supersede it. > IV. Encourage usage like: "conforming" for documents that use >only< > features explicitly documented in HTML 5 and "conforming to HTML 5 as > augmented by the XXXX and YYYY specifications" for documents that conform > to identified extension specs. I like this proposal best, personally. For instance, a document using RDFa would be conforming HTML5+RDFa, but (as long as RDFa is not part of the main spec) not conforming HTML5. The spec doesn't define this terminology right now -- I'm not sure whether it should. > For what it's worth, I think I like II. or II+IV best: that is, when no > additional specifications are explicitly called out, all the syntax that >>could have< been defined by such an extension should be considered > conforming. That way you don't consider a document broken just because > you can't name the spec that gave meaning to the new constructs. IMO, it's very important that validators raise errors if they hit unrecognized constructs. If your page passes validation, it should mean (ideally) that the page can be processed by a purely standards-based user agent. If there are unknown extensions present, they're likely either errors or non-standard, unless the validator is out of date. Of course, authors can ignore "unrecognized element foo" errors if they know that the element actually is part of a standard that the validator doesn't recognize (but hopefully the validator would be updated in that case). A validator that didn't catch the typo in <html lagn="en"> because lagn could possibly be an extension attribute would be kind of silly, I think. :)
Received on Tuesday, 9 February 2010 19:35:48 UTC