1.1 "the HTML specifications" -- raises the question of scope -- just what documents is this one intended to supersede - by its editor? - by the HTML5 WG? - by the W3C? 1.2 Scope again -- "tools that are intended to conform to this specification" is content-free! 1.3 The applications paragraph -- this is what you can _build_ on what is specified here? 1.4 "without requiring browsers to implement rendering engines that were incompatible with existing HTML Web pages." -- implies XForms _did_ require this -- true? "The proposal was rejected on the grounds that the proposal conflicted with the previously chosen direction for the Web's evolution." -- Anyone have a reference for this? 1.5.1 "Serializability of script execution" - what a _very_ odd thing to start with! 1.6.2 "Thus, authors and implementors who do not need such a modularization scheme can consider this specification a replacement for XHTML 1.x, but those who do need such a mechanism are encouraged to continue using the XHTML 1.1 line of specifications." 1.7 Two things, or three: 1) An abstract language; 2) In-memory representations of resources that use that abstract language; 3) Concrete syntax ? 1.9 Elements (abstract?) are denoted by tags (concrete). So this is only a non-normative "quick introduction", but the very strong emphasis on the DOM as the fundamental core of things is odd. It leaves out non-DOM-based applications, in particular any use with generic XML-based tools, and foregrounds inline script modifying an element, which is at best questionable. . . "The value can also be omitted altogether if it is empty.": ??? The example given directly contradicts HTML 4.01: Example says is equivalent to HTML4.01 says it's equivalent to See also below on 2.4.2 2.1.2 [minor] The use of typewrite font for DOM object classnames is not explained, and runs counter to the W3C spec. guidelines for accessibility, as it is presents a semantic distinction in a non-accessible way. 2.1.6 'resource' is used where 'representation' would be more consistent with AWWW/TAG usage. I think this has been raised elsewhere already. 2.2 The appearance of a script element in an XML document 'within a transformation expressed in XSLT' is called out for special not-as-specified-by-this-spec. treatment. But surely that applies to _all_ HTML elements found in stylesheets. . . Maybe the vertical bar is meant to suggest that this is an _example_ of "the semantics of [HTML] elements [being] overridden by other specifications." 2.2 I don't understand the difference between 'static' and 'dynamic' non-interactive user agents. The example doesn't help -- what properties are being assumed for "overhead displays"? 2.2 I can't figure out what this implies -- a 'for instance' would help a lot: "For the parts of this specification that are defined in terms of an events model or in terms of the DOM, [non-scripting] user agents must still act as if events and the DOM were supported." 2.2 I think this is too strong: "Authoring tools and markup generators must generate conforming documents" It's OK in my view to output well-formed-but-not-valid XML from an XML editor, for instance as an intermediate stage during authoring. 2.2 As I read the fifth-from-last para. and back at the beginning the fourth para. andthe Note thereafter, the decoupling of document from implementation conformance means that for every 'must' wrt document structure, there may be a corresponding 'parse error' or there may be what amounts to a preemptive recovery strategy. I'm curious to know whether and if so how often such disconnects arise. . . Boolean attributes appear to be a case of this. 2.2 I agree with the questions raised in existing threads about the implicit "XHTML MUST NOT be served as text/html" prohibition here. 2.2 "Entity references to unknown entities must be treated as if they contained just an empty text node for the purposes of the algorithms defined in this specification." Surely this should be "for the purposed of implementation conformance", to avoid possible confusion wrt document conformance, where unknown entities MUST NOT occur. 2.2.1 XML support should be mandated as no less than 4th edition, and allowed for higher. . . Likewise "support some version" of the DOM should be more precise. "Some parts of the language described by this specification only support JavaScript as the underlying scripting language." Hunh? For instance? Why? 2.4.2 This repeats the change to allow e.g. disabled="" -- I guess there's some implementation precedent -- this is a classic case of dumbing-down :-( A quick check suggests that _any_ value (including 'false') is treated as present in recent FF, IE, Opera, so I _really_ don't understand the motivation for this . . . validator.nu rejects disabled="foo" but accepts disabled="" as HTML5, rejects both as HTML4 Maybe this is a good way in to a complex issue -- 2.4.2 uses 'must' language, and the traditional reference to RFC2119 is present. We also find the following in 2.2 Conformance, near the end: "Some conformance requirements are phrased as requirements on elements, attributes, methods or objects. Such requirements fall into two categories: those describing content model restrictions, and those describing implementation behavior. Those in the former category are requirements on documents and authoring tools. Those in the second category are requirements on user agents." So we do get that in this case conforming documents must include boolean attributes in only one of three forms, e.g. "disabled", "disabled=''" or "disabled='disabled'", and conformance checkers have to detect and signal failures to observe this constraint. In a related point, the first of these is not XML-allowed, but this is not called out -- indeed the status of all of 2.4 vis-a-vis XHTML is unclear to me. According to the discussion in the section para of 2.1, this section should say "does not apply to XHTML" Another thing I don't see, after considerable searching, particularly in what I take to be the relevant part of the parsing algorithm, namely 9.4.2 Tokenization, particularly 9.4.2.5--9.4.2.15, and I saw nothing which would handle boolean attributes specifically at all. The DOM _interface_ to attribute reflected properties for them is well specified (2.8.1), but that is separate, I think. 2.4.3 Another change from HTML 4 and XHTML: "If an enumerated attribute is specified, the attribute's value must be an ASCII case-insensitive match for one of the given keywords that 2.4.2 Boolean attributes 2.4.3 Keywords and enumerated attributes are not said to be non-conforming, with no leading or trailing whitespace." In HTML 4/XHTML, enumerated attrs are whitespace-stripped before being checked. Why has HTML5 gotten stricter here? I note that in the next section leading/trailing whitespace _is_ ignored around numbers. . .