- From: <bugzilla@jessica.w3.org>
- Date: Fri, 31 Jan 2014 05:20:02 +0000
- To: public-html-admin@w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=24451 Bug ID: 24451 Summary: editorial comments on LCWD Product: HTML WG Version: unspecified Hardware: PC OS: Linux Status: NEW Severity: normal Priority: P2 Component: HTML/XHTML Compatibility Authoring Guide (ed: Eliot Graff) Assignee: eliotgra@microsoft.com Reporter: liam@w3.org QA Contact: public-html-bugzilla@w3.org CC: eliotgra@microsoft.com, mike@w3.org, public-html-admin@w3.org, public-html-wg-issue-tracking@w3.org There are a lot of comments here, but I think they are mostly or all editorial, except for a Process comment at the end, so I have sent them all in one comment. If you prefer I can separate them into multple Bugzilla entries. This is a ueful and good document - I'm pleased to see it move forward, but I found quite a few minor typos and some slightly confusing passages, as noted... Status of this Document Please don't refer to "legacy XML" - I think you mean just "XHTML 1.x". XML in general is not deprecated by W3C. "this recommendation" - it's not yet a W3C Recommendation, although I do hope it becomes one! 2.1 Principles s/requiremetn/requirement/ 3.1 Processing instructions and 3.2 Forbidding the XML Declaration - 3. says, "character encoding MAY be left undeclared in XML" but 3.1 forbids the XML declaration, which is where an encoding would be declared in XML. (the document goes on to clarify, but suggest change "As such, character encoding MAY be left undeclared in XML with the result that UTF-8 is still supported" to "Documents served with an XML content type therefore do not need to use any of the HTML encoding declaration methods, although if the document might be interpreted as text/html it SHOULD do so." However, the green NOTE further down restates this, sojust removing the "As such" sentence would also be fine. Further down you note that the I18N WG recommaends [that one] always include an encoding declaration, which is helpful but may leave the reader confused as to whether this applies to HTML or to XHTML. 3.3 The DOCTYPE The note that the string may be in mixed case or uppercase letters and still be well-formed XML is perhaps confusing since it starts talking about valid xml and then, later in the same sentence, moves to well-formed XML. Suggest, Note For valid XML the document element named in the document type declaration must exactly match the top-level element of the document, including in case. This rule is relaxed for well-formed, rather than valid, XML documents. Since XHTml requires a lower-case <code>html</code> element, Polyglot documents <rfc>should</rfc> use lower-case <code>html</code> for the element named in the DOCTYPE declaration. but not sure if it's worth the extra length. It would probably be worth saying something about customized XHTML DTDs here, with element and entity declarations inside the document type definition subset within the document, or that point to an alternate DTD. 3.4 Namespaces In XML it is the URI, not the prefix, that is the namespace, so the first paragraph (3.4.1, [HTML5] introduces..) is, formally, meaningless. What is meant, I think, is that the HTML 5 specification requires that HTML processors implicitly associate the prefixes html, svg and math with their respective URIs, which are as follows [...]. Regarding the paragraph, [[ Note that there are other prefixed attributes that can be used beyond xlink:href (such as xml:base). Polyglot markup does not declare these prefixes via xmlns. The prefixes are implicitly declared in XML and are automatically applied to the appropriate attributes in HTML ]] Is this a note or is it normative? It says it's a note but does not use Note markup. Also, "such as xml:base" seems far too wishy-washy for a specification. Is foaf:email such a prefixed attribute? what about xml:id? I _think_ what is meant is, The "xml" namespace prefix used e.g. in xml:base, xml:lang, xml:space and xml:id does not need to be declared in XML documents. See CSS namespaces [CSS3NAMESPACE] for how to use CSS selectors with these attributes. The following paragraph seems to be attempting to say this. I don't think the "Note" means to say anything about attributes associated with namespace URIs other than the URI normally associated with the "xml" prefix. I do like the "can be sued as CSS selectors" and have contacted my attorneys already :-) 3.5.1 Required elements and tags The first paragraph seems superfluous, but maybe it's needed for HTML people? In the next paragraph there's an extra comma in "optional tags, may create". s/in their code/in their markup/ Remove the extra comma in " with regard to tags, is" 3.5.1.1 "Every polyglot markup document therefore ontains an html, head, title, and body element, represented in the code with their tags." -- that's true in HTMl too, as the previous section just explained, although they are not represented in the "code" [please let's call it markup, not code]. Maybe you have an extra comma there just before "represented"? s/following source code/following markup/ 3.5.1.2 Required tags examples I think this section is talking about required _elements_, not required _tags_. Of course, in XML, the presence of an element is never inferred, so tags are always required at the start and end of element boundaries. 3.5.2 Excluded elements and tags This should just be Excluded elements. All three XML tags (start, end, null) are used in XHTML and polyglot HTML. Delete spurious comma in "Elements with features designed for HTML alone, are non-polyglot from the outset." (the rationale for excluding noscript is a nonsense of course: there's also no mechanism for producing img or a or table in XML directly. But we'll let that pass) In this section (3.5.2) you say that noscript is not allowed, then have a non-normative note that says there are other elements that are also not allowed but which you do not list. Since this is non-normative, how should the reader know which elements have features designed for HTML? I'd say "a" and "img" are the obvious candidates, but surely you don't mean these? 3.5.3.1 Element names. "Polyglot markup uses the correct case for element names." I think this sentence translates to, "conforming documents conform to this specification", and can be deleted. I'd suggest making the bullet list that follows it be three simple paragraphs instead. 3.5.3.2 Attribute names Again, since no conforming document could use an "incorrect" case, I'd delete the first sentence, and maybe promote the bullet list items to paragraphs. 3.6 Element Contents The term strictly speaking in both SGML and Xml is "Element content", although I think everyone will understand "Element contents" not to be a reference to an element called Contents :-) 3.6.1 "Example: Polyglot markup uses the minimized tag syntax for void elements" It uses the empty element tag syntax. You (mis)use the "minimized form" term again in the Example and, confusingly, use the undefined term "self-closing" in the note. Please either use the same term in all places or define all the terms. 3.6.2 Raw text elements XML does not have "comment tags" or "cdata tags". SGML does have CDATA elements, but that's not what you mean here. A better way to put it is that in HTML the content of the script and style elements is treated as if it were CDATA, so that & and < are not special except when they occur as the end tag to close the element. The "As a result" paragraph doesn't seem to add anything except suggesting that the editor of this document prefers HTML in some way :-) In the last column of the table, </script and </style should have the same description as for HTMl - they terminate the corresponding element. 3.6.2.2.1 Safe CDATA usage rules s/These rules assumes that CDATA is of limited use for CSS./These rules assumes that CDATA is of limited use for CSS and therefore focos on JavaScript used with the script element./ HTML's restrictions on <script>/<style> -- probably you should say what they are, and I sugget using an "and" instead of a virgule/slash here, as it looks like part of markup syntax. "Before the CDATA section there can only be one node" - preferrably only one line of code" -- by code here do you mean JavaScript code? There aren't any nodes at all in an XML document, nor in an HTMl document until it's aprsed, and then you get nodes in the DOM representation (XML systems mostly don't use DOM at all). So I don't understand this phrase. EXAMPLE 12 has a </script> but no <script>, is that intended? "Disadvantage: Less safe for templating since the comment could become treated as part of the template." I think this needs an explanation. Are you referring to XSLT templates here? You probably need an example in which the string ]]> occurs as part of the text, to demonstrate how to handle it. You may want to mention the problem of CDATA injection in which a malicious user creates data that looks like ]]> nasty stuff here <![CDATA[ 3.6.3 Escapable raw text elements delete spurious comma after "permitted" you could also delete the comma after 'safe text content" s/permittd/permitted/ 3.6.5 Normal Elements add a missing comma after iframe element to end the paranthetical clause in "Normal elements have no special restrictions other than those that normally apply to polyglot markup. But note that some elements, such as the iframe element must be empty" When you say these elements must be empty, 1. which elements exactly? 2. do you mean EMPTY, using the empty element tg syntax <iframe/> ? 3. If not, what do you mean? 3.7.1 newlines You probably need to explain that the problem is that HTML/SGML-based systems will delete the initial newline on parsing, but XML parsers will not. 3.8 Attributes "the literal character '\t'" -- that's actually four characters. Do you mean a literal tab character or do you mean that in HTML one can use \t to represent a tab? (I have no idea which you mean) It might be worth noting that javaScript and CSS in attribute values are affected by attribute value normalization, because a comment will end up commenting out not to the end of the source line but to the end of the entire attribute value. (whether CSS has comments to end of line is up for debate, but browsers behave as if it does, which is all most authors care about) In 3.8.1 Disallowed attributes you say that xm:space and xml:base are not allowed in HTML but are allowed on SVG and MathML elements - do you mean, even when those SVG or MathML elements occur within HTML documents? (if so, you shoudl probably say so; as it stands it could be taken to mean that they are allowed by those specs but not when SVG or MathML are used inside HTML) 3.8.3.1 The id attribute Note that for valid XHTML the value of every id attribute must unique within the document and must be a legal XML name, starting with a letter. [[ Polyglot markup always uses character references for the less than sign (<) and ampersand (&) when they are used as characters, except when those characters appear inside a CDATA section. ]] s/ inside a CDATA section/ inside a CDATA section or a comment/ 3.10 Comments "Polyglot markup does not begin a comment with either ">" or "->". " That's good because neither HTML nor XML do this - they use <! and <!-- respectively. 3.11.1 s/XHTM/XHTML/ 3.11.2 CSS I think the example at the start should be [attr]{property:value;} Remove spurious comma in "required by polyglot markup, are namespaced" [[ As result, a selector such as [xmlns]{rule:foo} will only work in HTML – it will not work in XHTML, where it is a namespace attribute. ]] The selector is not a namespace attribute. I think you mean, where the attribute has an associated namespace. [[ And the same goes for prefixed attributes – even if one escapes the colon ([xml\:lang]{rule:foo}), such selectors will only work in HTML, except that for the namespace declaration for the xlink: prefix, then it works like in XML even in the HTML syntax and must thus be selected in a namespaced way in both syntaxes. ]] This sentence is confusing for me and hard to read. Part of the problem is that the editor seems unaware of the distinction between a prefix, a namespace and a namespace URI, but most of the problem is that it's a run-on sentence. "it works like in XML" -- what works "like in XML"? Suggest rewriting as multiple sentences. I can't comment on correctness because I don't understand it, sorry. I think this section overall is good and correct, but needs a slight polishing. Hey, it's a draft :-) 3.12 Templating restrictions This section appears to be empty. * What is the repationship between Polyglot and XML 1.1? Is NEL allowed in whitespace in HTML? What about c0 and c1 controls? * Please remember to send a formal request to the XML Core Working Group to review this document; they/we may decline, or may accept these (personal) comments and endorse them, or do something else, but they must obviously be consulted just as the XML Working Group would consult the HTML Working Group in similar circumstances. Thank you, and thank you for working on this important and helpful document. -- You are receiving this mail because: You are on the CC list for the bug.
Received on Friday, 31 January 2014 05:20:05 UTC