- From: <bugzilla@jessica.w3.org>
- Date: Fri, 28 Jan 2011 11:50:52 +0000
- To: public-html@w3.org
http://www.w3.org/Bugs/Public/show_bug.cgi?id=11904 Summary: <plaintext> and <xmp> in Polyglot Markup Product: HTML WG Version: unspecified Platform: PC URL: http://dev.w3.org/html5/html-xhtml-author-guide/html-x html-authoring-guide.html#elements-that-cannot-contain -special-characters OS/Version: All Status: NEW Severity: major Priority: P2 Component: HTML/XHTML Compatibility Authoring Guide (ed: Eliot Graff) AssignedTo: eliotgra@microsoft.com ReportedBy: xn--mlform-iua@xn--mlform-iua.no QAContact: public-html-bugzilla@w3.org CC: mike@w3.org, public-html-wg-issue-tracking@w3.org, public-html@w3.org, eliotgra@microsoft.com The draft text on plaintext and xmp should be deleted: ]] Due to the conflict between parsing rules between HTML and XML, polyglot markup uses the following elements only if they do not contain angled brackets ("<" or ">") or ampersands ("&").[[ ISSUES: (1) plaintext/xmp are forbidden in HTML5 - so how do they belong in this draft? (Needs separate bug too.) According to Henri Sivonnen, the Polyglot spec should only describe a subset of XML1 and HTML5. But which subset? Is it about the valid subset? or the valid and well-formed subset? Or perhaps about the DOM equal subset? Or the valid and well-formed DOM equal subset? Example: When you say that polyglot markup *requires* <colgroup/>, then we are outside both validity and well-formedness - then we are in the "equality" land. And the same goes for <xmp> and <plaintext> - the emphasis, as long as you discuss them at all, is on equality, and not on whether validity or well-formedness. This question requires a separate bug. But I want to mention it here anyhow. In my view, Polyglot Markup should describe the HTML5-valid (and perhaps also XML 1.0-valid), XML 1.0-well-formed, DOM-equal subset of HTML5. For that reason, plaintext and xmp does not belong in Polyglot Markup, as it is not permitted in HTML5. (2) For <plaintext>, can conflicting parsing rules ever be avoided ? No! PLAINTEXT EXAMPLE: <plaintext></plaintext> A HTML parser will display the characters "</plaintext>" to the user. Thus it seems to me that if parsing rules is the justification, then <plaintext> must not be used in polyglot documents, as it is not possible to use it in polyglots, without landing in problems/differences due to conflicting parsing rules. (Exception: <iframe><plaintext/></iframe>. But then we should also say that for example "<p/><p></p>" should be permitted, as it is the same issue: "<p/>" works fine, as long as it is empty and a new block element follows immediately after. Plus that are are outside the syntax what HTML5 permits. (3) For <xmp>, can conflicting parsing rules ever be avoided? Only as long as the author avoids any child element and NCRs. Thus, practically speaking, no! XMP example: <xmp><p>å</p></xmp> A HTML-parser will render the content of xmp literally, as code. This is impossible to replicate in XML, unless one uses <[CDATA[ ]]>. However, if one places a <[CDATA[ ]]> inside, then the parser will render those letters literally as well. As for what the specification draft says: Normally one would not say that the XMP example "contains" "<", ">" or "&". Instead, it contains a <p> element and a NCR. And it is, eventually, child elements and NCRs that needs to be forbidden inside an xmp element that occurs in a polyglots document. (4) No need to escape the *characters* <>&. (Needs separate bug too.) >From XML's point of view, there isn't anything special with regard to "<", ">" and "&" inside xmp and plaintext: In all XML documents, the "<" and "&" must - in general -always be escaped. Thus they can neither occur whether inside xmp/plaintext or anywhere else. And, as long as they are escaped, then ">" does not constitute a problem, as far as I can see. Thus, nothing speciall needs to be said about "<" and ">" or "&" inside xmp/plaintext . Instead, it needs to be said aht xmp cannot contain elements or NCRs - see (3) above. CONCLUSION: Delete the entire section. Or, eventally, say that <plaintext> MUST NOT be used but that <XMP> can be used provided that it has no children and no NCRs. -- Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
Received on Friday, 28 January 2011 11:50:54 UTC