- From: <bugzilla@jessica.w3.org>
- Date: Fri, 28 Jan 2011 13:15:04 +0000
- To: public-html@w3.org
http://www.w3.org/Bugs/Public/show_bug.cgi?id=11909 Summary: The principles of Polyglot Markup - validity? well-formed? DOM-equality? Product: HTML WG Version: unspecified Platform: PC URL: http://dev.w3.org/html5/html-xhtml-author-guide/html-x html-authoring-guide.html OS/Version: All Status: NEW Severity: major Priority: P2 Component: HTML/XHTML Compatibility Authoring Guide (ed: Eliot Graff) AssignedTo: eliotgra@microsoft.com ReportedBy: xn--mlform-iua@xn--mlform-iua.no QAContact: public-html-bugzilla@w3.org CC: mike@w3.org, public-html-wg-issue-tracking@w3.org, public-html@w3.org, eliotgra@microsoft.com PROPOSAL: Suggest having a *normaltive* scope description of Polyglot Markup, and I am suggesting the following: ]] Polyglot Markup describes a HTML5-valid (validity), HTML5-comaptible (well-formedness), XML-well-formed (well-formedness), DOM-equal (DOM equality) subset of HTML5. It does not, however, occupy itself with XML-validity. XML-compatible when necessary for well-formedness reasons. But always both HTML-valid and HTML-compatible. [[ This could go into the intro or in a new paragraph. It would be ideal to establish a vocabulary which could be used throughout the spec. Then one could say "To use <colgroup> is a DOM-equality issue". OR "<p/> cannot be used because it isn't HTML5-valid". Or "An @id cannot begin with a number for XML-validity reasons". (XML 1.0 has a similar section where it defines what e.g. well-formed and valid etc means.) CURENT STATUS Currently, the principles of Polyglot Markup can be gleaned from the Abstract ("identical document trees" etc), from the Introduction ("valuable to be able to serve HTML5 documents that are also well formed XML documents" and from the title of the spec ("HTML-compatible XHTML documents"). DISCUSSION Regarding XML-validity: For example <div id="999"></div> is valid HTML5. But it is invalid (but well-formed) XML. If we (as I suggests) do *not* want it to be XML-valid, then this should be said. May be polyglots should strive to be XML-valid also? However, since the weight is on being HTML-compatible rather than XML-compatible, then this is an argument in favour of ignoring XML-validity and instead putting the weight on HTML-compliance. But then we should be conscious about it and state it in the draft. According to Henri Sivonnen, the Polyglot spec should only describe a subset of XML1 and HTML5. We should only read the specs and pick what is compatible with both specs. But which subset? * Validity subset: The HTML-valid subset? The XML-valid subset? The HTML + XML-valid subset? * Well-formed subset? * Well-formed and valid? * DOM equal subet? * All the above? The two main problems in this list are: DOM equality (this is not described in a spec that we can look at) and XML-validity (should we care?). But also, to a degree, HTML-validity/-conformance. It seems like HTML-conformance/-validity should not count as as important as HTML-compatibility. PROBLEM EXAMPLES: <colgroup>: The draft says that polyglot markup *requires* <colgroup/>, or else the XML dom will be different from the HTML DOM. OK. But then we are outside both validity and well-formedness - then we are in the "equality" land. Which isn't described in any other standard, which we can formulate a subset of. It is Polyglot Markup's task to describe the DOM equal subset. <xmp> and <plaintext>: to discuss those elements inside Polyglot Markup shows an emphasis on equality, rather than validity (they are HTML5-invalid) or well-formedness (they have no XML-well-formedness problems). The only problem is that they work differently in HTML and XHTML. attributes - line-feeds, tabs and CR inside attributes: this is not whether a validity issue or a well-formed issue. It is purely - and only sometimes important - DOM equality issue. @id: XML has some global validity rules for @id. For instance, an @id may not begin with a number. Should it matter to Polyglot Markup? -- Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
Received on Friday, 28 January 2011 13:15:09 UTC