[Bug 17710] New: Polyglot Markup: Make XML validity a principle.

https://www.w3.org/Bugs/Public/show_bug.cgi?id=17710

           Summary: Polyglot Markup: Make XML validity a principle.
           Product: HTML WG
           Version: unspecified
          Platform: PC
               URL: http://dev.w3.org/html5/html-xhtml-author-guide/#docty
                    pe
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: HTML/XHTML Compatibility Authoring Guide (ed: Eliot
                    Graff)
        AssignedTo: eliotgra@microsoft.com
        ReportedBy: xn--mlform-iua@xn--mlform-iua.no
         QAContact: public-html-bugzilla@w3.org
                CC: mike@w3.org, public-html-wg-issue-tracking@w3.org,
                    public-html@w3.org, eliotgra@microsoft.com


Section 4. The DOCTYPE requires that, quote:

          ]] * The string html is in lowercase letters.  [[

   And for a number of reasons, this is a good rule to have. However:

PROBLEMS: 
 * To require lowercase 'html' is a XML validity constraint:
   http://www.w3.org/TR/xml/#vc-roottype
 * Polyglot Markup, however, currently only operates  with 
   an XML well-formed principle - it has no XML validity principle:
   http://dev.w3.org/html5/html-xhtml-author-guide/#introduction

EXAMPLES: 
* To do <!DOCTYPE HTML> or <!DOCTYPE hTmL> (as opposed to 
   <!DOCTYPE html>, is NOT a well-formedness violation - it is thus
    NOT a fatal error in XML.
* The HTML5 validator already accepts uppercase 'HTML' for documents 
   served as application/xhtml+xml: http://goo.gl/hbZvC

PROPOSAL:
   Add a new principle in the Introduction section stating that, in addition to
the other constraints (XML well-formedness and HTML-compatibility), Polyglot
Markup complies with all the XML validity constraints of the DOCTYPE and the
DTD of the document. This should be a MUST principle, as such a thing would
favour the use of the HTML5 doctype, due to its much simpler XML validity
requirements. Though I can also live very nicely with a SHOULD principle.

  Benefits of this proposal: 

    (1) It favors the use of the HTML5 DOCTYPE, since the HTML5 doctype does
not add any validity constraints - except the constraint that the DOCTYPE
itself must contain the 'html' string in lowercase. (Polyglot Markup does not
rule out other doctypes than the HTML5 doctype.)
   (2) It makes Polyglot Markup a more universal specification, that applies
even to - for example - XHTML 1.0 documents (which are considered 'obsolete but
conforming' by HTML5.)
   (3) It makes the spec more logical. After all, we cannot ignore the fact
that, merely to have a DOCTYPE, even a simple doctype as the HTML5 doctype,
DOES introduce the concept of XML validity into HTML5.
   (4) For someone using a XML toolchain to create polyglot HTML, they can more
easily understand how the concept of XML validity plays into Polyglot Markup.
E.g. for a document with the HTML5 doctype, an XML validity check would only
potentially produce a single error (wrong casing of the 'html' string in the
DOCTYPE). On the other side: Many of the XML validity concepts that relates to
XHTML 1.0 and XHTML 1.1 are relevant for HTML5 too. (For instance, the
requirement that @id attributes must be unique). 

ALTERNATIVE PROPOSALS:

   If we do not introduce the XML validity constraint, we need to take one 
   of the following to actions instead:

   ALT 1: Turn the requirement to use lowercase 'html' into a
          informational note about how to cater for validating XML
          processors:
          "To cater for validating XML processors, the string
           html should be in lowercase." 

   ALT 2: Delete the entire requirement that 'html' has to be lowercase.
           Leave it all to XML and HTML5.

-- 
Configure bugmail: https://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

Received on Tuesday, 10 July 2012 12:06:27 UTC