Re: Backward-compatibility of text/html media type (ACTION-334, ACTION-364) from Henry S. Thompson on 2010-02-02 (www-tag@w3.org from February 2010)

From: Henry S. Thompson <ht@inf.ed.ac.uk>
Date: Tue, 02 Feb 2010 12:32:45 +0000
To: Dan Connolly <connolly@w3.org>
Cc: www-tag@w3.org
Message-ID: <f5b4olz7r82.fsf@calexico.inf.ed.ac.uk>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Three points:

 1) As Julian says, DOCTYPE is not the only issue;

 2) Ian Hickson's response appears to me to confuse two separate
    issues -- we're not contesting that the HTML 5 spec can define
    conformance as it currently does -- previous HTML specs have
    eliminated features and ruled old documents non-conforming to the
    new spec.  What's at issue is whether or not such documents can be
    labelled 'text/html'.  Equating the class of "can be served as
    text/html" with the class "conforms to this spec." is what we are
    objecting to -- that's _not_ something previous HTML specs have
    done.

 3) The new text in 7.2.5.4 The "initial" insertion mode is indeed a
    long way from the more restrictive approach taken in previous
    drafts.  Whether attempting to enumerate all possible historically
    used system and public ids is a good idea is unclear to me, but as
    it stands I think the section is buggy, in at least four respects:

   a) The optionality bits in the first bulleted list of 4 cases are not
      what I expect: for the XHTML cases the public identifier should
      be optional -- e.g.
   <!DOCTYPE html SYSTEM "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
      is perfectly well-formed XML/XHTML and is widely used;

   b) Similarly for the HTML 4 and preceding cases -- the PUBLIC
      identifier is optional, so e.g.
   <!DOCTYPE HTML SYSTEM "http://www.w3.org/TR/REC-html40/strict.dtd">
      can begin a conformant HTML 4.0 document.

   c) In the second (quirks-related) bulleted list, why are the
      transitional and frameset XHTML alternatives not allowed, e.g.
      "-//W3C//DTD XHTML 1.0 Transitional//EN" ?

   d) How does control ever get to that list in any case?  The first
      paragraph says if none of the first four bullets apply, we have
      a 'parse error'. . .  I realise this is a very basic
      spec. reading issue, but I can't actually tell from the
      definition of 'parse error' [1] what happens next.  Where is the
      'below' referred to by:

      "The error handling for parse errors is well-defined: user
       agents must either act as described below when encountering
       such problems. . ."

       ?

And, finally, is a document with a 'parse error' a conformant
_document_?  All of this section is about agent behaviour, whereas the
original point was about whether or not e.g. 
   <!DOCTYPE html SYSTEM "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
   <html><body><p>Hello world.</p></body></html>
is an 'HTML document' as defined by the spec.

ht

[1] http://dev.w3.org/html5/spec/Overview.html#parse-error
- -- 
       Henry S. Thompson, School of Informatics, University of Edinburgh
                         Half-time member of W3C Team
      10 Crichton Street, Edinburgh EH8 9AB, SCOTLAND -- (44) 131 650-4440
                Fax: (44) 131 651-1426, e-mail: ht@inf.ed.ac.uk
                       URL: http://www.ltg.ed.ac.uk/~ht/
[mail really from me _always_ has this .sig -- mail without it is forged spam]
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)

iD8DBQFLaBttkjnJixAXWBoRAu1fAJkBbXMXeE/4GyLXCmBVnrcBJf7hOQCdEtYn
9Tzos3QpmegQnc9CqoD/fAs=
=aYPW
-----END PGP SIGNATURE-----

Received on Tuesday, 2 February 2010 12:33:25 UTC