W3C home > Mailing lists > Public > public-html@w3.org > April 2010

Re: Request for Volunteers: Polyglot spec

From: Sam Ruby <rubys@intertwingly.net>
Date: Wed, 21 Apr 2010 19:14:16 -0400
Message-ID: <4BCF86C8.9090205@intertwingly.net>
To: Eliot Graff <eliotgra@microsoft.com>
CC: Adrian Bateman <adrianba@microsoft.com>, "public-html@w3.org" <public-html@w3.org>, "tag@w3.org" <tag@w3.org>, Tony Ross <tross@microsoft.com>, Paul Cotton <Paul.Cotton@microsoft.com>, "mjs@apple.com" <mjs@apple.com>, "plh@w3.org" <plh@w3.org>
On 04/21/2010 06:15 PM, Eliot Graff wrote:
> Today, I uploaded an EARLY draft version of a polyglot spec,
> "HTML/XHTML Compatibility Authoring Guidelines." [1]

A few QUICK comments:

> If a polyglot document uses an encoding other than UTF8 or UTF16

UTF-16 is not valid for HTML5.  I would recommend being more 
prescriptive: simply recomment (or even require) utf-8 as it is the only 
encoding guaranteed to be supported by all HTML and XML parsers.

> You must specify attribute values as lowercase.

This needs to be made more specific.  A few lines after this, you
provide a counter-example: <img src="karen.jpg" alt="Karen" />

> You should use only the following named entity references

This should either become a MUST, or this document needs to cover what 
DOCTYPES are acceptable.  I would recomment going with MUST.

> The named character reference &apos; (the apostrophe, U+0027) was
> introduced in XML 1.0 but does not appear in HTML.

&apos; is in HTML5.

> You should include a space before the trailing / and > of empty
> elements, e.g. <br />, <hr />

I haven't found this to be necessary.

> Also, you should use the minimized tag syntax for empty elements,
> e.g. <br />. The alternative syntax <br></br> allowed by XML gives
> uncertain results in many existing user agents.

I would recommend that this be a MUST.  The specific example you cite 
will produce different DOMs with HTML5 and XML1 parsers.

> Given an empty instance of an element whose content model is not
> EMPTY (for example, an empty title or paragraph) do not use the
> minimized form (e.g. use <p> </p> and not <p />).

Would suggest the use of RFC 2119 language (MUST not), and I suggest 
that the example be changed to <script src="..."> as this is an example 
that is particularly problematic.

- Sam Ruby
Received on Wednesday, 21 April 2010 23:15:17 UTC

This archive was generated by hypermail 2.3.1 : Monday, 29 September 2014 09:39:17 UTC