W3C home > Mailing lists > Public > www-tag@w3.org > June 2010

Re: Notes on the draft polyglot document Polyglot document

From: Henri Sivonen <hsivonen@iki.fi>
Date: Wed, 9 Jun 2010 05:56:03 -0700 (PDT)
To: James Graham <jgraham@opera.com>
Cc: HTML WG <public-html@w3.org>, TAG List <www-tag@w3.org>, Tim Berners-Lee <timbl@w3.org>
Message-ID: <968717767.395527.1276088163342.JavaMail.root@cm-mail03.mozilla.org>
"James Graham" <jgraham@opera.com> wrote:

> On 06/09/2010 01:35 PM, James Graham wrote:
> > I don't think it makes sense to make this document normative on
> > that
> > basis. The allowed content of a polyglot document is purely
> > inferred
> > from other, already normative, texts. Giving the same status to the
> > underlying rules and the inferred rules seems like a recipe for
> > trouble
> > since one is effectively defining the same thing in multiple
> > places.
> >
> Hmm, so this is more complex than I first thought, since there are
> also 
> judgment calls about whether features are considered compatible enough
> to be "polyglot". Nevertheless I would prefer that there is a clear 
> division between the actual rules that define what the term "polyglot"

I think the polyglot document should be a set of inferences and, therefore, should be informative. If the strict inferences are inconvenient, so be it. When the document is informative, readers are free to settle for not-truly-polyglot on the level that suits their use cases. If a WG starts deciding which inferences they don't like and ignore, the result is arbitrary and doesn't really document polyglotness.

However, there is one relaxation that I'd make on the polyglot definition level, since otherwise there can be no polyglot documents at all: I'd allow xmlns="http://www.w3.org/1999/xhtml" on the start tag of the root element even though it parses to a namespace declaration in XML and to an attribute in no namespace in HTML.

Other than that, I'd define a document to be a polyglot (X)HTML document if processing it as text/html and application/xhtml+xml yields exactly the same DOM (ignoring the CDATA section exposure domain modeling error in the DOM itself, since it would be wrong to tie the inferences to the design bugs on the explanatory device).
Henri Sivonen

Received on Wednesday, 9 June 2010 12:56:40 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:56:34 UTC