W3C home > Mailing lists > Public > public-html@w3.org > November 2012

Re: Polyglot Markup Formal Objection Rationale

From: Smylers <Smylers@stripey.com>
Date: Tue, 6 Nov 2012 12:50:03 +0000
To: public-html@w3.org
Message-ID: <20121106125003.GJ2391@stripey.com>
Jirka Kosek writes:

> Polyglot spec in fact defines what would be called HTML5 profile in
> ISO, or subset between mortal people.

The current Polyglot spec draft contradicts itself. As such, it isn't
clear what it's supposed to be doing.

> > > No, Polyglot has to explicitly define that only allowed encoding
> > > is UTF-8 because we want polyglot to use only UTF-8
> > 
> > Who's the "we" that wants that?
> 
> Working Group, or even wider Web community -- I don't expect that
> anyone sensible would promote UTF-16 as a recommended encoding.

Indeed I wouldn't promote UTF-16 as an encoding.

But there's a difference between what is promoted and what is permitted.

The HTML spec itself says "Authors are encouraged to use UTF-8" and
"using non-UTF-8 encodings can have unexpected results", but allows
UTF-16 -- to that extent "encouraged HTML" is a subset of conforming
HTML.

Your argument against UTF-16 surely applies equally to 'normal' HTML as
to polyglot HTML? The discouragement of UTF-16 doesn't seem to be
polyglot-specific. As such, I don't understand why it is a requirement
of the polyglot spec.

There are all sorts of HTML 'best practices' one could encourage (indeed
a 'best practices' profile could be defined). But these are orthogonal
to whether one chooses to write normal or polyglot HTML. 

The Polyglot spec can be either of these:

  A simply the common parts of text/html and XHTML

  B the common parts of text/html and XHTML plus some additional
    mandatory best practices

Has the working group decreed that it wants B? Apologies if I missed a
decision on this somewhere. From the document's own introduction I'd
been presuming it was A.

If the term "polyglot HTML" refers to B, that 'uses up' the term, and we
perhaps need some other term to refer to A.

> > If there are to be additional restrictions then yes, indeed they have to
> > be normative. I'd also suggest they need to be clearly distinguished
> > from the requirements that are implied by being the intersection of XML
> > and text/html, so that it is clear to anybody reading the spec that
> > these are additional things they need to do.
> 
> If anyone wants to do this detective excercise to figure out whether
> some restriction comes from HTML5 spec, XML spec, legacy browsers
> behaviour, ... then he is certainly free to do it

Why would this be detective work? Surely the contributors to the
Polyglot spec are aware of when they've decided to impose an additional
restriction on documents?

> and it would be interesting to have this in the spec. But I doubt that
> average reader of polyglot spec would be interested in this -- he/she
> needs to know rules to follow and that's it.

When writing a polyglot HTML document, it's useful for somebody already
familiar with HTML and XML to see what else is required.

When checking a polyglot HTML document for conformance, it's necessary
to see what additional requirements need to be checked in addition to
checking that it is valid text/html and valid XHTML. For instance, see
this mail from Mike Smith earlier today, where his "sorta" covers that
there may be additional requirements but at the moment it's impossible
for the "average reader" of the spec to find them:
http://lists.whatwg.org/pipermail/help-whatwg.org/2012-November/001101.html

Cheers

Smylers
-- 
New series of TV puzzle show 'Only Connect' (some questions by me)
Mondays at 20:30 on BBC4, or iPlayer: http://www.bbc.co.uk/onlyconnect
Received on Tuesday, 6 November 2012 12:50:30 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 6 November 2012 12:50:31 GMT