W3C home > Mailing lists > Public > public-html@w3.org > November 2012

Re: Polyglot Markup Formal Objection Rationale

From: Jirka Kosek <jirka@kosek.cz>
Date: Tue, 06 Nov 2012 08:49:05 +0100
Message-ID: <5098C0F1.5080800@kosek.cz>
To: public-html@w3.org
On 5.11.2012 15:04, Smylers wrote:

> That is:
> * The definition of the term "polyglot markup" being normative (it
>   currently isn't) and itself refer to normative definitions in the HTML
>   spec.

Yep, definition of polyglot markup should be definitively normative.

> * The consequences of that definition, the description of what it means,
>   not being normative (they currently claim to be).
> Would you be satisfied with that, or do you want the description parts
> to be normative as well?

Honestly, I don't care much about this. I think that technicaly it's
clear how polyglot should look like and the current Polyglot Markup spec
describes it quite clearly. We could nitpick around fact that style of
spec is now more close to the best practices document then to strict spec.

>> is probably more work then to add one sentence which can say that in
>> the case of *potential* conflict HTML5 wins.
> I don't think picking the (probable) least work for for spec writers is
> a good way of deciding what a spec should say.

WG has limited resources and there are more urgent issues.

>> For example as both in HTML5 and in XML you have some variety in
>> choosing encoding, Polyglot must *normatively* define that only
>> allowed encoding is UTF-8.
> It can do that by reference; it doesn't need to so it explicitly.
> Clearly by the definition polyglot HTML (being the overlap of text/html
> and XHTML) a conforming polyglot document needs to use an encoding
> which:
> * Is allowed in conforming text/html.
> * Is allowed in conforming XHTML.
> * Can be declared in a way which is conforming in both representations,
>   and has the same meaning in both.
> If the only encoding that turns out to meets those requirements is UTF-8
> then it necessarily follows that polyglot HTML documents must use UTF-8.
> Saying "Polyglot HTML documents use UTF-8" is therefore a description of
> a fact, and not itself a requirement; it places no further restrictions
> on those already made by the simple definition of what polyglot HTML is.
> If, on the other hand, it turns out there is some other encoding which
> also meets the above criteria then that would be an example of a
> contradiction between polyglot HTML being a simple profile of the
> overlap between text/html and XHTML and it having its own normative
> requirements. 

Well, actually your logic would allow either UTF-8 or UTF-16 encodings
(unless you have <meta charset/>). But in usual standards meaning
profile is clearly defined subset and such subset can define additional
requirements like allowing only UTF-8 in order to make interop easier.

> So I'd say the precise opposite: the polyglot spec needs _not_ to
> explicitly define what the allowed encoding is, because it's either
> redundant (requiring something which is already required anyway) or
> wrong (contradicting the HTML spec, and therefore to be ignored).

No, Polyglot has to explicitly define that only allowed encoding is
UTF-8 because we want polyglot to use only UTF-8 not UTF-16 which has
its own problems.


  Jirka Kosek      e-mail: jirka@kosek.cz      http://xmlguru.cz
       Professional XML consulting and training services
  DocBook customization, custom XSLT/XSL-FO document processing
 OASIS DocBook TC member, W3C Invited Expert, ISO JTC1/SC34 member

Received on Tuesday, 6 November 2012 07:49:40 UTC

This archive was generated by hypermail 2.4.0 : Saturday, 9 October 2021 18:45:58 UTC