Re: Polyglot markup and authors from Henri Sivonen on 2013-02-19 (public-html@w3.org from February 2013)

From: Henri Sivonen <hsivonen@iki.fi>
Date: Tue, 19 Feb 2013 12:32:15 +0200
To: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
Cc: Alex Russell <slightlyoff@google.com>, Mukul Gandhi <gandhi.mukul@gmail.com>, Jirka Kosek <jirka@kosek.cz>, public-html WG <public-html@w3.org>, Paul Cotton <Paul.Cotton@microsoft.com>, Maciej Stachowiak <mjs@apple.com>, "www-tag@w3.org List" <www-tag@w3.org>, Sam Ruby <rubys@intertwingly.net>, "Michael[tm] Smith" <mike@w3.org>
Message-ID: <CAJQvAueQaP5RrgEv+4tVcfwdnMfjmpwzuCKew=xt7EET7Pb1_g@mail.gmail.com>

On Mon, Feb 18, 2013 at 2:44 AM, Leif Halvard Silli
<xn--mlform-iua@xn--mlform-iua.no> wrote:
> If a non-well-formed HTML document had to be be converted to XHTML
> before being processed, then why not choose to convert to polyglot
> xhtml?

Because HTML to XHTML conversion can be automated without significant
data loss (if you consider mapping form feed to another space
character insignificant and consider the munging of some identifierst
that have no defined meaning in HTML insignificant) and because you
don't need to convert to polyglot--only XHTML--to process the doc as
XHTML. HTML to polyglot conversion cannot be similarly automated,
because e.g. ampersands in inline scripts are both common and
significant.

-- 
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/

Received on Tuesday, 19 February 2013 10:32:46 UTC