W3C home > Mailing lists > Public > public-html@w3.org > February 2013

Re: Polyglot markup and authors

From: Henri Sivonen <hsivonen@iki.fi>
Date: Tue, 19 Feb 2013 12:32:15 +0200
Message-ID: <CAJQvAueQaP5RrgEv+4tVcfwdnMfjmpwzuCKew=xt7EET7Pb1_g@mail.gmail.com>
To: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
Cc: Alex Russell <slightlyoff@google.com>, Mukul Gandhi <gandhi.mukul@gmail.com>, Jirka Kosek <jirka@kosek.cz>, public-html WG <public-html@w3.org>, Paul Cotton <Paul.Cotton@microsoft.com>, Maciej Stachowiak <mjs@apple.com>, "www-tag@w3.org List" <www-tag@w3.org>, Sam Ruby <rubys@intertwingly.net>, "Michael[tm] Smith" <mike@w3.org>
On Mon, Feb 18, 2013 at 2:44 AM, Leif Halvard Silli
<xn--mlform-iua@xn--mlform-iua.no> wrote:
> If a non-well-formed HTML document had to be be converted to XHTML
> before being processed, then why not choose to convert to polyglot
> xhtml?

Because HTML to XHTML conversion can be automated without significant
data loss (if you consider mapping form feed to another space
character insignificant and consider the munging of some identifierst
that have no defined meaning in HTML insignificant) and because you
don't need to convert to polyglot--only XHTML--to process the doc as
XHTML. HTML to polyglot conversion cannot be similarly automated,
because e.g. ampersands in inline scripts are both common and
significant.

-- 
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/
Received on Tuesday, 19 February 2013 10:32:46 UTC

This archive was generated by hypermail 2.3.1 : Monday, 29 September 2014 09:39:37 UTC