the link between "Authoritative Metadata" and "Polyglot" from Larry Masinter on 2013-04-06 (www-tag@w3.org from April 2013)

From: Larry Masinter <masinter@adobe.com>
Date: Fri, 5 Apr 2013 23:51:53 -0700
To: Larry Masinter <masinter@adobe.com>, Anne van Kesteren <annevk@annevk.nl>
CC: "www-tag.w3.org" <www-tag@w3.org>
Message-ID: <C68CB012D9182D408CED7B884F441D4D1E885D96CD@nambxv01a.corp.adobe.com>

the two issues are linked in an interesting way:


IF you accept that content-type is anti-pattern, and that file types SHOULD be determined by sniffing the content

THEN the existence and encouragement of Polyglot content is undesirable.


that is because if you start with Polyglot (something that can be parsed as HTML or as XML/XHTML) and tweak it a little, you can wind up with content which really *should* be sniffed as text/html, and some other content which really *should* be sniffed as application/xhtml+xml, and you won't be able to easily tell the difference.

That is, if you're trying to handle application/xhtml+xml with an XML parser and text/html, you might be forced to try to parse the whole file before knowing how to sniff it.

So: if you don't accept authoritative metadata for content-type, then perhaps polyglot is harmful.

Larry
--
http://larry.masinter.net

Received on Saturday, 6 April 2013 06:52:54 UTC