- From: <bugzilla@jessica.w3.org>
- Date: Fri, 14 Jan 2011 09:11:29 +0000
- To: public-html-bugzilla@w3.org
http://www.w3.org/Bugs/Public/show_bug.cgi?id=11755 Summary: The introduction should be clearer about use cases best addressed by polyglot markup Product: HTML WG Version: unspecified Platform: All OS/Version: All Status: NEW Severity: normal Priority: P2 Component: HTML/XHTML Compatibility Authoring Guide (ed: Eliot Graff) AssignedTo: eliotgra@microsoft.com ReportedBy: hsivonen@iki.fi QAContact: public-html-bugzilla@w3.org CC: mike@w3.org, public-html-wg-issue-tracking@w3.org, public-html@w3.org, eliotgra@microsoft.com The draft says: "It is often valuable to be able to serve HTML5 documents that are also well formed XML documents. An author may, for example, use XML tools to generate a document, and they and others may process the document using XML tools. These documents are served as text/html." The quoted part has four problems: 1) It claims "often valuable" in the passive voice without substantiating the claim beyond what is said in the next sentence, but the next sentence isn't on a very strong ground as seen below. 2) If an author uses XML tools to generate the document, using a generic XML serializer is not OK, because a generic serializer might do whatever is OK in application/xhtml+xml but not necessarily in text/html. As a trivial example, a generic XML serializer might likely serialize a script element pointing to an external script as <script src="foo.js"/>, which would be very wrong in text/html. Thus, the author needs a text/html-aware serializer anyway to be able to successfully use the output as text/html: either a polyglot serializer or a text/html-only serializer. Once a text/html-aware serializer is needed instead of a generic XML serializer, it isn't necessary to make the serializer polyglot if the goal is simply to produce text/html content using otherwise XML tools. Monoglot serializers for either text/html or for XML can serialize the text content of the style and script elements with relative ease. However, a strictly polyglot serializer can't support inline scripts and styles in the general case. (The serializer would either have to relax DOM sameness by generating /* <![CDATA[ */ at start of the text content and /* ]]> */ at the end of the text content or to ban the characters <, > and & in the script or style sheet, which would be a drastic restriction.) Using a monoglot serializer avoids this problem, so polyglot isn't a good solution for creating text/html content from an XML tool (such as an XSLT processor). 3) Polyglot isn't a very effective way of allowing others to process the document using XML tools, either. For someone else to be able to consume text/html content using an XML parser, every document (s)he wants to consume has to be polyglot. If the content to be consumed is Web content in general, there's no way to force all of it to be polyglot. From the point of view of the content consumer, it is easier to consume text/html content with an HTML parser that exposes the same APIs to the rest of the app that an XML parser would expose than to make agreements with document authors to get them to write polyglot markup. Once the consumer includes an HTML parser is the app, there's no longer value in any of the consumed docs being polyglot. Thus, from the point of view of a would-be polyglot author, making a document polyglot won't be of value if someone else whose document needs to be consumed by the same consumer makes a monoglot document. The would-be polyglot author might as well be the first one to make a monoglot document that forces the consumer to deal. Thus, getting authors to use polyglot markup isn't as good a solution to consuming text/html content with XML tools as putting an HTML parser at the start of the pipeline is. 4) The last quoted sentence says the documents are served as text/html but doesn't say why. A polyglot document is by definition a document that also works as application/xhtml+xml. The main reason not to serve such documents as application/xhtml+xml only is catering to the userbase of IE version earlier than IE9. It would be a shame to get a situation where authors keep addressing a transient problem even when the problem is gone (when IE6 through IE8 users no longer form a substantial audience). Once the author no longer wishes to address the IE6 through IE8 audience, the author could use a monoglot XML-only serializer for point #2 above. Please either substantiate "often valuable" better or remove the claim. Please replace the stated use cases with use cases for which using polyglot markup is indeed the best known solution or, alternatively, please at least mention the alternative solutions I outlined in points #2 and #3 above. Please mention that the reason for serving content as text/html when it would work as application/xhtml+xml is a transient reason. -- Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
Received on Friday, 14 January 2011 09:11:31 UTC