W3C home > Mailing lists > Public > public-html@w3.org > April 2010

Re: Request for Volunteers: Polyglot spec

From: Philip Taylor <pjt47@cam.ac.uk>
Date: Thu, 01 Apr 2010 19:41:28 +0100
Message-ID: <4BB4E8D8.7080008@cam.ac.uk>
To: Jonas Sicking <jonas@sicking.cc>
CC: Sam Ruby <rubys@intertwingly.net>, HTML WG <public-html@w3.org>, Technical Architecture Group <tag@w3.org>
Jonas Sicking wrote:
> On Fri, Mar 26, 2010 at 1:52 PM, Sam Ruby <rubys@intertwingly.net> wrote:
>> I took an action item from the TAG yesterday to convey the following
>> request:
>>    The W3C TAG requests there should be in TR space a document
>>    which specifies how one can create a set of bits which can
>>    be served EITHER as text/html OR as application/xhtml+xml,
>>    which will work identically in a browser in both bases.
>>    (As Sam does on his web site.)
>> This request requires a lot of explanation.  To start, it is recognized up
>> front that this will be a subset of the set of possible documents that can
>> be expressed as HTML5.  This is entirely OK.  For example, if it were to be
>> the case that such a subset were to entirely disallow scripts of any kind,
>> that would be acceptable as there exists a substantial class of documents
>> which do not require scripting of any kind.
> Out of curiosity, what does "work identically" encompass? Do they have
> to have the same DOM? Or just render the same when the default UA
> stylesheet is applied? Or just be semantically equivalent?
> [...]
> If DOMs aren't important, only rendering is, I assume that this
> document won't qualify:
> <html xmlns="http://www.w3.org/1999/xhtml">
>   <head>
>     <style> tbody { background: green } </style>
>     <title>example document</title>
>   </head>
>   <body>
>     Integer values for true/false.
>     <table>
>       <tr><td>true</td><td>1</td></tr>
>       <tr><td>false</td><td>0</td></tr>
>     </table>
>   </body>
> </html>

This one would also render differently:

<html xmlns="http://www.w3.org/1999/xhtml">
   <head><title>example document</title></head>
Arbitrary example text</pre>

and this one will also cause data corruption depending on the content-type:

<html xmlns="http://www.w3.org/1999/xhtml">
   <head><title>example document</title></head>
       Edit your comment:
       <textarea name="comment">
Your previous text</textarea>

(because the text/html parser strips a leading newline character in 
pre/textarea/listing elements), which seem like more serious issues than 
the <tbody>, since (unless I'm missing something) it's impossible to 
safely use these elements in polyglot documents, unless you do


which is a horrid hack and won't work for textarea anyway. So I think a 
true polyglot subset would have to exclude the textarea element, which 
limits its usefulness further. (Maybe the remaining subset is still 
large enough to be worth specifying in detail.)

Philip Taylor
Received on Thursday, 1 April 2010 18:42:02 UTC

This archive was generated by hypermail 2.3.1 : Thursday, 29 October 2015 10:16:00 UTC