Re: Request for Volunteers: Polyglot spec from Sam Ruby on 2010-04-01 (public-html@w3.org from April 2010)

From: Sam Ruby <rubys@intertwingly.net>
Date: Thu, 01 Apr 2010 19:33:12 -0400
To: Philip Taylor <pjt47@cam.ac.uk>
CC: Jonas Sicking <jonas@sicking.cc>, HTML WG <public-html@w3.org>, Technical Architecture Group <www-tag@w3.org>
Message-ID: <4BB52D38.30206@intertwingly.net>

On 04/01/2010 02:41 PM, Philip Taylor wrote:
> Jonas Sicking wrote:
>> On Fri, Mar 26, 2010 at 1:52 PM, Sam Ruby <rubys@intertwingly.net> wrote:
>>> I took an action item from the TAG yesterday to convey the following
>>> request:
>>>
>>> The W3C TAG requests there should be in TR space a document
>>> which specifies how one can create a set of bits which can
>>> be served EITHER as text/html OR as application/xhtml+xml,
>>> which will work identically in a browser in both bases.
>>> (As Sam does on his web site.)
>>>
>>> This request requires a lot of explanation. To start, it is
>>> recognized up
>>> front that this will be a subset of the set of possible documents
>>> that can
>>> be expressed as HTML5. This is entirely OK. For example, if it were
>>> to be
>>> the case that such a subset were to entirely disallow scripts of any
>>> kind,
>>> that would be acceptable as there exists a substantial class of
>>> documents
>>> which do not require scripting of any kind.
>>
>> Out of curiosity, what does "work identically" encompass? Do they have
>> to have the same DOM? Or just render the same when the default UA
>> stylesheet is applied? Or just be semantically equivalent?
>> [...]
>> If DOMs aren't important, only rendering is, I assume that this
>> document won't qualify:
>>
>> <html xmlns="http://www.w3.org/1999/xhtml">
>> <head>
>> <style> tbody { background: green } </style>
>> <title>example document</title>
>> </head>
>> <body>
>> Integer values for true/false.
>> <table>
>> <tr><td>true</td><td>1</td></tr>
>> <tr><td>false</td><td>0</td></tr>
>> </table>
>> </body>
>> </html>
>
> This one would also render differently:
>
> <html xmlns="http://www.w3.org/1999/xhtml">
> <head><title>example document</title></head>
> <body>
> <pre>
> Arbitrary example text</pre>
> </body>
> </html>
>
> and this one will also cause data corruption depending on the content-type:
>
> <html xmlns="http://www.w3.org/1999/xhtml">
> <head><title>example document</title></head>
> <body>
> <form>
> Edit your comment:
> <textarea name="comment">
> Your previous text</textarea>
> </form>
> </body>
> </html>
>
> (because the text/html parser strips a leading newline character in
> pre/textarea/listing elements), which seem like more serious issues than
> the <tbody>, since (unless I'm missing something) it's impossible to
> safely use these elements in polyglot documents, unless you do
>
> <pre><!---->
> text
> </pre>
>
> which is a horrid hack and won't work for textarea anyway. So I think a
> true polyglot subset would have to exclude the textarea element, which
> limits its usefulness further. (Maybe the remaining subset is still
> large enough to be worth specifying in detail.)

I have a textarea on my individual blog entry pages.  I serve those 
pages as text/html to IE (including the current Platform Preview), and 
as application/xhtml+xml to pretty much everyone else.

To date, I have not had a single complaint on this issue you describe. 
And have had comments left by plenty of people using a diverse variety 
of browsers.

It is my hope that the document produced can avoid superlatives like 
"serious", "impossible" and "horrid" when discussing this limitation.

- Sam Ruby

Received on Thursday, 1 April 2010 23:33:51 UTC