Re: Void elements in HTML from Philip Taylor on 2009-01-01 (public-html@w3.org from January 2009)

From: Philip Taylor <pjt47@cam.ac.uk>
Date: Thu, 01 Jan 2009 16:28:11 +0000
To: Anne van Kesteren <annevk@opera.com>
CC: Julian Reschke <julian.reschke@gmx.de>, Ian Hickson <ian@hixie.ch>, public-html@w3.org
Message-ID: <495CEF1B.8040707@cam.ac.uk>
Anne van Kesteren wrote:
> On Wed, 31 Dec 2008 17:10:37 +0100, Julian Reschke 
> <julian.reschke@gmx.de> wrote:
>> So what this shows is that of all the instances where we see this 
>> pattern, a significant amount *intentionally* uses the empty tag 
>> notation, and these will either be unaffected (because they aren't 
>> served the way we found them in practice), or will actually be fixed.
> 
> No, 4 out of 10 will be broken. Only 2 out of 10 will be fixed and those 
> are demos... Also, as Philip already demonstrated there is more content 
> out that relies on <textarea/> being an opening tag. And it makes sense, 
> given that all browsers treat it that way it is highly unlikely that 
> authors would intentionally make pages that don't work in browsers.

It's probably more accurate to say it's highly unlikely that authors 
would intentionally publish pages that they know don't work in browsers.

It appears that the following process is quite common:

1) An author intentionally writes something like <p><textarea /></p>... 
in their text/html page, expecting the textarea to be empty.

2) The author tests their page in their favourite browser.

3) The author sees that something is very clearly wrong (since the 
textarea has the content "</p>...</body></html>", and the rest of their 
page has vanished).

4) The author gets confused, and maybe files a dupe of 
https://bugzilla.mozilla.org/show_bug.cgi?id=162653

5) The author eventually finds out / gets told what they're doing wrong, 
or they just experiment with alternative syntax until they find 
something that works, and so they rewrite their page to use 
<textarea></textarea> (or maybe <textarea/></textarea>), and it works 
fine when they test it again.

6) The author publishes their page.

7) A user reads the page at some point in the future, using any sensible 
web browser, and the page works fine.

(The dupes of that Mozilla bug demonstrate that steps 1-4 happen quite 
often. The apparent rarity of published pages (visitable directly by 
users, not just demos) which are broken in current browsers because of 
<textarea/> demonstrate that either step 5 happens, or that authors give 
up and stop writing HTML entirely.)


If <textarea/> was changed to be parsed as an empty element, it would 
result in the following process:

1) An author accidentally writes <textarea/>content</textarea>, because 
they're confused or because it's a typo or whatever.

2) The author tests their page in their favourite browser.

3) The author sees that their page works perfectly well.

4) The author publishes their page.

5) Years pass, and the HTML spec is changed to say <textarea/> must be 
parsed as an empty element.

6) A user reads the page, using a sensible web browser which implements 
the latest HTML spec, and the page is (perhaps quite subtly) wrong, 
because the textarea's content is no longer inside the textarea.

(The existence of published pages with <textarea/>...</textarea> 
demonstrates that steps 1-4 happen quite often.)


The first process results in short-term pain to authors, but then they 
fix their error. The second process results in long-term pain to users, 
because some web sites stop working in their browser and there's nothing 
they can do about it (other than downgrading their browser), and also 
pain to authors since they'll have to discover and debug and fix the 
problem on their old pages.

The Priority of Constituencies principle says user pain should be given 
more weight than author pain, so the parsing of <textarea/> should not 
be changed unless the first situation is very much more common than the 
second situation.

-- 
Philip Taylor
pjt47@cam.ac.uk
Received on Thursday, 1 January 2009 16:29:15 UTC