Re: Doctypes with "[" after public identifier

Philip Taylor, Fri, 19 Feb 2010 10:44:44 +0000:
> Leif Halvard Silli wrote:
>> Philip Taylor, Thu, 18 Feb 2010 16:47:13 +0000:
>>> It's handled in HTML5 
>> 
(http://whatwg.org/html#between-doctype-public-and-system-identifiers-state) 
>>> exactly like any other bogus character (i.e. forcing quirks mode), 
>>> but Firefox appears to have a special case for "[" in this location 
>>> (preventing quirks).
>> 
>> When you say "Firefox appears to have", then you mean Firefox' HTML5 
>> implementation, I suppose?
> 
> I meant the legacy / current default Firefox behaviour, i.e. not HTML5.

But for the HTML5 inspired Safari 4, can you point to a single legacy 
web browser that triggers quirks mode because of presence of "[]" when 
the system identifier is failing in a doctype which triggers standards 
mode without the system identfier? 

Also: You focus on the "[". But you  should file separate bugs for "[" 
and a bug for "[]".  Because legacy Opera triggers quirks if the "]" is 
failing but not if both "[" and "]" are there. The effect of a single 
"[" is a different issue from the effect of "[]". I think we should not 
give much heed to the fact that a single "[" triggers quirks mode in 
Opera, hover, unless other browses do the same.
 
>>>   http://www.freemanforman.co.uk/

>>>     <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" 
>>> [url=http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
>>> 
>>>   http://symptomresearch.nih.gov/

>>>     <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" []>
>> 
>> Mister Taylor: That is a transitional doctype. The reason it 
>> triggers quirks has to do with that, and is _not_ related to the 
>> "[]".
> 
> Oops, hadn't noticed that - but in HTML5 the "[]" triggers the 
> force-quirks flag so it is related to that string, regardless of the 
> public identifier.

OK. And I thought that the HTML5 algorithm would trigger quirks already 
when it saw the quirks triggering public identifier. I thought the same 
about Legacy Firefox. But I was wrong about both ...

> In Firefox it seems to trigger some kind of 
> force-*non*-quirks flag, to override the quirkiness of the public 
> identifier.

Aha. I had not noticed that. So HTML5 enabled Firefox and Legacy 
Firefox have quite opposing algorithms for transitional HTML4.0 (but 
not quite the same for HTML4.0.1): plus became minus and minus became 
plus. In a summary, for HTML4.01 doctypes:

(A) For the quirks-mode triggering transitional doctype above, then 
HTML5 enabled Firefox is in "double quirks mode" because of the [], 
whereas in legacy Firefox the presence of [] untriggers quirksmode.

(B) For the non-quirks triggering strict doctype of my test page [*], 
then HTML5 enabled Firefox triggers quirks mode whenever there is a [], 
whereas legacy Firefox is in "double non-quirks mode" - since both the 
doctype and the [] causes non-quirks mode.

[*] http://målform.no/html4-or-html5/


However, HTML 4.01 Doctypes, complicate things again: If you change 
transitional 4.0 doctype to a 4.01 doctype, then the Legacy Firefox 
behaviour differs again, both with regard to the effect of the system 
identifier, and with regard to the effect of "[]": 

* For 4.0.1, then Firefox remains in standards mode regardless of the 
presence of "" and/or [].
* For 4.0, then Firefox remains in quirks mode regardless, _except_ 
when you BOTH *remove* the system identifier AND *add* the "[]" - which 
causes standards mode. 

That the removal of system identifier and the addition of [] triggers 
standards mode for the 4.0 transitional doctype, is different from how 
Internet Explorer behaves. I think we should follow IE here.

Another thing to note is that for the 4.0 transitional doctype, then 
the presence of [] does not make whether Legacy Firefox, HTML5 Firefox, 
Opera 10.5beta, Opera 10.10 or any other browser trigger standards mode 
even if there is system identifier. I find this logical, because I find 
that the user agents should simply ignore the presence of SGML comments 
or [] when they decide about standards/quirks mode. 

>>> Filed as http://www.w3.org/Bugs/Public/show_bug.cgi?id=9071

>> 
>> Wonder why we needed two bug reports for this ...
> 
> The other bug was about supporting an obscure SGMLism, 

From an "obscure SGMLism" point of view, then I can live with the 
current HTML5 algorithm: if the [] appears after the system identifier, 
then it doesn't trigger quirks. That saves my obscure SGMLism day.

> or about an 
> undefined set of "All DOCTYPE variants that trigger standards mode 
> pre-HTML5".

So what is your motivation for not agreeing with that goal?

> This one is just about a specific potential legacy 
> compatibility issue with broken content that happens to have a "[" 
> after the public identifier.

I wonder if your real motivation is that you want to simplify the issue 
of standards versus quirks. A kind of "we have, and must have, both 
standards and quirks, but what triggers/untriggers what, today is very 
unsystematic - so let's create a system".

This systematic system means that we are making quirksmode into a 
feature in itself.

If being able to set quirks mode is a feature in itself, then this is 
good. But if quirks mode is only about being compatible with existing 
content and the goal is that quirks mode shall one day stop being a 
problem, then it does not seem good. From that POV, we should only 
document existing behavior and, when there are conflicting behavior, we 
should pick the dominant behaviour.

I think that my basic assumptions apply: the presence of "[]" when 
there is no system identifier, does not trigger quirks mode, unless the 
doctype triggers quirksmode also without the []. The only exception we 
have found to this is the behaviour of Firefox w.r.t. 
HTML4.0Transitional and the behavior or Opera when the "]" is failing.
-- 
leif halvard silli

Received on Friday, 19 February 2010 17:03:22 UTC