W3C home > Mailing lists > Public > public-html@w3.org > October 2008

Case folding (lack thereof) in the doctype name state

From: Henri Sivonen <hsivonen@iki.fi>
Date: Wed, 15 Oct 2008 15:56:56 +0300
Message-Id: <67D98CB0-9F6B-4F73-97FF-00DD63C1C806@iki.fi>
To: HTML WG <public-html@w3.org>

Is there a reason why the doctype name isn't case-folded in the  
tokenizer like element and attribute names?

Of the APIs I've tried to map HTML5 to, so far both APIs that  
distinguish between interned and non-interned strings (Java SAX and  
Gecko internal APIs) treat the doctype name as an interned string.  
Doing a case-insensitive compare of interned strings in the tree  
builders goes against the point of having interned strings for names.  
It's doable of course, but the exceptional treatment of this  
particular name is weird (and for a portable parser, requires yet  
another one-off comparison method in the portability layer).

Live DOM Viewer situation:
  * Firefox 3.0.3 does not fold bogus names in the tokenizer but folds  
corrent names to upper-case "HTML".
  * WebKit 35752 doesn't fold.
  * Opera 9.60 doesn't show a doctype node at all.
  * IE8b2 inserts a bogus comment.

Since IE and Opera get away with not inserting a proper doctype node  
and I found not fixed bugs on this topic on b.m.o or bugs.webkit.org,  
I suspect folding vs. later case-insensitive compare isn't an interop- 
sensitive thing considering existing content.

Henri Sivonen
Received on Wednesday, 15 October 2008 12:57:38 UTC

This archive was generated by hypermail 2.4.0 : Saturday, 9 October 2021 18:44:38 UTC