- From: Ian Hickson <ian@hixie.ch>
- Date: Thu, 4 Jun 2009 23:49:04 +0000 (UTC)
On Fri, 24 Apr 2009, ?istein E. Andersen wrote: > > When a named character reference is followed by a semicolon, it clearly > has to be expanded, but how to handle non-semicolon-terminated character > references is less obvious. > > Let &IE4 (resp. &HTML4, &HTML5) be a non-semicolon-terminated named > character reference from the IE4 (resp. HTML4, HTML5) set, and let . > (full stop) represent any character other than semicolon, and ^ > (circumflex) any character which is (roughly) not an ASCII letter or > digit (i.e., [^a-zA-Z0-9]). Not completely unreasonable sets of > character references to expand (outside of attribute values) include: > > 1) &IE4^ > 2) &IE4. > 3) &HTML4^ > 4) &IE4. &HTML4^ > 5) &HTML4. > 6) &IE4. &HTML5^ > 7) &HTML4. &HTML5^ > 8) &HTML5. > > (The set of character references to be expanded in attribute values > could be obtained by replacing . by ^ above.) > > Currently, Opera follows 1), IE 2), and Safari and Firefox 3). > > My main concern is that &HTML4^ is actually legitimate in HTML4 and > works in both Safari and Firefox today, and that HTML5 should not change > the rendering of valid HTML4 pages unless there is a good reason to do > so. Could you give an example of what you mean? I'm having trouble following your description above. As far as I can tell HTML5 more or less matches what legacy pages need, but if there are specific entities that should be parsed in a different way than HTML5 says they should, I'm happy to fix this. -- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Received on Thursday, 4 June 2009 16:49:04 UTC