[whatwg] Parsing entities

On Mon, 14 Aug 2006, Simon Pieters wrote:
> 
> I guess that for compat with IE and the Web[1] we have to treat 
> "R&eacutesum&eacute" as if it were "Résumé". So how do we 
> handle "&noti;"? When the parser has come as far as "&not" it can't 
> return U+00AC yet because it could well be "∉". But when it has 
> reached "&noti;" then it can't be "∉", thus it returns U+00AC, but 
> then you also have to reparse the "i;", right? Unless I'm mistaken the 
> spec doesn't say anything about that.

Section 8.2.3.1 "Tokenising entities", under "Anything else", covers this: 
"Consume the maximum number of characters possible, with the consumed 
characters case-sensitively matching one of the identifiers in the first 
column of the entities table".

HTH,
-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Received on Monday, 14 August 2006 13:24:54 UTC