W3C home > Mailing lists > Public > whatwg@whatwg.org > May 2008

[whatwg] Entity parsing

From: Ian Hickson <ian@hixie.ch>
Date: Fri, 23 May 2008 02:50:23 +0000 (UTC)
Message-ID: <Pine.LNX.4.62.0805230245590.12911@hixie.dreamhostps.com>
On Thu, 28 Jun 2007, ?istein E. Andersen wrote:
> 
> 1) Is it useful to handle unterminated entities followed by an 
> alphanumerical character like IE does? The number of documents for which 
> this actually helps might be small compared to the number of documents 
> that contain other, incorrigible errors. The process also introduces 
> errors, albeit not in conforming documents. Is the gain worth the added 
> complexity?
> 
> If so, then should this apply to all entities? (Probably not.) Would it 
> be useful to add to/remove from the set supported by IE7? (This may seem 
> insane, but we should try to avoid premature decisions.)
> 
> 2) HTML 4.01 allows the semicolon to be omitted in certain cases. Does 
> this cause problems? Firefox and Safari both support this, and it would 
> seem meaningless to change the way conforming documents are parsed 
> unless it can be shown that, e.g., "&ndash " actually is supposed to 
> mean "&amp;ndash " more often than "&ndash; ". (Conformance is a 
> separate issue.)
> 
> 3) Will new entities ever be needed? If yes, can new entities adopt 
> existing conformance criteria and parsing rules?
> 
> 4) Similar considerations for entities in attribute values.

New entities have since been added, and the rules for parsing entities 
(sorry, "named character references") have been changed a bit. However, I 
am reluctant to change this from what we have now, since what we have now 
works well. How strongly do you feel about this?

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
Received on Thursday, 22 May 2008 19:50:23 UTC

This archive was generated by hypermail 2.4.0 : Wednesday, 22 January 2020 16:59:02 UTC