- From: Řistein E. Andersen <html5@xn--istein-9xa.com>
- Date: Thu, 28 Jun 2007 04:53:09 +0200
On 28 Jun 2007, at 12:43AM, Ian Hickson wrote: > Sadly none of the arguments in any direction right now are particularly > persuasive. Indeed. > I'm not really convinced that the data that the above proposed survey > might collect would actually help, since it doesn't tell us the what was > intended by the author. To a certain extent, this depends on the results. Some conclusions can be drawn without actually knowing the author's intent at all: if, for instance, "&foo[^;]" is exceedingly rare, then what the author meant does not really matter, since the construct does not need to be supported anyway. I also tend to think that entities that are part of existing words are highly likely to be supposed to be expanded. Of course, 100% accuracy cannot be achieved, but this is not really needed for the results to be useful. > Am I correct in assuming that you would like the spec changed? What would > you like the spec changed to, exactly? I would really like an informed decision, and I currently get the impression that rules are changed to follow IE by default rather than to handle existing content, which may lead to unnecessary complicated rules that do not actually handle existing documents optimally. More specifically, some of the points that probably should be addressed are the following: 1) Is it useful to handle unterminated entities followed by an alphanumerical character like IE does? The number of documents for which this actually helps might be small compared to the number of documents that contain other, incorrigible errors. The process also introduces errors, albeit not in conforming documents. Is the gain worth the added complexity? If so, then should this apply to all entities? (Probably not.) Would it be useful to add to/remove from the set supported by IE7? (This may seem insane, but we should try to avoid premature decisions.) 2) HTML 4.01 allows the semicolon to be omitted in certain cases. Does this cause problems? Firefox and Safari both support this, and it would seem meaningless to change the way conforming documents are parsed unless it can be shown that, e.g., "&ndash " actually is supposed to mean "&ndash " more often than "– ". (Conformance is a separate issue.) 3) Will new entities ever be needed? If yes, can new entities adopt existing conformance criteria and parsing rules? 4) Similar considerations for entities in attribute values. -- ?istein E. Andersen
Received on Wednesday, 27 June 2007 19:53:09 UTC