- From: Ian Hickson <ian@hixie.ch>
- Date: Wed, 27 Jun 2007 23:43:39 +0000 (UTC)
On Thu, 28 Jun 2007, ?istein E. Andersen wrote: > > > > I don't really know how to do more research -- it's quite hard to > > programatically tell when an entity should be expanded and when it > > shouldn't. > > True, but this is not completely insurmountable ??? or, rather: useful > information can be extracted without necessarily making these decisions > explicitly. > > I do not know what you have done already, but something like the following > for each entity &ref; would be useful for the discussion: > ??? total number of "&ref"; > ??? number of "&ref;"; > ??? number of "&ref" followed by /[a-zA-Z0-9]/; > ??? the N most frequent matches of /[a-zA-Z0-9]*&ref[a-zA-Z0-9&]+/. > > Without any real data, arguing, e.g., that conforming HTML 4.01 > documents that are currently handled correctly by Firefox and Safari > must be handled differently in the future for the sake of backwards > compatibility is not really persuasive. Sadly none of the arguments in any direction right now are particularly persuasive. I'm not really convinced that the data that the above proposed survey might collect would actually help, since it doesn't tell us the what was intended by the author. You'd be surprised at how often people use ampersands in text in ways that have nothing to do with entities but in ways which could get interpreted as entities. > The implication seems to be that Résumé can be found on the Web > and therefore should be supported. But Google also tells us something else: > > (1) "résum??": 572 > (2) +r??sum??: 114,000,000 > (3) résumé -"résumés": 16,300 > (4) +"r????sum????": 1,000 > > Actually, (1) does not only cover résumé, but also code like > r&eacutesum??, so the number of occurrences that can be saved by > parser quirks is lower than 572. The number of occurences of "résum?? "is at least two (the two hits I looked at both worked in IE and did not in Firefox). Am I correct in assuming that you would like the spec changed? What would you like the spec changed to, exactly? -- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Received on Wednesday, 27 June 2007 16:43:39 UTC