- From: Řistein E. Andersen <html5@xn--istein-9xa.com>
- Date: Tue, 26 Jun 2007 02:50:39 +0200
On 25 Jun 2007, at 8:28AM, Ian Hickson wrote: > On Sun, 24 Jun 2007, ?istein E. Andersen wrote: > >> HTML5 currently follows IE7 much more closely than Safari, >>Firefox and Opera do, which seems to suggest that some of the quirks >>could be dispensed with. > > It's possible, though people kept pointing out problems, which is how we > ended up where we are now. I have probably missed parts of this discussion, but most of the arguments I have seen seem to rely on the assumption that whatever IE does is more compatible with the Web as it is, which is probably a good approximation, but replicating each single detail is not necessarily the best thing to do. > Calling SGML "sensible" is a slippery slope! :-) Sure, I did not mean to imply that all aspects of SGML are sensible :-) (Bad connotations aside, SGML?s rules for optional semicolons happen to be less contrived than IE?s.) >> [It might be a good idea to accept a missing semicolon at the end of words.] > > Well, we'd have to prove this somehow with real research. Yes, research is really missing here. Whatever we do, some pages will break, and it is not a priori impossible that a compromise of IE and SGML rules may be less quirky and more compatible with existing content at the same time. I am unable to do a proper corpus study on this, but the following examples suggest that following IE blindly may not be optimal. All markup is extracted from real Web pages, and the author?s intent was quite obvious from the context. The numbers in parentheses indicate the number of pages found using Google. I] Should be expanded 1) only SGML expands &mdash IE (incorrect): &mdash SGML (correct): ? 2) only IE expands fiancée (390), cafés (1,460), naïve (716) IE (correct): fianc?e, caf?s, na?ve SGML (incorrect): fiancée, cafés, naïve 3) neither expands &oeliguvre (719), c&oeligur (3,720) both (incorrect): &oeliguvre, c&oeligur intended: ?uvre, c?ur II] Should not be expanded 1) IE expands moralðics, rosesþs IE (incorrect): moral?ics, roses?s SGML (correct): moralðics, rosesþs 2) SGML expands Alpha&Omega, once&forall IE (correct): Alpha&Omega, once&forall SGML (incorrect): Alpha?, once? 3) both expand roseþ both (incorrect): rose? intended: roseþ The examples I have found in category II] are all quite rare, but it is not unlikely that more common ones exist. Opera and Google both seem to err on the side of caution by only expanding entities when both IE and SGML do, i.e., in case II.3) above. It is also interesting to notice that reasonably common words belonging to class I.2), which are handled by IE, are apparently no more frequent than words from I.3), which no (popular) current browser handles correctly. I am looking forward to seeing more extensive research on this. -- ?istein E. Andersen
Received on Monday, 25 June 2007 17:50:39 UTC