- From: Robert Burns <rob@robburns.com>
- Date: Mon, 20 Aug 2007 23:49:30 -0500
- To: Anne van Kesteren <annevk@opera.com>
- Cc: "HTMLWG WG" <public-html@w3.org>
Hi Anne, On Aug 20, 2007, at 5:01 AM, Anne van Kesteren wrote: > > On Thu, 16 Aug 2007 05:52:12 +0200, Robert Burns <rob@robburns.com> > wrote: >> The other points remain viable. In particular by specifying that >> XML processed HTML5 documents should not throw up error pages when >> encountering an unknown character reference (like >> &madeupreference;), the current trends among implementations is to >> treat that as a fatal error and therefore needless breaks many web >> pages. If we could address that it would be a big deal. > > It seems out of scope for the HTML WG to define how to parse XML. > (The point where you know you deal with HTML is typically after the > parser level when elements are inserted into the tree at which > point you can not deal with well-formedness problems the parser > might throw up, etc.) Since any XML application such as XHTMl may define NCNames for dealing with elements, attributes and, in this case, entity references, this is necessarily an issue defined above the XML parser. HTML already defines many entity references for common characters in the extended Latin and Greek alphabets as well as many mathematical and other symbols Also other XML applications define NCName wildcards for things such as attributes. For example XForms allows anyattribute name on its elements. What I am suggesting is simply the same thing but for entity references. Basically we would define an anyEntityReference. and map those all to the Unicode replacement character (U+FFFD). This would prevent many needless fatal errors that serve no purpose for authors and users of HTML. Take care, Rob
Received on Tuesday, 21 August 2007 04:49:52 UTC