Re: Other syntax: part of my review of 8 The HTML syntax from Robert Burns on 2007-08-21 (public-html@w3.org from August 2007)

From: Robert Burns <rob@robburns.com>
Date: Mon, 20 Aug 2007 23:49:30 -0500
To: Anne van Kesteren <annevk@opera.com>
Cc: "HTMLWG WG" <public-html@w3.org>
Message-Id: <D76F9A66-B966-437A-96B1-CBBAC87AB6D6@robburns.com>

Hi Anne,

On Aug 20, 2007, at 5:01 AM, Anne van Kesteren wrote:

>
> On Thu, 16 Aug 2007 05:52:12 +0200, Robert Burns <rob@robburns.com>  
> wrote:
>> The other points remain viable. In particular by specifying that  
>> XML processed HTML5 documents should not throw up error pages when  
>> encountering an unknown character reference (like  
>> &madeupreference;), the current trends among implementations is to  
>> treat that as a fatal error and therefore needless breaks many web  
>> pages. If we could address that it would be a big deal.
>
> It seems out of scope for the HTML WG to define how to parse XML.  
> (The point where you know you deal with HTML is typically after the  
> parser level when elements are inserted into the tree at which  
> point you can not deal with well-formedness problems the parser  
> might throw up, etc.)

Since any XML application such as XHTMl may define NCNames for  
dealing with elements, attributes and, in this case, entity  
references,  this is necessarily an issue defined above the XML  
parser. HTML already defines many entity references for common  
characters in the extended Latin and Greek alphabets as well as many  
mathematical and other symbols Also other XML applications define  
NCName wildcards for things such as attributes. For example XForms  
allows anyattribute name on its elements. What I am suggesting is  
simply the same thing but for entity references. Basically we would  
define an anyEntityReference. and  map those all to the Unicode  
replacement character (U+FFFD). This would prevent many needless  
fatal errors that serve no purpose for authors and users of HTML.

Take care,
Rob

Received on Tuesday, 21 August 2007 04:49:52 UTC