W3C home > Mailing lists > Public > whatwg@whatwg.org > December 2006

[whatwg] Valid Unicode

From: Henri Sivonen <hsivonen@iki.fi>
Date: Sun, 3 Dec 2006 11:40:45 +0200
Message-ID: <7A71F248-5575-4134-B1CA-C50DD5820F83@iki.fi>
On Dec 3, 2006, at 03:47, Sam Ruby wrote:

>> What I am advocating is making sure that *conforming* HTML5 documents
>> can be serialized as XHTML5 without dataloss.
>
> Then you will also need to disallow newlines in attribute values.

I believe that is not the case. See the last line of the table at the  
end of section 3.3.3 in the XML 1.0 spec.
http://www.w3.org/TR/REC-xml/#AVNormalize

(Note that if some of this doesn't currently work in Gecko, Gecko has  
a bug. Expat does the XML-compliant thing but then nsExpatDriver runs  
whitespace normalization again, which is bogus. https:// 
bugzilla.mozilla.org/show_bug.cgi?id=343870 It doesn't make sense to  
fix it until bug 18333 has landed.)

> In any case, I understand the desire; my read is that the WG's desire
> for backwards compatibility is higher.  Limiting the character set to
> the allowable XML 1.1 character set should not be a problem for
> backwards compatibility purposes.

XML 1.1 doesn't really solve anything in this area. XML 1.1 is part  
of the problem. It creates incompatibility in corner cases without  
compelling benefits. The real XML that is known to work with any "XML  
tool chain" is XML 1.0.

I should point out that HTML5 proclaims non-conforming some things  
that no doubt exist on the Web and are far more common that form  
feeds. You can't even achieve any useful effect by including a form  
feed in HTML.

-- 
Henri Sivonen
hsivonen at iki.fi
http://hsivonen.iki.fi/
Received on Sunday, 3 December 2006 01:40:45 UTC

This archive was generated by hypermail 2.4.0 : Wednesday, 22 January 2020 16:58:50 UTC