Re: the document character set for text/thml serialization from Julian Reschke on 2007-09-09 (public-html@w3.org from September 2007)

From: Julian Reschke <julian.reschke@gmx.de>
Date: Sun, 09 Sep 2007 16:11:34 +0200
To: Robert Burns <rob@robburns.com>
CC: HTML Working Group <public-html@w3.org>
Message-ID: <46E3FF16.20402@gmx.de>

Robert Burns wrote:
> ...
> I see now that XML 1.1 permits all of these control characters as part 
> of the document character set, however all of these ASCII control 
> characters must be included only as character references in XML 1.1. 
> That leaves only the issues of surrogates; whitespace handling for these 
> characters (if any: e.g., U+000B, U+000C, and U+0085). Though I think 
> our WGs practice of finding use cases for a feature before including it 
> is apt here too. Is being compatible with XML 1.1 enough of a use case? 
> How would authors use these characters?
> ...

My personal impression was that XML 1.1 is a failure; thus I wouldn't 
recommend HTML5 to rely on XML 1.1 features for the XML serialization.

We really should answer the question we asked before: why would it be 
conforming to include those characters in the first place?

Best regards, Julian

Received on Sunday, 9 September 2007 14:11:51 UTC