Re: internet media types and encoding from Tim Bray on 2003-04-11 (www-tag@w3.org from April 2003)

From: Tim Bray <tbray@textuality.com>
Date: Fri, 11 Apr 2003 10:55:14 -0700
To: Chris Lilley <chris@w3.org>
Cc: Paul Grosso <pgrosso@arbortext.com>, www-tag@w3.org
Message-ID: <3E970182.8010805@textuality.com>

Chris Lilley wrote:
> Unlike Rick I am not making this argument on the basis of the ease of
> detecting encoding labelling or conversion errors; rather, on the
> basis of those non-printing characters having no basis being in a
> marked up document. I mean, start of string? end of guarded area?

I profoundly agree with Chris here, but I had thought this issue to have 
been long-since decided.  My vision of XML is that element content is 
text, and text is a string of characters, and characters have the 
semantics that Unicode says they have.  Most of the C0 and C1 control 
characters have no useful or agreed-upon semantics, and they have no 
place in XML under any circumstances.  Their inclusion substantially 
decreases interoperability.  Do enough of the TAG agree that we should 
take this up officially?  -Tim

Received on Friday, 11 April 2003 13:55:17 UTC