W3C home > Mailing lists > Public > www-xml-infoset-comments@w3.org > July to September 2000

A proposal for a slight change

From: Tim Bray <tbray@textuality.com>
Date: Mon, 03 Jul 2000 17:42:32 -0700
Message-Id: <>
To: www-xml-infoset-comments@w3.org
It seems to me that the [charset] property, which seems sensibly described,
should be required, not optional.

It should be required because it is in principle impossible at the deepest
level to parse an XML document & hence build an information set without
knowing what encoding it's in.  We are running into a lot of static over
in perl-land because a side-effect of using XML::Parser is that everything
gets turned in to UTF-8, which is wrong in a few apps; it's OK to ask
the programmer to morph it back,  but the input encoding ought to be 
reliably available as a side-effect of parsing.

It also bothers me at some level that this is made optional as a side-effect
of being attached to the entity... whereas it logically belongs
there, this is something that (for the document entity at least) you
can't create an infoset without knowing, and which people who care not
in the slightest about entities are going to want to find out about.

yes, this is awfully late in the game. -Tim
Received on Monday, 3 July 2000 20:42:50 UTC

This archive was generated by hypermail 2.3.1 : Monday, 16 July 2018 20:25:09 UTC