W3C home > Mailing lists > Public > www-tag@w3.org > September 2003

Re: Requesting a revision of RFC3023

From: Martin Duerst <duerst@w3.org>
Date: Fri, 19 Sep 2003 12:50:00 -0400
Message-Id: <4.2.0.58.J.20030919124030.04fbfd20@localhost>
To: John Cowan <jcowan@reutershealth.com>, Francois Yergeau <FYergeau@alis.com>
Cc: ietf-xml-mime@imc.org, WWW-Tag <www-tag@w3.org>

In the long term, I think the fundamental problem is with the large
number of encodings, not with the encoding identifications. The
example of US-ASCII very clearly shows the huge advantages of
having a single encoding.

This will of course take some time. As one example, N3 just says
that it's in UTF-8. For other new formats, that may make sense,
too.

Regards,     Martin.

At 12:10 03/09/19 -0400, John Cowan wrote:

>Francois Yergeau scripsit:
>
> > In this respect, yes.  All programming languages should provide for charset
> > identification of their source files.  Alas, none do, AFAIK.
>
>I almost, but not quite, entirely disagree with this position.
>
>Rather than having thousands of ad hoc mechanisms for encoding declarations
>in each of the thousands of text formats now extant, file systems should have
>a convenient mechanism for recording the encoding of each file, and character
>processing libraries should have convenient reading and writing operations 
>that
>do the necessary conversions.  Otherwise, generic text-processing tools become
>impossible, because each tool has to have a vast library that understands the
>mechanics of the encoding declaration specific to the format it is trying to
>read.  That way madness lies.
Received on Friday, 19 September 2003 12:50:06 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 26 April 2012 12:47:20 GMT