W3C home > Mailing lists > Public > www-international@w3.org > October to December 2007

Normalizing transcoders

From: Martin Duerst <duerst@it.aoyama.ac.jp>
Date: Fri, 30 Nov 2007 16:00:40 +0900
Message-Id: <6.0.0.20.2.20071130155152.096abbb0@localhost>
To: www-international@w3.org

[This is mostly a topic for a/the WG, related to normalization and
the Normalization part of the Character Model, but I'm sending it here
because expecting wider input.]

The Character Model: Normalization introduces the concept of a
Normalizing Transcoder
(http://www.w3.org/TR/charmod-norm/#sec-NormalizingTranscoder).

Up to yesterday, I was under the impression that such transcoders
are mostly of theoretical existence. But yesterday, I discovered
that the gnu iconv implementation on my cygwin system implemented
a normalizing transcoder for windows-1258 -> UTF-8.
Windows-1258 is probably the most widely used legacy encoding for
Vietnamese, and Vietnamese is in practice the language most in need
for a clear normalization policy.

I would like to take this as an opportunity to collect information
on other normalizing transcoders. If you know of some, please reply
to this mailing list.

Regards,     Martin.


#-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
#-#-#  http://www.sw.it.aoyama.ac.jp       mailto:duerst@it.aoyama.ac.jp     
Received on Friday, 30 November 2007 07:01:40 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:17:15 GMT