Re: ISO 8859-1 C1 set in RFC 2157

At 01:20 08/03/26, Frank Ellermann wrote:
>
>Uma Umamaheswaran wrote:
> 
>> Of which Level 1 was the structure to be used primarily for
>> the pure 8-bit 8859 series with no code extensions etc.

I think this fits very well with the fact that ISO 2022
is essentially a toolbox, which contains some very simple
tools and some very advanced tools (and some restrictions
on what combinations you can use.

>Yes.  Apparently ECMA 94 doesn't clearly say this, maybe this
>was fixed later in ISO 8859.  It would remove all weird ideas
>about using any G2 / G3 / SS2 / SS3 / ... "within" ISO 8859,
>and of course in practice nobody does this.  

In a different mail, Frank mentioned my role as (secondary)
reviewer for charset registrations at IANA
(http://www.iana.org/assignments/character-sets).
If a discussion came up there, my position would be that
in iso-8859-1 as registered there, the C0 and C1 areas
are assigned, but mostly unused. The boundaries of 'unused'
are a bit fuzzy, for example the number of documents with
form feeds in them is overall extremely small, but IETF
Internet Drafts and RFCs use them. The fuzzyness is probably
to a large extent a feature, it doesn't really hurt too much
but can come in handy when needed.

[...]

>That would support John's argument that windows-1252 is
>an extension of ISO 8859-1, in practice it is, no matter
>what the ISO theory about graphical characters said.

In terms of graphic characters, it is. But the above, and
the implementations in most character encoding converters,
disagree, because there is a clear difference between
'unassigned' and 'not used in practice'.

Regards,   Martin.


#-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
#-#-#  http://www.sw.it.aoyama.ac.jp       mailto:duerst@it.aoyama.ac.jp     

Received on Wednesday, 26 March 2008 08:42:49 UTC