Re: Charset "iso-10646-1"

From: Terje Bless (link@pobox.com)
Date: Fri, Aug 31 2001

  • Next message: Terje Bless: "Re: Charset "iso-10646-1""

    Date: Fri, 31 Aug 2001 09:14:08 +0200
    From: Terje Bless <link@pobox.com>
    To: Martin Duerst <duerst@w3.org>
    cc: John Middleton <jmiddlet@sedl.org>, www-validator@w3.org, www-html-editor@w3.org, www-html@w3.org
    Message-ID: <20010831101246-r01010800-919928f4-0910-010c@localhost>
    Subject: Re: Charset "iso-10646-1"
    
    [ Note: CCed all over the place. Watch were you send any replies! ]
    [ The right place is probably either www-html or www-validator,   ]
    [ depending on who and which issue you're replying to. :-)        ]
    
    On 29.08.01 at 16:19, Martin Duerst <duerst@w3.org> wrote:
    
    >I'm a bit confused. The place you cite does reference [ISO10646], but it
    >does not contain any syntax examples. The actual syntax is given in
    >Section 5, http://www.w3.org/TR/html4/charset.html, but this does
    >not contain the label iso-10646-1 at all.
    >
    >Also, the IANA registry at http://www.iana.org/assignments/character-sets
    >does not contain iso-10646-1.
    >
    >I wonder where you came up with iso-10646-1.
    
    It's a common misconception. Character Encoding issues are _hard_ and most
    people don't understand them. Since the ISO-8859-* series has been well
    worked into the collective subsconscious, if a spec uses a similar looking
    string (such as "ISO-10646") anywhere in relation to charset issues, a lot
    of people will immediately assume it is a charset name in the same vein as
    the ISO-8859-* encodings. This has cropped up periodically and should
    probably be mentioned to the HTML WG; a small explanatory note,
    strategically placed, could avoid a lot of confusion.