Re: Internationalized CLASS attributes from Keld J|rn Simonsen on 1996-10-24 (www-international@w3.org from October to December 1996)

From: Keld J|rn Simonsen <keld@dkuug.dk>
Date: Thu, 24 Oct 1996 18:13:55 +0200
To: Martin J Duerst <mduerst@ifi.unizh.ch>
Cc: rosenne@NetVision.net.il, www-international@w3.org
Message-Id: <199610241614.SAA24134@dkuug.dk>

Martin J Duerst writes:

> Keld Simonsen wrote:
> 
> >Martin J Duerst writes:
> >
> >> Keld Simonsen wrote:
> >> 
> >> >Martin J Duerst writes:
> >
> >Again, the user does not care
> >how the information is encoded, as long as what (s)he sees 
> >is understandable and what is expected. One or two characters
> >does not matter to the user. So again it is up to the  system designer
> >to code the information in an unambigeous and well-defined way.
> >In the case of accented Latin characters 10646 then specifies
> >normatively only one way of encoding.
> 
> As Jonathan Rosenne, I don't really agree on this point.
> Assume I have something like A-with-dot-below, which does
> not exist as precomposed in ISO 10646. For this thing, what
> does ISO 10646 (normatively or otherwise) specify?

Well, 1EA0 should do it for A-with-dot-below.

But anyway, I agree. I was thinking of our A-GRAVE example. 
There are a number of Latin characters that are not defined
in 10646, and you can encode the information with the use of combining
characters.

> The keyword here is "at some stage". And one also has to realize
> that combining semantics in particular for Indic scripts can be
> handled quite different from Latin, because it is much less a
> general combination, and much more a complicated arrangement
> of special cases.
> 
> Assume, for a littel while, that not even the precombinations
> in Latin-1 would be available in ISO 10646. This would mean
> obviously that because of large and wealthy markets such as
> Germany and France, everybody would immediately start to
> work on combining characters. And these implementations would
> be completed rather soon, and would be very straghtforward.

I have heard that it should be very straightforward, but I have
not yet seen implementations. I also know that encoding with
similar properties like UNICODE , including ISO 6937, have not
been very widely implemented, although it was capable of
handling almost all Latin script based languages, and has been
around for a long time.

The problem with rare languages is that you need to have also
printers, displays etc render the rare language character, and
this requires that the products be enhanced with the fonts for
these characters. At least for Danish I know that you cannot 
just do with a simple or intelligent combination of glyphs
with the base letter and the accents, and I would imagine that
for other languages based on the Latin script, they would have 
similar problems. So in the interest of the rare languages
we should work on integrating these characters in 10646.

Keld

Received on Thursday, 24 October 1996 12:15:01 UTC