W3C home > Mailing lists > Public > w3c-sgml-wg@w3.org > October 1996

Re: Non-Unicode characters, SDATA, etc.

From: Harvey Bingham <hbingham@ACM.org>
Date: Wed, 23 Oct 1996 21:40:18 -0400
Message-Id: <>
To: bosak@atlantic-83.Eng.Sun.COM (Jon Bosak)
Cc: w3c-sgml-wg@w3.org
At 21:54 1996/10/22 -0700, Jon Bosak wrote:
>[Tim Bray:]
>| Anders surprised some of us by pointing out that there are a large
>| number of ISO entities that are not in ISO 10646 at all.  So I'd like
>| to request input from the WG on this.
>| On the other hand, Anders' posting makes it clear that [particularly
>| in the area of mathematics] there are routinely a substantial number
>| of non-10646 characters available [in theory at least] to technical
>| publishers; who have been a mainstay of SGML support over the years.
>With some embarrassment (because, like Tim, I have never run into this
>problem myself, and therefore argued the 80-20 angle when this was
>before the ERB),
>I haven't checked with Mr. Poppelier, but I'm sure that the gentleman
>won't mind being quoted in a forum where this information might do
>some good.

I quote Ken Iverson [the inventor of the mathematical programming language
APL, some 35 years ago] about the inadequate extent of the glyph set 
available for mathematicians, and its cause:

"The reason mathematicians have used every character in every language,
living and dead, is that they chose to elide the multiply sign between 
such characters."

That shorthand mistake of omitted multiply sign [or in other areas of math
some other infix operator symbol] prevented the mathematician from using
multi-character names.

Since a mathematician feels free to define a private self-consistent world,
there need be no constraint to using a limited choices of printable glyphs. 
In the language for a self-defined world, any "squiggle" can have arbitrary
semantic meaning. Anyone who chooses to learn the language of that world 
needs to learn the mathematician's chosen meaning. In general, any squiggle 
can substitute for any other, so long as the choice has attached its semantics.

From this perspective, the measure of the set of squiggles is unbounded.
For that reason I continue to argue that an unrecognizable character entity
name should be passed on as-is to the application, which may well be
written by that mathematician in a private world. Little harm will be
done, and perhaps some education, if in fact the mathematian may learn
to use multi-character names! But that may constrain the academic freedom
to define a private world and be the gatekeeper for the few who seek to
enter into such a world.

Technology today makes it possible to create glyphs to represent those
squiggles. The mathematician who chooses to use squiggles should supply
their names and the desired glyphs for them, in an acceptible means,
such as truetype. Then the publishers will have less need for an ever-growing
character repertoire with corresponding glyphs in an ever-growing set
of font families.

I do not believe it is up to XML to try to close the set, nor prescribe
the actions of an application in which arbitrary squiggles may occur and
be meaningful to the mathematician's private world.

Received on Wednesday, 23 October 1996 21:42:52 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 20:25:04 UTC