Re: ERB decisions on A.17, B.9, and other questions from lee@sq.com on 1996-10-19 (w3c-sgml-wg@w3.org from October 1996)

From: <lee@sq.com>
Date: Sat, 19 Oct 96 18:44:37 EDT
To: U35395@UICVM.UIC.EDU, w3c-sgml-wg@w3.org
Message-Id: <9610192244.AA13495@sqrex.sq.com>

> In the light of these agreements, the ERB reconfirmed its earlier
> decision that XML 1.0 will not have SDATA entities.  It is thought that
> most uses of SDATA entities are adequately served by character
> references to Unicode characters (see example below).  Techniques for
> dealing with non-Unicode characters, specification of glyphs rather than
> characters, and related topics (such as possible mechanisms for document
> private agreements governing the ISO 10646 Private Use Areas) will be
> addressed in future revisions.

Sorry to quote all that...

If I want to put my transcription of bits of Bailey's 18th C. dictionary
on the web (actually it's already there in fairly grotty SGML), I will
want to refer to some glyphs not in unicode.

Would I do
    <!Entity tall-s "s">
and hope that a style sheet could map this to (say) the font
Adobe Caslon Alternates, using an embedded OpenType font?
HTML will let me do this next year, using MSIE or Netscape, according to
the respective vendors.

There are many other characters not (or not yet) in 10646 or Unicode,
some of which can be handled by floating diacriticals -- e.g. see
http://www.portal.ca/~tiro/ for an analysis of character sets required
for current, modern languages.  There are others that are incorrectly
identified in Unicode (it was made by humans...).

In the case where no glyph exists on the targent system, I obviously
want something mnemonic to appear -- I can live with
	[tall-s]
or
	[r-cedilla-acute]
but not with
	&#x80010fe7;
or the equivalent
	&#2147553255;

(ISO 10646 user space character numbers are all that long, right?)

As long as a mnemonic string is used, I don't care whether it is
called SDATA or PRISCILLA.... :-)

Lee

Received on Saturday, 19 October 1996 18:45:27 UTC