W3C home > Mailing lists > Public > w3c-sgml-wg@w3.org > October 1996

Re: ERB decisions on A.17, B.9, and other questions

From: Michael Sperberg-McQueen <U35395@UICVM.UIC.EDU>
Date: Sat, 19 Oct 96 14:56:12 CDT
Message-Id: <199610192033.QAA20280@www10.w3.org>
To: W3C SGML Working Group <w3c-sgml-wg@w3.org>
cc: David Durand <DGD@cs.bu.edu>
On Sat, 19 Oct 1996 15:15:27 -0400 David G. Durand said:
>>Instead of a declaration like
>>  <!ENTITY auml SDATA "[auml    ]">
>How do I declare a character like "TENGWAR VOWEL LETTER A" or "ME
>GLYPH GARDINER 201" (Middle Egyptian). It's just &#23434 or
>My gut reaction to that is #$*#&$#?!

I'm not sure what you're arguing here.  Would

    <!ENTITY a.teng SDATA "[a.teng  ]">
    <!ENTITY a.teng SDATA "[U+5B8A]">

be (a) equally likely to be understood by all XML processors and (b)
materially less likely to elicit the reaction &expletive;?

If the ERB decides question C.5 the way I hope we will, then we can
at least say

    <!ENTITY auml   "&u00E4;"><!-- auml = a umlaut (dec 228) -->
    <!ENTITY a.teng "&u5B8A;"><!-- a.teng = Tengwar vowel A
                                   (decimal 23434) -->

which I, for one, prefer, since I don't have to be logged on to a
system with a shell-level hex-to-decimal calculator, and it's
easier to proofread the declarations against Unicode documentation.

(But to be honest I suspect that a.teng belongs at 57344 (U+E000) or
thereabouts in the Private Use Area.)

>>any XML processor can work properly with a declaration of the form
>>  <!ENTITY auml "&#228;"> <!-- auml = a umlaut, U+00E4 -->
>These declarations must be included automatically, if we hope to have
>anyone _use_ XML. (Which does not seem to be a major concern, as far as I
>can tell).

Come on, now.  The question of including some entities automatically
is on the table, and has been addressed by several postings, most in
favor, as far as I can remember.  The proposals I've heard so far are
(a) just lt, gt, and amp (possibly augmented by declarations for ' and
"), (b) the above plus ISOLat1, and (c) everything in HTML (which
version?).  If you are hoping for automatic inclusion of Tengwar and
other similar scripts, I predict disappointment.  Or even Middle
Egyptian, for that matter.  If you think we should decide to include
them automatically without worrying about the form of their declaration,
then I respectfully but vigorously disagree.

On the whole, I don't see how retaining or excising SDATA has any
effect on the ease or difficulty of declaring things like TENGWAR
VOWEL A and ME GLYPH GARDINER 201 in useful ways.  And so I don't
see how these examples bear upon the decision to do without SDATA
in version 1.0 of XML.

>   I'd like that the end of this process will be something that can be
>endorsed instead of inveighed against. I'm starting to worry.

The further XML takes us toward the Promised Land, the better.  But I
hope it will be endorsed for bringing us closer to it, rather than
inveighed against simply because it doesn't airdrop us down in the
middle of it.  Those who first approached the Promised Land, let us
remember, did so at the pace of the slowest animal or child in the

Some of us would like to go faster, or farther.  Some would like to go
slower, and not so far.  Whatever speed we go, whatever distance, will
be right for some and uncomfortable for others.  Whatever decisions we
make, some of us will be unhappy.  But we won't get to the Promised Land
by breaking up the caravan.  The right speed is the one that keeps us
moving together, and more or less in line.  (Hint to those lagging
behind:  some people want to go faster.  Hurry up, OK?  Hint to those
way out in front:  some people's feet hurt.  Cool yo' jets!)

-C. M. Sperberg-McQueen
Received on Saturday, 19 October 1996 16:33:07 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 20:25:04 UTC