Re: (Repeat) Decision: C.4 (Predefined entities)

> So, I unashemedly agree with Jon's request. An XML application
> should and must be capable of faithfully processing a well-defined
> and agreed set of named character entities, even in the absence
> of declarations for those named character entities. And while
> I would prefer that this set be as comprehensive as possible,
> I submit that the minimum set that is supportable with "very 
> little effort" is the set which includes all of the ASCII set
> (dec 32-126), all of the 8859-1 set (dec 158?-255), and those
> characters from the Adobe Symbol set that have corresponding names
> among the ISO list of named character entities.


Murray, I think that was very helpful background.

I agree with you that a pre-defined set of entities is useful in a language.
Obviously in an SGML environment you must be able to redefine those names.

These entities do not in fact add anything at all to XML, since they
are already available in the document character set (Unicode).
We have already been told that numbered references are to be used in
preference to SDFATA entities -- what you are now saying is that that
decision is OK as long as it doesn't affect the entities you want to use
yourself :-)

There are now three ways to get an OMEGA symbol in XML:
(1) The ISO 10646 code point: &#0x000003A9; -- this can also be
   written without the leading zeros as &#0x903A9;.  Actually you have
   to use decimal, but I'm using the 0x prefix to pretend I can use
   the more normal hexadecimal used in the Unicode & ISO10646 tables.
   OMEGA is at group 0, plane 0, row 3, position A9 (169 decimal).
   This is the Greek letter, though, not the symbol as used in mathematics
   and engineering, so it is not the one in the Symnbol font.

(2) The ISO 10646 code point &#0x2186; for "OHM SIGN".  This is  not the
   same as a Greek Omega; the Adobe Symbol font OMEGA is usually used
   for this.  I don't believe the Symbol Font Greek is represented
   directly in ISO 10646, though.
   
(3) With the ISO entitiy Ω or &OHgr; (I don't know which one the ERB
   picked; neither is correct for the Symbol font, which contains Math Greek).

In addition, you can define your own entitiy in terms of these:
<!Entity Math.Omega "&#0x903A9;">

We have no way of saying whether you want the Symbol font Omega or the
Greek Omega, though, unless you do
<FONT FACE="Adobe-Symbol">&omega;</FONT>
in the Grand Tradition of HTML :-), or use a style sheet.

Thee is no real justification for using incorrect names,
It would be best to use the names that Adobe gives to the Symbol font
symbols, and precede them with an indicative prefix such as Sym.

If they are predefined and cannot be changed, they must *obviously* begin
with -XML- as per the earlier requirement about invading the namespace.

So whilst I now appreciate more of what you were trying to achieve,
I still cannot condone doing it in such a flawed and ad hoc way.
Isn't XML supposed to be setting an example of how to use SGML in a
clear, clean, understandable and elegant way?  Or is it just a quick
and dirty hack?

Lee

Received on Wednesday, 13 November 1996 12:45:10 UTC