Re: ERB decisions on A.17, B.9, and other questions from John_Lavagnino@Brown.edu on 1996-10-20 (w3c-sgml-wg@w3.org from October 1996)

From: <John_Lavagnino@Brown.edu>
Date: Sun, 20 Oct 1996 15:18:56 -0400 (EDT)
To: w3c-sgml-wg@w3.org
Message-Id: <199610201918.PAA05245@swansong.stg.brown.edu>
Michael writes:

|   Both David Durand and Lee Quin seem to be interpreting SDATA
|   entities as things which provide a system-independent specification
|   of characters or glyphs, in particular full names of the character or
|   glyph, in the style familiar from ISO character-set standards.
|   
|   Since I had understood SDATA to be intended to hold system-*de*pendent
|   specifications (such as the elaborate escape sequence needed to
|   produce a given glyph on some particular device or system -- say, an
|   IBM ProPrinter or an HP LaserJet III, or ...), this notion confuses
|   me.
|   
|   Can you point to any passages in 8879 that prescribe, or even allow,
|   the usage you are foreseeing?

Surely there is no question that it is allowed.  Just because one
"normally" redefines SDATA for "different applications, systems, or
output devices" (in the words of definition 4.304) doesn't mean you
have to.  (Are you suggesting that an SGML system should prohibit
Unicode values as SDATA text, because they're not
application-dependent enough?)

And common practice (as has been previously noted on this list) is to
treat the SDATA replacements printed in Appendix D as symbol names to
be translated into escape sequences or whatever at a later stage.
There are fewer reasons to do that today than there used to be:
catalog files, for instance, make it much easier to change from one
set of entity definitions to another in a standard way than was once
the case.  But I still feel, and apparently others do too, that it is
highly desirable to keep system-dependent stuff even out of the DTD,
and to convert some name into the system-dependent stuff.

|   Can you explain how the use of the SDATA keyword helps build a
|   framework superior to what can be built without it?  So far, the
|   argument appears to be that providing the name of a character,
|   without any information about its position in 10646 if any, or any
|   information about an appropriate glyph in the AFII glyph registry if
|   any, is superior to providing its position in 10646, with name etc.
|   in a comment.
|   
|   In what way does the SDATA keyword affect this tradeoff?

As Lee explained, you aren't necessarily talking about a character
that exists in any kind of glyph registry anywhere.  It's a fact of
life that people invent new characters all the time; in this case, is
it preferable to provide an essentially random number or something
that can at least say what is desired?  And if you are in a situation
where you've got a list of standard numbers and standard names, as in
10646, why do you need both, and why is the number preferable?

The SDATA keyword, in very common practice, means "This is a name for
the character, a name that needs conversion for whatever output device
you've got at hand."  The prescribed effect (at least in the ESIS
world) is to mark that name as distinct from ordinary document
content.  You can certainly live without that distinction if you don't
mind unreliable hacks like saying "anything in square brackets is
really a character name".

I haven't seen a convincing response to Lee's comment:

| Would you be happy with
|     <!Element 391 - - (%4067;)* -- paragraph -->
| ?

John Lavagnino
Received on Sunday, 20 October 1996 15:18:49 UTC