Request for ERB decisions

Date: Wed, 13 Nov 1996 10:52:11 -0500
I am aware of two issues that seem to have been lost in the shuffle, but
generated enough discussion that I would like to know what the ERB
_officially_ thinks about them.

First is the issue of SDATA. I stopped pushing this when it seemed the
arguments had all been laid out, but I never saw a plausible justification
for the use of numbers instead of names for things that will _probably_ be
undefined in many processing systems, and that _already have names_. And I
never saw an official decision, or even indication that it had been
discussed, after the long hard work of a 2 week list-discussion.

   I remember some book on datra processing rules of thumb that I read in
bookstores in the 80's (Stan Kelly-bootle's "System programmer's problem
solver", I think). The first rule was one that I have always tried to apply
becuase of it's obvious good sense: "Never use a number to represent
something that is not a number." Now obviously a character code cannot hew
to this as a hard line, but for undefined characters, unlabelled numbers
are obviously less-informative, less useful, and less amenable to extension
than are strings. I also advocate that we reserve some syntactic subset of
the SDATA space for future use by an official "glyph-resolution service".

I think the shibboleth of SDATA use in this way being non-conforming was
also squared away. At least, SDATA is application-visible in the ESIS which
is certainly the minimum of information one can expect from SGML software.

Second was the proposal to use HTML headers instead of Processing
Instructions to mark character sets, XML version, (and potentially a host
of other attributes we might need in future versions). I think a vote is in
order, since the proposal was made, and generated significant discussion,
after a decision by the ERB that introduced a highly specific mechanism
that had not been fully examined previously.

   Even Charles thinks that it would be conforming SGML to have storage
objects with headers, and the headers could be made optional when the
character set can be otherwise determined be the transmission channel (eg
HTTP, or a floppy mailed with a note on it as to data format).

   And while we're at it, if Charles is relenting a bit on whitespace,
maybe we should just make it all significant and be done with it. (I do not
feel that the ERB "owes us" a vote on this issue, but the simplicity would
be nice, if we could get it).

