[Prev][Next][Index][Thread]

Re: ERB decisions on A.17, B.9, and other questions



Michael wrote, regarding suggestions about the handling of SDATA
entities:

|   As to the other proposition, that it is widely understood, I can
|   only say that if it is widely understood, then surely someone can be
|   found to describe the behavior on this list explicitly enough that
|   the rest of us can also understand it.  I for one do not understand
|   the behavior, and would really like a description, preferably with
|   reference to some documentation.

If you encounter an SDATA entity, you:

--- take the entity text

--- look it up in a table of SDATA-to-local-rendition conversions

--- output the string that the table supplies, if there is one

--- if not, complain (or not; this part is indeed undefined, but it is
    possible to mandate some particular behavior)

I feel constrained to point out that the approach of using private
values from 10646 to denote characters not in this standard set
obliges the processor to implement an almost identical procedure.

If you encounter such a value, you:

--- take the number

--- look it up in a table of number-to-local-rendition conversions

--- output the string that the table supplies, if there is one

--- if not, complain (or not)

The idea behind the existence of the private values in the standard is
that in your local system you know what they mean and can take
appropriate action.  Perhaps you've even got a system that uses 10646
as its native character set.  But if we're designing a language that
is to be used on the net, for which a basic assumption has to be that
you don't know much in advance about the capabilities of the target
system, then you have to assume that a quite different character set
may be used for rendition there; and certainly they can't know about
your private use of the private values.

As has been said here before, the difference between these two
approaches is in what happens if you haven't managed to communicate
your eccentric private practices to the document's recipient.  In the
first case, you might have something of the form "LOOKS LIKE A Q BUT
WITH THREE TAILS".  In the second, you have a number that is by
definition undefined.

John Lavagnino


References: