- From: <John_Lavagnino@Brown.edu>
- Date: Tue, 22 Oct 1996 22:04:05 -0400
- To: w3c-sgml-wg@w3.org
Michael wrote, regarding suggestions about the handling of SDATA entities: | As to the other proposition, that it is widely understood, I can | only say that if it is widely understood, then surely someone can be | found to describe the behavior on this list explicitly enough that | the rest of us can also understand it. I for one do not understand | the behavior, and would really like a description, preferably with | reference to some documentation. If you encounter an SDATA entity, you: --- take the entity text --- look it up in a table of SDATA-to-local-rendition conversions --- output the string that the table supplies, if there is one --- if not, complain (or not; this part is indeed undefined, but it is possible to mandate some particular behavior) I feel constrained to point out that the approach of using private values from 10646 to denote characters not in this standard set obliges the processor to implement an almost identical procedure. If you encounter such a value, you: --- take the number --- look it up in a table of number-to-local-rendition conversions --- output the string that the table supplies, if there is one --- if not, complain (or not) The idea behind the existence of the private values in the standard is that in your local system you know what they mean and can take appropriate action. Perhaps you've even got a system that uses 10646 as its native character set. But if we're designing a language that is to be used on the net, for which a basic assumption has to be that you don't know much in advance about the capabilities of the target system, then you have to assume that a quite different character set may be used for rendition there; and certainly they can't know about your private use of the private values. As has been said here before, the difference between these two approaches is in what happens if you haven't managed to communicate your eccentric private practices to the document's recipient. In the first case, you might have something of the form "LOOKS LIKE A Q BUT WITH THREE TAILS". In the second, you have a number that is by definition undefined. John Lavagnino
Received on Tuesday, 22 October 1996 22:03:18 UTC