W3C home > Mailing lists > Public > w3c-sgml-wg@w3.org > October 1996

Re: A17: keep or drop entities?

From: <DAVEP@acm.org>
Date: Thu, 10 Oct 1996 22:35:31 -0500 (CDT)
To: Charles@sgmlsource.com
Cc: W3C-SGML-WG@w3.org
On Thu, 10 Oct 1996, Charles F. Goldfarb wrote:

>On Thu, 10 Oct 1996 10:12:13 -0400, "Eve L. Maler" <elm@arbortext.com> wrote:

>>At 04:49 AM 10/10/96 GMT, Charles F. Goldfarb wrote:

>>>                                                The keyword SDATA in the ISO
>>>character entity set is unnecessary because the replacement text is a symbolic
>>>string. (My original intention was that a system would use an equivalent entity
>>>set in which the replacement text was real system data.)

>>The [xxxxxx] replacement text "templates" have been widely implemented 
>>to produce the desired glyphs.  But this doesn't mean they're not system
>>data, does it?  It's still essentially a "processing instruction that
>>returns data" (clause 8).  Regular internal text entities aren't 
>>supposed to have this property.

>Eve has made a very sensible observation, so let me explain my reasoning.

>There are two principal purposes for labeling SDATA and PI:
>1. To make it easy to locate and revise or remove system-specific information.
>This, of course, enhances document portability and reuse by containing system
>2. To prevent generated text from being parsed in context with the SGML
>document. This enhances portability and reuse by assuring that all applications
>will "see" the same data.

>The symbolic replacement text in the ISO 8879 character entity sets don't
>present a problem on either of those counts. They are not system-dependent and
>they parse identically in all environments. That is because the generation of
>system-specific data takes place in the *result* document; it is never seen by
>the parser. In pernicious SDATA, the entity text is system-specific and
>therefore needs to be labeled.

There may have been just two principal purposes, but there is certainly
a third in this day and age:  When one is inserting an ordinary text entity,
the replacement text is specifically intended to become characters in the
document.  When "&amp;" is inserted in a document, one does not normally
intend it to be the same as though "[amp   ]" were directly in the document.
Many processors rely on the SDATA designation to trigger the special
handling.  Leave it out and we're in trouble.

Dave Peterson

Received on Thursday, 10 October 1996 23:35:50 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 20:25:04 UTC