- From: Christopher R. Maden <crm@ebt.com>
- Date: Wed, 25 Sep 1996 20:57:23 GMT
- To: w3c-sgml-wg@w3.org
Several people have proposed defining the RE/RS problem out of existence by defining the RE and RS function charcters as codes that won't occur in entities, via the SGML application for XML. James Clark takes the stand that the definitions of RE and RS are what codes the parser should communicate to the application when they are encountered, so redefining them won't change whether or not they occur. He also posits that lines in an input file from DOS or UNIX should be interpreted as records. That certainly seems a reasonable interpretation, but I can't find anything to that effect in 8879. Clause 7.6.1, "Record Boundaries", defines the rules for RE ignorance or preservation, but doesn't say anything about when the parser generates an RS or RE signal. Charles Goldfarb's commentary thereto (pp. 321+322 of the SGML Handbook) discuss translating lines into records, but that's not normative. The best normative thing I can find is 4.140, "function character identification parameter: A parameter of an SGML declaration that identifies the characters assigned to the RE, RS, and SPACE functions, and allows additional functions to be defined." This suggests that, since characters are assigned to functions, that the characters in the document should assume the roles of these functions; ergo, if non-ocurring characters are the ones assigned to those roles, the function characters never occur. Is that not the intended meaning? If not, what is? I think that, if the RE/RS problem can be redefined out of existence, that it can be very easily handled at the application level. Some have suggested this already; I outlined a proposal in conversation with Gavin Nicol, and he seemed to think it worthwhile. I'll send that in another message if others agree that it is possible for an application of ISO 8879:1986 (not :2001) to define every entity to have a single record. -Chris -- <!NOTATION SGML.Geek PUBLIC "-//GCA//NOTATION SGML Geek//EN"> <!ENTITY crism PUBLIC "-//EBT//NONSGML Christopher R. Maden//EN" SYSTEM "<URL>http://www.ebt.com <TEL>+1.401.421.9550 <FAX>+1.401.521.2030 <USMAIL>One Richmond Square, Providence, RI 02906 USA" NDATA SGML.Geek>
Received on Wednesday, 25 September 1996 17:06:53 UTC