- From: Steven J. DeRose <sjd@ebt.com>
- Date: Thu, 12 Sep 1996 18:05:52 -0400
- To: Gavin Nicol <gtn@ebt-inc.ebt.com>, tbray@textuality.com
- Cc: w3c-sgml-wg@w3.org
At 05:41 PM 09/12/96 GMT, Gavin Nicol wrote: >Well, from my reading of the SGML Handbook, it seems that RE and RS >are not *required* at all. If they occur, they are put there by the >entity manager. In fact, RE and RS are not really even characters per >se, they are kind of psuedo-characters (they have a code, and a name, >but they aren't real characters). Right. For example, consider a file from VM (IBM mainframe OS): VM does not represent record boundaries by characters: instead, it either makes all lines the same length (RECFM F to us ancients), or puts a length prefix in front of each record (RECFM V). So RS and RE clearly cannot be characters in the source document; if they show up at all, it is because the entity manager inserted them. Same is true on other systems, even though RS looks a lot like linefeed, and RS looks a lot like carriage return, they're not the same thing. So all you need do is tell your entity manager that all your files are recordless, and that CR and LF are merely whitespace characters (spacechar) on par with TAB. To avoid numeric conflict, you'd want to move either them or RS and RE somewhere else. Having done this, you're safe. It would be most helpful for the SGML revision to add the ability to turn OFF any delimiter of function character (presumably by setting it to #OFF, or the null string, or some such). Then we wouldn't have to go find two unoccupied character codes to assign RS and RE to. Such a change would also have the benefit that any other delimiters XML doesn't use (DSO? PIO? whatever) could be turned off. Why is that important? Let's say we ended up not having PIs (just an example!!!): Then our grammar would have no mention of the string "<?", so I suppose it would be treated as literal data if it showed up in an XML file. Oooooooops, so much for SGML being able to parse XML data. But if we could turn off the PIO delimiter, we'd be home free. Same argument for various other delimiters we might moot. Steve
Received on Thursday, 12 September 1996 18:08:29 UTC