- From: Gavin Nicol <gtn@ebt.com>
- Date: Thu, 12 Sep 1996 17:41:59 GMT
- To: tbray@textuality.com
- CC: w3c-sgml-wg@w3.org
>>David Durand and I independently came up with a good way of dealing >>with such things > >I've heard about this but never seen it. Could you or David or >someone please post it to the group? In our informal discussions >before the advent of the WG, figuring out what to do about RS/RE, without >busting our 8879 compliance, was one of the most worrying things. Well, from my reading of the SGML Handbook, it seems that RE and RS are not *required* at all. If they occur, they are put there by the entity manager. In fact, RE and RS are not really even characters per se, they are kind of psuedo-characters (they have a code, and a name, but they aren't real characters). Anyway, if we assigned some character codes to them that are guaranteed to never occur in input, then the parser will never even see them (another syntax trick). This will, of course, mean that \n and \r will be seen in content, but they could be mapped such that they get converted to a space on input. I don't claim to be intimately familiar with all the effects that this will have in terms of markup regognition etc. but it seems to me that this would simplify parsers a great deal, and also get around problems with MIME text type requirements (canonical form) etc.
Received on Thursday, 12 September 1996 13:43:21 UTC