- From: Christopher R. Maden <crm@ebt.com>
- Date: Thu, 12 Dec 1996 21:08:28 GMT
- To: w3c-sgml-wg@w3.org
[Paul Prescod] > It isn't really the legacy issue I'm worried about. It is requiring > every new DSSSL script to include code to detect and destroy > whitespace that the author thought (reasonably) was only for > pretty-printing. It is even more than that...it is the idea of > having pretty-printing whitespace be something that is handled by > the application *at all*. That's out of whack with most people's > understanding of a parser's job. C++ parsers don't return > "whitespace nodes" between tokens that the C++ "back end" must > detect and delete. This is already a necessity. The 8879 rules don't eliminate any internal whitespace, and they don't eliminate all leading and trailing whitespace. If I have a DSSSL stylesheet on top of an 8879 parser, I am going to have routines to strip and compress whitespace. RE delenda est does not introduce a new pain here. (I wouldn't strip whitespace if I expected the element to be preformatted, but in that case, I'd just as soon that the parser give me *all* of my whitespace, not *most* of it.) > Well, that's the XML status quo. I hoped that we could do better. I > think it will be rather annoying to have to know when I code my > style sheets whether the application uses an XML parser or an SGML > parser. I might just instruct authors to avoid whitespace between > elements altogether, since it will not be reliably interpreted > (which, I guess, is what some people want). If every parser passed all the whitespace, then all parsers would give the same parse tree. As long as we have DTD-less parsing, this is the *only* option that will give the same parse tree. Without a DTD, everything must be assumed to be mixed content, and with a DTD, any heuristic that distinguishes between element and mixed content will result in a different parse tree. -Chris -- <!NOTATION SGML.Geek PUBLIC "-//GCA//NOTATION SGML Geek//EN"> <!ENTITY crism PUBLIC "-//EBT//NONSGML Christopher R. Maden//EN" SYSTEM "<URL>http://www.ebt.com <TEL>+1.401.421.9550 <FAX>+1.401.521.2030 <USMAIL>One Richmond Square, Providence, RI 02906 USA" NDATA SGML.Geek>
Received on Thursday, 12 December 1996 16:19:29 UTC