- From: Robert Streich <streich@slb.com>
- Date: Mon, 23 Sep 96 20:08:46 CDT
- To: w3c-sgml-wg@w3.org
At 11:07 AM 9/23/96 +0000, James Clark wrote: >At 01:50 23/09/96 CDT, Robert Streich wrote: >>I think they are pretty straightforward: >> >> In data content: >> 1. If an element begins or ends with a newline [not entirely >> accurate, but this is what people see], the newline is ignored. >> 2. Newlines inside markup are ignored. >> 3. All other newlines are passed on. > >According to your statement of the rules all of the three newlines in the >following would be passed on: > ><doc><?> ><?> >data ><?></doc> > >In fact none of them are. Exactly right on both counts. I'm willing to accept this inconsistency with SGML. An XML/SGML editor could write it out such that it would parse the same in SGML, i.e., <doc><?><?>data<?></doc> or <doc ><? ><? >data<? ></doc> Someone who was hand coding the doc would have to know how to conceal the newlines if they were in an element where newlines were considered "significant." But then, I'm inclined to believe that they would expect them in a case like this. If the line endings weren't considered significant, I'd expect a browser to discard them just as it would any other leading or trailing whitespace. With this approach, the only place you run into a problem is when you have significant line breaks. I'd rather have this then get rid of them altogether. Leave it up to the application to deal with the whitespace. If we don't have shortrefs then it's no longer important for the parser to care about what whitespace character it is. Tabs, newlines, carriage-returns, spaces, all get treated the same in data content--passed through. bob Robert Streich streich@slb.com Schlumberger voice: 1 512 331 3318 Austin Research fax: 1 512 331 3760
Received on Monday, 23 September 1996 21:09:09 UTC