- From: David G. Durand <dgd@cs.bu.edu>
- Date: Thu, 26 Sep 1996 14:09:01 -0400
- To: w3c-sgml-wg@w3.org
Having finally finished reading all the back mail, I'm ready to try to jump back into the discussion. At 8:43 PM 9/25/96, Paul Prescod wrote: >At 06:14 PM 9/25/96 EDT, lee@sq.com wrote: >>That's correct. For what it's worth, SoftQuad Panorama can display SGML >>tables with newlines between the tags, even if PCDATA is allowed there. >>It isn't particularly hard to implement, as far as I can see. > >But should every SGML application have to implement it over and over again? >That means that between and within ANY ELEMENT you would have to explicitly >look out for "meaningless" newlines. Instead of implementing the handling in >the parser (which code we expect to be used over and over again) you must >implement it in the application. > >Then you have to define in your DTD-documentation that newlines in that >context are going to be interpreted as "meaningless" which means that we are >shifting the documentation and education burden to application designers. The example under consideration was the table: <TABLE><TR><TD>1</TD><TD>2</TD><TD>3</TD><TD>4</TD></TR><TR><TD>1</TD><TD>2< /TD><TD>3</TD><TD>4</TD></TR></TABLE> and the desire to format it, thus: <TABLE> <TR><TD>1</TD><TD>2</TD><TD>3</TD><TD>4</TD></TR> <TR><TD>1</TD><TD>2</TD><TD>3</TD><TD>4</TD></TR> </TABLE> Since in the DTD-less case we don't know that the table is element content, we are unable to remove the whitespace. Now, we could fix this by having DTD-less processing differ from DTD-ful processing, but I agree with most of you (I expect) that this is a bad idea. But, why not format the table like this: <TABLE><TR ><TD>1</TD><TD>2</TD><TD>3</TD><TD>4</TD></TR ><TR><TD>1</TD><TD>2< /TD><TD>3</TD><TD>4</TD></TR ></TABLE> This looks a little weird, but you already have to do this in netscape (at least for TD elements) because in tables leading whitespace is not ignored. And if we have a further "application" convention that whitespace is ignored according to common convention, except when a stylesheet requests verbatim processing for an element. I'd rather add a "verbatim" declaration to the DTD, come down to it. Incompatibility with 8879 is actually not an issue here, anyway, as the entity manager is _never_ required to report RS/RE to a parser. If current entity managers insist on recognizing CR and LF as record boundaries, then we should live with the incompatibility, and encourage the development of simpler entity managers that are not so "obliging". The RS/RE stuff in SGML was supposed to make life easier for taggers. Experience has shown that, arguments about "true content" or not, they do not work, as even SGML experts can disagree about the meaning of the rules. We are in danger of making compatibility with SGML's mis-features a millstone around our necks. >A DTD-less parser doesn't know or care that it is dealing with shortref. It >would treat '"' as "PCDATA Start" and "PCDATA End". This is correct. However, XML would be required to look stupid by quoting things that are already clearly delimited (by tags), and would be permanently, and completely incomaptible with all existing SGML and pseudo-sgml (i.e. HTML) documents. Using quotes to preserve the SGML newlines is like wearing glasses to fix your outdated contact-lense prescription. Ignoring whitespace around markup is an idea that has otlived its usefulness. Let's bury it. > Paul Prescod RE delenda est. -- David --------------------------------------------+-------------------------- David Durand dgd@cs.bu.edu | david@dynamicDiagrams.com Boston University Computer Science | Dynamic Diagrams http://www.cs.bu.edu/students/grads/dgd/ | http://dynamicDiagrams.com/
Received on Thursday, 26 September 1996 14:05:27 UTC