- From: David G. Durand <dgd@cs.bu.edu>
- Date: Sat, 28 Sep 1996 21:37:35 -0400
- To: w3c-sgml-wg@w3.org
At 10:10 PM 9/19/96, Charles F. Goldfarb wrote: >On Thu, 19 Sep 1996 13:30:51 +0000, Tim Bray <tbray@textuality.com> wrote: > >>1. Why ignore whitespace? >> >>Given really simplistic RS/RE handling, it would be the case that the two >>following "P" elements would parse differently. >> >><p>Listen to my heart beat.</p> >><p> >>Listen to my heart beat. >></p> >> >>The position has been advanced by both Charles Goldfarb and James Clark >>that this would be A Bad Thing. Obviously, it would complicate the >>problem of achieving compatibility with 8879. Aside from that, >>WHY IS THIS A PROBLEM? > >A principal objective of SGML is that all applications should receive the same >"true information" about the document. When an SGML document is created with an >editor that preserves line breaks (which SGML calls "record" breaks to avoid >confusion with formatted output lines), the possibility exists that some record >breaks are not part of the "true information". For example, in > ><p>Listen to my heart beat. ><?DIRECTOR: audio on> >And beat and beat and beat.</p> > >the true information is: > >"Listen to my heart beat. >And beat and beat and beat." > >because the record end after the PI is not part of the data .... Further examples and discussion deleted.... >I'm not sure this rant is even relevant. Anyway, perhaps now you see why it is, >in fact, a fatal problem, as it goes to the core of the purpose of SGML: If a >markup language can't preserve and interchange the true information, it is >useless. > >Best regards, > >Charles I just want to note that this is a textbook instance of begging the question: The definition of "true content" presupposes that RE after markup should not be significant (the core of the confusion around the SGML RE rules). With that defintion in hand we can quickly come to the conclusion that SGML's RE/RS rules are necessary to return the true content. Why should initial whitespace be hidden fromthe application in XML. We all agree that we can put the CR _within_ a tag, so where's the problem? And if we are willing to forgo that one convenience, we can even be compatible with 8879, since it is the entity manager that decides whether RE exists and goes to the parser. I know that changing YASP to process this correctly would take about 10 minutes. I'd be surprised if it took longer for SP. This is an incompatibility we could all live with. -- David RE delenda est. --------------------------------------------+-------------------------- David Durand dgd@cs.bu.edu | david@dynamicDiagrams.com Boston University Computer Science | Dynamic Diagrams http://www.cs.bu.edu/students/grads/dgd/ | http://dynamicDiagrams.com/
Received on Saturday, 28 September 1996 21:33:48 UTC