Re: RS/RE: basic questions

At 9:26 AM 9/23/96, Michael Sperberg-McQueen wrote:
>To come back to our core test:
>Are the RE rules *essential* to good reprsentation of textual data?
>I haven't heard anyone argue this position, only other positions:
>  - the RE rules are part of 8879, and our goal of alignment with
>    8879 requires that we follow them somehow
>  - the RE rules are a convenience for data entry
>  - the RE rules are simpler to explain than the alternatives
>The only answer I know to the first argument is that in this particular
>case strict conformance may exact too high a price in simplicity and
>comprehensibility.  I'd like to minimize the difference between SGML and
>XML in this area, but it may be that this is one place where the
>conformance goal has to give.  (I suppose this boils down to my saying
>that for me, the RE rules don't even pass the weaker test James was
>arguing against:  I *don't* like them that much, and they *don't*
>seem reasonably easy to implement.)

   Hear! Hear!

>About the other two positions there is nothing to say except that they
>do not seem to me to be true; others seem to find them more persuasive.

As a coutnerexample to their necessity, I'll note that even "Joe HTML" has
learned that whitespace needs to be used carefully (to get images and
tables to format correctly). It's a pain to be forced not to use a
line-break, but the rule "line-breaks always count as whitespace" is
certainly very easy to remember. Like Michael, I don't vaslue compatibility
with 8879 so much that I would preserve the current rules.

   There are still a some hard questions about how, and whether to try to
normalize the marking of input lines (ie. should we preserve the input
character stream intact, convert to an XML-standard linend convention
before we send data to the application, require a standard linend
convention on XML input, or simply ignore line endings entirely and require
markup for significant linends). But I think the current rules for RS/RE
just don't make the grade. We should not let our decisions about other SGML
features (like includions, comments, and markjed sections) depend on the
handling of this one issue -- we should resolve markup features on their
value as markup, not their ability to preserve line endign comaptibility
with 8879.

   I think treating the entire SGML input as a single record is still the
best way to go, and it can be implemented by any entity manager. If we are
not compatible with current parsers on this feature, that can be regarded
as _strong_ input to the 8879 revision.

   RS/RE Delenda Est!

>-C. M. Sperberg-McQueen

   Never mind that if Michael can't understand the rules easily, that most
of the rest of us who think we do understand them are probably wrong...

   -- David

David Durand                  dgd@cs.bu.edu | david@dynamicDiagrams.com
Boston University Computer Science          | Dynamic Diagrams
http://www.cs.bu.edu/students/grads/dgd/    | http://dynamicDiagrams.com/