Re: RS/RE, again (sorry)

Paul Prescod <papresco@calum.csclub.uwaterloo.ca>
> It isn't really the legacy issue I'm worried about. It is requiring every
> new DSSSL script to include code to detect and destroy whitespace that the
> author thought (reasonably) was only for pretty-printing. It is even more
> than that...it is the idea of having pretty-printing whitespace be something
> that is handled by the application *at all*. That's out of whack with most
> people's understanding of a parser's job. C++ parsers don't return
> "whitespace nodes" between tokens that the C++ "back end" must detect and
> delete.

Paul, I don't think there's any worthwhile analogy between parsing SGML
and parsing C, C--, or any other "programming" language. The purposes
and expectations on the output are entirely different. If you're
accustomed to publishing, you're accustomed to worrying about white-space.

> >> Every application that is currently based on an SGML parser will
> >> get a different parse tree from an XML parser.
> >
> >Yes.  And that's fine.
> Well, that's the XML status quo. I hoped that we could do better. I think it
> will be rather annoying to have to know when I code my style sheets whether
> the application uses an XML parser or an SGML parser. I might just instruct
> authors to avoid whitespace between elements altogether, since it will not
> be reliably interpreted (which, I guess, is what some people want).

I would hope that anytime you write a DSSSL stylesheet, you wouldn't make
any assumptions about what an SGML parser would do with white-space. If you
do, you still have more faith in software than I do. Call me paranoid, but
I would always build in rules for collapsing white-space where I didn't
want. I'd rather be safe than sorry.