Re: RS/RE: basic questions

There is a proposal near the end of this article that is very like Paul P's,
but I think retains more backward compatibility with existing SGML tools.


Paul Prescod proposed:
> #1. RE's are not signficant, except within verbatim elements <" "> (which
> can only contain data content, no markup).
> 
> #2. RE's within verbatim elements ARE significant (i.e. the parser passes
> them to the application).
> 
> #3. RE's between words (i.e not immediately following or preceding markup)
> are significant.

You also need the newline to be significant between <emph>these</emph>
<number>two</number> lines.

I don't quite understand this verbatim markup -- how do you do it in SGML
products that don't allow you to change the RCS beyond increasing NAMELEN?

There's no point in practice in being compatible with 8879 -- instead, XML
has to be compatible with actual tools.  Probably the most widespread
tools that read SGML are HoTMetaL, Panorama, Adept, and A/E, with NSGMLS
and SGMLS and Omnimark being the most widespread on the `next layer down'
(conceptually, I don't mean to deprecate them!).  I don't list DynaText
because the viewer is the widespread part, and it reads a compiled form.
As far as I know, only the tools in the 2nd group I mentioned support
shortref or datatag or making " a name start character (is that what's
going on there with <">...</">??).

InContext, Microstar's editor (Far and Wide?  I don't mean Near & Far),
SoftQuad Explorer, LivePage, and many many other tools that are perhaps
less widespread than the ``biggies'' (e.g because newer) also don't, as
far as I know, support such things.  Correct my privately if I am wrong;
I will gladly post a summary.  In fact, perhaps this is the sort of
information that SGML OPEN could mantain??

At any rate, if most of the most widely deployed SGML tools won't support
XML because it needs features that they don't implement, the SGML 
compatibility of XML buys little or nothing, I think.

I know that RS/RE is an issue for James in SP (and hence in nsgmls);
SoftQuad can cope with making all whitespace significant, and the heck
with RS/RE, as our Fearless Leader used to say.

Again, if there are any other software writers represented here whose
current shipping software could not cope with making all whitespace
significant, and could not supply a script or patch or upgrade or
sidegrade or whatever within, say, six to twelve months, send me mail
and I will post a summary early next week.

On the other hand, let me know if you _could_ cope with this:

* any sequence of whitespace characters is equivalent to a single blank
* there is no distinction between spaces and newlines
* an application that processes XML may reduce whitespace to a canonical
  format, or to an equivalent format, or may leave the whitespace
  untouched.

In this world, multiple spaces and newlines would have to be quoted in
some way.

A verbatim element might be acceptable, although it would need to be
syntactically distinguished in a way that was still acceptable to most
existing SGML software (e.g. no changing name start characters).

For example, suppose all verbatim elements began their names with
    xml.nf.

e.g.:
<xml.nf.listing>In this listing,
  multiple     spaces
are  not compressed to single 
blanks as they are elsewhere, and
newlines
and
spaces
are distinct.</xmlnf.listing>

If we can all live with this, and there is no implementation problem,
we can stop talking about it and move on...

Lee

Received on Wednesday, 2 October 1996 22:44:46 UTC