Re: RS/RE considered confusing?
>The Durand/Nicol proposal on RS/RE is, if I've understood it
>correctly, that when an XML document is parsed by an SGML parser, then
>it will use an SGML declaration that includes something like:
> RE 65531 -- not a legal ISO 10646 character --
> RS 65532 -- not a legal ISO 10646 character --
> SPACE 32
> TAB SEPCHAR 9
> CR SEPCHAR 10
> LF SEPCHAR 13
>1. Most SGML parsers won't handle such an SGML declaration. The
>practical benefit for XML of SGML compatibility is to enable SGML
>tools to be used on XML documents. If XML uses features of SGML that
>aren't implemented in most SGML parsers, this practical benefit is
Does SP allow this?
>2. It's not clear to me that this in general will work with a
>conforming SGML system. The entity manager is supposed to transform
>whatever mechanism the OS uses for representing lines into RS/RE.
I have never heard that "lines" are equivalent to "records". Many OS's
do not have "records" at all.
>4. It is a fact of life that OSs delimit lines using different
>methods. One of the ideas underlying the RS/RE concept in SGML is
>that these different methods should canonicalized into a single form,
>so that applications are isolated from these differences.
Again, who says that lines and records are equivalent?
>5. In mixed content there are some newlines that must get ignored by
>some part of the system (whether the parser or the application). For
>example, if I have
>This is a paragraph
>I don't want the newline after the <p> to result in a space at the
>beginning of the paragraph.
What do you do if you *do* want a linefeed to occur there? It's just
as easy to tell people "if you don't want a space, you must start up
hard against the tag":
<P>This is a paragraph.
>6. Consider a paragraph like this:
>The SGML rules specify that the newline folling the comment is
Does SGML actually say *newline* or record end?
>7. Moving the responsibility of ignoring newlines to a part of the
>system other than the XML parser is going to mean that XML documents
>and SGML documents will need different processing. For example, when
>I write a style sheet I would have to know whether I've got an SGML
>document or an XML document so that I can specify that newlines are
>ignored in the XML document in the way I want.
I do not think this is true if you have a *conformant* SGML system
that can accept and correctly use a declaration similar to XML.
This is probably the fundamental problem: many SGML systems are not
conformant in this respect.