Re: RS/RE: basic questions

(The discussion so far has focussed on RS/RE handling, but there
is also a problem with separator characters _other_ than
record-ends in mixed vs. element content.)

Paul Prescod <papresco@calum.csclub.uwaterloo.ca> wrote:

> Joe seems to be proposing that if we
> * restrict PIs and comments to element content
> * restrict mixed-content-models to "|"

  [ actually, "restrict mixed content to OR groups
    with a REP occurrence indicator and which only contain
    primitive content tokens", or something equivalent;
    IOW, "no pernicious mixed content" ]

> * disallow inclusion exceptions

There's one more rule (which is the important one):

  * disallow separator characters in element content

This is because things like:


have different meanings depending on whether A has mixed content or
element content.  The record-end after the first </b> end-tag is
significant in the former case, and is ignored in the latter.

If A has element content, the above would have to be written like:





> then we can reduce the RS/RE handling rules to "Robert's Rules" ( =) ) of
> In data content:
>  1. If an element begins or ends with a newline [not entirely
>     accurate, but this is what people see], the newline is ignored.
>  2. Newlines inside markup are ignored.
>  3. All other newlines are passed on.

Yes, as far as I can tell.

Charles' proposal is similar:

  * restrict PIs and comment declarations to element content
  * disallow mixed content
  * disallow inclusion exceptions (or perhaps, disallow
    included subelements in pseudoelement content).
  * require data content to be delimited

The chief difference is in the second and fourth rules.

With these restrictions the RS/RE/separator character rules
are even simpler:

  1. Delimited separator characters are data.
  2. Undelimited separator characters are ignored.

--Joe English