Re: Element content the real issue?...

Joe English <jenglish@crl.com>

> [...] How about the following
> as a heuristic to distinguish element content from mixed content:
>     3. If the only data appearing between two tags is a sequence of
>        lexical SEPCHARs (including RS and RE), then it is deemed
>        insignificant.

<P><emph>That</emph> <strong>doesn't</strong> work.</P>
                    ^-- you lose this space.

If you want to inspect the entire element to see if it contains
anything except spaces and sub-elements, you're in for a lot of
lookahead (consider <HTML> in a well-formed RFC 1822 document!).

And in any case, just because my paragraph only contains individual
emphasised words does not mean that the spaces (or record ends)
are insignificant.


should be the same, right?

I don't think any white-space should be discarded by the parser.