W3C home > Mailing lists > Public > w3c-sgml-wg@w3.org > September 1996

Re: Element content the real issue?...

From: <lee@sq.com>
Date: Mon, 30 Sep 96 23:12:34 EDT
Message-Id: <9610010312.AA14806@sqrex.sq.com>
To: jenglish@crl.com, w3c-sgml-wg@w3.org
Joe English <jenglish@crl.com>

> [...] How about the following
> as a heuristic to distinguish element content from mixed content:
> 
>     3. If the only data appearing between two tags is a sequence of
>        lexical SEPCHARs (including RS and RE), then it is deemed
>        insignificant.

<P><emph>That</emph> <strong>doesn't</strong> work.</P>
                    ^-- you lose this space.

If you want to inspect the entire element to see if it contains
anything except spaces and sub-elements, you're in for a lot of
lookahead (consider <HTML> in a well-formed RFC 1822 document!).

And in any case, just because my paragraph only contains individual
emphasised words does not mean that the spaces (or record ends)
are insignificant.

<P><emph>That</emph>
<strong>doesn't</strong>
work.</P>

should be the same, right?

I don't think any white-space should be discarded by the parser.

Lee
Received on Monday, 30 September 1996 23:13:38 EDT

This archive was generated by hypermail pre-2.1.9 : Wednesday, 24 September 2003 10:03:25 EDT