- From: <lee@sq.com>
- Date: Mon, 30 Sep 96 23:12:34 EDT
- To: jenglish@crl.com, w3c-sgml-wg@w3.org
Joe English <jenglish@crl.com>
> [...] How about the following
> as a heuristic to distinguish element content from mixed content:
>
> 3. If the only data appearing between two tags is a sequence of
> lexical SEPCHARs (including RS and RE), then it is deemed
> insignificant.
<P><emph>That</emph> <strong>doesn't</strong> work.</P>
^-- you lose this space.
If you want to inspect the entire element to see if it contains
anything except spaces and sub-elements, you're in for a lot of
lookahead (consider <HTML> in a well-formed RFC 1822 document!).
And in any case, just because my paragraph only contains individual
emphasised words does not mean that the spaces (or record ends)
are insignificant.
<P><emph>That</emph>
<strong>doesn't</strong>
work.</P>
should be the same, right?
I don't think any white-space should be discarded by the parser.
Lee
Received on Monday, 30 September 1996 23:13:38 UTC