Re: Mixed content

On Sat, 14 Sep 1996 10:57:46 -0700, Tim Bray <tbray@textuality.com> wrote:

>At 09:47 AM 9/14/96 GMT, Charles F. Goldfarb wrote:
>>James Clark .. has proposed to solve this (and other RE related
>>problems) by abolishing mixed content for XML. I think that is an excellent
>If we finesse the RS/RE using the Durand/Nicol trick (which seems like
>a pretty trouble-free win), then (and this is a serious question) are
>there remaining big problems with mixed content?  Because if not, I'd hate
>to lose it... it is certainly natural to use in declaring, e.g., paragraphs
>with references or images or <a href="http://www.w3.org/">hyperlinks</a>.
Your "mixed content" paragraph is easily represented by using explicit
sub-paragraph elements (or, as HyTime calls them, "pseudo-elements"), viz:

<!element p - - (pe | a)+>
<!element  (pe | a) - - (#pcdata)>

<p><pe>Now is the </pe>
<a href="http://www.w3.org/">time</a>
<pe> for all ...</pe></p>

I understand how RS and RE characters can be excluded from XML. What I don't
understand is whether, under those circumstances the following two cases will
cause identical data to be passed to the application.  If not, what data
character codes are passed to the application in each case?

Case 1:

Text of p.

Case 2:

<p>Text of p.</p>

With the "no mixed content" rule, RS and RE could be confined to element content
(that is, between elements) where they would always be ignored; they would never
occur in data. 
Charles F. Goldfarb * Information Management Consulting * +1(408)867-5553
           13075 Paramount Drive * Saratoga CA 95070 * USA
  International Standards Editor * ISO 8879 SGML * ISO/IEC 10744 HyTime
 Prentice-Hall Series Editor * CFG Series on Open Information Management

Follow-Ups: References: