Re: RS/RE, again (sorry) from Gavin Nicol on 1996-12-12 (w3c-sgml-wg@w3.org from December 1996)

From: Gavin Nicol <gtn@ebt.com>
Date: Wed, 11 Dec 1996 23:21:55 -0500
To: elm@arbortext.com
CC: lee@sq.com, w3c-sgml-wg@w3.org, tbray@textuality.com
Message-Id: <199612120421.XAA02778@nathaniel.ebt>

>There's one complication to this.  If there's no DTD (quite possible),
>AND no way in the instance (such as with an attribute value) to indicate 
>which is element content and which is mixed content, AND no stylesheet 
>semantic that allows this to be controlled, how can the application layer
>decide?  I'm guessing that DSSSL expects a document to have gone through 
>a classic SGML parser first, which means that element content will have 
>internal whitespace stripped.  XML documents might not undergo any such
>treatment.  Somehow, the element-vs.-mixed distinction needs to be made 
>in cases where it's critical for rendering or other processing.  Where is
>the right place to make the distinction, if there's no DTD?  Is is the
>stylesheet?  If so, are there any technical barriers to doing this?

I'd like to hear from people the cases that they can think of
where the distinction is truly necessary.

In cases where it really *is* necessary, I see no problem at all in
having/requiring a "validation" layer that *requires* a DTD so that
element and mixed content can be distinguished. This layer would
validate the post-parse representation using whatever rules need be
applied (one might be to use SGML's "normal" rules).

I cannot see why a *parser* for an essentially trivial syntax need be
bothered by this.

Conceptual model:

   input-->parser-->events/grove------------------+-->application
                      \                          /
                       +-->filter-->events/grove+

where "filter" can perform some rules-driven (ie. DTD, DSSSL
spec. etc.) transformation of the parser output.

Received on Wednesday, 11 December 1996 23:23:15 UTC