Re: RS/RE, again (sorry) from Christopher R. Maden on 1996-12-16 (w3c-sgml-wg@w3.org from December 1996)

From: Christopher R. Maden <crm@ebt.com>
Date: Mon, 16 Dec 1996 22:15:20 GMT
To: w3c-sgml-wg@w3.org
Message-Id: <199612162215.WAA19727@phaser.EBT.COM>

In attempting to draft an argument to Paul Grosso, I've run into a
quandary.

1) In the absence of a DTD, we must assume mixed content for
   everything.[*]

2) This could create whitespace nodes in element content.

3) A dichotomy between "DTD-ful" and DTD-less parsing will make any
   sibling-based relationship difficult at best; this will affect some
   TEI or HyQ based hyperlinks, as well as sibling-based stylistic
   decisions.

4) The only way to avoid the dichotomy is to preserve these whitespace
   nodes even when a DTD is present.

5) Since SEPCHAR is thrown away in element content, every element must
   be made mixed content, and any element declaration without #PCDATA
   is illegal.

This is clearly unacceptable.  Once the addressing issues are
considered, I don't think that either RE delenda est or Charles
Goldfarb's shortref hack cuts it - Paul Prescod's suggestions of
explicit mixed content delimiters or elimination of mixed content
whitespace seem to be the only workable suggestions.  They're icky,
but I don't see another way.

-Chris

[*] There are proposals for heuristics to determine the difference,
    but I can think of a failure condition for any of the ones I've
    seen so far.
-- 
<!NOTATION SGML.Geek PUBLIC "-//GCA//NOTATION SGML Geek//EN">
<!ENTITY crism PUBLIC "-//EBT//NONSGML Christopher R. Maden//EN" SYSTEM
"<URL>http://www.ebt.com <TEL>+1.401.421.9550 <FAX>+1.401.521.2030
<USMAIL>One Richmond Square, Providence, RI 02906 USA" NDATA SGML.Geek>

Received on Monday, 16 December 1996 17:26:45 UTC