- From: Paul Prescod <papresco@calum.csclub.uwaterloo.ca>
- Date: Tue, 17 Dec 1996 16:45:28 -0500
- To: w3c-sgml-wg@w3.org
I think that the following proposal is a compromise, but a reasonable one. It is a blend of existing proposals. It puts the ultimate whitespace responsibility in the author's court without forcing them to use huge, verbose attributes, or extra markup in the "common cases" of mixed content and element-content with-no-whitespace. --- I propose that we make significant whitespace the default, as per the Grosso, Delenda Est, etc. proposals, and provide a *simple*, *concise* mechanism for indicating that insignificant whitespace is being inserted (a variation on Paoli). This is different from my previous proposal where I required the curly braces to follow the content model of the DTD. Example: <ULIST>{ <UITEM><P>Foo</P></UITEM> <UITEM><P>Bar</P></UITEM> <UITEM><P>Baz</P></UITEM> }</ULIST> A non-validating XML parser would have to strip out the whitespace before handing the data to an application. Note that the UITEMs can have either mixed or element content, so the tag delimiters are not tied to the content models of the DTD. If you try to introduce whitespace in an element-content node, a validator should tell you that you must change the instance to either remove the whitespace or hide it with curly braces. Tools to validate adherence to my new proposal could not be written on top of SGML parsers unless the SGML parser returns "SSEP" nodes (most current ones probably do not). But they could be written. Parsers also have to be changed to check the stupid-net-trick for empty tags. --- The following special case can be removed if people do not think that changing from element-content to mixed-content should require special support. <SPECIAL-CASE> As David (I think) pointed out earlier, sometimes you want to change from element content to mixed content. I rebutted that this would change the meaning of your document and should not be done lightly (in either existing SGML, or XML as I proposed it). In my new proposal, both concerns are addressed: Although authors would never do this (nor need to know about it), it would be legal to put the curly braces around mixed content elements *as long as that content had no non-whitespace data characters*. Both kinds of XML parsers would remove these whitespace nodes. This is extra work for SGML-based parsers but not for new XML-based parsers. This is a concession to make it easier for people who change DTDs, not for authors. The point of this is that shifting from element content to mixed content isn't a problem until you want to *modify* the element's content (to add non-whitespace data). If you forget to change the "whitespace mode indicator" in the instance, you will get a well-formedness error: "Element using curly braces may not have data content." This will happen *very rarely* and is certainly no harder to figure out than the average SGML error message. It is obviously legal to use the traditional angle brackets around element-content elements if they contain no whitespace, and illegal if they do not. (as described above) </SPECIAL-CASE> Paul Prescod
Received on Tuesday, 17 December 1996 16:42:44 UTC