- From: Murray Altheim <murray@spyglass.com>
- Date: Tue, 17 Dec 1996 17:24:15 -0400
- To: Tim Bray <tbray@textuality.com>
- Cc: w3c-sgml-wg@w3.org
Tim Bray <tbray@textuality.com> writes: [...] >3. All non-markup bytes are signicant, whitespace or not (Durand) > >Pro: Everyone can understand the rules, it's easy to implement ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >Con: You lose certain Hytime addressing facilities, and the application > gets no help from the XML processor in ignoring WS that to the user > is "obviously" irrelevant. > >4. Use an mechanism *in the instance* to signal a DTD-less application > what's going on. > > 4.1 The PI-based DTD summary (Sperberg-McQueen) > 4.2 Explicit quoting of significant character data (Goldfarb) > 4.3 -XML-SPACE [...] >Face facts, folk. There is just not a solution that is going to solve >this problem and be free of some cost. And please remember the cost >of explanation and education is very real. For myself, my preference >would be (in descending order) #3, #4.3.3 (a3/b2), 4.3.2, 4.3.1. I've on several occasions written up responses to this thread, only to be pre-answered and re-confused by further discussions/questions. When you talk of the cost of explanation and education, you must also mention the cost of messy solutions (from an end-user perspective). I would hope we would avoid messy solutions AT ALL COSTS. The public view of XML will have a lot to do with how many messy solutions we create in the specification. Every instance of things like <?XML-SPACE-PROGRESSIVE-SHIFT VARIANCE::8879-97//LPN SPIN-RIGHT="XML"/> as a learning requirement on document authors is going to hurt the perception of XML as a solid specification, free of hacks, and relatively simple to learn and use. I don't even like the opening PI much, to be frank. If option #3 only causes problems with 'certain HyTime addressing facilities', then it seems that someone should concentrate on coming up with a solution for HyTime, not XML. I wouldn't assume too many users coming up from HTML or learning XML as an 'onramp to SGML' are going to be impacted much by difficulties with HyTime addressing facilities. If Jon is willing to admit to ignorance, I'll surely admit to plenty 'o ignorance. I don't even know what a HyTime addressing facility is, and I would hope I wouldn't need to know one from a Semantic Specific Result Instance, a Conceptual Output Instance, a Generic Language Translation Process Specification, a Formatting Output Specification Instance, or any other string longer than thirty characters. You won't find XML on any shelves if so. If someone needs this level of complexity, then, as we mention in the XML Q&A sheet, then don't use XML, use full-bore SGML. Based on my abhorrence of custom strings and excessive complexity, isn't there some way (if the cons of #3 are accurately stated) that we simply use some simple rules: >3. All non-markup bytes are signicant, whitespace or not (Durand) This is true only for the 'pGrove' (ala Kimber below)? In processing the pGrove, we make some assumptions about authors' intentions. Further processing is based on these assumptions: a. whitespace within an element is significant to that element* b. whitespace between elements is not significant c. whitespace after a start tag is eliminated (ie., not significant) d. whitespace preceding the end tag is normalized to a single space *Associated with item (a) is the fact that sans DTD, no document author can expect all whitespace to be significant in content, eg., one couldn't expect an XML UA to know about <PRE> in HTXML without either a DTD or a stylesheet to provide that information, preferably a stylesheet. Then, maybe we could require SDAFORM CDATA #FIXED "Lit" on all element content with significant whitespace and be done with it. :-) I must reinforce Jon's assertion that when discussing child nodes of a parse tree, most of us ignorant folks aren't going to be thinking of a linefeed as the third element of an ancestor. I'm with you Tim on #3 as a first choice. The other solutions seem to clutter up the requirements on what is intended to be a simple spec. Murray ``````````````````````````````````````````````````````````````````````````````` Murray Altheim, Program Manager Spyglass, Inc., Cambridge, Massachusetts email: <mailto:murray@spyglass.com> http: <http://www.cm.spyglass.com/murray/murray.html> "Give a monkey the tools and he'll eventually build a typewriter."
Received on Tuesday, 17 December 1996 17:21:23 UTC