- From: Kent Pitman <kmp@harlequin.com>
- Date: Sat, 25 Apr 98 13:13:52 EDT
- To: tbray@textuality.com
- Cc: xml-editor@w3.org, kmp@harlequin.com
Date: Sat, 25 Apr 1998 08:12:27 -0700 From: Tim Bray <tbray@textuality.com> At 11:43 AM 4/24/98 EDT, Kent M Pitman wrote: >The introductory text in section 4, Physical Structures, is very >confusing. It uses a meaning for "parsed" which is alien to any >meaning of "parsed" that I am familiar with. Well, I must say that I'm impressed at the intensity you've been putting into reading the XML spec. I am sorry that you find it so disappointing. Well, I emphasize that one of the disappointments was that it came so close to repairing a problem that had bugged me so much about SGML. You kind of set up the expectation by showing that it was within grasp of fixing... so take at least a little of my comment as a compliment in that it's obvious you have some people who were working hard on simplification. I just wanted to weigh in strongly on the idea that certain simplifications matter in very material ways. I'll try to find the time in the near future to address the points you raise, but before that, a couple of meta-points are in order: No problem. I'm happy to see responses but I'm not blocked in any way by any failure on your part to respond; mostly I'm just happy to know they're filed where they can be discussed by your group and action taken if/when it is ever appropriate to take action again. I just wanted to get my thoughts into the pipeline while I was thinking about them. I had done a careful compare of the old draft I had and the new document and that seemed the appropriate time... First, XML 1.0 is effectively frozen now and will not be changing. Yes, there are shortcomings, but at some point we had to draw the line and ship, so on Feb. 10th, we did. Second, others, who find the spec less unsatisfying than you do, have charged in and implemented a wide variety of parsers and tools in a variety of languages; so far, they seem to offer very high interoperability (it helps having James Clark in the field) - thus for most developers, details of syntax can generally be ignored and outsourced to the XML processor authors. Having said that, all your input has gone in my "errata" file and will be considered carefully when, if ever, we do another revision of the XML spec. Yeah, I'm Project Editor for J13 (formerly X3J13), the committee that produced ANSI Common Lisp. I'm familiar with the problems of the standards cycle and have just such a file myself for exactly the same reason. My comments will keep. Finally, as to your specific point regarding "parsed" and "unparsed" - the committee kicked around lots of options. In earlier drafts we had used "text" and "binary" but that was unsatisfactory since "binary" might in fact be text. In fact, the only distinguishing characteristic of "binary" entities (what SGML calls "data" entities) is that they are not read and parsed by the XML processor. So the correct label should be "NotToBeReadByTheXMLProcessor", for which "unparsed" seemed to us an acceptable contraction. Then for symmetry, the other kind is called "parsed". I agree with you that there are other usages of the word "parsed", but I do feel that our usage is legitimate and unsurprising. I guess the essence of my point was really that nowhere in the discussion of "parsed" does it SAY that "parsed" means "by XML". Honestly, I had to read this section way too many times before I figured out what it meant, and some extra verbiage would have helped because the concepts I finally figured out it was offering me were not as complex as I had feared. Even just a single sentence that says ``Some documents are intended to be parsed by XML; we'll called those "parsed".'' would help a lot. I don't care what formal terms you create--if you see the Common Lisp HyperSpec(TM), at http://www.harlequin.com/books/HyperSpec/FrontMatter/ you'll notice I have a glossary of about 70 printed pages of English terms that I hijacked for my own use in describing ANSI Common Lisp. But I do offer definitions so that people don't get confused between the common and formal meanings. I think it's the absence of the phrase "not to be read by the XML processor" for "unparsed", etc. that got me. BTW, can we infer from your close attention that Harlequin is going to do something interesting with XML? I can't speak for the company about what the company will do. I can observe that we do a heavy amount of business in digital printing and publishing--we make a high-end PostScript RIP that supports many major publishers in publishing document content. At this point we're "tracking XML seriously". Whether we make any products out of it will, I imagine, depend on customer demand. My sense is that the industry is warming to XML, but also that it's too early to tell for sure. And as I'm `just' a technology developer, I don't have a say in what specific products we release. However, I think it's safe to say that if our customers ask for it, we'll support it.
Received on Saturday, 25 April 1998 13:14:42 UTC