- From: Matthew Fuchs <matt@wdi.disney.com>
- Date: Fri, 9 May 97 11:29:46 PDT
- To: w3c-sgml-wg@w3.org
Jon Bosak says: > The processor is required to behave in the manner we've described only > as long as it's calling itself an XML processor. This is an > advertising issue. In the scenario Bill describes, the processor sees > the broken message, and this being what he calls a "mission critical" > application, it has been programmed to respond to such a message by > saying to itself, "This is broken, so it must not be an XML message, > and therefore I'm free to stop being an XML processor and to do what I > need to do, which is to recover the message the best I can and get it > to the receiver in the best shape that I can manage." So the parser > has to take off its figurative "XML approved" hat for a minute to save > your life. Big deal. I think it may be worthwhile to extend the "XML approved" hat, otherwise you're saying to the XML vendors "As soon as something goes wrong you're free to behave in any slovenly manner you want." This would make it hard to write an app that would work on both MS and NS browsers unless the Net never fails. There is an important distinction between ill-formed documents (which we want to discourage), on the one hand, and garbled or fragmentary documents (which we want to help) on the other. Bill Smith's concerns are certainly with the latter. I also wonder if "push" technology and the results of the Document Object Model group won't make the notion of a full document more and more tenuous. Error recovery is always enabled by embedding redundant information. For example, I can create a WF doc which can recover from lost tags by sending tags with a format as follows: <tagname-tagid-depth-startpos>...</tagname-tagid-depth-endpos>, where tagid is incremented with each new tag, depth is depth in the tree, startpos and endpos are the current positions in the document. Of course, this document would have the unusual characteristic of being well formed but invalid (unless the DTD is far larger than the document), so the processor would need to understand and strip this out. The processor (and possibly the app) would also know what is missing. (I may not have chosen the best format for embedding this information, but the point is that it can be done). On the Draconian side, it is pretty obvious that it takes less extra info to recover a doc that started WF. I would agree that only WF docs should be _transmitted_, which I get the impression is what the vendors really want. On the Tolerant side, this shows there are models which will allow applications to do error recovery without opening the door to tag soup. Finally, I think this supports Henry Thompson's "Radical Simplification" suggestion. We can build in decent error correction if we handle it separate from the language itself. Matthew Fuchs matt@wdi.disney.com http://cs.nyu.edu/phd_students/fuchs
Received on Friday, 9 May 1997 14:28:03 UTC