- From: Larry Masinter <LMM@acm.org>
- Date: Wed, 11 Dec 2002 15:23:30 -0800
- To: <noah_mendelsohn@us.ibm.com>, "'Scott Lawrence'" <lawrence@world.std.com>
- Cc: <ietf-xml-use@imc.org>, <www-tag@w3.org>
The discussion makes it sound like the considerations are as much "code footprint", "reliability", "simplicity" as they are "performance". It's good to be clear about the requirements, partly to decide if they are appropriately satisfied by the proposed solution. In addition, the motiviations for "Binary XML" seem to be the same kinds of high-performance or embedded applications. Scott Lawrence wrote: > Supporting entity substitutions other than the required minimum > would have had a fairly large effect on code size and complexity. > The largest and most troublsome effect was on the buffer management > - the minimum required entities are all larger than the text that > they turn into internally, so they just collapse the data within the > existing buffer(s), but that's not true in the general case. Noah Mendelsohn replied: > Thanks Scott. Turns out, this was among the optimizations we at IBM > had noticed, and was among the ones I had in mind when preparing > input to the XMLP workgroup response. So, that's at least two > independent organizations doing implementations with similar > insights and intuitions regarding the tradeoffs involved in > supporting entities. > > BTW: several have asked whether there would have been a cost to > allowing entities in the case where the instance did not in fact use > entities. Well, as you say, there's often a cost in code footprint, > unless you have a way of acquiring the code dynamically. Unless > you're very careful, there's also potentially a cost in terms of > levels of indirection to the various potentially discontinguous > buffers, unless you're willing to build two versions of your code > and switch to the "no buffer management" version when you discover > that there are indeed no entity definitions. That also involves > more testing cost for the alternate paths, etc. Regarding those who > have asked for specific performance numbers, I can't say that we in > IBM have built controlled implementations, one with and one without > just the internal subset optimizations. As I said in my earlier > note, one tends to make combinations of optimizations together. Our > experience is that in combination it is possible to use such > techniques to get very significant improvements over what would be > typical of full function parsers.
Received on Wednesday, 11 December 2002 18:24:02 UTC