- From: <noah_mendelsohn@us.ibm.com>
- Date: Wed, 11 Dec 2002 14:19:35 -0500
- To: Scott Lawrence <lawrence@world.std.com>
- Cc: ietf-xml-use@imc.org, "Larry Masinter" <LMM@acm.org>, www-tag@w3.org
Scott Lawrence writes: >> Supporting entity substitutions other than the required minimum would >> have had a fairly large effect on code size and complexity. The >> largest and most troublsome effect was on the buffer management - the >> minimum required entities are all larger than the text that they turn >> into internally, so they just collapse the data within the existing >> buffer(s), but that's not true in the general case. Thanks Scott. Turns out, this was among the optimizations we at IBM had noticed, and was among the ones I had in mind when preparing input to the XMLP workgroup response. So, that's at least two independent organizations doing implementations with similar insights and intuitions regarding the tradeoffs involved in supporting entities. BTW: several have asked whether there would have been a cost to allowing entities in the case where the instance did not in fact use entities. Well, as you say, there's often a cost in code footprint, unless you have a way of acquiring the code dynamically. Unless you're very careful, there's also potentially a cost in terms of levels of indirection to the various potentially discontinguous buffers, unless you're willing to build two versions of your code and switch to the "no buffer management" version when you discover that there are indeed no entity definitions. That also involves more testing cost for the alternate paths, etc. Regarding those who have asked for specific performance numbers, I can't say that we in IBM have built controlled implementations, one with and one without just the internal subset optimizations. As I said in my earlier note, one tends to make combinations of optimizations together. Our experience is that in combination it is possible to use such techniques to get very significant improvements over what would be typical of full function parsers. ------------------------------------------------------------------ Noah Mendelsohn Voice: 1-617-693-4036 IBM Corporation Fax: 1-617-693-8676 One Rogers Street Cambridge, MA 02142 ------------------------------------------------------------------
Received on Wednesday, 11 December 2002 14:21:20 UTC