Re: XML-*

>> It seems to me that supporting the possibility of 
>> managing entities should have little impact on 
>> performance for documents that don't have
>> any.

Not true, in my experience.  Depending on one's implementation strategies, 
the very fact that things can move, expand, and in the case of external 
entities be in different encodings all can potentially complicate an 
implementation.  It's not unusual for such factors to lead to an 
assumption that data will be copied, something which may be hard to avoid 
in the no-entity special case.  It can also for buffering strategies to 
lead to multiple levels of indirections in, say, validation logic if 
structures derived from the various entities wind up unnecessarily 
discontiguous.   An implementation would have to be quite careful to avoid 
such levels of indirection in the special case of no entities, if it 
otherwise relied on such indirections.  All these things make it harder to 
do optimizations like:  gee, this message is only 16K, I can fit it in a 
contiguous buffer and wail away (because you always have to worry that you 
might get some of those billion laughs when you start looking into the 
message.) 

I'm not claiming that there are necessarily factors of, say, 2x in all 
this.  I'm saying that the harder you work on performance, the more these 
things start to matter.  When you're in the regimes that interest me you 
are already paying some real costs for handling of namespaces, unicode 
conversions etc.  Adding entities to the mix is a further complication, 
and for systems like SOAP I don't think that on balance it pays.

------------------------------------------------------------------
Noah Mendelsohn                              Voice: 1-617-693-4036
IBM Corporation                                Fax: 1-617-693-8676
One Rogers Street
Cambridge, MA 02142
------------------------------------------------------------------

Received on Friday, 6 December 2002 13:06:31 UTC