W3C home > Mailing lists > Public > www-dom@w3.org > July to September 2001

Re: Memory Overhead

From: Joseph Kesselman <keshlam@us.ibm.com>
Date: Mon, 10 Sep 2001 11:39:13 -0400
To: www-dom@w3.org
Message-ID: <OFA3319866.F7103AE6-ON85256AC3.0054ED7E@pok.ibm.com>

>saying that a DOM heirarchy might take up to 4 times as much memory than
>the file it was loaded from, depending on the implementation.

The last phrase is absolutely essential. That "four times" is someone
talking about a particular implementation, and is _not_ a reasonable rule
of thumb. I'd say 1:1 was closer to average for DOMs coded by someone who
knows what they're doing.

The DOM API, per se, only specifies how the data is accessed; it says
_nothing_ about how the data is stored, and implementations are free to
trade off performance, code size, and storage size to suit their users'
needs. A DOM may take more space than the XML file.... or may take less
space... depending on how it was written and the characteristics of the
individual file.

To take one obvious example, a DOM can save a considerable amount of space
by maintaining only a single copy of each unique element or attribute name
string. Since these are repeated many times in a typical mid-to-large-size
document, that can add up to significant savings. There are other, less
obvious, steps that can be taken to reduce storage, though some of them
impose performance costs. Depending on what you're trying to do, that may
or may not be a net win for your application's overall responsiveness.

So if you don't like one DOM implementation's costs, then by all means try
another!

Also see http://www.w3.org/DOM/faq.html#SAXandDOM for a discussion of some
of the tradeoffs. There _are_ some problems which are better approached via
SAX, or via a custom model... or via a custom DOM implementation.

______________________________________
Joe Kesselman  / IBM Research
Received on Monday, 10 September 2001 11:39:48 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 22 June 2012 06:13:49 GMT