- From: Razvan Costea-Barlutiu <cbrazvan@baltan.bsd.uchicago.edu>
- Date: Mon, 10 Sep 2001 09:48:43 -0600
- To: www-dom@w3.org
I read at some point some reference about the DOM level 1 specification, saying that a DOM heirarchy might take up to 4 times as much memory than the file it was loaded from, depending on the implementation. I completelly agree with that. I made an implementation for DOM level 2 (without the events part, which actually is the real breakthrough of DOM 2) and i begin stress-testing it on a 1.7 MB file containing some 230.000 entries (I just copied and pasted the contents of the file, almost choking at some point XMLSpy, after i moved to mighty... Notepad). Anyway, these entries "eat" about 50 MB of memory which i find pretty scaring, for a 1.7 MB file. I searhed my code for memory leaks or over-allocations, but found none. I tried to compute the size i was expecting from DOM to get, and got very surprised as i saw a HUGE number of TEXT nodes (about 55.000), barely containing TAB characters and "/r" characters. I know that the DOM MUST reflect the structure of the document, but these nodes are a pain in the lower layer and eat up huge amounts of memory. While it might sound stupid to IMPOSE the "pretty-printing" formatting on the developers of the DOM, at least some things could be done to reduce the amount of these formatting text fields in the DOM. The normalization method from NODE concatenates adjacent text nodes, but it's not the matter here. I'm looking forward for an answer, observations or whatever you have to say about this. Thank you! __________________________________________________________________ Razvan Costea-Barlutiu Department of Radiology, The University of Chicago 5841 South Maryland Avenue Chicago, Illinois 60637 Phone: (773)834-5106 E-Mail: cbrazvan@baltan.bsd.uchicago.edu
Received on Monday, 10 September 2001 10:47:23 UTC