- From: Jonathan Robie <jonathan@texcel.no>
- Date: Wed, 02 Dec 1998 22:17:43 -0500
- To: steve@rsv.ricoh.com (Stephen R. Savitzky)
- Cc: Mike Champion <mcc@arbortext.com>, f.cameron@ulst.ac.uk, www-dom@w3.org
At 02:29 PM 12/2/98 -0800, Stephen R. Savitzky wrote: >Examples of things that don't round-trip include choice of quotes for >attributes, named vs. numeric character entities, omitted start-tags and >end-tags in HTML documents, presence of line breaks before and inside of >tags, and whether an explicit end tag or "/>" was used on an empty tag in >XML. Each of these is an example where there is more than one way to encode the same document structure. HTML is slated to become XML conformant in the future, so omitted start-tags and end-tags will become obsolete, but some of these others might really matter in an authoring application that persists the output. Currently, of course, we can't persist documents, so it doesn't really cause any problems. These things are primarily important for round tripping - the main purpose of the DOM is to represent document structure, not the keystrokes originally used to encode them. But as soon as we can persist documents, round tripping gets pretty important, IMHO. >And of course, the DTD and any other declarations embedded in the >document don't get into the tree, either. Right. We are all very aware of the need to represent the DTD. >DOM level 1 loses information -- it is not possible to reconstruct the >original document from the "equivalent" DOM tree. This is one of the >most serious problems with it, by the way. Another is the inability to >represent generic SGML documents. They're related. The DOM isn't *supposed* to represent generic SGML documents, it's not in our charter. I don't think we should spend our time figuring out how to do record ends and other obscure things few people are using. >I ran into one of these recently, when I was contemplating using @ for >@ in documents containing my e-mail address to foil spammers (at least until >they start using DOM-based address-suckers). Now waitaminit, they can't use our DOM to do anything so despicable as writing address-suckers... Jonathan jonathan@texcel.no Texcel Research http://www.texcel.no
Received on Wednesday, 2 December 1998 22:28:33 UTC