W3C home > Mailing lists > Public > www-dom@w3.org > April to June 1999

RE: RFC: White Space Handling In XML Parsing

From: Booth, David <booth@bluestone.com>
Date: Thu, 20 May 1999 16:55:40 -0400
Message-ID: <512EBEF97F02D311B89900A0C9D177601FD54A@thor.operations.bluestone.com>
To: www-dom@w3.org
I think Larry makes a good case for the value of
whitespace preservation in some applications.  

Perhaps the key question
is how many people will be writing applications such
as XML editors that should retain whitespace formatting,
versus how many people will be writing other applications
that simply need to consume XML, and don't care much
about the whitespace formatting the input had.

My personal guess is that there will be a lot more people 
writing XML applications that consume XML and do not care 
much about the whitespace formatting than there will
be people writing XML editor-like things.  But I don't
have any data backing up this guess.  Anybody
got any data quantifying the expected uses?

David Booth
Bluestone Software, Inc.      +1 609 727 4600   ext. 1740


> -----Original Message-----
> From: Larry Watanabe [mailto:LWatanab@JetForm.com]
> Sent: Thursday, May 20, 1999 9:16 AM
> To: Booth, David; www-dom@w3.org
> Cc: 'Joel A. Nava'; Bickel, Bob; Nigro, Mark
> Subject: RE: RFC: White Space Handling In XML Parsing
> 
> I think preservation of whitespace is important. Consider 
> implementing any application where 
> 	(a) user may format XML
> 	(b) program may read in the XML file
> 	(c) changes may be made to the DOM
> 	(d) the DOM is written out
> If the whitespace is not preserved, then the formatting is 
> lost. This is a complaint against many html editors.
> 
> I think that rather than being in the minority, the majority of XML
> applications 
> cannot make the assumption that a) their input was formatted 
> by someone, and
> b) someone (possibly the same person) may want to view the output.
> 
>  Larry Watanabe
>  Jetform Corp
>  lwatanab@jetform.com
> 
> 
> 
> > -----Original Message-----
> > From:	Booth, David [SMTP:booth@bluestone.com]
> > Sent:	Wednesday, May 19, 1999 6:18 PM
> > To:	www-dom@w3.org
> > Cc:	'Joel A. Nava'; Bickel, Bob; Nigro, Mark
> > Subject:	RE: RFC: White Space Handling In XML Parsing
> > 
> > Some general thoughts: 
> > 
> > 1. It would be really, really nice if the DEFAULT behavior were 
> > deterministic, i.e., guaranteed to produce the same DOM tree
> > under different parser implementations.  If one is concerned
> > about parser efficiency (i.e., not unduly constraining parser
> > implementations), then an option could allow the parser to
> > produce its preferred DOM tree for maximal speed.
> > In other words, it would be nice if the specifications supported
> > an application development model of: "FIRST make it correct, 
> > THEN make it fast".  This would be best supported by making
> > the default behavior be the most deterministic, and the non-default
> > behavior be the fastest (assuming one must choose between the two
> > objectives).
> > 
> > 2. It seems to me that the most appropriate 
> > default should be to NOT preserve whitespace, since:
> > 
> > 	a. This is likely to be the behavior that is
> > 	desired by most applications; and
> > 
> > 	b. The people writing the applications that wish to
> > 	PRESERVE whitespace will be a somewhat more select group
> > 	of programmers; thus it is more reasonable to require
> > 	them to know a little more than others, rather than
> > 	the other way around.
> > 
> > Personally, I like to think of XML as a very direct
> > representation of an abstract syntax tree.  Under this
> > kind of thinking, syntactic details such as whitespace 
> > are irrelevant.  This viewpoint may reflect a bias of my
> > background, but I suspect I am not alone in this view.
> > This is another factor in favor of making the default
> > behavior be to NOT preserve whitespace.  However, I am
> > certainly open to other viewpoints.
> > 
> > David Booth
> > Bluestone Software, Inc.      +1 609 727 4600   ext. 1740
Received on Thursday, 20 May 1999 17:02:21 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 22 June 2012 06:13:46 GMT