W3C home > Mailing lists > Public > www-tag@w3.org > May 2005

Another potential new TAG issue: Use of <, >, and & in XML based data formats

From: Elliotte Harold <elharo@metalab.unc.edu>
Date: Thu, 05 May 2005 11:44:02 -0400
Message-ID: <427A3F42.4020602@metalab.unc.edu>
To: W3C TAG <www-tag@w3.org>

I noticed another problem in the CSDI spec, unrelated to the separation 
of content from presentation, that nonetheless might be worthy of TAG 
discussion under the subject of XML-Based Data Formats.

CSDI, and several other W3C specs, existing and proposed, make use of 
the symbols <, >, and & for typical programmatic purposes; i.e. 
comparing items for greater than, lesser than, and for boolean and. 
XPath uses < and > for example, though it uses "and" instead of &.

It seems to me based on experience that this is confusing developers. 
Everyone figures out how to escape these characters sooner or later, of 
course; but nonetheless it persistently wastes people's time as they do 
try to figure out the error and debug the problem. The worst issue is 
the use of & in URL query strings. That one's so bad that some 
frameworks are now supporting the semicolon as an alternative to the 
ampersand that does not need to be escaped.

Even once programmers have learned their lesson, they still waste time 
forgetting to escape these characters. Finally the code they do produce 
runs but it's less clear than it should be. select="value &gt; 0" just 
isn't as easy to read as select="value > 0".

I ask that the TAG consider taking up the question of the use of the <. 
 >, and & characters in XML-based data formats. It is my hope that they 
will issue a recommendation that all future specs for  content expected 
to be used in an XML document not use the characters <, >, and & for any 
special purpose. Possibly it should consider the same question for " and 
' as well, though these two characters aren't as big a problem in practice.

There are alternatives to <, >, and &. Fortran programmers have been 
happily using .GT., .LT., and .AND. for over half a century. New 
specifications for XML based data formats should not overload the <, >, 
and & symbols if they can possibly avoid doing so.

Elliotte Rusty Harold  elharo@metalab.unc.edu
XML in a Nutshell 3rd Edition Just Published!
Received on Thursday, 5 May 2005 15:44:10 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:56:08 UTC