Re: Two new drafts: Multipart/Interleaved and Application /BatchBeep from Jacob Palme on 2001-04-30 (ietf-discuss@w3.org from April 2001)

From: Jacob Palme <jpalme@dsv.su.se>
Date: Mon, 30 Apr 2001 19:30:30 +0200
To: discuss@apps.ietf.org
Message-Id: <p05010401b7134c35e818@[130.237.150.141]>
At 15.17 -0400 01-04-24, Keith Moore wrote:
>  > Did you also look at my comparison between RFC822 and XML
>>
>>  RFC822 example:
>>       From: Father Christmas <fchristmas@northpole.arctic>
>>
>>  XML encoding of the same information:
>>
>>       <from>
>>         <user-friendly-name>Father Christmas</user-friendly-name>
>>         <e-mail-address>
>>           <localpart>fchristmas</localpart>
>>           <domainpart>
>>             <domainelement>northpole</domainelement>
>>             <domainelement>arctic</domainelement>
>>         </domainpart>
>>       </from>
>>
>>  The XML encoding uses five times as many characters.
>
>you could have as easily said:
>
><from>Father Christmas &#60;fchristmas@northpole.arctic&#62;</from>
>
>and that would have been a more accurate comparison.  The XML version
>uses a few more characters, but it's not a huge difference overall.

But then you are combining XML with other encoding. &#60 and
@ are framing constructs from RFC822 which in your example
are combined with XML. So your example is a mixture of two
different encoding methods, XML and RFC822, which is not
very neat. You miss the main advantage of XML (and ASN.1/BER),
that one single syntactical method is used for all encoding,
and that the same framing rules can be used everywhere.
In RFC822, there are different framing rules for different
places in the encoded information, and different rules for
allowed characters and special encoding of non-allowed
characters in different parts of the message header.
In XML, you get a single, unified method of framing,
but at the expense of a much more verbose encoding.
That is the point I wanted to make with my example.

In the case of RFC822, it has solved the problems with use
of framing characters in encoded data by some very archaic
rules, such that spaces are (in practice, even if not according
to the standards text) not allowed in e-mail addresses and
that special characters have to be encoded in special ways.
We are so accustomed to RFC822 that we do not think of the
restrictions it imposes. But non-Internet experts do not
like the RFC822 rules for what is allowed and not allowed
in the elementary parts of e-mail addresses, even if
nowadays almost everyone in high-technology countries
are so accustomed to Internet rules that they do not
complain about them anymore.

XML also has special rules for encoding of special characters,
but they are the same rules everywhere, because a unified
method is used for all the framing.

The XML encoding has another advantage: If the keywords
are chosen well, you can read the XML data and understand
it even if you do not know the encoding rules. In the
RFC822 example, you have to know what is user-friendly
name and what is e-mail address, what is local part
and what is domain name, how a domain name can be
hierarchical by the use of ".". This is of course of
most value when the data transmitted has a syntax
with which the reader is not already familiar. In the
case of RFC822, we are so accustomed to its special
usage of "<", "@" and ".", that this is no problem to
an Internet-savvy person.
-- 
Jacob Palme <jpalme@dsv.su.se> (Stockholm University and KTH)
for more info see URL: http://www.dsv.su.se/jpalme/
Received on Monday, 30 April 2001 13:50:22 UTC