- From: Jacob Palme <jpalme@dsv.su.se>
- Date: Mon, 30 Apr 2001 19:30:30 +0200
- To: discuss@apps.ietf.org
At 15.17 -0400 01-04-24, Keith Moore wrote: > > Did you also look at my comparison between RFC822 and XML >> >> RFC822 example: >> From: Father Christmas <fchristmas@northpole.arctic> >> >> XML encoding of the same information: >> >> <from> >> <user-friendly-name>Father Christmas</user-friendly-name> >> <e-mail-address> >> <localpart>fchristmas</localpart> >> <domainpart> >> <domainelement>northpole</domainelement> >> <domainelement>arctic</domainelement> >> </domainpart> >> </from> >> >> The XML encoding uses five times as many characters. > >you could have as easily said: > ><from>Father Christmas <fchristmas@northpole.arctic></from> > >and that would have been a more accurate comparison. The XML version >uses a few more characters, but it's not a huge difference overall. But then you are combining XML with other encoding. < and @ are framing constructs from RFC822 which in your example are combined with XML. So your example is a mixture of two different encoding methods, XML and RFC822, which is not very neat. You miss the main advantage of XML (and ASN.1/BER), that one single syntactical method is used for all encoding, and that the same framing rules can be used everywhere. In RFC822, there are different framing rules for different places in the encoded information, and different rules for allowed characters and special encoding of non-allowed characters in different parts of the message header. In XML, you get a single, unified method of framing, but at the expense of a much more verbose encoding. That is the point I wanted to make with my example. In the case of RFC822, it has solved the problems with use of framing characters in encoded data by some very archaic rules, such that spaces are (in practice, even if not according to the standards text) not allowed in e-mail addresses and that special characters have to be encoded in special ways. We are so accustomed to RFC822 that we do not think of the restrictions it imposes. But non-Internet experts do not like the RFC822 rules for what is allowed and not allowed in the elementary parts of e-mail addresses, even if nowadays almost everyone in high-technology countries are so accustomed to Internet rules that they do not complain about them anymore. XML also has special rules for encoding of special characters, but they are the same rules everywhere, because a unified method is used for all the framing. The XML encoding has another advantage: If the keywords are chosen well, you can read the XML data and understand it even if you do not know the encoding rules. In the RFC822 example, you have to know what is user-friendly name and what is e-mail address, what is local part and what is domain name, how a domain name can be hierarchical by the use of ".". This is of course of most value when the data transmitted has a syntax with which the reader is not already familiar. In the case of RFC822, we are so accustomed to its special usage of "<", "@" and ".", that this is no problem to an Internet-savvy person. -- Jacob Palme <jpalme@dsv.su.se> (Stockholm University and KTH) for more info see URL: http://www.dsv.su.se/jpalme/
Received on Monday, 30 April 2001 13:50:22 UTC