Re: Glossary "non-text content" Small Nit

Hi Chris,


On Oct 13, 2005, at 5:44 AM, Chris Ridpath wrote:

> Images and binary content are usually send as ASCII characters.  
> Weird eh?
>
> Here's what Wikipedia has to say about the MIME standard:
> http://en.wikipedia.org/wiki/MIME
>
> RFC 822:
> http://www.ietf.org/rfc/rfc822.txt
>
> You can try it yourself. Send an image using email then look at the  
> message source. The binary image file has been converted to ASCII  
> characters.

What you've said is true for email, but not for content sent over  
HTTP. HTTP allows 8-bit content natively, and uses the MIME type of a  
given format to determine how a given resource is sent by the server,  
and what class of application (browser, plugin, media player, word  
processor...) is meant to process it on the client.

What you are describing is "MIME encoding", which is done to overcome  
a constraint in SMTP. Email messages sent over SMTP must be 7-bit, so  
8-bit content such as binary images (and also text over character  
127, such as accented letters) have to be broken down to be  
transferred properly.

However, if you were to connect directly to an HTTP server and  
request known binary content, what you will get is not MIME-encoded  
data, but the original binary data. The 'curl' program will let you  
test this. If you run:

$ curl http://www.w3.org/WAI/images/wai-temp >wai.gif

and then view the wai.gif you've downloaded in an image viewer, or  
running a conversion tool like gif2pbm, to test that it's maintained  
its integrity. If it were MIME-encoded data you had downloaded,  
processing the image would fail because you would have needed to run  
a decoder on it to reconstitute the original file.

Taken another way, if HTTP by default MIME-encoded everything that  
wasn't 7 bits, then any HTML saved in just about any character  
encoding other than those used for English (us-ascii, iso-8859-1,  
etc.) would have to be encoded, or users of that same character set  
wouldn't be able to process it because the high bit would be lost.

-
m

Received on Thursday, 13 October 2005 15:49:01 UTC