W3C home > Mailing lists > Public > www-tag@w3.org > March 2002

Re: What does a document mean?

From: Elliotte Rusty Harold <elharo@metalab.unc.edu>
Date: Fri, 29 Mar 2002 10:29:53 -0500
Message-Id: <p04330106b8ca3690212b@[192.168.254.4]>
To: www-tag@w3.org
At 11:28 AM -0500 3/28/02, Norman Walsh wrote:
>Paul and I agreed to make this document public, and I did so on
>Monday, but forgot to announce it. Ooops. Here it is, moments before I
>vanish for two weeks vacation. Sorry about the timing.
>
>   http://www.w3.org/2001/tag/doc/docmeaning.html
>

The definition it gives of a document is:

A document on the Web is a stream of bits identified with a specific 
MIME type. The MIME type indicates to the processor how it may 
interpret the stream of bits to decompose it into a sequence of 
characters, for example, or a specific bitmap image.

I don't know that I believe that all documents on the Web have MIME 
types. What about a file in a custom binary format for which no MIME 
media type has been defined retrieved through a protocol such as ftp 
which does not provide MIME typing information? Is this not a 
document? Is this not part of the Web?

And what about documents that aren't on the Web? Wouldn't it be 
prudent to have a definition of "document"?

An unrelated point: is it possible that a document is an infinite or 
at least indefinite stream of bits? If not, we should  state that a 
document is a finite stream of bits.

Also the word "stream" seems a little too suggestive of particular 
APIs. I suggest that the word "sequence" is more precisely defined, 
more likely to be understood, and less likely to cause confusion.

Finally, are we really sure that a document on the Web is always 
bits? and always will be? What about non-binary computers? There have 
been such things in the past and seem likely to be such things in the 
future. This includes analog computers, genetic computers, quantum 
computers, and decimal-based finite state machines. I for one don't 
think it's prudent to lock in binary notation as a fundamental 
concept.

And of course there are many things off the web which we would 
recognize as documents and which are decidedly not binary. Again, if 
we first defined "document" then maybe we could better define 
"document on the Web".
-- 

+-----------------------+------------------------+-------------------+
| Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer |
+-----------------------+------------------------+-------------------+
|          The XML Bible, 2nd Edition (Hungry Minds, 2001)           |
|             http://www.cafeconleche.org/books/bible2/              |
|   http://www.amazon.com/exec/obidos/ISBN=0764547607/cafeaulaitA/   |
+----------------------------------+---------------------------------+
|  Read Cafe au Lait for Java News:  http://www.cafeaulait.org/      |
|  Read Cafe con Leche for XML News: http://www.cafeconleche.org/    |
+----------------------------------+---------------------------------+
Received on Friday, 29 March 2002 10:38:24 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 26 April 2012 12:47:05 GMT