Re: file types from David Perrell on 1996-10-25 (www-html@w3.org from October 1996)

From: David Perrell <davidp@earthlink.net>
Date: Thu, 24 Oct 1996 23:01:28 -0700
To: "Scott E. Preece" <preece@predator.urbana.mcd.mot.com>
Cc: <www-html@w3.org>
Message-Id: <199610250603.XAA29364@norway.it.earthlink.net>

Scott E. Preece wrote:
> I guess I don't understand your argument - TIFF has an embedded file
> type code (the same kind of thing I was talking about).

Either 49492Ah or 4D4D2Ah, depending on byte order of multi-byte
values. The 2Ah is the never-changing version number. (Are we still at
Rev 5.0, circa 1988?)

>  And no, you
> don't need new codes for new versions if the format is
self-describing
> and the new versions stay within the same self-description standard.
> You only need a new type (or to use a type+version coding) if the
> types are incompatible between versions.  TIFF isn't (at least at one
> level), though in terms of vectoring a double-clicked icon to the
> appropriate application, TIFF is of limited help, since the
application
> may need to do substantial parsing before it can decide whether it
can
> handle the file or not, which is why I suggested the utility of
having
> version numbers.

That's supposed to be the beauty of TIFF. A well-designed reader could
make some sense out of just about anything. But how often do you see
"well-designed"?

> I'm not sure you're better off in TIFF for not having a simple
> way to judge compatibility from the outside (without a lot of
> processing).

Designed for intersystem portability. Don't let those lazy programmers
give up too easily.

> Remember, there are two uses for the typing - one is to let an
> application decide whether it can handle the file, the other is to
> determine, from the type, what application to hand it to.  Parsing
works
> OK for the first use, though somewhat more slowly and with ambiguity
> issues if the application can support different versions in different
> ways and a particular file has some attributes of one case and some
of
> another; it works less well for the latter (though you could still
> handle it by offering the file to all the potentially accepting tools
> and letting them decide individually whether to say they can handle
it,
> then let the user decide which of the candidates to use).

All fine and dandy, except that to avoid the overhead of opening and
reading a file to find out the type, the type must be part of the file
system, not embedded in the file data. I know of no possible mechanism
for this in the FAT system besides the extension and a single attribute
byte. I believe the same is true of NTFS, though here you've got long
filenames and support for UNICODE characters. Can you imaging having to
open tens of thousands of files to construct a readable
folder/directory listing?

David Perrell

Received on Friday, 25 October 1996 03:23:55 UTC