sharing MIME types and other enumerations [was: notes on SLinkS RDF schema]

Eric Hellman wrote:
> Thank you, Dan, for some very helpful and timely comments.

Glad to be of service; this is kinda fun...

> What, exactly, is the point of replacing the string text/html with the
> resource http://www.isi.edu/in-notes/iana/assignments/media-types/text/html
> , anyway?

Hmm... good question. I suppose MIME types are just like country codes,
ISBNs, etc. My intuition says that the way to manage these global
enumerations is to ground them in URI space. But I can't put my finger
on exactly why this is... here are some related thoughts:

	"The Web works best when anything of value and identify is a first
	class object.  If something does not have a URI, you can't refer to it,
	and the power of the Web is the less for that."
	-- TimBL, Dec 1996
	http://www.w3.org/DesignIssues/Axioms

And waaay back:

	I will be using the format

	<a href=isbn:0-13-484080-1> Carl Malamud's "Stacks" </a>

	-- Edward Vielmetti
	Nov 91
	mid:m0khBzh-00081pC@crane.aa.ox.com
	(cited from http://www.w3.org/Addressing/schemes#isbn)

Ah! Now I remember an example of why this matters: consider IRC
channels. The popular way to integrate them into the web is to
put something like:

	Server: irc23.undernet.org
	channel: #playtime

in a file called playtime.chn and serve it up at
	http://www.example.net/whocares/playtime.chn
with a MIME type of
	application/x-irc-coordinates

and install a handler under that name that fires up your IRC client.
(er... I just rememberd the details are at
http://www.mirc.co.uk/mirclink.html
and I didn't recall the MIME type nor extension correctly; but no
matter...)

Well... this works, but consider an IRC URI scheme (which
is documented, by the way: http://www.w3.org/Addressing/schemes#irc)
that makes IRC channels into first class objects, so that rather than

	<a href="http://www.example.net/whocares/playtime.chn">playtime
channel</a>

you just write:

	<a href="irc://irc23.undernet.org/%??playtime">playtime channel</a>

that allows your browser to add that channel to your "I've been there
before" list, and it allows you to make RDF statements about it, etc.
(not to mention saving an HTTP transaction and the bother of managnig
the .chn file)

Yes, you could make RDF statements about
	http://www.example.net/whocares/playtime.chn
but somebody else could put the exact same coordinates in
	http://www.example.com/somewhere-else/playtime.chn

and your RDF statements wouldn't apply, even thought they should.

So the point is: sharing. If you enumerate the MIME types in
one place, and I enumerate them in another, and you say something
about the text/html mime type in your enumeration (e.g.
	text/html -- approved-for-use-in --> SomePublicationClass )
then that something doesn't apply to my notion of text/html; we
lose an opportunity to share.

The solution, it seems to me is: once something has a home
in URI space, *always* use that identifier to refer to it.

So while it's not exactly ideal that IANA picked the address
	http://www.isi.edu/in-notes/iana/assignments/media-types/text/html
for text/html, it's a historical fact, and one that we should exploit.

(actually, come to think of it, I think the original address was:
	ftp://ftp.isi.edu/in-notes/iana/assignments/media-types/text/html
but I get "too many users" when I try to verify that)

But maybe it's enough for you and me to add some seeAlso sort of stuff
to our documentation of MIME types, so that there would be
*some* path from your notion of text/html to my notion of text/html,
so that RDF engines can infer that they're interchangeable.

Another possibility is that the XML Schema spec for datatypes will
specify a datatype for MIME types, and you'll be able to exploit that
somehow.

By the way...

> I note that these are not RDF resources, they're just names of resources
> containing user-friendly text like "See RFC 9104".

It's an RDF resource regardless of what's inside it; I can make RDF
statements about GIF files, HTML documents, plain text files, etc., no?

> In an earlier version of the schema, we allowed MIME-type to be anything.
> Then, we started using the schema itself to generate our editor
> application. The way it works now, our editor application reads the schema
> when it starts up. It extracts the comments for its help box and uses the
> enumerated values of MIME-Type to build a choice object. So you see why
> it's the way it is.

Er... it's perfectly reasonable to annotate the IANA resources
for this purpose. But your schema seems to actually *define*
the list of MIME types.

[...]

> I also note that while it is easy to add a MIME-Type to my schema, it is
> impossible to subtract a MIME-Type. There is no mechanism to restrict the
> scope of MIME-Type's. Thus it is better to enumerate a minumum set than it
> is to denote the set of all possible values.

Again, it's misleading to say that you're defining the set of MIME
types;
rather, I suggest that you're endorsing some of them for use in your
application. And clearly there are mechanisms for that; endorsement
is the original W3C metadata application: PICS, which is the pre-cursor
to RDF!


Another thought I had a while ago about global enumerations...

========
except from
http://computer.org/internet/v2n2/connolly.htm

Can you explain what you mean by integrating
        global identifiers with URLs?

            Let's say you've got a product--a video
            camera--and it's going to have two settings, one
            for tall people and one for short people. These
            are essentially two global identifiers. In version
            two of the product, I might introduce
            medium-sized people. So now I've got a global
            enumeration. It's like adding a new tag to HTML.

            It's important that these identifiers be tied to
            URLs. So, for example, you might publish the
            spec for that enumeration on your Web site as
            mycompany.com/shortpeople, and so on. This
            allows people to say, okay, if this is one of
            these things I've never heard of, I can look it up
            on the Web and find out what it is. 

            I read too many specifications where somebody
            has introduced a global enumeration without
            thinking about the evolutionary aspects of how
            they should be deployed. Every global
            subroutine is one of these, for example,
            especially in the case of Java, where you've got
            dynamic linking. 
========

-- 
Dan Connolly, W3C
http://www.w3.org/People/Connolly/

Received on Wednesday, 10 November 1999 03:16:17 UTC