Re: uri, urn and info from Dan Connolly on 2003-10-10 (uri@w3.org from October 2003)

From: Dan Connolly <connolly@w3.org>
Date: Fri, 10 Oct 2003 16:47:41 -0500
To: Eric Hellman <eric@openly.com>
Cc: uri@w3.org
Message-Id: <1065822461.13823.2054.camel@dirk.dm93.org>
On Tue, 2003-10-07 at 16:45, Eric Hellman wrote:
> urn would be great. but perhaps a concrete example would illustrate 
> where info may be coming from:
> 
> What single,  stable, and widely used name should I use to refer to 
> the text/plain mime type?
> 
> It would seem to someone from the outside, perhaps even someone from 
> Tim Bray's planet, that it might be a good idea to use something from 
> the "urn:" URI scheme.
> 
> I asked google what URI to use for a mime type, and, to my great 
> surprise, google's response pointed to an e-mail I had sent to the 
> rdf-interest mailing list in 1999, and which is still worth reading. 
> http://lists.w3.org/Archives/Public/www-rdf-interest/1999Nov/0065.html
> 
> At that time, Dan Connolly had suggested the use of 
> http://www.isi.edu/in-notes/iana/assignments/media-types/text/html to 
> identify text/html
> If we dereference this url, I obtain a resource which I quote here IN 
> ITS ENTIRETY:
> "
> 
> 
> See RFC 2854.
> "
> which of course, is hugely useful to semantic web applications.

Indeed it is!

(1) It provides evidence that IANA, as a representative
of the Internet Community, still cares
enough about a consensus around this MIME type
to answer HTTP queries about it.

(2) The IETF endorses (as Best Current Practice) the
use of this URI for this purpose:

[[
2.5.  Location of Registered Media Type List

   Media type registrations will be posted in the anonymous FTP
   directory "ftp://ftp.isi.edu/in-notes/iana/assignments/media-types/"
]] -- http://www.ietf.org/rfc/rfc2048.txt *


(3) It provides operational support to avoid
collisions. If the left hand of IANA got disconnected
from the right hand and tried to use "text/html" as a
name for some other MIME type, about hot-dogs markup language,
say, they'd run into the existing use of "text/html"
if they tried to publish the hot dog markup language
definition.

(4) it explains what text/html means by reference
to RFC 2854.

Now it would be great if the link to RFC 2854 were
machine-readable. An HTML ref would contribute
google-kharma. An rdfs:isDefinedBy relationship
would be cool, but I'm not quite sure how yet.

But even the plain text reference published
in an HTTP 200 response makes the http URI
better than a URN in these three ways.

You asked for a single, stable, widely-used name.

The most widely used name for this media
type is "text/html". If you were happy with that,
you wouldn't be asking in uri@w3.org, so I presume
you're after a name in URI space. I'm somewhat
curious what you want that for, especially since
you don't seem to require network lookup.
But as I point out above, the IETF recommends
the ftp://... name.

It's evidently not very widely used; google
reports 6 links to it. I'm curious why you want
it to be widely-used. I suppose one reason is that
wide use contributes to stability, at least
in the case of HTTP URIs, where people are likely
to complain if the server(s) stop doing what they've
come to expect.

Relying on ISI is questionable, when it comes to
stability.

The area directors are drafting new media type registration document
  http://www.ietf.org/internet-drafts/draft-freed-mime-p4-03.txt
and they seem to be correcting that problem, as well
as the problem that FTP doesn't allow for redirection,
to smooth any future moves, nor format negotiation,
to smooth evolution of data formats.

[[
   Media type registrations are listed by the IANA at:

     http://www.iana.org/assignments/media-types/index.html

]]

The .html could go, for my purposes. I suppose I should
tell them that; the document is in last call.

This change in address as a consequence of change in media
type registration rules is cause to wonder about future
stability. That's why Mark Baker and I wrote an Internet
Draft suggesting stability as an IANA policy...

A Registry of Assignments using Ubiquitous Technologies and Careful
                                Policies
                   draft-connolly-w3c-accessible-registries-00
http://www.ietf.org/internet-drafts/draft-connolly-w3c-accessible-registries-00.txt


> A year later, James Tauber (who I doubt is the ignorant dolt that I 
> am)  admitted  to not knowing of this URL when the question comes up 
> again, and suggests
> http://www.iana.org/mime-types/text/plain
> which has nothing on the dereference.

But he made a pretty good guess into the future! ;-)

> Graham Klyne, who is also not an ignorant dolt, suggested that 
> "urn:iana:content-type:text/plain" was on the way.
> 
> Dan Connolly then pointed to 
> ftp://ftp.isi.edu/in-notes/iana/assignments/media-types/"
> as the source of authority for these assignments;

Did I? I should have pointed to rfc2048

>  but if I deference 
> and follow that, I get ftp URI's like
> ftp://ftp.isi.edu/in-notes/iana/assignments/media-types/text/html
>   which appears to result in the same resource that I quoted in its entirety
> 
> Looking at actual practice on the web, I see that Dan's advice is 
> often ignored.

Quite. :)

>  I see all sorts of stuff like 
> "urn:mimetype:text/plain". Google finds a total of 297 document which 
> use Dan's uri, versus 1900 documents using the ftp.isi.edu version.
> 
> looking further, I see that there was an "eastlake" draft that IETF 
> seems to have deep-sixed. I learned, tantalizingly, that Graham 
> Klyne, Ted Hardie and Michael Mealling did some work to perhaps 
> create uris for iana registered stuff, their draft is also expired by 
> ietf, so I cannot tell  what they found.
> 
> The bottom line is that the at least for this one example, the URI 
> infrastructure has failed to provide a single, stable uri for 
> text/plain in a way that people know to use it.

The URI infrastructure did provide a URI. You expect the URI
infrastructure to get people to use it? I don't see why.

There's a lot of stuff around HTTP URIs that can contribute
to use... google kharma and the like. None of that
exists for URNs nor info: URIs.



>  Nowhere is there a 
> place that an authoritative source (other than Dan Connolly, who I 
> have never met) says to use a particular uri for text/plain.

rfc2048 enjoys some level of endorsement; it's an IETF BCP.


> let's try again with "the iso 8879 character set". Is there a good 
> uri for that? not that I can easily find.

This is sorta close...
  http://www.iana.org/assignments/character-sets

if/when they make it available in HTML or RDF or
some such, perhaps you'll be able to use...
  http://www.iana.org/assignments/character-sets#iso8879


> maybe that's not a good one. how about a uri based on the iso country 
> code for Mexico? sorry, I can't find one.

http://www.iana.org/root-whois/mx.htm

(again: better to lose the .htm)

If you care that it's published by ISO rather than IANA,
I'd have to search around some more. ISO likes to charge
for access to their information, so I'm not sure they're
willing to play the game yet.

> ok, how about languages, is there a universally understood URI for 
> American English?

http://www.iana.org/assignments/language-tags

>  Someone on the list can tell me maybe, but what 
> about all those people who aren't receiving the URI list???

It seems to me that the cost of finding these things is
manageable. And if they're really wanted/needed, the cost of
finding them in http space is likely to go down over
time.

>   Developing a common language, which is what we're trying to do for a 
> specific, web based application, is a social, non-technical process 
> of consensus.

Amen.

>  URN, and HTTP for that matter, has failed to make that 
> consensus happen, even for these "easy" cases.

I think you're jumping to conclusions too early.

>  So the result is that 
> all sorts of groups make up their own vocabulary and none of the 
> groups can talk to one another.

That's natural at the start of this consensus process, no?

>  Although I think the info draft can 
> be improved in many ways, I'd have to say that organizations like 
> NISO with experience at developing that kind of consensus in the 
> bricks and mortar world need to be actively involved.

NISO is welcome to play along. But I think the IETF isn't doing
so badly, really.


> I really don't care whether it's urn or info ( or http, for that 
> matter). I can make any of them work, if only we could just get on 
> with it.
> 
> so here's a taxonomy for the ways that I've seen put forward for name- uri's
> 
> http
> easiest and most functional, but the minter has to spend a lot of 
> money and time getting people to adopt the resulting URIs. 

Surely that applies to all these choices, no?

> unfortunate car/document argument that always crops up.

True.

> tag
> even easier to mint, no function other than uniqueness. The minter 
> has an even bigger hurdle to get the URI  space adopted, due to 
> people's unfamiliarity with tag
> 
> urn
> rigorous requirements but the real hurdle with urn is to get IETF 
> consensus. IETF lapses most URN proposals and doesn't promote or use 
> the ones it does.

Hmm... The URN nid registration process seems pretty efficient,
from what I can tell. I take the lack of lots of URN NID
registrations as evidence that they're not really needed,
not that they're cost-prohibitive.

> 
> info
> minter has to obtain NISO sign-off. hardly any requirements.

And consequently, hardly any benefits. No google,
no operational support for conflict management.

>  no 
> function except an unspecified namespace registry.

I suggest NISO run a registry in http space.

The URI scheme registry is very valuable real-estate.
Putting something there and keeping it there costs
the whole IETF/Internet community. I don't see why the
whole community should bear this burden when
NISO can do its thing within the existing framework.

> 
> Eric
-- 
Dan Connolly, W3C http://www.w3.org/People/Connolly/
Received on Friday, 10 October 2003 17:47:54 UTC