W3C home > Mailing lists > Public > uri@w3.org > December 2007

RE: URI registries and schemes

From: Mike Schinkel <mikeschinkel@gmail.com>
Date: Thu, 13 Dec 2007 17:37:19 -0500
To: "'Sean Reilly'" <sreilly@cnri.reston.va.us>, "'Clive D.W. Feather'" <clive@demon.net>
Cc: "'Erik Wilde'" <dret@berkeley.edu>, <uri@w3.org>
Message-ID: <035a01c83dd8$bc2c0190$0702a8c0@Guides.local>

Clive, Sean:

Clive D.W. Feather wrote:
> There's an accepted 
> way to represent concepts: URNs. To my mind, there should be:
> 
>     urn:location:wg84:+5217,+00003    (or whatever encoding gets used)

Sean Reilly wrote:
> +N (where N is however many votes I can afford to buy)

Seems like you both dropped in in the middle of the conversation and missed
some of the earlier discussion.  A URN is exactly what I proposed on Dec
11th for the wgs84 use-case however Erik Wilde dismissed it because of his
belief that a URN required a resolution mechanism. See my comments that
start with "The fog is slowly clearing for me" at:

    http://lists.w3.org/Archives/Public/uri/2007Dec/0042.html

Clive D.W. Feather wrote:
>     urn:location:osgb:TL4652

What does "osgb" refer to in your example?

> But, more practically, there's a conceptual difference 
> between "the location 52d17'N 0d03'E" and "a web page from 
> <X> about the location 52d17'N 0d03'E". 
> while yours uses the mapping:

This is the age-old debate about the meaning of a URL compared to its
resource. I take the position that the URL can identify something and return
a web page about it.  And if that bothers you then this can be your
identifier:

    http://location.org/{grid}
    
And this can be your web page:    
 
    http://location.org/{grid}.html
    
Where the former can return the latter's representation via content
negotiation.

>     urn:location:osbg:{grid}  ->  
> http://maps.google.co.uk/q={grid}&cs=osgb
> 
> Meanwhile, my sat-nav accepts the URN as a "home" location, 
> or a destination, or whatever. These aren't information 
> requests (URIs) but location data. Different things.

A URL is both an identifier, in its own right, and also can be used to
request representations. No problem with that. But if you disagree can you
please give me a concrete scenario where it causes a problem to have them be
both?

> instead, but that then provides semantic confusion. You still 
> need to have a registry of names, but your approach seems to 
> me to add a layer of bureaucracy (the "foundation") and an 
> overloading of two concepts on to one syntax, which is almost 
> certainly a bad thing. 

Certainly it provides some confusion given the current state of the web, but
only for those who cannot read (I'm not being facitious; web clients can't
'read.') However, that could be resolved in the future by introducing a
single simple concept that would allow HTTP headers and <meta> or <link>
headers in web pages to identify themselves as providing a representation
specific to a service with a link to the service's definition.  Recognizing
software would of course need to proliferate, but at least it would solve
the problem once and for all instead of requiring the problem to be solved
over and over again.

The problem I have with purely scheme and URN approaches is they have no way
to bootstrap newly introduced concepts. Layering over HTTP would allow
bootstrapping of new concepts.  What this debate boils down to is a
dichotomy of values; what I and some others value is different than what you
two value. But assuming 'your side' is not fundamentalist about the issue
(i.e. your way and only your way), we can achieve both sets of goals by
mating a URN approach with an HTTP/URL approach.

> In particular, how does my browser 
> distinguish between:
> 
> * I want to see {grid} on Streetmap.
> * I want to see what location.org are saying about {grid}.

Before I answer that can we assume that your browser must first be made
aware of this {grid} concept?  If not, how will your browser know what to do
with the URN?  You can't have it both ways.

Sean Reilly wrote:
> +N (where N is however many votes I can afford to buy)

I'll see your +N and raise you N^2.  '-)

> No matter what organization is responsible for location.org, 
> it is likely that they will not continue to exist forever.  

That is a straw-man argument. Neither the W3C nor the IETF is not likely to
last forever either, but if w3.org or ietf.org were to go away it would be a
nightmare.  OTOH, I think we can safely see them to continue to exist.

If a foundation with the proper mission were put in place to manage
location.org interested parties would ensure that it exists as long as there
were value for it to continue to exist. And if there were no longer value
for it to continue to exist (que vision of dystopic future), why would it
matter if it disappeared?

And if it ever *does* become a problem we can backpeddle to a
syntax-matching URN (or just create the URN+HTTP/URL duo to begin with.)

> Putting that domain into a geoloc URI specification 
> essentially mandates that anyone looking for a location do so 
> through location.org, at least until browsers are designed so 
> that location.org URLs bypass the http pipeline altogether 
> (which is quite a large assumption).  

Putting it solely in a URN or a scheme mandates that anyone looking for a
location must have hardware or software that is knowledgable of the URN or a
scheme. How is that different from that bypass of the http pipeline you
define as being unlikely?

> Remember when 
> verisign/netsol started redirecting all unregistered domain 
> lookups to their own search/advertisement site?  It would be 
> nice to avoid enabling another similar situation.

verisign/netsol was a commercial entity, not a foundation.  I don't advocate
for a commercial entity owning location.org.

> Second, how does one verify the authority of a specification 
> declaring location.org (or any other domain) to forever and 
> always have a certain meaning?  Even if you verify that the 
> spec author is the current owner of location.org there's no 
> way that I know of to prove that they will always have the 
> right to determine how that domain is used (or bypassed) for eternity.

Sigh. I feel like I keep having to repeat myself (from former emails.)  The
foundation is the key.  Or an existing foundation with a compatible mission,
for that matter.

> The urn: or even info: URI options are the most elegant (urn 
> if you want resolution, info if not).  

INFO is newly proposed for this debate and one I had not previously
considered.  As INFO is a relatively new spec I was not as familiar with it
as I am with other concepts.  Let's take a look at what RFC4452
(http://www.ietf.org/rfc/rfc4452.txt) has to say:

   The "info" Registry provides a mechanism for the registration of
   public namespaces that are used for the identification of information
   assets and that are not part of the URI allocation.

Okay, so then we would look to decide whether location is a "public
namespaces used for the identification of information assets." That might
fit location, I can't directly argue it is not.  However, let's look at
RFC2141 (http://www.ietf.org/rfc/rfc2141.txt) on URN Syntax:

   Uniform Resource Names (URNs) are intended to serve as persistent,
   location-independent, resource identifiers and are designed to make
   it easy to map other namespaces (which share the properties of URNs)
   into URN-space. Therefore, the URN syntax provides a means to encode
   character data in a form that can be sent in existing protocols,
   transcribed on most keyboards, etc.

It sounds like URN is a lot like INFO, and vice-versa?  Let's see what
RFC4452 has to say about that:

   RFC 2141 [RFC2141] states that "Uniform Resource Names (URNs) are
   intended to serve as persistent, location-independent, resource
   identifiers".  The "info" URI scheme, on the other hand, does not
   assert the persistence of the identifiers created under this scheme
   but rather of the public namespaces grandfathered under this scheme.
   It exists primarily to disclose the identity of information assets
   and to facilitate a lightweight registration mechanism for public
   namespaces of identifiers managed according to the policies and
   business models of the Namespace Authorities.  The "info" URI scheme
   is neutral with respect to identifier persistence.  

So it seems that a main distinction is persistence. Given that, it would
seem that INFO would be a poor choice for wgs84 coordinates, right, and that
URN would be better?

   Further, the "info" URI scheme is not globally dereferenceable in
   contrast to the specific recommendation given in RFC 1737,
   "Functional Requirements for Uniform Resource Names" [RFC1737] that
   "It is strongly recommended that there be a mapping between the names
   generated by each naming authority and URLs".  Individual Namespace
   Authorities registered in the "info" Registry MAY, however, disclose
   references to service mechanisms and are encouraged to do so.
   
Since INFO is by definition not known to be dereferenceable and given that I
have a strong preference for identifiers being dereferenceable I would thus
prefer to not see INFO used unless it by-definition mapped to dereferencable
URLs.  But I admit that is just my strong preference.

   An extra consideration is that the "urn" URI syntax explicitly
   excludes generic URI hierarchy by reserving the slash "/" character.
   An "info" URI, on the other hand, admits of hierarchical processing,
   while remaining neutral with respect to supporting actual hierarchy,
   and thus allows the slash "/" character (as well as more liberally
   allowing the ampersand "&" and tilde "~" characters).  It therefore
   represents a lower barrier to entry for Namespace Authorities in
   keeping with its intention of acting as a bridging mechanism to allow
   public namespaces to become part of the URI allocation.  In sum, an
   "info" URI is more widely supportive of "human transcribability" as
   discussed in RFC 3986 [RFC3986] than is a "urn" URI.
   
That distinction, on the other hand, leads me to prefer INFO over URN; at
least if these location namepsaces were to be used for human-generated
namespace-scoped local place names.

   Additionally, the "urn" URI syntax does not support "fragment"
   components as does the "info" URI syntax for indirect identification
   of secondary resources.

The lack of fragment support might prove to be a significant limitation of
INFO, however.

Interestingly, regarding INFO dereferncability RFC4452 states:

   The "info" Registry will be publicly accessible and will support
   discovery (by both humans and machines) of:

   o  string literals identifying the namespaces for which the Registry
      provides a guarantee of uniqueness and persistence
   o  names and contact information of Namespace Authorities
   o  syntax requirements for identifiers maintained in such namespaces
   o  normalization methodologies for identifiers maintained in such
      namespaces
   o  network references to a description of service mechanisms (if any)
      for identifiers maintained in such namespaces
   o  ancillary documentation

So it appears you can dereference the namespaces but not the identifiers
although I didn't see a way for a machine to discover the information w/o
aid from a human.

> Piggy-backing http: in 
> this case is the easiest short term approach, but is 
> definitely a kludge.

What you see as a kludge I see as elegant because of the benefits that HTTP
resolution provides.  

Thinking aloud, it would be nice to have a new TLD that would be reserved
for registries, maybe ".reg"  Then we could have location.reg and its
managing foundation could be responsible for maintaining that namespace and
resolvable registries where location.reg is just one of them. These
HTTP-dereferencable registries could be implemented in a very lightweight
manner where they delegate the heavy lifting to other organizations whose
related activities do not conflict with managing foundation's mission in
much the same way the the current DNS root servers operate. In addition we
could then develop a set of technologies and protocols that would allow many
devices to bootstrap functionality off of these registries. 

Something like that in the previous paragraph is the vision I'd like to see
become reality, anyway.

-- 
-Mike Schinkel
http://www.mikeschinkel.com/blogs/
http://www.welldesignedurls.org
http://atlanta-web.org 
Received on Thursday, 13 December 2007 22:37:40 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:25:11 UTC