Re: Boeing XRI Use Cases from Paul Prescod on 2008-07-17 (www-tag@w3.org from July 2008)

From: Paul Prescod <paul@prescod.net>
Date: Thu, 17 Jul 2008 11:07:20 -0700
To: "Ray Denenberg, Library of Congress" <rden@loc.gov>
Cc: www-tag@w3.org
Message-ID: <1cb725390807171107v73e736fer62e30112ca76fec7@mail.gmail.com>

On Thu, Jul 17, 2008 at 6:59 AM, Ray Denenberg, Library of Congress <
rden@loc.gov> wrote:

>
> From: "Paul Prescod" <paul@prescod.net>
> > Let me make a concrete proposal.
> > Could the W3C (the TAG? Or someone else?) issue a recommendation to the
> > effect that URIs of the following form are special:
> > http://xri.example.org/SOMETHING:/@boeing*jbradley/+home/+phone
> .....
> > Once the W3C had issued such a recommendation, the chances of someone
> > minting these URIs by accident would drop
>
> But the problem isn't the risk of someone minting these URIs *after the
> fact* (accidentally or otherwise).  The problem with this approach,
> registering a reserved string for the first URI path component, is the
> possibility that that string is already used.  It's not simply a matter of
> telling everyone in the world "don't ever use this string as the first path
> component of any URI you ever mint in the future".   Rather, you're telling
> everyone they'll have to change every such existing URI. I'm sure nobody is
> contemplating that, so what it means is finding some unique string that
> nobody in the world has ever used (in that part of a URI).  How do you go
> about that? (And not just one - SOMETHING will only be the first, someone
> will subsequently want SOMETHINGELSE, then ANOTHERTHING, and so on.)
>

There are two ways to approach it. One could either amend the URL syntax so
that the new identifiers were not previously URLs. (that's not my proposal
but it is implicitly on the table)

OR one could say that it is sufficiently safe to say that it is "very, very
unlikely that there exist names in the wild that match the pattern AND that
it is EVEN MORE unlikely that a wrong interpretation would result in a
serious and hard-to-correct error. Furthermore, in a previous message I
proposed that the W3C could approach people with massive storehouses of URLs
(Google, Wikipedia, Yahoo, Open Directory Project) and just ask them to do
the moral equivalent of a "grep" to see whether they know of any false
positives (especially systematic ones...generated by some obscure CMS or
something). It is my personal opinion that this level of rigour would reduce
the breakdage far below the breakage generally associated with new
specifications (e.g. XML 1.1, new versions of HTML which grab extra tag
names, C APIs that may have name clashes, etc.) The web development world is
a messy, not mathematically pure place.

I think it is fairly common for a standards body to realize after the fact
that it hasn't left itself a hook for extra standardization and to need to
grab some of the namespace to do that. Even more so when there are multiple
standards bodies involved as in this case. For example, if anyone in the
SGML world had used a processing instruction called XML ("Extended Meta
Linguistics"), they would probably see a bunch of software start
misinterpreting their documents after XML was invented. But I have never
heard of a real-world problem.

 Paul Prescod

Received on Thursday, 17 July 2008 18:07:58 UTC