RE: [XRI] How can http: URIs meet URN requirements? from Williams, Stuart (HP Labs, Bristol) on 2008-08-22 (www-tag@w3.org from August 2008)

From: Williams, Stuart (HP Labs, Bristol) <skw@hp.com>
Date: Fri, 22 Aug 2008 09:42:35 +0000
To: Drummond Reed <drummond.reed@cordance.net>, "www-tag@w3.org" <www-tag@w3.org>
Message-ID: <233101CD2D78D64E8C6691E90030E5C818196A7352@GVW1120EXC.americas.hpqcorp.net>
Hello Drummond,


________________________________
From: www-tag-request@w3.org [mailto:www-tag-request@w3.org] On Behalf Of Drummond Reed
Sent: 21 August 2008 21:13
To: www-tag@w3.org
Subject: [XRI] How can http: URIs meet URN requirements?

Although the ability to support persistent identifiers is just one aspect of XRI architecture (and should NOT be confused as being the only feature or even the most important feature), the recent discussions about URNs, XRIs, and persistence brought to light some of the original requirements that the XRI TC began its work around in 2003. These requirements have since been validated in the market because some of the most visible deployments of XRIs are using them explicitly (e.g., XRI CanonicalIDs for security protection in OpenID Authentication 2.0 [1] [2], and the use of persistent XRIs for persistent identification and cross-domain data sharing in the Higgins identity framework [3]).

In particular, we identified two functional requirements that XRIs had in common with URNs:

1) To be able to express a fully persistent identifier as defined in section 2 of RFC 1737, Functional Requirements for URNs [4]:

     Persistence: It is intended that the lifetime of a URN be
     permanent.  That is, the URN will be globally unique forever, and
     may well be used as a reference to a resource well beyond the
     lifetime of the resource it identifies or of any naming authority
     involved in the assignment of its name.
I know this is going to sound like I'm quibbling... but what is it that is distinctly persistent. A URI/URN... is inherently persistent - it is a literal string. "http://www.w3.org/" as a URI is persistent and will always be "http://www.w3.org" regardless of what that URI refers to. So, when folks speak of persitent URI/URNs I repeatedly find myself ask the question what is it that is persistent. The lines above tell me that "the URN will be globally unique forever..." but that would be true of all URI, so I'm sure there is more that you take as implicit but have not said here. eg. "...and may well be used as a reference to a resource well beyond the lifetime of the resource it identifies or of any naming authority  involved in the assignment of its name." may be intended to convey the notion that resource referred to by a given persistent identifier is invariant - but it does not actually say that.

Responses from John suggest that what is invariant (in XRI) is the relationship between the persistent identifier and the authority that administers the corresponding XRD document. It may be that the obligation on that authority lead to other, possibly intentional, invariances eg. in the relationship between a persitent identifier and the resource that it is intended to refer to.

I know I'm being anal... but I think that persistence goals need to be stated very carefully and clearly and I repeatedly find that they are not.
2) To be able to recognize a fully persistent XRI purely by inspection, i.e., without requiring resolution of any kind.

URNs (RFC 2141, [5]) meet both these requirements very easily:

1) All identifiers using the URN scheme are required to be persistent by definition.
2) All URNs can be unambiguously recognized purely by the urn: scheme prefix.

While with XRIs it isn't quite that simple because XRIs by definition encompass both persistent and reassignable abstract identifiers (or any combination of persistent and reassignable subsegments), XRIs as defined in XRI Syntax 2.0 [6] still satisfy the same two requirements in essentially the same way:

1) An XRI is fully persistent if it consists entirely of persistent subsegments (subsegments that are delimited with the ! character).
2) All XRIs in URI normal form can be unambiguously recognized by the xri: scheme.


We have long said that such an XRI is functionally a URN. So here's the question: the TAG asserted (back during the OASIS vote on XRI 2.0 in May) that "We are not satisfied that XRIs provide functionality not readily available from http: URIs." [7] However the XRI TC has discussed this extensively and we do not understand how http: URIs can meet these two requirements. Our logic is not complex:

1) The http: scheme does not itself require all http: URIs to be persistent.
2) The http: scheme does not define any syntax for indicating persistence of a particular http: URI.

Therefore, if an http: identifier is to serve the same function as a URN, and this quality is to be recognizable purely by inspection, it must be done with some additional semantics beyond the scope of the http: scheme.

Are we missing something?
I think that the 'so-called' Booth/Bradley proposal gives you a way to do that, based on a commitment from the administrative authority for a given domain name used a a prefix - xri.net IIRC - not dissimilar to the commitment behind purl.org.

http://purl.oclc.org/docs/inet96.html states: "It is important to note that persistence is a function of organizations, not technology." and I think that that remains true for XRI as much as any other forms of identifier.

Note that this should NOT be interpreted as saying that such semantics cannot be added to http: URIs. Indeed, that's what the XRI TC did in defining the HXRI (HTTP XRI) format for XRIs - see section 11.2 of [8]. In discussions on this list John Bradley has coinied the term "http: subscheme" to describe this ability to do URI-scheme-to-http:-URI mapping. I believe all XRI TC members are strong supporters of XRI-to-http:-URI mapping because it makes sure all XRI-addressable resources can be fully exposed to and integrated with the http: information space. We'd like to work with the TAG to do it in the most standardized fashion possible.

However we believe the URN requirement alone shows why we also need the functionality of the xri: scheme. It appears this type of requirement was specifically anticipated in section 1.1 of RFC 3986 [9] as it explains the rationale for URIs and different URI schemes:

   This specification does not place any limits on the nature of a
   resource, the reasons why an application might seek to refer to a
   resource, or the kinds of systems that might use URIs for the sake of
   identifying resources.  This specification does not require that a
   URI persists in identifying the same resource over time, though that
   is a common goal of all URI schemes.  Nevertheless, nothing in this
   specification prevents an application from limiting itself to
   particular types of resources, or to a subset of URIs that maintains
   characteristics desired by that application.

We invite the TAG's thoughts on this topic.

=Drummond

[1] http://openid.net/specs/openid-authentication-2_0.html
[2] http://middleware.internet2.edu/idtrust/2008/papers/01-reed-openid-xri-xrds.pdf
[3] http://www.eclipse.org/higgins/
[4] http://www.w3.org/Addressing/rfc1737.txt
[5] http://www.ietf.org/rfc/rfc2141.txt
[6] http://docs.oasis-open.org/xri/xri-syntax/2.0/specs/cs01/xri-syntax-V2.0-cs.html
[7] http://lists.w3.org/Archives/Public/www-tag/2008May/0078
[8] http://docs.oasis-open.org/xri/2.0/specs/xri-resolution-V2.0.html
[9] http://www.ietf.org/rfc/rfc3986.txt
 Regards

Stuart
--
Hewlett-Packard Limited registered Office: Cain Road, Bracknell, Berks RG12 1HN
Registered No: 690597 England
Received on Friday, 22 August 2008 09:45:22 UTC