RE: [XRI] TAG recommendation from Williams, Stuart (HP Labs, Bristol) on 2008-07-22 (www-tag@w3.org from July 2008)

From: Williams, Stuart (HP Labs, Bristol) <skw@hp.com>
Date: Tue, 22 Jul 2008 14:54:42 +0000
To: John Bradley <john.bradley@wingaa.com>
CC: "Schleiff, Marty" <marty.schleiff@boeing.com>, "www-tag@w3.org" <www-tag@w3.org>
Message-ID: <233101CD2D78D64E8C6691E90030E5C816BA2F7ABF@GVW1120EXC.americas.hpqcorp.net>
Hello John,

[By all means read all the way down... but I've placed an <xxxx> marker at
the point where I think my response is more interesting.]
 
> -----Original Message-----
> From: John Bradley [mailto:john.bradley@wingaa.com] 
> Sent: 22 July 2008 06:26
> To: Williams, Stuart (HP Labs, Bristol)
> Cc: Schleiff, Marty; www-tag@w3.org
> Subject: Re: [XRI] TAG recommendation
> 
> Hi Stuart,
> 

<snip/>

> The Syntax spec defines abstract identifiers  iNames.
> 
> These identifiers have two major components:
> 1. An authority segment
> 2. A path segment

Ok...

> These are patterned after IRI,  and if proceeded by a IRI/URI scheme  
> identifier they would be IRI/URI.

Ok... actually, as defined there, it seems to me that all XRI are IRI (ie.
the fall in the subset of strings that are admissable as IRI). I haven't
looked closely at the delimiter issues - but I think that the design intent
would be that XRI are IRI... so that for instance at line 370  of the PDF
version of the syntax spec:

	xref = "(" ( XRI-reference / IRI ) ")"

could be replaced by:

	xref = "(" IRI ) ")"

(as long as one didn't allow the xri:// scheme prefix (in that spec) be
optional).

> The XRI-TC attempted to make XRI syntax IRI compliant.
> 
> I fear there lurks in our discussion a fundamental difference of  
> opinion regarding the relative roles of IRI vs http: scheme URI.
> The XRI-TC and others still hold to TBL's 1996 Axioms 
> http://www.w3.org/DesignIssues/Axioms.html.

I don't think so... least ways, if there is something like that lurking I
have no idea what it is.

> The TAG in 2006 partially in response to XRI, published the much  
> quoted "URNs, Namespaces and Registries" 
> http://www.w3.org/2001/tag/doc/URNsAndRegistries-50.html.

That's a work in progress with a long standing action for an update. 

FWIW, I have lodged comments [2] wrt what I think remains the current draft
ie. <http://www.w3.org/2001/tag/doc/URNsAndRegistries-50-2006-08-17>

[2] http://lists.w3.org/Archives/Public/www-tag/2007Mar/0036

> This newer document sets forth the primacy of the http: scheme above  
> all others.
> 
> This W3C document was published after XRI 2.0 syntax was adopted by  
> the TC.

It is work in progress - a draft - it does not command the concensus of the
TAG - it says so "Status of This Document" section at the beginning of the
document.

> Syntax also covers elements that are opaque to the resolver for use in  
> cross references and XDI.
> 
> I think that David Orchard and David Booth have both made some  
> compelling arguments as to why it would be preferable for XRI not to  
> register a URI/IRI scheme.   The XRI-TC has stated that they are  
> willing to consider the change if a suitable alternative for  
> identifying XRI can be agreed upon.  I find the XML name-spacing  
> argument one the most convincing personally.
> 
> Now lets consider resolution.
> 
> XRI is about resolving a abstract XRI to discover an XML description  
> for a desired service.
> 
> It is not a replacement for http:, it is much closer to DNS.
> 
> Like DNS you preform resolution of an identifier to retrieve metadata  
> for an application to use.
> 
> In DNS you have record types that you look for MX, TXT, SRV,  and A  
> amongst others.
> 
> So if I resolve www.w3.org selecting for an A record I get an IP address.
> If I resolve the same name looking for a MX record I may get a  different
IP address.
> If I resolve it looking for _xmpp-server._tcp.w3.org SRV record and  
> get back an IP address and a port.
> 
> I will also note that multiple domain names can map to a single IP or  
> conversely the same DNS name can map to multiple IP addresses.
> 
> I would say that this is the foundation that XRI builds on not http:  
> from a resolution perspective.  True the syntax was taken from URI and  
> that perhaps misdirects people.

It is a facit of http: that any kind of URI may be submitted on the request
line of an HTTP request.

If 
	<a href="xri://<whatever"/>

where <whatever> stays in bounds wrt to generic URI/IRI syntax where to show
up on some markup delivered to a browser, the related URI form (IRI->URI
transformation) will likely find it's way into the request line of an http
request. I don't think that's a question of perception wrt to a resolution
mechanism that operates *over* http. The request line may make it to an
proxy that coordinates resolution using the XRI resolution protocol or, I
suppose, it 404's because the proxy has no idea how to resolve the
identifier.

What of:

	<a href="http://xri.net/<whatever>">

One assumes that that is a direct appeal to xri.net to resolve "<whatever>"
as  'xri-hier-part ["?" query]' - or using the approach that you and David
discussed regard ^http://xri.net/ as a trigger to a local resolver. (xri://
in the earlier example would serve similarly as a trigger for a local
resolver).

Basically *if* XRI are going to show up in places where URI/IRI are
expected... then they should be URI/IRI things.

> The XRI-TC has used XML to create an extensible resolution protocol  
> for resolving XRI identifiers to entities in an XRI entity space.   
> Part of the resolution process uses the XRI path and associated XRI  
> query parameters to preform this resolution.

We've got two things going on here... We have a primary identifier that
we're trying to resolve and we have a process by which it becomes resolved
(or not). The latter also happens to use a whole bunch of identifiers to
perform the resolution (that perhaps bear some syntactic relation to the
original one... but embellished with parameters to drive through the
resolution process).

The questions that I have been asking have not been about the process of
resolution, however that is accomplished, but about the intended denotation
of the original identifier - the one that is being resolved.

> This is much like DNS only extensible.
> 
> The XRD document retrieved has a unique address conventionally  
> referred to as an iNumber,  or canonical ID element.   Many INames can  
> resolve to the same XRDS document  identified bey the same CID element.

This risks a side bar in to the intended relations between iname, inumber
and the corresponding XRDS document... but I think that would take us off
track at present - just to leave as a question what the intended denotation
of an iNumber would be and how that relates to an XRDS document (eg. the
iNumber may be the single canonical identifier for that document - hmmm...
haven't yet understood how iNumbers as scoped/allocated eg. global scope
regardless of segment position, or scoped by segment path hierachy).

> 
> As a side issue iNumbers are by policy persistent and non reasonable,

What is it that is regarded as persistent? The association between the
identifier and the 'thing' it denotes; or that the referenced thing is
'immutable'. I hear the word persistence being applied to both notions. 

> However a CID is not required to be persistent,  it is however  
> strongly recommended in the spec.   iNumbers are similar in form to a  
> IPv6 address so there is a sufficient quantity to avoid reuse.
> 
> Depending on the query parameters all of the XRDS or some specified  
> subset is returned to the requesting application.
> 
> The Service Endpoint requested may contain URI and or other XML  
> elements describing the service. It may describe an SCP service  
> accessed over SS7.  There may be no URI resource associated 
> with a SEP  
> in some cases.
> 
> The only thing native XRI resolution has to do with http:  is a  
> http(s) binding much like soap and other protocols use to tunnel  
> through http:.  XRI is capable of having bindings for a number of  
> transports.
> 
> I want to make this very clear,  with or without a URI/IRI scheme for  
> XRI Syntax.  XRI resolution will not give up native XRI resolution.    
> The one spec is called resolution for a very good reason.
>
> So XRI is a Syntax that is in IRI form and a resolution protocol that  
> is more like DNS than http:
> 
> I simply point to the fact that in http: the authority resolution  
> makes no reference to the path, query string or any other parameter.    
> This is fundamentally different from XRI where they are resolved as a  
> whole,  much like how DNS uses both the host name and record type to  
> resolve a query.
> 
> Now we get into a grey area around a http: compatibility mode that the  
> W3C recommended we adopt  sometime around 2006.   This came from David  
> Booth's work  "Four Uses of a URL: Name, Concept, Web Location, and  
> Document Instance"
> http://www.w3.org/2002/11/dbooth-names/dbooth-names_clean.htm
> http://www.w3.org/2002/11/dbooth-names/dbooth-rfc2396-analysis_clean.htm
> http://www.w3.org/2002/11/dbooth-names/rfc2396-numbered_clean.htm
> 
> David has been quite helpful in recommending ways that XRI resources  
> can be best interpreted by non XRI aware applications through 
> the use of a gateway or proxy resolver.

<xxxx>

> 
> The HXRI form of XRI and the XRI proxy resolver gateway were created  
> based on W3C feedback and the reason why the resolution spec took much  
> longer to complete after syntax was done.

     denotes
 +------------------> <something>
 |
 |   +----XRI resolution--------+
 |   V                          V
 |
 |             +-----------+
 |             V           |
XRI -----> *(HXRI) ----> XRDS document (finally describing seps related to
<something>)
               |
               +-------> awww:representation of <something>

This little bit of ascii art is intended to show some relationships between
some things to test my understanding.

An XRI is used to denote something (that thing is the thing I mean when I
ask what does a given XRI refer to).

At least in this diagram HXRIs are the http: scheme URIs that arise in the
process of (proxy?) XRI resolution.

XRI resolution terminates either with an XRDS document being returned to the
(XRI resolution) client or a representation of the denoted resource. The '*'
is intended to indicate that several HXRIs may be generated during a given
resolution episode, each yielding an XRDS document which may either be
returned to the client or used to resolve the next item down a path.

When you speak of HXRIs are you speaking only of this HTTP URI form used
(sort of under the covers) in the process of XRI resolution, or is that form
also used when speaking of the XRI that is being resolved?

> The XRI proxy resolver differentiates between clients that are XRI  
> aware and using the proxy as a connivence function rather than  
> directly doing XRI resolution (a number of openID libraries do this),   
> and a request coming from a non XRI aware client like a web browser.

On what basis (in terms of the protocol interaction) is that distinction
made? Specifically, is it 'visible' in the request URI exchanged, or is it
'hidden' in an Accept: request header (or some other mechanism)?

*If* the distinction is made in the URI (eg, the addition of a parameter),
then we have distinct identifiers (URI) one for <something> (above) and one
for the final XRDS document (and maybe others for any intermediate XRDS
documents).

*If* HTTP content negotiation (ie Accept: request headers and
Content-Location: response headers) are used to make the distinction between
'data' and 'metadata' retrieval for the *same* request URI... that goes
against web architecture and there are several ways to avoid it.

1) Regard the XRI as a denoting some resource (<something> above); If XRI
resolution is going to return metadata, use an intervening 303 redirect to
supply a distinct URI for the XRDS document, which is then retrieved in
second HTTP operation.

2) There is some current revival of the HTTP Link: header. Use the HTTP
Link: response header to provide a reference to the XRDS document for both
HTTP GET and HEAD operations. 

Bottom-line, only use a 200 response along with 'bits' *if* those bits are a
awww:representation of the thing referenced by the request URI.

> In the first case http://xri.net/=jbradley is an abstract identifier  
> that is resolved to a CID and XML meta data.
> 
> In the second case http://xri.net/=jbradley is treated as a browsers  
> request for a concrete URI resource.
> 
> So yes the result depends on the context in which the resolution occurs.

Ok... see above (just)...

A little more detail on whether there are intervening redirections might
make it clearer whether we have a problem or not.

> If it is better web architecture we could register a URI/IRI scheme so  
> that it is always clear that xri://=jbradley  always returns metadata  
> about =jbradley rather than http://xri.net/=jbradley which is looking  
> for a concrete web resource.

Hmmm.... sounds like a camel :-)

I think Marty (and the example I started with earlier in this thread) made
it clear that the identifier was intended to refer to a thing (I picked the
XRI resolution spec.) rather than metadata about the thing.

Using redirects would to provide a second URI for the metadata would make
things clear.

> If we are asked to overload the semantics of http:  then it is unfair  
> to blame us for the ambiguity that causes.

/me not attributing blame! /me trying to improve my understanding of XRI/XRI
resolution and expose bits of the minutia that at least some of the TAG care
about... plus... it is possible to avoid the ambiguity.

> There are a number of changes we have discussed that would need to be  
> make to HXRI and proxy resolution if we don't use a URI/IRI scheme.

Some suggestions above.

> If David or others can come up with a way to disambiguate the  
> overloading caused by scheme reuse,  I am happy to work with them to  
> incorporate it,  in the revised specs.

As above...

> 
> Best Regards
> John Bradley
> OASIS IDTRUST-SC
> http://xri.net/=jbradley
> 五里霧中
> 

BR,

Stuart
--
Hewlett-Packard Limited registered Office: Cain Road, Bracknell, Berks RG12
1HN
Registered No: 690597 England
Received on Tuesday, 22 July 2008 14:56:58 UTC