RE: Review of draft finding on URNs, Namespaces and Registries from Booth, David (HP Software - Boston) on 2006-08-15 (www-tag@w3.org from August 2006)

From: Booth, David (HP Software - Boston) <dbooth@hp.com>
Date: Tue, 15 Aug 2006 14:54:40 -0400
To: "David Orchard" <dorchard@bea.com>, <www-tag@w3.org>
Message-ID: <EBBD956B8A9002479B0C9CE9FE14A6C2F1877E@tayexc19.americas.cpqcorp.net>
Hi Dave,

Thanks for your comments.  Replies interspersed below . . .

> -----Original Message-----
> From: David Orchard [mailto:dorchard@bea.com] 
> Sent: Monday, August 14, 2006 2:52 PM
> To: Booth, David (HP Software - Boston); www-tag@w3.org
> Subject: RE: Review of draft finding on URNs, Namespaces and 
> Registries
> 
> Some comments.. 
> 
> > -----Original Message-----
> > From: www-tag-request@w3.org [mailto:www-tag-request@w3.org] 
> > On Behalf Of Booth, David (HP Software - Boston)
> > Sent: Wednesday, August 09, 2006 4:43 AM
> > To: www-tag@w3.org
> > Subject: Review of draft finding on URNs, Namespaces and Registries
> > 
> > 
> > Review of http://www.w3.org/2001/tag/doc/URNsAndRegistries-50.xml
> > 
> > General Comments:
> > 1. Great topic!  I'm glad to see the TAG addressing this.  
> > The document is well considered and makes many excellent points.  
> > 
> > 2. As a whole, the document (and Section 3 in particular) 
> > does not go far enough in admonishing the use of myRIs.  It 
> > feels like it is slashing at branches of the issue rather 
> > than cutting it down at its trunk.  As (I think) I 
> > demonstrated in http://dbooth.org/2006/urn2http/ , the 
> > capabilities of http URIs are virtually a direct superset of 
> > myRIs.  I would not go as far as to claim that I have 
> > *proved* this assertion, because there are some (in my 
> > opinion minor) ways in which they are inherently different, 
> > and hence myRIs could still be viewed as advantageous.  (And 
> > BTW I just updated the document to explicitly list the ways I 
> > could think of: 
> > http://dbooth.org/2006/urn2http/#differences .)  However, I 
> > think the simple myRI-to-http conversion recipe that I showed 
> > provides very
> > convincing evidence.   Thus, unless someone can show how my 
> > analysis is
> > flawed, I think the TAG's advice should be: "Do not create 
> > new myRIs unless you can demonstrate that those inherent 
> > differences are so important to your application that they 
> > outweigh the enormous installed base of HTTP URIs."
> > 
> 
> I think that the TAG is moving to a style that talks about trade-offs
> rather than straight up "don't do this or that".  But I think 
> the point
> that we can make the trade-offs more stark is taken.
> <snip/>

That would be very helpful, in particular, spelling out under what
circumstances should one use myRIs instead of http URIs.

> 
> > 
> > 6. Sec 3 The value of http: URIs:
> > This section seems to be saying that the http scheme's 
> > "two-part approach to identifying resources" is the reason 
> > http is better than myRIs.  This does  not seem correct to 
> > me, for two reasons:
> > 
> > 	a. In the first paragraph, I don't it's correct to say 
> > 	that http URIs use "a hierarchical syntax for distinguishing 
> > 	resources which share the same owner".  The syntax is just 
> > 	the syntax specified in RFC3986.  Whether URIs within a 
> > 	domain are treated as being hierarchical would depend on 
> > 	the policies of the domain owner, wouldn't they?  I don't
> > 	think there is inherently anything hierarchical about the
> > 	URI syntax, though it often is convenient to treat it as
> > 	hierarchical.
> 
> 2396 etc. say that the path is hierarchical.  Period.  There are rules
> for generating absolute URIs from relative and base uris, including
> replacement of ".." with parents.  

Oh, I see.  Yes, you're quite right.  I was thinking of the *mapping*
from absolute URIs to resources (which is supposed to be opaque), rather
than the URIs themselves.  I think I now realize what you intended.

> 
> <snip/>
>  
> > 9. Sec 4.1:
> > The clause "but this must be on a subset of the entire set of 
> > namespace names" is confusing.  I think it should have said: 
> > "but namespace owners are not required to do so and not all do".
> > 
> 
> Ok.
> 
> > 10. Sec 4.2:
> > This section is entitled "Identification" but the discussion 
> > is all about trying to dereference a namespace URI.  Also, 
> > the discussion about whether the URI is "clickable" is a 
> > little misleading.  It seems to be implying that one should 
> > never even try to dereference the URI, to avoid wasting 5-10 
> > seconds.  But this is certainly wrong advice.  One *should* 
> > try dereferencing the URI if one is trying to learn more 
> > about it, because it *may* be dereferenceable and there *may* 
> > be authoritative, useful information available from it.
> 
> This was a bit of a tricky section to rewrite because some of the
> admonition against http: uris is that "they look dereferenceable".
> Perhaps this should be a second section, something like "the 
> fallatious
> appearance of dereferencability of an identifier", but I wasn't too
> sure.
> 
> Also, I think that you have missed somewhat the tone of the 
> discussion.
> It is an honest attempt to describe the trade-offs between myRI: and
> http: uris.  In this case, a myRI will never have the wasted 5-10
> seconds of time.  Which is a benefit of myRIs compared to http: uris.

So being potentially dereferenceable is both a pro and a con: you might
get useful, authoritative information, but you might also waste 10
seconds trying in vain.  I find it hard to take this "con" seriously,
given that the only need for dereferencing it would be to discover
authoritative information about it if one did not already have such
information.  The alternative is that the URI (based on a
non-dereferenceable URN) is not dereferenceable at all, and thus one has
no deterministic way to find information about the URI.   

> Then I went into the benefits of dereferenceability in the
> dereferencability section...   That make sense?

No, because the only benefit to http URIs *is* their potential
dereferenceablity.  I think it makes sense to separately address the use
of the URI for identification and for dereferencing, but the lost 10
seconds only comes into play when one tries to dereference.

> 
> I do think that you have implicitly pointed out that the 4.5 summary
> section should be stronger.
> 
> 
> > 
> > 11. Sec 4.3:
> > This section needs rewriting.  The conclusions are not quite 
> > correct and the arguments are not solid.  The question of 
> > whether a URI persists as an *identifier* is the question of 
> > whether it continues to identify the same *resource* -- not 
> > whether it is dereferenceable or whether the URI owner still 
> > exists or continues to mint more URIs.  The resource 
> > identified by a URI can indeed change, though it should not.  
> > It is up to the URI owner to say what resource is associated 
> > with a particular URI.  Thus, if the organization that owns a 
> > URI changes its mind in 10 or 20 years and decides to 
> > associate a different resource with the URI, it is free to do so.
> > 
> > The points of this section *should* be: (1) that software 
> > using namespace URIs must not depend on them being 
> > dereferenceable; and (2) whether a URI continues to always 
> > identify the same resource depends on the URI owner, and in 
> > either case (urn versus http) it is Oasis in this example, 
> > thus there is no difference in persistence.
> 
> I disagree.  Much of the argument that I've heard against http URIs is
> the "what happens if the namespace owner goes away".  Now maybe there
> are 2 separate types of persistence worth discussing, and this section
> should be called "namespace owner persistence" with another section on
> "resource persistence".

Yes, I think it would be better to separately discuss two types of
persistence, but I don't think the two types should be considered
"namespace owner persistence" and "resource persistence".  Rather, the
two types are persistence as an identifier and persistence as a
dereferenceable location.  Identifier persistence is the question of
whether the URI continues to identify the same resource; dereference
persistence is the question of whether the URI continues to be
dereferenceable to useful, authoritative information.

> 
> > 
> > 12. Sec 4.4:
> > This section also needs rewriting.  The arguments are 
> > muddled and not
> > even quite the right arguments.   The point of this section 
> > should be
> > that the Oasis URNs are not dereferenceable at all, whereas 
> > an http URI
> > *might* be dereferenceable to useful, authoritative 
> > information.  Thus, one is no worse off with http URIs, and 
> > potentially much better off.
> 
> I think some rewriting is in order and I agree that the tone should be
> potentially much better off.  However, I think that the
> dereferencability section ties back to the context section.  
> Context is
> crucial for knowing how/when to dereference identifiers, and 
> that can't be ignored.

I'm not entirely sure what you mean, but I think you mean that the
context in this case indicates that an application using a namespace URI
must not count on being able to dereference it (per the namespace spec).
Therefore, any useful info that is obtained from deferencing it is
gravy.

> 
> > 
> > 13. Sec 4.5:
> > This section mentions:
> > [[
> > A provider of a identifier must specify how the identifier 
> > will be used in each specific sub-context of their XML 
> > language, whether it is intended as an identifier, a 
> > location, or both.
> > ]]
> > This is true of other names in XML, such as element and 
> > attribute names.
> > It is not true of URIs.  I think this sentence can just be deleted.
> 
> You are absolutely right that context will determine name
> identification/dereferencability, but this section is on namespace
> names.  I could have done a section on "qname" case study, but I think
> there's close to too much in 4-6 as it is..

I don't understand what you mean.  I wasn't suggesting that you address
qnames.  The above quote in [[ ... ]] is saying that a namespace
provider must specify how that namespace will be used in each
sub-context of their XML language, right?  Why do you say this is
needed?  It seems to me that a namespace provider should specify the
semantics of the vocabulary that is in that namespace, but the
interpretation of the namespace URI itself is supposed to be independant
of context.  Isn't it?  Can you explain more what you mean?

> 
> > 
> > 14. Sec 4.5:
> > This section also mentions:
> > [[
> > Any use of an identifier, or any datatype for that matter, in 
> > an XML document has the same issues. 
> > ]]
> > Other identifiers in XML (such as element and attribute 
> > names) *do* have other, additional issues.  However, I think 
> > this is a red herring anyway.  I think this sentence can 
> > simply be deleted.
> > 
> > 15. Sec 5: Case Study: Persistent Document Location:
> > I ran out of steam on this section, but on quick reading, 
> > this section seems to go into far more detail than necessary 
> > in attempting to make the essential point, and this obscures 
> > the essential point.  For example, the first paragraph says:
> > [[
> > [XRI] observes that changing the organizational structure 
> > represented in the URI, for example to 
> > http://newdept.agency.example.org/docs/govdoc.pdf, or the 
> > path structure, for example to 
> > http://newdept.agency.example.org/documents/govdoc.pdf, 
> breaks access.
> > ]]
> > But one obvious solution (already pointed out in Cool URIs 
> > don't change, http://www.w3.org/Provider/Style/URI ) is to 
> > not put the organizational structure in the URI.  
> 
> I agree this section is very detailed.  It is an attempt to go to the
> same level of detail as XRIs to fully complete the analysis.  
> I know of
> no shorter way to complete the analysis, and a criticism of earlier
> versions of the document is that it didn't prove the claims it made
> about http: vs xri:.  Any help on shortening the section yet retaining
> the analysis would be great.
> 
> I think you have missed some of the tone of the attempt here.  The TAG
> has already said "cool URIs don't change" and yet XRI, others come up
> with new schemes.  We need to follow through the XRI logic on why they
> don't have "cool URIs" to THEIR level and documentation.   
> Simply saying
> yet again, create cool URIs, doesn't seem that motivating to them and
> others.

How about showing them how to create equivalent http URIs and then
comparing the pros/cons of the http URIs with the pros/cons of the xri
URIs?  The recipe in http://dbooth.org/2006/urn2http/ shows one way such
http URIs can be created.  I think this would simplify the comparison.
For example, the comparison could then be between

	xri://@example.org*agency*department/docs/govdoc.pdf

and

	http://xri.org?@example.org*agency*department/docs/govdoc.pdf

> 
> In general, I feel like we haven't quite connected on the section 4-6
> material.  
> 
> Cheers,
> Dave
> 

David Booth

P.S.  Thanks for your work on this!
Received on Tuesday, 15 August 2006 18:55:02 UTC