RE: LRDD Update (Resource Descriptor Discovery) and Proposed Changes from Williams, Stuart (HP Labs, Bristol) on 2009-07-01 (www-tag@w3.org from July 2009)

From: Williams, Stuart (HP Labs, Bristol) <skw@hp.com>
Date: Wed, 1 Jul 2009 11:06:07 +0000
To: Xiaoshu Wang <wangxiao@musc.edu>
CC: Jonathan Rees <jar@creativecommons.org>, "www-tag@w3.org" <www-tag@w3.org>
Message-ID: <233101CD2D78D64E8C6691E90030E5C832D3318826@GVW1120EXC.americas.hpqcorp.net>
Xiashou,

> -----Original Message-----
> From: Xiaoshu Wang [mailto:wangxiao@musc.edu] 
> Sent: 30 June 2009 14:48
> To: Williams, Stuart (HP Labs, Bristol)
> Cc: Jonathan Rees; www-tag@w3.org
> Subject: Re: LRDD Update (Resource Descriptor Discovery) and 
> Proposed Changes
> 
> Stuart,
> 
> I will answer in one block instead of paragraph by paragraph. I think 
> it will be more cohesive.
> 
> The Magna Carta example that Jonathan gave yesterday made me to think: 
> what went wrong for this thinking "If I link to wikipedia, I want you to 
> go to the wikipedia article, darnit, not to the Magna Carta."
> 
> I think it is caused by the duality of a URI for being both URN and 
> URL.  Here, URN is used different from other URN, such as LSID etc., I 
> use the URN in an absolute sense that it is independent of any 
> transportation protocol. If people understand this duality and knows 
> when they are using it in which sense, I don't think that there will be 
> any such confusion.  Say, let
> 
> //en.wikipedia.org/wiki/Magna_Carta = Mogna_Carta
> 
> Then, the wikipedia article would be denoted by 
> "http:Mogna_Carta".  I doubt there will any ambiguity.
> 
> Then, this comes back to the definition of Information Resource, with a 
> URN, wouldn't IR be the set of all URL's referent?

By Fieldings writings a resource is modelled as a function from time to [(sets of equivalent awww:representations) OR (URI)].

> If we have not taken 
> IR as *representation*, would this (the duality of URI as name and 
> locator) be the cause? 

But... we have *not* taken "IR as *representation*"....

Just think of the web as a great big machine that answers (amongst other things) "get" questions of the form "Please 'get' me a (current) awww:representation *of* the 'thing' named with this 'uri'"

Responses that you might get back are one of following kind (non-exhaustive):

	- an awww:representation of the 'thing'
	- advice that 'thing' has another name (temporary or permanent)
	- advice that more information *about* requested 'thing' *may* be available by asking the 'get question' of a different 'thing'.
	- advice that the requested 'thing' cannot be found/accessed...
	- something more catastropic.

The point is that in web architecture the 'thing' referred to by the 'uri' in the question, and the 'thing' that the awww:representation (if any) is an awww:representation of are supposed to be the *same* 'thing'.

> Now, let's go back to Pat's question, how a far away galaxy can be 
> connected to the Web?  Say Orion (let's make it a schemeless URN 
> again).  Then, you chose your information path by selecting your 
> transportation protocol.  When you choose "http", your information 
> resource would be http:Orion, which makes it a URL now.  Wouldn't you 
> know what kind of access that you are getting into?

Well... you are having to invent here a new construct here that the web as it is does not support - the schemeless URN. As things stand right now there is no defined relationship between the things referenced by URI whose spelling differs only in the spelling of the scheme component.

Also, whilst I think this is the topic of TAG issue schemeProtocol-49, what a URI refers to and how you access that 'thing' (if you can) should be regarded as orthogonal. The HTTP protocol itself can be used with URI from any scheme - though there are obvious practicalities in setting up the relevant gateways/proxies.

> If we have one URN, I believe all these problems will be gone.  Sure, we 
> can make URI to keep its duality.  But in the latter case, we must be 
> aware which sense we are using when we make a statement.  We should not 
> try to cure one linguistic confusion with another because that only 
> gives rise to new problem while still not settling the old one.

IIUC, you have:

	//en.wikipedia.org/wiki/Magna_Carta (or possibly Mogna_Carta) is a 'URN' 
		which names 
			the Magna Carta (a conceptual work with a small number of original transcriptions on vellum)

	http:////en.wikipedia.org/wiki/Magna_Carta (or possibly http:Mogna_Carta) is a 'URL'
		which names
			a wikipedia page *about* the Magna Carta.

In this scheme, what 'URN' do I use for the wikipedia page so that I can write a page *about* it (and so on)?

I don't think that we need such rigidity. We just need to be careful about what it is that is being named, and some named things contain information about other named things.

> I *sincerely* wish that when we define our engineering terms, we can 
> follow Wittgenstein's version of "meaning is use".  A *definition* must 
> be distinguished from a *description*.  The former is aimed at clear 
> usage while the latter comprehension.  If we define a vocabulary X, and 
> no one can tell X from not-X, X must be treated as its subsuming 
> concept.  This is my logical argument for the word "descriptor/metadata" 
> etc. because their definitions make them semantically equivalent to 
> "resource".  Then, it offers no more therapeutic (in Wittgenstein's 
> term) value than what we already have, except making it worse.

I think that you are overreading what people are saying. Most people AFAICT, are comfortable with the notions that resources can described other resources and will speak of the former as being metadata about the latter. One doesn't then have to get into meta-'x',  meta-meta-'x' .... One just has 'x' and some 'x's have things to say about other 'x's.

> Honestly, I do not buy the "cost" argument.  I wonder what is the 
> priority of TAG?  Is it to create patchy solution here and there?  Or to 
> think thoroughly and tries to settle down a set of solid foundation and 
> methodology before evaluating anything else?
> 
> Ideally, here is the list of things I wish TAG to consider. 
> 
> 1.  The necessity of *one* URN. 
>     URN is a URI but without a pre-defined transportation protocol. 
>	We use URN to describe anything in the world, just like a word 
>	in natural language. Syntax: my proposal is to make it an HTTP-URI sans "http:".

You understand that you here are asking for a HUGE change in the architecture of URI that would have to perculate a substantial amount of deployed infrastructre.


> 2.  The URN can be bound to any number of URL.
>     A URL denotes a (information) resource that provides information about the URN.
>     Syntax: scheme + ":" + URN.
>      Here, a scheme must correspond to a transportation protocol but not 
>      naming protocol. Thus, there will be no more argument about if we need a 
>      new URI scheme.  If there is a new transportation protocol, 
>      then there is one.  Otherwise, no.
>     In addition, we might consider a default namespace URI for the scheme.

Again this is a HUGE change that you are asking for


> 3. Complete the referential range of URI so that syntactically, we can 
> tell if a URI denotes a URI, a Resource or a Representation. 
>     Syntax: my proposed syntax is:
>     a) If the root URI ended with a "~", it by definition denotes the URI without the trailing "~".
>     b) Use #(mine-type) to denote a representation.

FWIW: I take the view that in general Representations do not have URI. If you have a Representation like thing that you want to give a URI to... You have to promote it to being a Resource which then has its own Representations (which may happen to be invariant - and which may or may not be identical to the repesentation that you are tying to name). Much easier on the whole to avoid naming representations IMO.

> 4.  Consider denoting a MIME-type with URI.
> 5.  Optionally, make a URI syntax to denote the set of mime-type associated with a resource.
>      Syntax: A root URI ended with a trailing "?"
> 
> I believe the above design would give us a very solid foundation for the 
> Web.  

Well... It would be a different web from the one that we have. If you wanted to get there you would also need an effective transition plan - this is not a greenfield site - more brown field in some respects with movement restricted by earlier decisions.

> All important concepts, URI, Representation, Resource, 
> Transportation, MIME, will be grounded on URI.   Thus, we know one URN 
> can have multiple Information Resource (URL); and one Information 
> Resource can have multiple Representations.  And we communicate through 
> the Web by fetching the most suitable *representation* to understand the 
> world, which is grounded on the Web by the URN.
> 
> Don't you think that is a much better and cleaner way to solve 
> architectural problems?  

Honestly... you have not sold it to me!

I think that with careful use the web as it is today works just fine.

> I want to point out that all my proposed design 
> does not define anything semantically.  It is all driven by syntactic 
> conventions.  We are engineers.  Our job is to design and built "syntax 
> or structure" for our users to deal with their semantic issues.   It is 
> not our job to tell our users he or she get  "X is Y" right or not.  It 
> is their freedom and we have no right to take it away from them.
> 
> I may have sounded that I have been bickering about semantic issues.  
> But that is not my purpose.  My purpose of doing that is only to show 
> how shaky a ground we will stand if we ever try to built our engineering 
> principles from our intuitions.  To be pragmatic, that is my purpose. 
> 
> Xiaoshu

Stuart
--
Received on Wednesday, 1 July 2009 11:07:43 UTC