Re: LRDD Update (Resource Descriptor Discovery) and Proposed Changes from Xiaoshu Wang on 2009-07-01 (www-tag@w3.org from July 2009)

From: Xiaoshu Wang <wangxiao@musc.edu>
Date: Wed, 01 Jul 2009 08:52:00 -0400
To: "Williams, Stuart (HP Labs, Bristol)" <skw@hp.com>
CC: Jonathan Rees <jar@creativecommons.org>, "www-tag@w3.org" <www-tag@w3.org>
Message-ID: <4A4B5BF0.4090108@musc.edu>
Williams, Stuart (HP Labs, Bristol) wrote:
> Xiashou,
>
>   
>> -----Original Message-----
>> From: Xiaoshu Wang [mailto:wangxiao@musc.edu] 
>> Sent: 30 June 2009 14:48
>> To: Williams, Stuart (HP Labs, Bristol)
>> Cc: Jonathan Rees; www-tag@w3.org
>> Subject: Re: LRDD Update (Resource Descriptor Discovery) and 
>> Proposed Changes
>>
>> Stuart,
>>
>> I will answer in one block instead of paragraph by paragraph. I think 
>> it will be more cohesive.
>>
>> The Magna Carta example that Jonathan gave yesterday made me to think: 
>> what went wrong for this thinking "If I link to wikipedia, I want you to 
>> go to the wikipedia article, darnit, not to the Magna Carta."
>>
>> I think it is caused by the duality of a URI for being both URN and 
>> URL.  Here, URN is used different from other URN, such as LSID etc., I 
>> use the URN in an absolute sense that it is independent of any 
>> transportation protocol. If people understand this duality and knows 
>> when they are using it in which sense, I don't think that there will be 
>> any such confusion.  Say, let
>>
>> //en.wikipedia.org/wiki/Magna_Carta = Mogna_Carta
>>
>> Then, the wikipedia article would be denoted by 
>> "http:Mogna_Carta".  I doubt there will any ambiguity.
>>
>> Then, this comes back to the definition of Information Resource, with a 
>> URN, wouldn't IR be the set of all URL's referent?
>>     
>
> By Fieldings writings a resource is modelled as a function from time to [(sets of equivalent awww:representations) OR (URI)].
>   
By no means to disrespect Fieldings, do we have to take someone's past 
writing as a bible that we should never change?  If that, how can we 
ever advance? 
>> If we have not taken 
>> IR as *representation*, would this (the duality of URI as name and 
>> locator) be the cause? 
>>     
>
> But... we have *not* taken "IR as *representation*"....
>
> Just think of the web as a great big machine that answers (amongst other things) "get" questions of the form "Please 'get' me a (current) awww:representation *of* the 'thing' named with this 'uri'"
>
> Responses that you might get back are one of following kind (non-exhaustive):
>
> 	- an awww:representation of the 'thing'
> 	- advice that 'thing' has another name (temporary or permanent)
> 	- advice that more information *about* requested 'thing' *may* be available by asking the 'get question' of a different 'thing'.
> 	- advice that the requested 'thing' cannot be found/accessed...
> 	- something more catastropic.
>   
That is my view, there is no "information resource".
> The point is that in web architecture the 'thing' referred to by the 'uri' in the question, and the 'thing' that the awww:representation (if any) is an awww:representation of are supposed to be the *same* 'thing'.
>   
You have repeatedly assured me that no one has taken the view of 
"Resource = Representation".  What was this supposed *sameness*?  Define 
it objectively and clearly. Otherwise, scratch it.  (Meaning is use, 
please!)

As in all science and philosophy, the "reference" and "dereference" are 
NOT symmetric.  Trying to make it symmetric gives rise all sort of 
puzzles.  Decartes thought about it and he proclaimed "Cogito, ergo sum" 
and Russel tried from logic point of view and then he got a paradox 
named after him.

Don't and don't ever take that view. Communication is simple.  You get a 
message and judge the truth of the claim.  There isn't something magic 
about it.  This is how we humans know the universe.
>> Now, let's go back to Pat's question, how a far away galaxy can be 
>> connected to the Web?  Say Orion (let's make it a schemeless URN 
>> again).  Then, you chose your information path by selecting your 
>> transportation protocol.  When you choose "http", your information 
>> resource would be http:Orion, which makes it a URL now.  Wouldn't you 
>> know what kind of access that you are getting into?
>>     
>
> Well... you are having to invent here a new construct here that the web as it is does not support - the schemeless URN. As things stand right now there is no defined relationship between the things referenced by URI whose spelling differs only in the spelling of the scheme component.
>   
Of course, this is why I am proposing TAG to consider it. It helps solve 
many problems.
> Also, whilst I think this is the topic of TAG issue schemeProtocol-49, what a URI refers to and how you access that 'thing' (if you can) should be regarded as orthogonal. The HTTP protocol itself can be used with URI from any scheme - though there are obvious practicalities in setting up the relevant gateways/proxies.
>   
Wouldn't my proposal help clarify that? A scheme denotes a path to 
acquire "awww:representation" of a resource, denoted by the 
schemeless-URI? If we give a default namespace to the scheme part. Then, 
people can follow their nose to understand what is the protocol, right?  
If you take a static view point, then what a schemed URI denotes is the 
"information resource". Besides, it also solve the problem between the 
equivalence of http-URI and https-URI.

I am not saying this is currently supported.  That is why I ask TAG to 
consider. We do use URI in two senses, one as URN and the other URL. Do 
you agree? 
 
>> If we have one URN, I believe all these problems will be gone.  Sure, we 
>> can make URI to keep its duality.  But in the latter case, we must be 
>> aware which sense we are using when we make a statement.  We should not 
>> try to cure one linguistic confusion with another because that only 
>> gives rise to new problem while still not settling the old one.
>>     
>
> IIUC, you have:
>
> 	//en.wikipedia.org/wiki/Magna_Carta (or possibly Mogna_Carta) is a 'URN' 
> 		which names 
> 			the Magna Carta (a conceptual work with a small number of original transcriptions on vellum)
>
> 	http:////en.wikipedia.org/wiki/Magna_Carta (or possibly http:Mogna_Carta) is a 'URL'
> 		which names
> 			a wikipedia page *about* the Magna Carta.
>
> In this scheme, what 'URN' do I use for the wikipedia page so that I can write a page *about* it (and so on)?
>   
You want to propose an http://ftp://http://... .  Sure.  I don't think 
my proposal is theoretically against that. ou have to define how the 
inner scheme is grounded on.   In other words, you have to construct a 
meta-Web to do that.  But do you think that it is practically useful?  I 
think Godel has told us there is no system that can be self-complete. We 
have to stop our recursion somewhere.  Otherwise, we will go nowhere.
> I don't think that we need such rigidity. We just need to be careful about what it is that is being named, and some named things contain information about other named things.
>   
Sure.  I am all for that.  Then, think about IR/httpRange-14, is it 
rigid or not?
 
>> I *sincerely* wish that when we define our engineering terms, we can 
>> follow Wittgenstein's version of "meaning is use".  A *definition* must 
>> be distinguished from a *description*.  The former is aimed at clear 
>> usage while the latter comprehension.  If we define a vocabulary X, and 
>> no one can tell X from not-X, X must be treated as its subsuming 
>> concept.  This is my logical argument for the word "descriptor/metadata" 
>> etc. because their definitions make them semantically equivalent to 
>> "resource".  Then, it offers no more therapeutic (in Wittgenstein's 
>> term) value than what we already have, except making it worse.
>>     
>
> I think that you are overreading what people are saying. Most people AFAICT, are comfortable with the notions that resources can described other resources and will speak of the former as being metadata about the latter. One doesn't then have to get into meta-'x',  meta-meta-'x' .... One just has 'x' and some 'x's have things to say about other 'x's.
>   
Did I?  I am not against people using the word "meta-data".  I am 
against people trying to propose a standard approach based on that 
word.  I don't care what the word is.  What I care is how I can tell one 
thing from the other.  I am engineers.  All works is in essence a bunch 
of "if elese".  If my "if" always return one result, what is the 
"else"?  And someone told me that there is an "else".

(As a history, I remember when I first heard RDF back in 2000, it was 
called a metadata framework.  I guess there is a reason for that TAG no 
longer use that.)
 
>> Honestly, I do not buy the "cost" argument.  I wonder what is the 
>> priority of TAG?  Is it to create patchy solution here and there?  Or to 
>> think thoroughly and tries to settle down a set of solid foundation and 
>> methodology before evaluating anything else?
>>
>> Ideally, here is the list of things I wish TAG to consider. 
>>
>> 1.  The necessity of *one* URN. 
>>     URN is a URI but without a pre-defined transportation protocol. 
>> 	We use URN to describe anything in the world, just like a word 
>> 	in natural language. Syntax: my proposal is to make it an HTTP-URI sans "http:".
>>     
>
> You understand that you here are asking for a HUGE change in the architecture of URI that would have to perculate a substantial amount of deployed infrastructre.
>   
Yes.  But it also solve huge problems.  I remember the debate about the 
"XRI" scheme.  I remember from the oasis site, the proposal almost 
passed (I think they fell just short of 75%) even with the strong 
opposition from TAG.  And then, there was the very heated debate on 
HCLS-IG about LSID.  What it means to me that most people are just not 
comfortable with the duality of URN and URL.  

>> 2.  The URN can be bound to any number of URL.
>>     A URL denotes a (information) resource that provides information about the URN.
>>     Syntax: scheme + ":" + URN.
>>      Here, a scheme must correspond to a transportation protocol but not 
>>      naming protocol. Thus, there will be no more argument about if we need a 
>>      new URI scheme.  If there is a new transportation protocol, 
>>      then there is one.  Otherwise, no.
>>     In addition, we might consider a default namespace URI for the scheme.
>>     
>
> Again this is a HUGE change that you are asking for
>
>
>   
>> 3. Complete the referential range of URI so that syntactically, we can 
>> tell if a URI denotes a URI, a Resource or a Representation. 
>>     Syntax: my proposed syntax is:
>>     a) If the root URI ended with a "~", it by definition denotes the URI without the trailing "~".
>>     b) Use #(mine-type) to denote a representation.
>>     
>
> FWIW: I take the view that in general Representations do not have URI. If you have a Representation like thing that you want to give a URI to... You have to promote it to being a Resource which then has its own Representations (which may happen to be invariant - and which may or may not be identical to the repesentation that you are tying to name). Much easier on the whole to avoid naming representations IMO.
>   

That is why the (mine-type) is appended after the # to denote 
Representation. I said in an earlier email to "Martin J. Dürst" that 
this is not needed, but it is for convenience.  But a convention to 
denote a URI, I do believe is a necessity because there is indeed need 
to describe a URI but not the referent, as shown in XRI's usecase.

>> 4.  Consider denoting a MIME-type with URI.
>> 5.  Optionally, make a URI syntax to denote the set of mime-type associated with a resource.
>>      Syntax: A root URI ended with a trailing "?"
>>
>> I believe the above design would give us a very solid foundation for the 
>> Web.  
>>     
>
> Well... It would be a different web from the one that we have. If you wanted to get there you would also need an effective transition plan - this is not a greenfield site - more brown field in some respects with movement restricted by earlier decisions.
>   

HTTP's Accept header allows extension.  It is a matter of standardizing 
a token. 
>> All important concepts, URI, Representation, Resource, 
>> Transportation, MIME, will be grounded on URI.   Thus, we know one URN 
>> can have multiple Information Resource (URL); and one Information 
>> Resource can have multiple Representations.  And we communicate through 
>> the Web by fetching the most suitable *representation* to understand the 
>> world, which is grounded on the Web by the URN.
>>
>> Don't you think that is a much better and cleaner way to solve 
>> architectural problems?  
>>     
>
> Honestly... you have not sold it to me!
>
> I think that with careful use the web as it is today works just fine.
>   
Yes.  This is my point.  The web works just fine without the concept of 
IR.  We need to be careful about our vocabulary, i.e., URI.  That is 
all.  Then why compound us with "httpRange-14"?

Cheers

Xiaoshu

>> I want to point out that all my proposed design 
>> does not define anything semantically.  It is all driven by syntactic 
>> conventions.  We are engineers.  Our job is to design and built "syntax 
>> or structure" for our users to deal with their semantic issues.   It is 
>> not our job to tell our users he or she get  "X is Y" right or not.  It 
>> is their freedom and we have no right to take it away from them.
>>
>> I may have sounded that I have been bickering about semantic issues.  
>> But that is not my purpose.  My purpose of doing that is only to show 
>> how shaky a ground we will stand if we ever try to built our engineering 
>> principles from our intuitions.  To be pragmatic, that is my purpose. 
>>
>> Xiaoshu
>>     
>
> Stuart
> --
Received on Wednesday, 1 July 2009 12:52:55 UTC