Re: LRDD Update (Resource Descriptor Discovery) and Proposed Changes from Xiaoshu Wang on 2009-06-30 (www-tag@w3.org from June 2009)

From: Xiaoshu Wang <wangxiao@musc.edu>
Date: Tue, 30 Jun 2009 09:48:05 -0400
To: "Williams, Stuart (HP Labs, Bristol)" <skw@hp.com>
CC: Jonathan Rees <jar@creativecommons.org>, "www-tag@w3.org" <www-tag@w3.org>
Message-ID: <4A4A1795.3030504@musc.edu>
Stuart,

I will answer in one block instead of paragraph by paragraph.  I think 
it will be more cohesive.

The Magna Carta example that Jonathan gave yesterday made me to think: 
what went wrong for this thinking "If I link to wikipedia, I want you to 
go to the wikipedia article, darnit, not to the Magna Carta."

I think it is caused by the duality of a URI for being both URN and 
URL.  Here, URN is used different from other URN, such as LSID etc., I 
use the URN in an absolute sense that it is independent of any 
transportation protocol. If people understand this duality and knows 
when they are using it in which sense, I don't think that there will be 
any such confusion.  Say, let

//en.wikipedia.org/wiki/Magna_Carta = Mogna_Carta

Then, the wikipedia article would be denoted by "http:Mogna_Carta".  I 
doubt there will any ambiguity.

Then, this comes back to the definition of Information Resource, with a 
URN, wouldn't IR be the set of all URL's referent?  If we have not taken 
IR as *representation*, would this (the duality of URI as name and 
locator) be the cause? 

Now, let's go back to Pat's question, how a far away galaxy can be 
connected to the Web?  Say Orion (let's make it a schemeless URN 
again).  Then, you chose your information path by selecting your 
transportation protocol.  When you choose "http", your information 
resource would be http:Orion, which makes it a URL now.  Wouldn't you 
know what kind of access that you are getting into?

If we have one URN, I believe all these problems will be gone.  Sure, we 
can make URI to keep its duality.  But in the latter case, we must be 
aware which sense we are using when we make a statement.  We should not 
try to cure one linguistic confusion with another because that only 
gives rise to new problem while still not settling the old one.

I *sincerely* wish that when we define our engineering terms, we can 
follow Wittgenstein's version of "meaning is use".  A *definition* must 
be distinguished from a *description*.  The former is aimed at clear 
usage while the latter comprehension.  If we define a vocabulary X, and 
no one can tell X from not-X, X must be treated as its subsuming 
concept.  This is my logical argument for the word "descriptor/metadata" 
etc. because their definitions make them semantically equivalent to 
"resource".  Then, it offers no more therapeutic (in Wittgenstein's 
term) value than what we already have, except making it worse.

Honestly, I do not buy the "cost" argument.  I wonder what is the 
priority of TAG?  Is it to create patchy solution here and there?  Or to 
think thoroughly and tries to settle down a set of solid foundation and 
methodology before evaluating anything else?

Ideally, here is the list of things I wish TAG to consider. 

1.  The necessity of *one* URN. 
    URN is a URI but without a pre-defined transportation protocol. We 
use URN to describe anything in the world, just like a word in natural 
language.
    Syntax: my proposal is to make it an HTTP-URI sans "http:".
2.  The URN can be bound to any number of URL.
    A URL denotes a (information) resource that provides information 
about the URN.
    Syntax: scheme + ":" + URN.
     Here, a scheme must correspond to a transportation protocol but not 
naming protocol. Thus, there will be no more argument about if we need a 
new URI scheme.  If there is a new transportation protocol, then there 
is one.  Otherwise, no.
    In addition, we might consider a default namespace URI for the scheme.
3. Complete the referential range of URI so that syntactically, we can 
tell if a URI denotes a URI, a Resource or a Representation. 
    Syntax: my proposed syntax is:
    a) If the root URI ended with a "~", it by definition denotes the 
URI without the trailing "~".
    b) Use #(mine-type) to denote a representation.
4.  Consider denoting a MIME-type with URI.
5.  Optionally, make a URI syntax to denote the set of mime-type 
associated with a resource.
     Syntax: A root URI ended with a trailing "?"

I believe the above design would give us a very solid foundation for the 
Web.  All important concepts, URI, Representation, Resource, 
Transportation, MIME, will be grounded on URI.   Thus, we know one URN 
can have multiple Information Resource (URL); and one Information 
Resource can have multiple Representations.  And we communicate through 
the Web by fetching the most suitable *representation* to understand the 
world, which is grounded on the Web by the URN.

Don't you think that is a much better and cleaner way to solve 
architectural problems?  I want to point out that all my proposed design 
does not define anything semantically.  It is all driven by syntactic 
conventions.  We are engineers.  Our job is to design and built "syntax 
or structure" for our users to deal with their semantic issues.   It is 
not our job to tell our users he or she get  "X is Y" right or not.  It 
is their freedom and we have no right to take it away from them.

I may have sounded that I have been bickering about semantic issues.  
But that is not my purpose.  My purpose of doing that is only to show 
how shaky a ground we will stand if we ever try to built our engineering 
principles from our intuitions.  To be pragmatic, that is my purpose. 

Xiaoshu


Williams, Stuart (HP Labs, Bristol) wrote:
> Xiashou,
>
>   
>> -----Original Message-----
>> From: Xiaoshu Wang [mailto:wangxiao@musc.edu]
>> Sent: 29 June 2009 18:25
>> To: Williams, Stuart (HP Labs, Bristol)
>> Cc: Jonathan Rees; www-tag@w3.org
>> Subject: Re: LRDD Update (Resource Descriptor Discovery) and
>> Proposed Changes
>>
>> Williams, Stuart (HP Labs, Bristol) wrote:
>>     
>>> Xiaoshu,
>>>
>>>
>>>       
>>>> -----Original Message-----
>>>> From: www-tag-request@w3.org [mailto:www-tag-request@w3.org]
>>>> On Behalf Of Xiaoshu Wang
>>>> Sent: 28 June 2009 04:55
>>>> To: Jonathan Rees
>>>> Cc: www-tag@w3.org
>>>> Subject: Re: LRDD Update (Resource Descriptor Discovery) and
>>>> Proposed Changes
>>>>
>>>> Jonathan Rees wrote:
>>>>
>>>>         
>>>>> [less cc:]
>>>>>
>>>>> Tracker, I write this email primarily for you. This whole thread is
>>>>> about ISSUE-53 [1]. Current LRDD discussion bears on ISSUE-62 [2].
>>>>>
>>>>> Xiaoshu, you're essentially pressing the TAG again to make a formal
>>>>> statement on recommended use of conneg, as Michael Hausenblas did in
>>>>> February [3]. I'm sorry that this has fallen to the periphery of the
>>>>> TAG business heap. The best I can do now is to point you to the advice
>>>>> [4] that we gave to the Cool URIs for the Semweb editors, which agrees
>>>>> with Eran's reading.
>>>>>
>>>>>
>>>>>           
>>>> Yes.  I am pressing and I hope TAG can take it seriously.  I am not
>>>> pressing TAG to make a recommended use on conneg.  Conneg is there and
>>>> people is start using it.  I don't think AJAX community needs to ask TAG
>>>> before using conneg.  The mechanism is there, thanks to the well design
>>>> of HTTP, we can just use it.  What I am asking TAG to rethink
>>>> httpRange-14 because it will let us know how much nonsense that issue-62
>>>> as well as LRDD proposal is.  (Sorry for my bluntness, but why waste
>>>> time on propose something that you cannot even define?)
>>>>
>>>>         
>>> Specifically, what is it that you are asking the TAG to rethink?
>>>
>>>       
>> (1) The necessity for the definition of "Information Resource".
>> (1a) If TAG thinks IR is necessary, please give a concrete definition so
>> that every can be used objectively. The current definition is not a
>> definition. It is more like a wish but a definition.
>> (1b) If TAG cannot define IR, then eliminate it.
>>     
>>> The TAGs httpRange-14 resolution and subsequent contributions to
>>> the "Cool URIs for the Semantic Web" amounts to saying that:
>>>
>>> - fragment-less http URI can be used to refer-to, name... any kind of thing
>>>
>>> - and for things that do not/cannot possibly have awww:representations (and I am such a thing)
>>>   please deploy a redirection (303) to a different thing that (if the redirection is to be useful)
>>>   has something say about the thing first asked about.
>>>
>>>       
>> But *please* define the "can", whose capability in what regard?
>>     
>
> I can offer some characteristics and have done so in the past - eg. things that have mass do not/cannot possibly have awww:representations.
> Things whose state (and state history) can be serialised as a message can have awww:representations.
> Things that are entirely conceptual - eg an RDF property or class fall into a bit of a grey area.
>
> Personnally I could have lived with a resolution to httpRange-14 that allowed descriptive representations for things referenced by slash http URI and left classification of the resource (if required) to some higher level. FWIW, information/non-information resource is a bit coarse grained anyway.
>
> However, I was also persuaded by Pat's arguments about far-distant galaxies (several light years away) how on earth could they possibly be involved in any http interaction arising from an attempt to access them using a slashed' http URI to refer to them?  Even a physical body somewhat closer?
>
> AIUI the intention of web architecture is that URI are usable to access the things that they are use to name/denote/refer-to (all of which are intented to be aligned). There are things that cannot not possibly be accessed by the web - so the pragmatic thing to do (if you intent to respect that intention of web architecture) is *not* to deploy representations at the http: URI used to name those things.
>
>   
>> Is it
>> O.K. for me to 200 an HTML-representation for myself, which is a person?
>>     
>
> IMO no... it is not ok.
>
>   
>> I definitely think that I *can* but I guess you would think so. Then,
>> what is the standard of this "can or cannot" list or criteria?
>>
>> If TAG think that it is O.K. for my kind of "can", then I am fine with
>> that because it only says that the concept of "information resource"
>> varies from person to person. Else, let me know how to tell IR from
>> non-IR because, currently, to be safe (i.e., not being accused of
>> violating the web architecture), the only thing that I can do
>> is to 303 forever.
>>
>> Note, using hash-URI doesn't get me out of the predicament of  "Can I 200
>> now? because my hash URI still have to be rooted on some slash URI,
>> which IR-ness I must ponder by the httpRange-14.
>>
>>     
>>>> If we know that Information Resource does not make sense, then Generic
>>>> Resource does not make sense either because what is the definition of
>>>> this genericity?  As I have discussed in the manuscript "Is the Web a
>>>> Web of Document or Things?" (Going to be presented in IR-KR 2009,
>>>>
>>>>         
>> http://ir-kr.okkam.org/workshop-program/irkr2009-proceedings.pdf), all
>>     
>>>> these concepts are based on a somewhat "one resource to one representation" assumption.
>>>>
>>>>         
>>> I think I have to cry foul again here. You state repeatedly
>>>       
>> that this assumption is made in the AWWW or by the TAG and
>> then proceed to argue against it. I can assure you that I
>> know of no-one on the TAG or that contributed to the writing
>> of AWWW that makes that assumption.
>>     
>>> Maybe you are speaking here of an assumption that you make
>>>       
>> in your work, but I don't think that is the case.
>>     
>> Sure, I sincerely *wish* that I am wrong. But then it is beyond my
>> comprehension why TAG is so obsessed with IR/httpRange-14.
>>     
>
> :-) I think that the TAG has moved beyond it and that the obsession lies elsewhere.
>
>   
>> I remember
>> that a few months back TAG was even proposing IETF to 404 back some
>> documents. So, suddenly 404 isn't that bad any more, huh?
>>     
>
> Sorry... I can't ground that in anything that I'm aware of.
>
>   
>> Then, why we
>> ask people to make "cool URI"? It really sound ridiculous to me.
>>
>> It is interesting that in AWWW it says "We define the term "information
>> resource" because we observe that it is useful in discussions of Web
>> technology and may be useful in constructing specifications for
>> facilities built for use on the Web."
>>     
>
> Well... for me the distinction is between things that are:
>
> 1) material/physical things
> 2) abstract/conceptual things
> 3) information things (serialisable in a message)
>
> The only one category that gives some pause for thought is the 2nd one, such things certainly admit description, but can they have awww:representation of themselves? (I could go either way).
>
>   
>> I wonder if after so many years of debate. No one has shown even one
>> application that has benefited from using the concept of IR. It perhaps
>> is what it is -- only useful in *discussion*. Why compound people with
>> some concept that we don't even know what it is?
>>     
>>>> It is the same on "the uniform access to metadata".
>>>> What is "metadata"?  If you cannot define it, do you
>>>> honestly think that a proposal will be of any use?
>>>>
>>>> The web is built on three things: URI, Resource, Representation
>>>>
>>>>         
>>> AWWW is about those three things (and Interaction).
>>>
>>>       
>> Of course. But interaction is how the Web is implemented and realized.
>> And what a URI denote should have nothing to do with how a message is
>> retrieved by a particular protocol. Can I use a ftp-URI to denote a
>> person? I sure can, right? And then how do I 303 that? httpRange-14
>> breaks the most fundamental principle -- the principle of orthogonal
>> specification.
>>     
>
> httpRange-14 was asked and resolved in the context of HTTP URI. It has nothing to say about FTP URI.
>
>   
>>>> There is no fourth or fifth essential concepts.  Metadata, generic resource,
>>>> information resource, whatever they are must be one of the three
>>>> entities.  Thus, you have to follow the design pattern of the first
>>>> three entities and then consider what we can standardize next for a more
>>>> specific problem.
>>>>
>>>> Without solving httpRange-14,
>>>>
>>>>         
>>> What (in your view) is not solved? [You may not like the
>>>       
>> solution, but that is a different matter].
>>     
>> In my view, httpRange-14 solves no problem at all. What it does is to
>> create a problem (at least for me). And the created problem is a very
>> huge one because the semantics of 200 suddenly becomes murky.
>>     
>>> [And if you want a framing of the question, it is "What should be deployed on the web at
>>>       
>> http://example.net/people/skw where the intention is that URI
>> is used to name and to refer-to me (the person)? Further the
>> deployment should be such as to avoid the possibility of
>> confusing a reference to me as reference to a document about me."]
>>     
>> You use "http://example.net/people/skw" just as you use your name.
>> People needs to realize that what they get is a document retrieved from
>> the URI but not what the URI denotes.
>>     
>
> Ah... Well there we have it... See above... a different intention for the architecture.
>
>   
>> I have proposed to use
>> "http://example.net/people/skw#(mine-type)" to denote the document of a
>> particular mine-type retrieved from that document.
>>     
>
> I don't think that you have freedom to do that. The attribution of significance to frag ids is delegated to media-types specifications, doing what you suggest would be a big change. Practically you could do something similar in ? Space eg.
>
>         http://example.net/people/skw?mt=(mine-type)
>
> But again it would be a stretch to get universal adoption... It is little different then from a suffix based convention:
>
>         http://example.net/people/skw           names me, the person,
>         http://example.net/people/skw.about     names a document about me (and access to the former are redirected to the latter)
>
>   
>> The treatment is the
>> same whether the URI in question contains a fragment or not. Thus, if
>> you use "http://example.net/people#skw" to denote you. Then the URI
>> "http://example.net/people#(mine-type)skw" would denote a particular
>> document-fragment describing you.
>>
>> Of course, such a #(mine-type) is not absolutely needed, since we can
>> always design properties and using b-node, etc to describe it. The
>> syntax just makes it more convenient. But no matter what, what a URI
>> denotes is always up to the owner of a URI. A client retrieves
>> information and judge if s/he should accept it or not. httpRange-14
>> basically takes the ownership of URI's meaning away from its owner.
>>
>> Xiaoshu
>>     
>>>> proposing anything else is like building a
>>>> house from top and hope that all those design can consistently converge
>>>> to a solid foundation.  I don't think that is the way it works and I
>>>> don't think it can work either.
>>>>
>>>> Xiaoshu
>>>>
>>>>         
>>>>> Jonathan
>>>>>
>>>>> [1] http://www.w3.org/2001/tag/group/track/issues/53
>>>>> [2] http://www.w3.org/2001/tag/group/track/issues/62
>>>>> [3] http://lists.w3.org/Archives/Public/www-tag/2009Feb/0074.html
>>>>> [4] http://www.w3.org/2001/tag/2008/02/28-minutes#item01
>>>>>
>>>>>           
>>> Stuart
>>> --
>>>
>>> <snip/>
>>>
>>>       
>
> Stuart
> --
>
>
Received on Tuesday, 30 June 2009 13:48:48 UTC