Re: Uniform access to metadata: XRD use case. from Xiaoshu Wang on 2009-02-27 (www-archive@w3.org from February 2009)

From: Xiaoshu Wang <wangxiao@musc.edu>
Date: Fri, 27 Feb 2009 13:22:55 +0000
To: "Williams, Stuart (HP Labs, Bristol)" <skw@hp.com>
CC: Pat Hayes <phayes@ihmc.us>, "jar@creativecommons.org" <jar@creativecommons.org>, "Patrick.Stickler@nokia.com" <Patrick.Stickler@nokia.com>, "www-archive@w3.org" <www-archive@w3.org>
Message-ID: <49A7E92F.1060402@musc.edu>
Williams, Stuart (HP Labs, Bristol) wrote:
> Hello Pat,
>
> [Trimming to those that I think are the interested parties - by all means respond back on list if so motivated, publically archived just in case.]
>
>   
>> -----Original Message-----
>> From: www-tag-request@w3.org [mailto:www-tag-request@w3.org] 
>> On Behalf Of Pat Hayes
>> Sent: 26 February 2009 17:53
>> To: Patrick.Stickler@nokia.com
>> Cc: wangxiao@musc.edu; eran@hueniverse.com; 
>> julian.reschke@gmx.de; jar@creativecommons.org; 
>> connolly@w3.org; www-tag@w3.org
>> Subject: Re: Uniform access to metadata: XRD use case.
>>
>>
>> On Feb 24, 2009, at 11:38 PM, <Patrick.Stickler@nokia.com> 
>> <Patrick.Stickler@nokia.com 
>>  > wrote:
>>
>>     
>>>
>>> On 2009-02-25 02:00, "ext Xiaoshu Wang" <wangxiao@musc.edu> wrote:
>>>
>>>       
>>>> The critical flaw of all the proposed approach is that the definition of
>>>> "metadata/descriptor" is ambiguous and hence useless in practice.   
>>>> Take the "describedBy" relations for example.  Here I quote from Eran's  
>>>> link.
>>>>
>>>>      The relationship A "describedby" B asserts that resource B
>>>>      provides a description of resource A. There are no constraints on
>>>>      the format or representation of either A or B, neither are there
>>>>      any further constraints on either resource.
>>>>
>>>> As a URI owner, I don't know what kind of stuff that I should put in A
>>>> or B.  As a URI client, how should I know when should I get A and when
>>>> B?  Since I don't know what I might be missing from either A or B, it
>>>> seems to suggest that I must always get both A and B. Thus, I cannot
>>>> help but wondering why they are not put together at A at the first  
>>>> place.
>>>>
>>>> The same goes for MGET, how a user knows when to GET and when to  
>>>> MGET?
>>>>         
>>> If one wants a representation of the resource, use GET. 
>>>       
>> To avoid (even more) confusion, here you mean "representation" in the  
>> narrow TAG/awww sense. right? The sense used in the REST architecture  
>> description. Its important to get this clear, since when  
>> 'representation' is used in its more common, wider, sense, a  
>> description _is_ a representation. In fact, descriptions are in very  
>> real sense the paradigmatic kind of representation.
>>     
>
> Yes... I think that confusion has been induced by other participants in this thread.
>
>   
>>> If one wants a description of the resource, us MGET.
>>>
>>> There is some potential conceptual overlap between representations and
>>> descriptions for certain kinds of resources, but the distinction should be
>>> reasonably intuitive.
>>>       
>> Actually no, its not _intuitive_ at all. Intuitively, in fact, it  
>> makes absolutely no sense whatsoever. Why is one special kind of  
>> representation, one that indeed has never been given a precise  
>> definition or any kind of semantics, and appears to have no precursors  
>> or exemplars anywhere in the entire technical literature previous to  
>> Roy's doctoral thesis, be elevated to such an exalted status that an  
>> entire world-wide transfer protocol be devoted to handling it, while  
>> ignoring all other forms of representation? And _how_ does this kind  
>> of representation make it fundamentally different from a description?   
>> Of course Im speaking intuitively here, and I think that both of these  
>> questions have reasonable answers: but AFAIK nobody has actually  
>> offered any; and they aren't particularly intuitive.
>>     
>
> FWIW: Here's my take:
>
> Descriptions are resources (awww:resources, pol:things) too and at least those that are web accessible have awww:representations (ie. either ephemeral messages (token) identified by bit-sequence time and space(comms-link) or a type (all messages of conveying a given byte-sequence) - and webarch is has not been clear about which - I don't think that matters for the purposes of this discussion). In large part I wish we could take awww:representation out of our 'ontology' because they are not things that naturally have URI assigned to them. But clearly, despite really being part of the machinary that creates the 'illusion' of a web accessible resource we seem doomed to have to speak of them.
>
> The resource at: http://www.ihmc.us/users/phayes/PatHayesAbout.html seems to be to be a description of a person, carefully crafted, at least as a narrative [**] to ground the identity of an individual by stating a number of invariants - it also  designates another (different URI) that may be used for referring to that person [*]. 
>
> It seems to me that the sequence bytes obtained by performing an HTTP GET using http://www.ihmc.us/users/phayes/PatHayesAbout.html in the request line have a different relationship with the (descriptive) resource desiginated by that URI than they do with the person being described. 
>
> IMO (and FWIW) the (descriptive) resource that happens to be referred to by http://www.ihmc.us/users/phayes/PatHayesAbout.html may be used to pol:represent person described, while the sequence of bytes exchanged awww:represent that descriptive resource.
>
> Where I struggle to get beyond the intuitive is in the atriculation of what seem to me to be something of a tightly coupled relation between a resource and its awww:representations (if any), a sense that they are 'of' it, as against a more loosely coupled sense of 'aboutness' between a descriptive resource and the thing that it describes. You have touched on these kinds of notions in discussing galaxies far-far away and questioning how on earth (not your word) they get to participate in web interactions (how on earth as resources they can be expected to emit or accept (and even respond to) awww:representation). And of course they can't - the web interactions are with somethingelse and the exchanged  awww:representations awww:represent that somethingelse even if it pol:represents a galaxy far-far away.
>
> I believe that Xiaoshou wants to say that "awww:represents rdfs:subProperty pol:represents ." such that:
>
> 	?a pol:represents ?b entails ?a pol:represents ?b 
>
> and such ambition, I think, causes him to drop the distinction in his messages (because in his world AFAIKT there is none).
>   
Yes, or no.  Under a specific context, I won't object the distinction.  
But on the very top level, I cannot make distinction.  That is the 
problem. 
> Taking a different example. The byte stream that I get back an HTTP GET request on 
> http://dfdf.inesc-id.pt/misc/man/www2009_10_30.pdf doesn't seem so much as to 'describe' a particular document/manuscript, but to convey its current state. I might describe the manucript but making statements about its content, its authors, its history and so forth all of which would serve to (maybe) identify what document/manuscript I'm talking about. I could make such a description available as a web resource. It would have it's own awww:representations, however I think that they would hardly serve as awww:representation of the manuscript itself - though the description may serve to pol:represent the manuscript.
>   
The choice of word is not important, what is important is the semantics 
of the word.  I have made suggestions on TAG mailing list, as well as to 
Patrick (Strickler), about using Quine's idea of ontological commitment 
to avoid hypostasizing or reifying terminology for the purpose of a 
specific theory or approach.  Take IR as an example.  If IR is defined 
to be a resource that IS (but not "can be conveyed in") a bytestream, 
then everyone would be very clear about it.  Hence, a provider would 
know how to provide them and its client would know what to and should do 
about them.  Defining things in subjective sense or even provide a 
simple subsumption relationship is not useful for the purpose of clear 
communication because, if A subsumes B,  unless both communicating 
parties agree 100% on the boundary between B and A-B, mis-communication 
is inevitable.  As Pat said, there could have some reasonable answers, 
but please define it.  Whether it is right or wrong (in terms of if it 
adheres to someone's intuition or not), in fact, doesn't really matter 
as long as it is objective.  The purpose is to make the term pragmatic.

The same goes for the distinction between Representation and Description.
> [*] ok... there's been a bit of history around this particular deployment and the use of redirections and so forth http://www.ihmc.us/users/phayes/PatHayes used to redirect to http://www.ihmc.us/users/phayes/PatHayes.html which IMO was just as good as http://www.ihmc.us/users/phayes/PatHayesAbout.html though its a 302 rather than 303 redirection - I won't quibble - the spirit is there :-) .
>
> [**] many other referring names are used in the narrative that don't receive similar attention
>   
The behavior is secondary issue.  Meaning, again by Quine, has 
behavioral significance. If the meaning of a concept is ambiguous, it 
incurred behavior will be unpredictable.  We need to know first what we 
are designating a behavior to respond to what particular kind of 
meanings, then we can decide if the behavior is appropriate. This goes 
to httpRange-14, Link, MGET etc.  That is why I have repeatedly ask TAG 
to reopen httpRange-14, they all boils down to the definition of IR. 

I want to be clear that my insistence on a minimalist approach -- to 
remove the definition of IR and to rethink Conneg -- is not (at least 
not solely) because I think it is superior to other approaches. It is 
much more so because I failed to find a working definition for IR and 
Metadata/Description.  There is always gray areas in pretty much any 
semantic domains.  When I trot upon these areas, I want a definite 
guideline regardless it confirms to my personal intuition or not. But 
there is none.  I welcomed IR/httpRange-14 first, but then revolted. I 
initially (quite a few years back) proposed to LSID about if they should 
consider defining their metadata as RDF. (This goes to you - Patrick 
Strickler -- as you accused me of not doing enough homework), but I 
later revolted again and think the whole idea of LSID is ill-conceived.

What makes me reverse my course? As I can now understand, the reason is 
that when we are speaking at the higher level, there are only a few 
words/concepts that we can use.  For the Web, there are only three 
things that are clearly defined.  URI, awww:Representation, Resource.  
According to the SIR model that I have used to define all information 
system,

URI is the *Symbol*
awww:Representation the *Information*,
Resource the *Referent*.

Resource, is thus, what exists in the Web.  Existence, according to 
Quine's definition, is the value of a bounded variable.  It is things 
that has meaning or significance.

Representation is Information, which, as defined by Dretske, is 
objective.  And Information must have structure, which, w.r.t. to the 
Web, it is the format of awww:Representation.

Thus, when we talk about the Web, these are all the terminologies that 
we have at our disposal.  Any other terminologies, such as IR, or Meta-, 
Description etc. needs a concrete definition and could be used either a 
small sub-system of the Web (hence making it unsuitable as a generic 
design pattern for the Web) or someone needs to come up with a model 
that making these concepts an essential component of the Web.

I wish this clearly outlined my position (so I won't be accused of 
refusing to understand other's position).  Hopefully, it can also give 
us a reasonable guideline for the subsequent debate if any.

Xiaoshu
Received on Friday, 27 February 2009 13:23:50 UTC