Re: Uniform access to metadata: XRD use case. from Xiaoshu Wang on 2009-02-25 (www-tag@w3.org from February 2009)

From: Xiaoshu Wang <wangxiao@musc.edu>
Date: Wed, 25 Feb 2009 11:44:31 +0000
To: Phil Archer <phil@philarcher.org>
CC: Eran Hammer-Lahav <eran@hueniverse.com>, Julian Reschke <julian.reschke@gmx.de>, "Patrick.Stickler@nokia.com" <Patrick.Stickler@nokia.com>, "jar@creativecommons.org" <jar@creativecommons.org>, "connolly@w3.org" <connolly@w3.org>, "www-tag@w3.org" <www-tag@w3.org>
Message-ID: <49A52F1F.3080701@musc.edu>
Phil Archer wrote:
> Xiaoshu Wang wrote:
>   
>> The critical flaw of all the proposed approach is that the definition of 
>> "metadata/descriptor" is ambiguous and hence useless in practice.  Take 
>> the "describedBy" relations for example.  Here I quote from Eran's link.
>>
>>      The relationship A "describedby" B asserts that resource B
>>      provides a description of resource A. There are no constraints on
>>      the format or representation of either A or B, neither are there
>>      any further constraints on either resource.
>>
>> As a URI owner, I don't know what kind of stuff that I should put in A 
>> or B.
>>     
>
> Yes you do. You know that B has something to say about A. You don't, 
> however, know what format either is in or anything else. Those details 
> are handled by other mechanisms, notably the content type. In this link:
>
> Link: <foo.bar>; rel="describedby" type="application/thing";
>
> You would probably only fetch foo.bar if you had a UA that could process 
> application/thing. This is a hint - it may be superseded by the more 
> authoritative headers that come back if you dereference foo.bar)
>
>    As a URI client, how should I know when should I get A and when
>   
>> B? 
>>     
>
> Because either:
>
>   - you're interested in A for any of the reasons you may be interested 
> in any resource (you're following a link, it's in a search result or 
> whatever). Optionally, you can find out more about A by following the 
> link to B.
>
>   - you're collecting URIs of resources that have particular features. 
> Therefore, you'll look for Bs and then use them to find As.
>   
Honestly, do you think that answers any question that I raise?  If B 
describes A, and if I am interested in A, I am of course interested in 
B.  What particular features that I am looking to allow me to decide 
either A or B but not both A and B?
>   Since I don't know what I might be missing from either A or B, it
>   
>> seems to suggest that I must always get both A and B.
>>     
>
> No. As an analogy: if an HTML page links to a stylesheet you can choose 
> whether to fetch the stylesheet or not in order to render the page.
>   
No.  This is not a reasonable analogy.  When I received a HTML page, (a 
representation btw), there exists a context that defines the semantics 
of stylesheet and it, in turn, helps a UA to decide accordingly.  At 
HTTP level, there is no such context because I know nothing except the 
URI denotes a Resource.  If you take this HTML page as an analogy, it 
means that I can move the HTML's stylesheet link into the HTTP layer as 
the HTTP Link?  Is this a good design?
>   Thus, I cannot
>   
>> help but wondering why they are not put together at A at the first place.
>>     
>
> Because they are often managed by different people, subject to different 
> production and editorial control etc. Take a content production 
> workflow. Often there is a relatively large number of people 
> (journalists, graphic artists etc.) who create the content which is then 
> subject to review by an editor(ial team). There are many situations 
> where the latter creates the metadata concerning resources produced by 
> the former.
>   
Again, you assumed a working context.  This is no different from the 
WebDAV case.  It is invalid as a general mechanism.  Conneg can solve 
the problem too.  You define a format/service, preferably with a URI, 
say "b",  for the content of B deploy it under A.  If a user wants to 
get B-content, which implicitly suggests that they already know "b".  
Then, they request the "b"-content from A. 
> As a little example:
>
> A is the homepage of a bank. It was last updated 2 hours ago.
> B tells you that A is the homepage of a bank. B was last updated 2 
> months ago.
>
> Current financial crisis notwithstanding, both are accurate, both have 
> been updated in a time frame that suggests they are actively managed.
>   
I don't get it.  Shouldn't A's representation tells me that A is the 
homepage of a bank?  Why do I need B to tell me the same thing?  And yhy 
would I be interested in the update of B? 
>> The same goes for MGET, how a user knows when to GET and when to MGET? 
>> PROFOUND is different because when people use it, they have already 
>> known that the resources is defined by WebDAV.   Hence, these kind of 
>> ideas only works when the client already have some knowledge about A.  
>>     
>
> I think you're getting into a bit of a tunnel here. How do you know 
> about anything on the Web? How do you discover anything? All the 
> mechanisms under discussion have their As and Bs (resources and 
> descriptions thereof). The current effort is all about trying to find 
> some uniformity of approach.
>   
Yes, my assumption is that you don't know anything about a Resource at 
the first place.  Thus, given a resource's URI, if I am a specialized 
agent, say RDF agent, I would request something that I can understand, 
such as RDF/XML, n3 etc. On the other hand, if I am a general agent, 
such as a human, I would (1) conduct implicit Conneg, by request 
something that I prefer, such as HTML, or other things like image, 
audio, etc.,  or (2) conduct transparent Conneg to ask what kind of 
services/content-types that the resource offer so I can choose.  If MIME 
type is URIzed, then a general agent such as a human can follow each of 
the MIME-URI to understand what is the most appropriate for my need so 
that I can make my choice accordingly.

This is not as what you said "how can you discover anything?".  It is 
exactly the opposite, it allows you to discover everything. 
>> But, to propose it as a general framework for the Web, it won't work.  
>> At the most fundamental level, we only know three things about the Web 
>> -- URI, Representation, Resource.  The concept of metadata is 
>> ill-conceived at this level because as data about data, to say metadata 
>> implies that we already know something about the resource we tries to 
>> access, a piece of knowledge that we don't have.
>>     
>
> But even a UA doesn't live in a vacuum. It responds to input, usually 
> human, sometimes automated. Either way, it is performing a task and will 
> have a variety of parameters. Metadata should make its task easier.
>
>   
>> There are a lot of implicit assumptions under the so-called "uniform 
>> access to metadata/descriptor" approach.  It either requires the 
>> definition of IR or a one-on-one relationship between Resource and 
>> Representation.
>>     
>
> That depends what the metadata says. If it says "this page is generated 
> dynamically to suit a wide variety of devices" that says quite the 
> opposite to your conjecture - namely that there are many different 
> representations available at the described URI.
>   
If you can describe your scenario without invoking the word "metadata" 
or any other similar sort, then you will present a valid case.  This is 
the very question that I asked at the very first place. Tell me, given a 
resource or data A, what is its meta-Resource or its metadata B?  Again 
as I have suggested for the definition of IR, let's use Quine's 
"ontological commitment" as a criteria to guard ourselves from 
hypostasizing or reifying things for a particular theory.

Define Data and Metadata in an ontology so that data and metadata is 
disjoint because only by which that everyone (both providers and 
consumers) can follow it in practice.

Xiaoshu
> Others, more qualified than me, have answered your remaining issues.
>
> Phil.
>
>    As the former implies that non-IR cannot have a
>   
>> representation, it makes the "descriptor/metadata" necessary.  The knock 
>> on this assumption is that the definition of IR is impossible to work with.
>>
>> The 1-on-1 relationship gives rise to the so-called "legacy resource".  
>> But the word "legacy resource" is wrongly named too.  In the Web, there 
>> might be something as "legacy representation" but there should NOT be 
>> such thing as "legacy resource" because the latter implies that the 
>> Resource is closed and no more semantics will be added.
>> But the so-called "metadata/descriptor" problems can be solved by using 
>> HTTP Content Negotiation, making any other proposal a redundant one. The 
>> actual issue, as I have discussed in [1], is about the incomplete syntax 
>> of the URI specs, which  currently does not have a syntactic notation 
>> the other two foundation objects in the Web, i.e., URI and 
>> Representation.  Once we supplement URI spec with those syntactic sugar, 
>> such as the one I proposed in [2], then, we can have a uniform approach 
>> to (1) describe URI along with standard resources and (2) to 
>> systematically discover the possible representation types, i.e., 
>> Content-Type/MIME types, associated with a Resource (either URI or 
>> standard Resource). As a particular content-type is equivalent of a 
>> particular *service*, hence, the approach in effect establishes a 
>> uniformed approach to service discovery.
>> What is required is to define Content-Type in URI.  Once we have these,
>> not only Data/Resource are linked but DataType/Service.  The best of 
>> all, it works within the conceptualizations defined in AWWW, and does 
>> not require any other ambiguous conceptualization, such as, IR, 
>> metadata, and description, etc.
>>
>> 1. http://dfdf.inesc-id.pt/misc/man/http.html
>> 2. http://dfdf.inesc-id.pt/tr/uri-issues
>>
>> Xiaoshu
>>
>> Eran Hammer-Lahav wrote:
>>     
>>> Both of which are included in my analysis [1] for the discovery proposal.
>>>
>>> EHL
>>>
>>> [1] http://tools.ietf.org/html/draft-hammer-discovery-02#appendix-B.2
>>>
>>>  
>>>       
>>>> -----Original Message-----
>>>> From: Julian Reschke [mailto:julian.reschke@gmx.de]
>>>> Sent: Tuesday, February 24, 2009 1:45 AM
>>>> To: Patrick.Stickler@nokia.com
>>>> Cc: Eran Hammer-Lahav; jar@creativecommons.org; connolly@w3.org; www-
>>>> tag@w3.org
>>>> Subject: Re: Uniform access to metadata: XRD use case.
>>>>
>>>> Patrick.Stickler@nokia.com wrote:
>>>>    
>>>>         
>>>>> ...
>>>>> Agents which want to deal with authoritative metadata use
>>>>>       
>>>>>           
>>>> MGET/MPUT/etc.
>>>>    
>>>>         
>>>>> ...
>>>>>       
>>>>>           
>>>> Same with PROPFIND and PROPPATCH, btw.
>>>>
>>>> BR, Julian
>>>>     
>>>>         
>>>   
>>>       
>>     
>
>
Received on Wednesday, 25 February 2009 11:45:29 UTC