Re: Uniform access to metadata: XRD use case. from Patrick.Stickler@nokia.com on 2009-02-25 (www-tag@w3.org from February 2009)

From: <Patrick.Stickler@nokia.com>
Date: Wed, 25 Feb 2009 15:51:48 +0100
To: <wangxiao@musc.edu>
CC: <eran@hueniverse.com>, <julian.reschke@gmx.de>, <jar@creativecommons.org>, <connolly@w3.org>, <www-tag@w3.org>
Message-ID: <C5CB27A4.DF74%patrick.stickler@nokia.com>
On 2009-02-25 11:40, "ext Xiaoshu Wang" <wangxiao@musc.edu> wrote:

>
>
>
> Patrick.Stickler@nokia.com wrote:
>>
>> On 2009-02-25 02:00, "ext Xiaoshu Wang" <wangxiao@musc.edu> wrote:
>>
>>
>>> The critical flaw of all the proposed approach is that the definition of
>>> "metadata/descriptor" is ambiguous and hence useless in practice.  Take
>>> the "describedBy" relations for example.  Here I quote from Eran's link.
>>>
>>>       The relationship A "describedby" B asserts that resource B
>>>       provides a description of resource A. There are no constraints on
>>>       the format or representation of either A or B, neither are there
>>>       any further constraints on either resource.
>>>
>>> As a URI owner, I don't know what kind of stuff that I should put in A
>>> or B.  As a URI client, how should I know when should I get A and when
>>> B?  Since I don't know what I might be missing from either A or B, it
>>> seems to suggest that I must always get both A and B. Thus, I cannot
>>> help but wondering why they are not put together at A at the first place.
>>>
>>> The same goes for MGET, how a user knows when to GET and when to MGET?
>>>
>>
>> If one wants a representation of the resource, use GET.
>> If one wants a description of the resource, us MGET.
>>
> This doesn't answer the question at all.  For me, a representation must
> be describing something.

You're definition of representation seems overly narrow.

If a given URI denotes a tree, and a 200 response to an HTTP GET request for
that URI returns an image of the tree (i.e. a representation of the tree),
does that image "describe" the tree? One may be able to observe
characteristics of the tree by viewing the image, but whether or not the
image is a "description" of the tree is, I think, a matter of debate, and in
any case, outside the scope of the protocols in question.

> Hence, I cannot say if something is a
> Representation but not Description.

It's the specification of the protocol that says what is returned (or should
be).

A successful response to a GET request can be presumed to be a
representation.

A successful response to an MGET request can be presumed to be a
description.

>> There is some potential conceptual overlap between representations and
>> descriptions for certain kinds of resources, but the distinction should be
>> reasonably intuitive.
>>
>  I don't think any protocol based on intuition is practical.

Neither HTTP or URIQA are based on intuition. Some concepts are, however,
for most folks, fairly intuitive. But the specs will say how software should
behave and expect when using those protocols.

>  The
> concept of IR seems intuitive, but it doesn't work (at least not for me).
>>
>>> PROFOUND is different because when people use it, they have already
>>> known that the resources is defined by WebDAV.   Hence, these kind of
>>> ideas only works when the client already have some knowledge about A.
>>> But, to propose it as a general framework for the Web, it won't work.
>>> At the most fundamental level, we only know three things about the Web
>>> -- URI, Representation, Resource.  The concept of metadata is
>>> ill-conceived at this level because as data about data, to say metadata
>>> implies that we already know something about the resource we tries to
>>> access, a piece of knowledge that we don't have.
>>>
>>
>> For URIQA, all that is needed is the URI. After all, you have to be able to
>> name something to communicate effectively about it.
>>
>> URIQA does not presume that any representation exists. It neither posits nor
>> requires an "Information Resource".
>>
>> It is perfectly complimentary to the web.
>>
>> GET/PUT/etc. deal with representations.
>> MGET/MPUT/etc. deal with descriptions.
>>
>> If you have a URI, you can use it to either get representations or
>> descriptions, and if you don't know anything about what resource the URI
>> denotes, you might first want to get the description.
>>
>>
>>> There are a lot of implicit assumptions under the so-called "uniform
>>> access to metadata/descriptor" approach.  It either requires the
>>> definition of IR or a one-on-one relationship between Resource and
>>> Representation.  As the former implies that non-IR cannot have a
>>> representation, it makes the "descriptor/metadata" necessary.  The knock
>>> on this assumption is that the definition of IR is impossible to work with.
>>>
>>
>> URIQA makes none of those assumptions.
>>
> Really? Try to define the distinction between your terms "description"
> and "representation", see what you must come out.

A representation is what you (should) get from a 200 response to an HTTP GET
request. It can be expected to reflect, in some manner, the state of the
resource denoted by the request URI. Whether the representation returned is
useful or meaningful to the recipient (either human or machine), or whether
it "describes" the resource in any discernable way, is outside the scope of
the HTTP spec and lies entirely in the domain of information publication and
consumption -- i.e the social relationship between the publisher of the
representation and consumers of the representation.

A description is what you (should) get from a 200 response to a URIQA MGET
request. It can be expected to correspond to a graph of RDF statements,
serialized in some manner (RDF/XML by default) where the particular
statements of interest are those in which the request URI occurs as the
subject (though there can be other statements in the graph in which the
subject does not correspond to the request URI). It is intended to be
interpreted by the recipient (usually a machine) in terms of the RDF model
theory.

Pretty distinct to me.


HTTP GET may return a serialization of an RDF graph.
URIQA MGET always returns a serialization of an RDF graph.

Note that a description, returned by URIQA MGET, is a specific subtype of
representation, returned by HTTP GET, and it is certainly possible for
representations of that description to be accessible via HTTP GET. So yes, a
representation can certainly describe a resource. But not all
representations accessible via HTTP GET will be as explicitly descriptive as
an RDF graph. (sorry if that is confusing, reading it several times may be
necessary ;-)



>>> The 1-on-1 relationship gives rise to the so-called "legacy resource".
>>> But the word "legacy resource" is wrongly named too.  In the Web, there
>>> might be something as "legacy representation" but there should NOT be
>>> such thing as "legacy resource" because the latter implies that the
>>> Resource is closed and no more semantics will be added.
>>>
>>> But the so-called "metadata/descriptor" problems can be solved by using
>>> HTTP Content Negotiation, making any other proposal a redundant one.
>>>
>>
>> Actually, it can't. As noted on http://sw.nokia.com/uriqa/URIQA.html:
>>
> The link returns a 404, so I don't know if it suppose to return
> something meaningful or it is a metaphor.

Perhaps you are including the colon at the end, which is not part of the URI
(sorry). I.e. try

http://sw.nokia.com/uriqa/URIQA.html

>> --
>> Why not use a MIME type and content negotiation to request a description?
>>
>> Content negotiation is designed to allow agents to select from among a set
>> of alternate encodings. The distinction between a resource description and
>> (other kind of) resource representations is not based on any distinction in
>> encoding.
> Nope.  That is perhaps the intention that conneg is designed.  But I
> don't think that is the way it should be understood.  Content-type might
> be signal a special encoding, but language, for instance, is also part
> of Conneg.

That is true, and the wording is perhaps imperfect, but the point made is
valid.

Content negotiation is intended to provide access to alternative
representations where the presumption is that those representations convey,
as much as is possible given the limitations of their form of expression,
the same essential body of information.

You may wish to use content negotiation for something else, but it's
original intended use, and actual use, is pretty well established.
Exploiting it to do something else, is certainly possible, but not
necessarily optimal as a generalized solution.

>> In fact, a given description (which is itself a resource) may have
>> several available encodings (RDF/XML, XTM, N3, etc.). Thus, if you use
>> content negotiation to indicate that you want a description, you can't use
>> it to indicate the preferred encoding of the description (if/when other
>> encodings than RDF/XML are available).
>> --
>>
> What is the implication of your statement. That RDF (or its sort) is
> description but others are not?

No. I didn't mean that at all.

> An HTML or XML doc definitely describes
> somethings.

As noted above, representations may correspond to descriptions, but may not
be as explicitly or formally descriptive as a serialization of an RDF graph.

> If you URIDL  them to an RDF, it doesn't change the nature
> of its content.

One can represent a specific RDF graph in a number of different ways, and
content negotiation can be effectively used as intended to request
particular variant representations of that graph.

If content negotiation is (mis)used to request an explicit description of a
resource, then it is not available to request variant representations of
that description (at least not without potentially doubling (or more) the
number of MIME types).

>> Content negotiation can be used as intended in conjunction with URIQA to
>> request particular variant encodings of a description.
>>
> Again, the definition of "description"?

See above.

>>> The
>>> actual issue, as I have discussed in [1], is about the incomplete syntax
>>> of the URI specs, which  currently does not have a syntactic notation
>>> the other two foundation objects in the Web, i.e., URI and
>>> Representation.  Once we supplement URI spec with those syntactic sugar,
>>> such as the one I proposed in [2], then, we can have a uniform approach
>>> to (1) describe URI along with standard resources and (2) to
>>> systematically discover the possible representation types, i.e.,
>>> Content-Type/MIME types, associated with a Resource (either URI or
>>> standard Resource). As a particular content-type is equivalent of a
>>> particular *service*, hence, the approach in effect establishes a
>>> uniformed approach to service discovery.
>>>
>>> What is required is to define Content-Type in URI.  Once we have these,
>>> not only Data/Resource are linked but DataType/Service.  The best of
>>> all, it works within the conceptualizations defined in AWWW, and does
>>> not require any other ambiguous conceptualization, such as, IR,
>>> metadata, and description, etc.
>>>
>>
>> I consider on of the strengths of the semantic web layer is that it is
>> agnostic about the syntactic structure of URIs. I also think that
>> syntactically binding the URI of a resource and the URI(s) of its
>> representation(s) or description(s) is necessary, and would be overly
>> cumbersome in practice.
>>
> Of course.  But anyone who words with the Web should know that the Web
> is consisted of these three kinds of things.

Anyone who is familiar with the standards which serve as the foundation for
the web, and semantic web, knows what things are defined as relevant to
software applications and the scope of those definitions.

(granted, no spec or standard is perfect, but things are defined a lot more
clearly and precisely than the definitions you seem to be assuming for these
particular terms)


> Hence, giving these three
> concept some syntactic sugar doesn't violate the URI's opacity
> principle.

I'm sorry, but that statement is self-contradicting. If the URI is opaque
for a given application, then syntax is irrelevant, hence there cannot be
any syntactic sugar which is meaningful to that application.

Syntax which may be relevant to the web layer is irrelevant to the semantic
web layer.

The interface between the web and semantic web layers is a shared set of
URIs with consistent denotation, and a means for semantic web agents to
interact with representations of descriptions accessible via those URIs
using web protocols.

The web layer is concerned with representations of resources.
The semantic web layer is concerned with descriptions of resources.

A description of a resource is a kind of representation of that resource,
but with a formal significance to the semantic web layer, and therefore it
is optimal if semantic web agents can easily access those particular
representations which correspond to descriptions, or from which descriptions
can be extracted, where such descriptions can be interpreted as RDF graphs
according to the RDF model theory.

The less bandwidth or processing needed to obtain such descriptions the
better.

URIQA is designed to provide the most optimal access to explicit
descriptions meaningful to semantic web agents with the lowest bandwidth and
processing overhead possible and the least amount of specialized knowledge
(nothing more than the URI and which method to use).

> When I say syntactic sugar, I mean that it is not absolutely
> necessary.  But the benefit of defining it is for convenience in practice.
>

The sheer number of software applications which would need to be modified to
consistently support such a special URI notation is staggering. URI opacity
is one of the most important principles of the semantic web, for the very
reason that it allows most software and content in the web layer to remain
unchanged and agnostic, while enabling us to make explict statements about
any resources denoted by any form of URI.

Regards,

Patrick


> Xiaoshu
>> Patrick
>>
>>
>>> 1. http://dfdf.inesc-id.pt/misc/man/http.html
>>> 2. http://dfdf.inesc-id.pt/tr/uri-issues
>>>
>>> Xiaoshu
>>>
>>> Eran Hammer-Lahav wrote:
>>>
>>>> Both of which are included in my analysis [1] for the discovery proposal.
>>>>
>>>> EHL
>>>>
>>>> [1] http://tools.ietf.org/html/draft-hammer-discovery-02#appendix-B.2
>>>>
>>>>
>>>>
>>>>> -----Original Message-----
>>>>> From: Julian Reschke [mailto:julian.reschke@gmx.de]
>>>>> Sent: Tuesday, February 24, 2009 1:45 AM
>>>>> To: Patrick.Stickler@nokia.com
>>>>> Cc: Eran Hammer-Lahav; jar@creativecommons.org; connolly@w3.org; www-
>>>>> tag@w3.org
>>>>> Subject: Re: Uniform access to metadata: XRD use case.
>>>>>
>>>>> Patrick.Stickler@nokia.com wrote:
>>>>>
>>>>>
>>>>>> ...
>>>>>> Agents which want to deal with authoritative metadata use
>>>>>>
>>>>>>
>>>>> MGET/MPUT/etc.
>>>>>
>>>>>
>>>>>> ...
>>>>>>
>>>>>>
>>>>> Same with PROPFIND and PROPPATCH, btw.
>>>>>
>>>>> BR, Julian
>>>>>
>>>>>
>>>>
>>
>>
>>
Received on Wednesday, 25 February 2009 14:50:07 UTC