Re: Last minute input to discussion re 'on the boundaries of content negotiation in the context of the Web of Data' from Jonathan Rees on 2009-02-20 (www-tag@w3.org from February 2009)

From: Jonathan Rees <jar@creativecommons.org>
Date: Fri, 20 Feb 2009 08:36:12 -0500
To: Michael Hausenblas <michael.hausenblas@deri.org>
Cc: "www-tag@w3.org" <www-tag@w3.org>
Message-ID: <760bcb2a0902200536l3241827rfde7f7a8d559e3e9@mail.gmail.com>
OK, we have been meandering a bit here. You started with a request for
a piece of writing providing advice on the use of entity tags and
content negotiation. TAG advice on this subject has been consistent
over the years (as Ian Hickson says, you can predict what the TAG will
say), but the absence of a writing is a problem that I have
acknowledged.

You state your real problem, which is a need for the community to
converge on a single protocol to be used to communicate "definitions"
of URIs. This, too, ought to be addressed in a piece of writing. (It
is not completely clear that this is TAG business.) A variety of
solutions have been proposed, including one for # URIs from Tim, some
guidance from POWDER, and a protocol articulated by David Booth. There
is a de facto standard (see below). I have my own ideas and expect to
do more work after the IETF Description Resource Discovery Protocol is
further along. "Uniform access to metadata" is an active TAG topic and
I trust you've seen the memos on the subject.

You have connected your problem to the CN question by asking whether
you are advised to use CN as part of a particular solution, which you
propose. The answer is: fine, but only if the representations are
"similar enough" that one can be substituted for the other (roughly
speaking). An example is FOAF (although personally I find its HTML and
RDF representations to be not "similar enough" and I get angry when I
have to take time to hunt around to find the RDF version; but the idea
is plausible).

I had thought the problem of getting definitions for # URIs was
solved. Is it that you're worried that the use of # URIs in RDF, which
has been common practice since RDF was first introduced, is not
consistent with media type specifications? Or that you have a powerful
need for non-RDF representations that do not serve the same purpose as
RDF representations?

(a few comments)

On Thu, Feb 19, 2009 at 1:17 PM, Michael Hausenblas
<michael.hausenblas@deri.org> wrote:
>
> Jonathan,
>
>> Do I understand you as making a definition thing = non-information resource?
>
> Basically, yes.

May I recommend you use "thing" consistently with the way it's used in
OWL. Documents are things too.

>> Are you talking about all URIs or just http: and https: URIs?
>
> As a good Web citizen, of course HTTP URIs in the first place (unsure about
> https scheme - can you provide me with pointers re discussions around this,
> please?)

Not off hand. You might particularly want to look at data: , urn:lsid:
, mailto: , and info: . Or not.

>> By "the resource" do you mean the resource named by (a) the fragmentless URI,
>> or the resource named by (b) the fragmentful URI?
>
> My idea was (put in simple word): let's use fragmentful URIs from now on to
> identify non-information resources, ahm, things. As there are already used
> conventions/specs (for example HTML fragments, RDF fragments, etc.) one
> needs rules to explicitly state exceptions to that default.

This advice has been given, but a lot of people don't follow it. If
you're looking for a shot at
universal adoption I doubt this will do it.

Also I see nothing wrong with using a # URI for an IR.
<http://example.com/reading-list#moby-dick> rdf:type foaf:Document.

>> Same question as above re "the resource". Also, if specs or media
>> types say inconsistent things about the referent of the the
>> fragmentful URI, what then?
>
> Honestly, I don't understand the question. Can you rephrase, please?

If U#F has representations X and Y with media types A and B, and media type A
implies U#F "identifies" one kind of thing and media type B implies
U#F "identifies"
a different kind of thing, which one wins, if either? Or do we
decide to ignore the media type specs? Or is it just an error?

>
>>> Axiom 4) An authoritative party can explicitly state fragment identifier
>>> semantics consistency by using POWDER's describedby property as of [5] along
>>> with HTTP Link: header as of [6] or by embedding RDF as of [7].
>>
>> You do not mean to be exclusive here right?
>
> Not exclusive, but I get more and more the feeling that people need concrete
> advices (or is it just me?) on certain issues.

You have to be clear, as the DRD protocol is: You can get information
about something from any source. The purpose of DRD is not to limit
anyone, or to say what constitutes authoritative information, but
rather to arbitrarily select a single discovery method so that as many
people as possible doing the same thing and communication takes place.
Authority is completely orthogonal.

>
>>> Please note the following: The intention is to keep the current definition
>>> of IR as of the AWWW1
>>
>> You are aware that there is some debate around the desirability of doing so?
>
> Yes.
>
>> Could one phrase this as a defaulting mechanism: In the absence of evidence
>> to the contrary, one should assume that a fragmentless URI "identifies" an
>> information resource?  You are aware that this runs afoul of the
>> established practice of communicating non-IR-ness in "description resources"
>> obtained in other ways - do you mean to ask people to change what they do?
>
> So far I have not explicitly talked about fragmentless URIs, my bad. Yes, my
> assumption would be that fragmentless URI per default would then identify
> IRs unless one explicitly says this is not the case (again, similar as of
> axiom 3 and 4). When you are talking about 'established practice': how big
> would the 'collateral damage' be, in your opinion?

Quite large, IMO, but even if it weren't there's no turning back. A
lot of people would either be pissed off, or would ignore you/us.
Basically the TAG has already said that http: URIs don't necessarily
"identify" IRs, and so there has been major investment (time, $,
reputation) based on the assumption that it's OK. There has also been
investment from parties who never thought there was a problem in the
first place and didn't need the TAG to give advice.

>
>>> Further, axiom  allows (currently) two ways to 'announce' exceptions,
>>> on the HTTP layer or on the representation layer - this is open to
>>> discussion and should/will be extended.
>>
>> So communications from the URI owner along certain paths are operationally
>> to be treated in one way, while communications along other paths are
>> to be treated in another?
>
> Not sure if I understand you right, here. All I am saying is that *I*
> identify/propose two mechanisms to deploy POWDER's describedby property.
> There might be more to add to this list; these can be seen as good practices
> - is this against any doctrine, here?

If you specify a certain protocol for getting some information, that is
like DRD. If you say that the failure to get information by following that
protocol has some implication, such as non-IR-ness, that is a closed
world practice and therefore not a good idea. Any time you have a notion
of exceptions that implies defaulting and that implies closed world.

>
>> Can you say more about what practical (non-philosophical) problem
>> you're up against here?
>
> Problem: what is the definition of a non-information resource? AWWW1 doesn't
> talk about it, and I was not able to spot a definition in other docs - have
> you? Looking at [1], I guess it is a rather hard task, currently ;)

(You don't define resources, you define URIs...)

If this is your problem I'm not sure what is failing, in practice,
with the de facto standard of (a) for fragmentful URIs, strip the
fragment, do a GET (specifying your personal conneg preferences) and
look in the representation you get back, (b) for others, do a GET, 200
implies IR (assuming a cooperating server), 303 leads you to
definition as before.

For me the bigger problem is definitions of URIs for IRs, not for
non-IRs. And since the IR
idea is so vague, any solution that doesn't mention "IR" is much
stronger than a solution
that does. Thus the word "uniform" in the "uniform access" conversation.

Best
Jonathan

>
> Cheers,
>      Michael
>
> [1] http://esw.w3.org/topic/AwwswVocabulary
>
> --
> Dr. Michael Hausenblas
> DERI - Digital Enterprise Research Institute
> National University of Ireland, Lower Dangan,
> Galway, Ireland, Europe
> Tel. +353 91 495730
> http://sw-app.org/about.html
> http://webofdata.wordpress.com/
>
Received on Friday, 20 February 2009 13:36:54 UTC