Re: Change Proposal for HttpRange-14 from Niklas Lindström on 2012-03-25 (public-lod@w3.org from March 2012)

From: Niklas Lindström <lindstream@gmail.com>
Date: Sun, 25 Mar 2012 19:07:41 +0200
To: Michael Brunnbauer <brunni@netestate.de>
Cc: Jeni Tennison <jeni@jenitennison.com>, James Leigh <james@3roundstones.com>, public-lod community <public-lod@w3.org>
Message-ID: <CADjV5je_p1dkECOwQ2mGJ_A7wzzsjDQjGeD=soXjm4A-1rAPKQ@mail.gmail.com>
Hello,

On Sun, Mar 25, 2012 at 3:19 PM, Michael Brunnbauer <brunni@netestate.de> wrote:
>
> On Sun, Mar 25, 2012 at 12:31:18PM +0100, Jeni Tennison wrote:
>
>> > You are solving the problem by pretending that the IRs are not there then
>> > the publisher does not make the distinction between IR and NIR.
>>
>> No, I am just proposing stopping pretending that the NIR is not there, which is what is mandated by the current httpRange-14 design.
>
> If - like Hugh suggested - httpRange-14 is really stopping people inside the
> community from delivering solutions and those people are willing to sacrifice
> the IRs (although I find both of this hard to believe) - then you have good
> reasons to go ahead.
>
> But this makes me think about what those same people will be unable to deliver
> because they cannot make the default IR assumption any more (as I said, the
> rest of the world will probably go on making it).
>
> Perhaps the default IR assumption be saved by saying that a 200 URI <X> is a
> IR as long as we don't find some triple at X that suggests otherwise. Why not a
> NIR class ? If the concept of IRs/NIRs is sufficiently unambiguous to talk
> about it in natural language (I think it is), we can talk about it in RDF.

I find myself a bit concerned about what the proposed change implies
about the (currently direct) correspondence between an information
resource and data sent with a 200 OK. I wonder if there is merit in
the following reasoning:

Assume that I perform an HTTP GET on <x> and receive data, interpret
that and find the assertion:

    <x> cc:licence <cc-by> .

Now I know:

1. The publisher of <x> sends data representing *or* describing <x>
upon request.

2. The publisher of <x> licences it under <cc-by>.

3. The domain of cc:license is "a potentially copyrightable work".
Thus, the publisher of <x> says that <x> is such a work.

4. *If* I can assume that:

  a) such a work is represented by data, and
  b) something that can be represented as data is an information resource, and
  c) when *an information resource* is retrieved using a HTTP GET
responding with 200 OK, I have received *some of* its representation
data.

*Then* I can conclude that I'm allowed use the data I've just received
according to <cc-by>.

The question is, can I assume this given the change proposal? If I get
it, the change loosens what 200 OK means to mean #1 above. But does
#4.c hold? If not, can it be amended in some way to state that on a
200 OK, the data received *either* has to represent a requested
information resource itself, *or* be a description of a
non-information resource? That is, so that it would not be
allowed/correct/truthful to send a representation of another
information resource (such as a separately licensed independent record
of it), if the IRI dereferenced denotes an information resource.

To clarify, what do I mean by "another information resource"? Isn't a
representation of a resource also a resource, ultimately different
from the thing itself? I attempt to take an *intentional* stance here,
to avoid getting tangled in notions of extensional meaning (and
different levels of reality fidelity). Thus I am using
"representation" loosely, meaning "data corresponding to the form of
the thing itself". It might be a set of pixels made by transcribing
emitted photons, or it may even be a description in English, made by
an interpreting human mind. Here, "another" becomes contextually
dependent. A representation simply intentionally corresponds to the
IR. Of course, I am leaving it up to the publisher and consumer to
jointly agree whether there are triples in place which predicates the
resource as being representable information. I think this, while
perhaps appearing carelessly ambiguous, can actually be OK. Because
the legality of e.g. licensing hinges entirely upon a shared mutual
understanding of what e.g. cc:licence applies to. The attempt here is
to be just precise enough to be able to maintain social contracts
around publishing data.

The next step in disambiguation (increasing the resolution by
expressing a relevant difference between the resource and the
representation) is using Content-Location (also expressible by e.g.
dc:hasFormat). The one after that is using wdrs:describedby and
ideally a 303 (explicitly indicating that there is another
description, correspondingly disjoint).

Best regards,
Niklas
Received on Sunday, 25 March 2012 17:08:41 UTC