Re: Explaining the benefits of http-range14 (was Re: [HTTP-range-14] Hyperthing: Semantic Web URI Validator (303, 301, 302, 307 and hash URIs) )

On Fri, Oct 21, 2011 at 2:42 AM, Leigh Dodds <leigh.dodds@talis.com> wrote:
> Hi,
>
> On 19 October 2011 23:10, Jonathan Rees <jar@creativecommons.org> wrote:
>> On Wed, Oct 19, 2011 at 5:29 PM, Leigh Dodds <leigh.dodds@talis.com> wrote:
>>> Hi Jonathan
>>>
>>> I think what I'm interested in is what problems might surface and
>>> approaches for mitigating them.
>>
>> I'm sorry, the writeup was designed to do exactly that. In the example
>> in the "conflict" section, a miscommunication (unsurfaced
>> disagreement) leads to copyright infringement. Isn't that a problem?
>
> Yes it is, and these are the issues I think that are worth teasing out.
>
> I'm afraid though that I'll have to admit to not understanding your
> specific example. There's no doubt some subtlety that I'm missing (and
> a rotten head cold isn't helping). Can you humour me and expand a
> little? The bit I'm struggling with is:
>
> [[[
> <http://example/x> xhv:license
>       <http://creativecommons.org/licenses/by/3.0/>.
>
> According to D2, this says that document X is licensed. According to
> S2, this says that document Y is licensed
> ]]]
>
> Taking the RDF data at face value, I don't see how the D2 and S2
> interpretations differ. Both say that <http://example/x> has a
> specific license. How could an S2 assuming client, assume that the
> data is actually about another resource?

By observing D2. D2 is the page at that URI, it is not what is
described by the page. For example, one talks describes the image,
while the other doesn't. You get different answers. I'm not sure how
to be more clear.

> I looked at your specific examples, e.g. Flickr and Jamendo:
>
> The RDFa extracted from the Flickr photo page does seem to be
> ambiguous. I'm guessing the intent is to describe the license of the
> photo and not the web page. But in that case, isn't the issue that
> Flickr aren't being precise enough in the data they're returning?

If you adopt the httpRange-14 rule, what this does is make the Flickr
and Jamendo pages "wrong", and if *they* agree, they will change their
metadata. The eventual advantage is that there will be no need to be
clear since a different URI (or blank node) will clearly be used to
name the photo, and will be understood in that way.

I feel you're doing a bait-and-switch here. The topic is, what does
the httpRange-14 rule do for you, NOT whether a different rule (such
as "just read the RDF") is better than it for some purposes, or what
sort of agreement might we want to attempt. If you want to do a
comparison of different rules, please change the subject line.

To summarize:

- A "rule" is something that helps eliminate judgment and uncertainty,
and, ideally, facilitates automated processing.
- These URIs (hashless retrieval-enabled ones) are currently being
used in two different and incompatible ways. In the issue-57 document
I call these ways "direct" (it's the document found there) and
"indirect" (just read the RDF).
- If there is no rule, then you can't use one of these URIs without
further explanation as to which way is meant ("being clear"). Maybe
that's OK.
- Any particular rule will assign 0 or more URIs as direct and 0 or
more as indirect. Any time any URI is assigned *either* way some
benefit will ensue to someone, because uses of the URI in that way
will not require further explanation.
- The httpRange-14 rule assigns one of the two ways to all affected
URIs.  The advantage is that people who want to use URIs in this way,
will be able to use them in this way, and be understood. That is, it
gives you a way to refer to anything on the web - even if you don't
know how to read its content, don't trust the content, etc.  It is a
"legacy" solution since it grandfathers everything that was on the web
before we started using URIs in these new and different ways.
- Other rules will have advantages in other situations. What the
httpRange-14 rule does for you can be understood independently of the
virtues of other rules, such as the one Ian Davis put forth last fall,
or the more radical rule that says that all such URIs are indirect.
What httpRange-14 does for you is a different matter from whether
something else is better. If you want to shift to comparison shopping,
please change the subject line.

> The RDFa extracted from the Jamendo page including type information
> (from the Open Graph Protocol) that says that the resource is an
> album, and has a specific Creative Commons license. I think that's
> what's intended isn't it?
>
> Why does a client have to assume a specific stance (D2/S2). Why not
> simply takes the data returned at face value? It's then up to the
> publisher to be sure that they're making clear assertions.

Taking the information at face value *is* a stance - that's exactly
the S2 (indirect) approach. Saying that all hashless retrieval-enabled
URIs are indirect (S2) would be a perfectly principled and coherent
approach, it's just not the one the TAG advised in 2005.

You have to take a stand (if you use these URIs without somehow
specifying the mode) because in almost all cases D2 and S2 give
different answers. If you apply the D2 rule to Jamendo you end up with
a document that says some things that are not true. That's a good
thing, because you want to be able to talk about documents that have
mistakes in them or that you don't believe - for example you would
like to say "Jamendo please fix the mistake in ..." in RDF. Sure,
there may be other solutions, but that wasn't your question - you
wanted to know what the rule does for you.

Jonathan

>> There is no heuristic that will tell you which of the two works is
>> licensed in the stated way, since both interpretations are perfectly
>> meaningful and useful.
>>
>> For mitigation in this case you only have a few options
>> 1. precoordinate (via a "disambiguating" rule of some kind, any kind)
>> 2. avoid using the URI inside <...> altogether - come up with distinct
>> wads of RDF for the 2 documents
>> 3. say locally what you think <...> means, effectively treating these
>> URIs as blank nodes
>
> Cheers,
>
> L.
>
> --
> Leigh Dodds
> Product Lead, Kasabi
> Mobile: 07850 928381
> http://kasabi.com
> http://talis.com
>
> Talis Systems Ltd
> 43 Temple Row
> Birmingham
> B2 5LS
>

Received on Friday, 21 October 2011 12:15:45 UTC