Re: Proposal to amend the "httpRange-14 resolution" from Mo McRoberts on 2012-03-01 (www-tag@w3.org from March 2012)

From: Mo McRoberts <Mo.McRoberts@bbc.co.uk>
Date: Thu, 1 Mar 2012 23:26:10 +0000
To: Jonathan A Rees <rees@mumble.net>
Cc: www-tag@w3.org
Message-Id: <C8887B66-01CB-4D6B-B095-8C0C84D6AFD0@bbc.co.uk>
Hi Jonathan, assembled TAG-watchers,

On 1 Mar 2012, at 17:23, Jonathan A Rees wrote:

> I wrote my document from the receiver's point of view side - "what
> does this URI mean", and by implication the sender's point of view
> "what can I write so that I'll be understood". You seem to be arguing
> for a presentation from the URI owner side "if I want senders and
> receivers to have a particular understanding among one another, what
> do I need to do."

In effect, yes. I think there's a real danger in going around in circles trying to find an answer to “how can we infer things which publishers have opted not to make easy to determine” before you arrive at an answer of “well, maybe it's just not such a big deal for the publishers to make it clearer".

> I don't think this is necessarily the best way to
> couch the problem, as the URI owner doesn't always have a stake in the
> sender/receiver interaction, but it is a legitimate question and your
> approach might be better, so I will place it in my list of editorial
> issues.

Point conceded regarding URI owner's stake…

> One of the sticky points is what the receiver should do if the
> URI owner didn't have any particular intention at all for how the
> sender was supposed to use the URI (other than as the target-URI of an
> HTTP request, or in @href). The httpRange-14(a) clause starts to cover
> this case, in a way that obviously not everyone agrees with (I
> certainly find it flawed since it doesn't help in the Flickr/Jamendo
> situation).


I suppose I am having trouble in seeing the real-world problem which doesn’t fall into the category of:

- “Doctor, doctor, it hurts when I do *this*!”
- “Well, don’t do that, then!”

i.e., what scenario occurs where you actually need to know in an automated fashion whether the URI for a resource refers to an NIR or not _and_ it is not described as such in some manner that you can interpret _and_ doesn't involve minting URIs in other people's namespaces?

> Thanks for the input - I found it quite valuable - and I hope you stay
> engaged. If I sound critical it's only because I've been staring at
> this for too long, not because I mean any harm. It is really important
> I think for the TAG to hear a diversity of views.

Far from sounding critical, I thought it went down surprisingly well given it was a thinly-veiled rant ;)

>> More pressingly, however, it appears be to be written to answer the question “how can I tell, given a representation, whether the URI I had refers to a thing, or the document describing that thing?” — this is a question which crosses problem domains, and I’m not convinced is the actual problem at all. Maybe I've misunderstood the document; it is rather unwieldy.
> 
> Hmm. I thought the problem was clearly stated: people are saying 303s
> have performance and deployment problems, they don't like hash as an
> alternative, consensus threatens to unravel, what should we do? Live
> with 303 as you suggest, and advocate hash URIs to avoid the 303
> problems? That's certainly fine by me, but then I'm going to be
> pleased by any consensus. Not everyone agrees, so how can we bring
> them around and get consensus around that answer? Is there any other
> answer that is more likely to get consensus?

Okay, fair enough, although given that I think there really needs to be a lot more to the “don’t like hash URIs as an alternative” case than there is. It really does seem to boil down to folk saying “I don't really like them” rather than there being any serious issue with them as a solution.

(Obviously, my stance is is near-status-quo, though I have a preference for hash URIs over 303 for the various reasons covered — indeed, I'd be personally happy to drop the overloaded interpretation of 303 altogether).

> Certainly there are many other problems relating to linked data, but
> it was not my aim to take them on.

Sure, though lines are almost always a little blurry…

> How do we name things that are on the web, which do *not* have
> machine-readable descriptions and for which nobody is following either
> Recommendation A or Recommendation B? For example, what Turtle term
> would you use to refer to the document found at
> http://www.w3.org/TR/2004/REC-rdf-mt-20040210/ ? Your Recommendation A
> says to use that URI, but your Recommendation B says to deploy and
> then use 303 (as is done at dx.doi.org), which would have to involve a
> second URI.

Right, so there's a lack of clarity on my part there. Rec B is only actually worthwhile where you _have_ to have different URIs for a described thing and the description of it. Given this case is referencing a specific information resource, which HTTP and URIs do anyway, there's no particular reason to do anything special, so you're hitting Rec A by default.

> This is the kind of question that led me to a receiver-centric
> presentation. To me the real technical question is, how does the
> receiver interpret what the sender says?

okay, so given the above, does this question still apply in a meaningful way? (looping back to the 'what scenario occurs...' question, above)


>> The actual question is: ”as a publisher, how can I name resources such that consumers of my data can differentiate between my descriptions of documents, representations, and NIRs while also allowing the descriptions of my NIRs to be retrieved by derefrencing the URIs I assign to them?”
> 
> Once a description (documentation) is available, those distinctions
> can be made in RDF. The hard parts are things like agreeing on where
> to find that documentation, how to tell whether it's there in the
> first place, what to do if it's not found, and so on.

I think we disagree slightly here... if the RDF is published, and it's published in a way which means that

a) you can retrieve it by dereferencing the URI
b) it doesn't use the same URI to refer to IRs and NIRs

...then distinguishing is trivial, you just look for the statements referring to the URI you started the process with and process them.

> You are inspiring me to prepare a set of test cases, since the same
> ones keep coming up: dx.doi.org, Flickr, Jamendo, Manchester syntax,
> data:, and so on.

That would be good, I think.

>> It’s this question which most looking to httpRange-14 — and subsequent discussions — have sought an answer to, what the functioning of linked data is predicated upon, and so this is what my proposal was written to answer, albeit in rather rough form.
>> 
>> Given that, I'd suggest a new title of “Guidance for linked data publishers: Choosing URIs”.
> 
> But we already have such guidance, in many places, such as the "Cool
> URIs for the Semantic Web" note. If these documents aren't working
> that's too bad but it's not the problem I'm trying to solve. Are you
> saying the TAG ought to take on more leadership in this area? Good
> exposition is a hot potato but I'm not sure the TAG is the best place
> to do the work, and it is not being urged on us.

I guess I'm saying that the approach httpRange-14 took was, by sticking closely to “be liberal in what you accept”, not all that helpful for those thinking about publishing linked data and wanting straight answers, and similarly I'm not at all convinced that developers of agents have had all that much difficulty, because of URIs, in figuring out whether responses contain the stuff they need or not. Sometimes, publishers do so in an unhelpful/less-than-ideal way, and this causes problems, but the problems aren't because of lack of consensus on how to interpret stuff.

>> Further, the premise for this exercise seems to be the notion that both 303-based redirects nor hash-URIs have horrible fatal flaws which make them unworkable, when it's not all that clear that this is the case (particularly in the case of hash URIs, the criticisms at http://www.w3.org/2001/tag/awwsw/issue57/20120202/#hash seem pretty weak from a linked data consumer’s perspective, all told).
> 
> No, the premise is that there is weak consensus and it is desirable to
> get strong consensus.

I can definitely live with that, and am very glad to hear it (good luck, though…!)

> I agree completely that the merits of those
> arguments are unclear - I was only trying to record what other people
> were saying. (I could have done better with the attributions.) But the
> opponents of hash+303 are respectable and have some good points. How
> do you think is the best way to bring the community together on these
> issues? The TAG could dig in its heels and try to advance a  document
> on Rec track that doesn't give an efficient and easily deployed way to
> do discovery for hashless URIs, do you think this would help get
> consensus, even if combined with additional how-to guidance? I don't
> know.

I'm not sure, if I'm honest.

httpRange-14 from a consumer's point of view is a pretty clear and straightforward set of rules, although the benefit of hindsight provides an opportunity to tighten it up.

On the other hand, I think any revision to httpRange-14 needs to point *clearly* at guidance for publishers.

So, I guess, my stance boils down to:–

1. Why not "hash URIs"? Hash URIs are fine, use them where feasible.

2. Why is "303" unacceptable? It is acceptable, but suffers from disadvantages which mean it isn't preferred.

3. When is a given 200 response payload to be a nominal URI documentation carrier for the URI? It isn't.

4. When does a given 200 response payload mean that the identified resource is nominally an information resource? Always.

Cheers,

M.

-- 
Mo McRoberts - Technical Lead - The Space,
0141 422 6036 (Internal: 01-26036) - PGP key CEBCF03E,
Project Office: Room 7083, BBC Television Centre, London W12 7RJ
Received on Thursday, 1 March 2012 23:26:56 UTC