Re: Change Proposal for HttpRange-14 from Jeni Tennison on 2012-03-23 (public-lod@w3.org from March 2012)

From: Jeni Tennison <jeni@jenitennison.com>
Date: Fri, 23 Mar 2012 21:42:49 +0000
To: James Leigh <james@3roundstones.com>
Cc: public-lod community <public-lod@w3.org>
Message-Id: <D8CF4CC8-8D2F-47D3-AD56-DE5D01724C06@jenitennison.com>
James,

On 23 Mar 2012, at 20:24, James Leigh wrote:
> On Fri, 2012-03-23 at 19:49 +0000, Jeni Tennison wrote:
>> On 23 Mar 2012, at 19:23, James Leigh wrote:
>>> I am not saying everyone should care to distinguish them (real data will
>>> always be dirty), but using the same identifier for both the person and
>>> the document should not be the recommended approach.
>> 
>> Absolutely. Where in the Change Proposal do you think it says otherwise? I'd be glad to clarify it.
> 
> At the bottom:
>        where a URI is intended to identify a NIR but provides a 200
>        response, there remains no method of addressing the
>        documentation that is returned by that 200 response (to assert
>        its license, provenance etc); a set of best practices for linked
>        data publishers would need to spell out what publishers should
>        do and how consumers should interpret the information provided
>        within the response and that found at the end of any
>        ‘describedby’ links
> 
> The proposal says there is no way to identify the document.
> 
> A Web crawler, for example, may need to know what document has the
> xhv:stylesheet attached to it. With the current HttpRange-14, the URL
> that returns a 200 is the identifier of the document. This proposal, as
> I understand it, brakes that because the document may have no identifier
> at all.

No. If a publisher wanted to expose information about the document separate from the thing the document was about (ie the Person) then, just as now, they should have two separate URIs for those things.

Any document that contains:

  <http://example.org/me> a foaf:Person ;
    xhv:stylesheet <http://example.org/style.css> ;
    foaf:name "James" .

is just as broken now as under the proposal. Under the current state of affairs, because the document (which I'm assuming was actually at <http://example.org/me>) was returned as the response to the 200 request, the application knows

  * <http://example.org/me> is an information resource
  * <http://example.org/me> is a Person
  * <http://example.org/me>'s stylesheet is <http://example.org/style.css>
  * <http://example.org/me>'s name is "James"

All that changes under the proposal with this document is that the application can't make the first inference unless they have gotten to the document through a 303 redirection or already know that it's the object of a 'describedby' statement.

(I wonder, btw, how many of the webcrawlers are actually adding statements to the effect that the things they are downloading are information resources...)

The big thing that *is* different under this proposal is that if you have an HTML+RDFa 1.1 document like:

<!DOCTYPE html>
<html>
<head>
<base href="http://example.org/me"/>
<link rel="stylesheet" resource="style.css"/>
<title>Me</title>
</head>
<body typeof="foaf:Person">
<h1 property="foaf:name">James</h1>
</body>
</html>

returned with a 200 response from http://example.org/me then the application knows:

  * <http://example.org/me> is a Person
  * <http://example.org/me>'s name is "James"

and does not have a stray and inaccurate

  * <http://example.org/me> is an information resource

hanging around which was contrary to the publisher's intent.

Anyway, I wonder how we might change the paragraph that you quoted to remove the implication that publishers can get away with one URI when they want to identify two things. Would this work better:

       where a URI is intended to identify a NIR but provides a 200
       response, there remains no method of addressing the
       documentation that is returned by that 200 response (to assert
       its license, provenance etc); publishers still need to support
       a separate URI if they want to make statements about the
       documentation distinct from the NIR. An updated set of best 
       practices for linked data publishers would need to spell out what 
       publishers should do and how consumers should combine the 
       information provided within the response with that found at the 
       end of any ‘describedby’ links.

Cheers,

Jeni
-- 
Jeni Tennison
http://www.jenitennison.com
Received on Friday, 23 March 2012 21:43:15 UTC