Re: Linked Data Platform ISSUE-20: What is the base URI of a POSTed document? from Henry Story on 2012-10-11 (public-ldp-wg@w3.org from October 2012)

From: Henry Story <henry.story@bblfish.net>
Date: Thu, 11 Oct 2012 13:56:25 +0200
To: Andy Seaborne <andy.seaborne@epimorphics.com>
Cc: public-ldp-wg <public-ldp-wg@w3.org>, "public-philoweb@w3.org" <public-philoweb@w3.org>
Message-Id: <D0DD2EC2-16B1-4A8A-8F02-85B8AE2C8CD9@bblfish.net>
Just ccing this to the philoweb group, because there is a bit of interesting
talk of indexicals here, which appears in the philosophy of language. The thread 
started here:

  http://lists.w3.org/Archives/Public/public-ldp-wg/2012Oct/0104.html


On 11 Oct 2012, at 11:06, Andy Seaborne <andy.seaborne@epimorphics.com> wrote:

> On 10/10/12 18:05, Henry Story wrote:
>> 
>> On 10 Oct 2012, at 17:03, Andy Seaborne <andy.seaborne@epimorphics.com> wrote:
>> 
>>> This does have a real consequence to implementation:
>>> 
>>> A design that
>>> 
>>> 1/ receive POST -- some general receipt handling
>>> 2/ content-type: parse body as RDF
>>> 3/ Decide it's a container
>>> 4/ dispatch request to container
>>> 5/ Create new BPR
>>> 
>>> trying to create an abstraction of "incoming RDF", does not work because the parsing happens before the operation is known to be a container with specific action of creating the new BPR.
>> 
>> There are a few answers to that:
>> 
>>  A. you simply don't parse the RDF and just serialise it to disk into
>>   the file name created around 3 in your design. Doing that everything will
>>   work just right, because the relative URLs will automatically turn into
>>   the right URLs when fetched in the next round.
>>    (I imagine that this is exactly what MUST happen in WebDAV or Atom)
> 
> Aside: I think this is pushing it a bit too far - RDF is a data model, Turtle a transfer syntax. 

I know that Turtle is a syntax, but the point still holds. A minimal server
could be designed that takes rdf inputs and stores them serialised as is to
disc, returns a filename for the resource as the file on disc minus the
(.ttl, .rdf,...) extension, and if requested in a different format does the
transformation then. This is already a bit more than what Atom or WebDAV offer.
What is interesting is that one can move fluidly between both of these. If what
we build can work so that it functions nicely with those protocols then we have
a win. ( certainly the stuff with referring to bnodes from an external resource
is absolutely horrendous )

An LDP server would of course want to offer more, such as: 
 - parsing so that we can immediately notify the server before accepting something 
   if what is sent is broken
 - preparing indexes to make it easier to query a graph
 - accomplishing further actions, such as data coherence management...

> The Turtle bytes aren't the data - the RDF triples (absolute URIs) are.


relative urls are the equivalent on the web to what philosophers of language
call indexicals: me, you, here, there, now, ... These words require the context
to get their full meaning. If you see a sentence such as

  "I love you" but don't know who said it to whome, you cannot really make much of
it. You cannot make any serious deductions or merge it with any other data, because
the context is required to get at the meaning. ( We can do a bit of reasoning on the
web without the context because we can skolemize the context clearly )

If we follow Dan Connoly's intuitions about HTTP's relation to speech acts,
which I illustrate in "Philosophy of the Social Web" [1], then we can see how
the different HTTP verbs change the meaning of what is going on.

1. in a GET the relative URLs are determined by the URL we fetched a document
at or if it is a 301 Moved Permanently by the Location header
   See Tim Berners Lee's remark on that here
    http://lists.w3.org/Archives/Public/public-webid/2012Apr/0006.html
   ( But I think he mixed up 302 with 301 there )

2. In our POST use case the context to interpret the content is not the container. 
We are asking the container in the POST to create a context for the non-yet 
contextualised grph we are sending it . 
POST is a resource _creation_ mechanism, and is non-indempotent
  http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html#sec9.5

[[
The actual function performed by the POST method is determined by the server and is usually dependent on the Request-URI. The posted entity is subordinate to that URI in the same way that a file is subordinate to a directory containing it, a news article is subordinate to a newsgroup to which it is posted, or a record is subordinate to a database.
]]

So this is very similar to placing a new file with relative urls in a directory.

> 
>> 
>>  B. You parse the incoming stream into a graph that accepts relative URLs,
>>    and then in 3/ either
>>      a- place it into a store that accepts relative URLs
>>      b- resolve the URLs against the full store url
>> 
>>   C. You delay the parsing until around 3 or 4 when you know the full
>>    URL.
> 
> No dispute it can be implemented but if the particular implementation choice is forced I think we are in (a minor) "willful violation" of RFC 3986. Implementation choices should be invisible.

What are these violations of the URI spec?
RFC3986 permits relative URLs ( in section 42 no less ;-)

  http://tools.ietf.org/html/rfc3986#section-4.2

> 
>> The fact that A works, is very good reason to believe that my proposal -
>> which Steve Battle named A) is the correct design.
>> 
>> B seems to make a good case for having at least parsers that can parse
>> documents with relative URLs without needing to resolve them.
> 
> That would be a change - the output would not be strict RDF.  The data would have to be modified later to "correct" the URIs.

relative URLs are dependent on their context.

> 
>> C. Should be quite possible to do, since downloading documents should
>> be done asynchronously, and takes time, whereas finding out from the
>> path that a resource needs to be created can be done extremely quickly.
> 
> In the SPARQL GSP, POST to a graph means "add triples" - this is inline with RFC 2616 where it says POST can be "Extending a database through an append operation".   The base URI is the target graph, there being only one URI to consider.

In this case we are POSTing to a container - which is closer to an RDF store. As RFC2616
says it is very close to creating a new file in a directory, which is what we are doing
when we POST to a container - we are creating a new named graph.
> 
> We could phrase is as "the base URI is the target of the request" and then make the target the newly create resource.

yes, something like that.
The base is the newly created resource, as determined by the server.

> 
> Base URIs matter a lot more in RDF syntax - we're just pushing the boundaries of specs not designed with the current (new) usage in mind.

Relative URLs are essential to the functioning of the whole web. documents often
contain them to point to imgs, javascript, etc... 


> 
> 	Andy



[1] http://bblfish.net/tmp/2010/10/26/

Social Web Architect
http://bblfish.net/
Attachments

application/pkcs7-signature attachment: smime.p7s
Received on Thursday, 11 October 2012 11:57:03 UTC