Re: Comments on the LDP Spec: Creating new Resources from henry.story@bblfish.net on 2014-10-10 (public-ldp-comments@w3.org from October 2014)

From: <henry.story@bblfish.net>
Date: Fri, 10 Oct 2014 09:58:39 +0200
To: Miguel Aragon <miguel.aragon@base22.com>
Cc: public-ldp-comments@w3.org
Message-Id: <D6468A81-E105-4D72-8F68-806B9A8DA12F@bblfish.net>
On 9 Oct 2014, at 18:25, Miguel Aragon <miguel.aragon@base22.com> wrote:

> Hello to everyone
> Based on the design and implementation process that my team and I have experience, I've several comments about the LDP Spec that I'd like to share with you. But first lets make sure that we talk in the same language:
> 
> Concepts
> Note: Keep in mind that these are the concepts that are working for us. By no means I'm criticising the "Academic point of view"
> Relative URI: A relative URI that was not resolved to an absolute URI because the document didn't specified a base URI (@base).
> Null URI: an empty, relative URI.
> 
> Creation of LDP RDF Sources (LDPRS)
> There are several key points in section 5.1 Introduction that need to be considered:
> An LDPRS can be created by issuing a POST to an LDPC. 
> The client can specify a Slug header to provide a hint of the URI desired for the new resource.
> The examples show that a null URI can be used for the resource to be created. The resulting URI will be forged by the server.
> The LDP test suite goes beyond this and uses relative URIs in the resources that are POSTed to the server. (ex. <something> a ldp:RDFSource. ).
The mistake you are making is that you seem to think that the content of the document POSTed has a bearing on the resource created by
the server. 



> At first we followed this approach, but when we started using JSON-LD as our main RDF format, we started encountering several problems with it:
> If non empty, relative URIs (ex. <something>) are accepted, it doesn't make much sense to support the Slug header. What would happen if both of them were used? 
> 
> Example: 
> Slug: something
> <somethingElse> a ldp:RDFSource.

You'd end up with a document at some URI ( which may be <something> if you are lucky )
containing { <somethingElse> a ldp:RDFSource }

There is no guarantee that the server will also create <somethingElse> . 
What is sure is that the client won't even know what <somethingElse> refers to before POSTing because LDP does not require the resource
to be created to be a URL that would be a  URL consitituted of the LDPC + a segment without a slash. That is containers are not intuitive containers. see
http://www.w3.org/2012/ldp/track/issues/50


> 
> By allowing the client to send both null URIs and non empty, relative URIs, a weird behaviour would be expected:
> If a null URI was used, forge a slug for the new resource and take the LDPC URI as a base for the URI of the resource to be created.
> If a non empty, relative URI was specified, treat that as a hint for the desired slug and use the LDPC URI as a base for the URI of the resource to be created.
The spec says nothing about the role of relative URLs in the document. No URL in the document constitues a request
by the client that the server should create any of them. 
1. It would often be impossible generally because URLs may be remote URLs on remote hosts on which the server POSTed to has no control
2. It would not be useful because the server would have no idea as to what to put at those remote URLs. Even if one
   could imagine a strategy to do so, it could be confusing if two requests simultaneously referred to the same url: which one would 
   be the one to guide the creation of the new resource? etc..ecc...

> The logic needed for this behaviour will impose an unnecessary overhead for each request. 

Indeed. We agree. And that is why it is not the way LDP as described by the spec works.

> As far as we know, specifying relative URIs and not defining a base URI results in an invalid RDF document.

No, the reltive URLs in the created document are resolved using the URL of the created document as per Turtle, JSON-LD,
RDF/XML specs and URI specs. Relative URLs always refer correctly. What may be the problem is that the resources at those URLs exist
in the sense of returning 2xx result.
> If the server supported the creation of multiple resources on a single request, null URIs will overlap with each other.

LDP does not define any behaviour concerning  the creation of multiple resources on a single request. 
This is not that problematic because with HTTP/1.1 you don't need to open a new TCP connection for each
request. With SPEEDY you can also send the requests one after the other and don't need to wait for the 
response to come back. So that comes to the same as creating multiple requests in one go.

So assuming you have SPEEDY you can then create a number of resources in one go with LDP.
What is clear is that using POST you will not be able to have any of the created documents refer
to each other prior to the creation of the document, because LDP gives you no guarantee as to what
the URL created will look like using POST. That means the client needs to wait for the response to
come back from the server before sending the next document that wishes to refer to the first created
document.

PUT seems to be better for situations like that. (Sadly LDP servers are not required to
support PUT.)

With PUT you get all you want:
 • the client knows in advance the name of the resource it is creating.
 -> so it knows exactly what the relative URLs in the document are referring to

Still the problem there would be that if the server rejected one of the PUTs your client
would then need to patch the created documents to modify the urls they were pointing
to.

> Common parsers (like Jena) don't treat null URIs and relative URIs consistently.
That may be a problem with those frameworks. We use Jena in banana-rdf and got it to work correctly. 
  https://github.com/w3c/banana-rdf

> Some of the possible approaches for addressing these problems are:

There

> The obvious solution would be to use fully qualified URIs on every request. But the client doesn't always know what the resulting URI will be.
> Another approach would be to use a placeholder, a fully qualified URI that the server knows it's acting just as a placeholder (Ex. <http://example.org/placeholder>). But that would mean the client is constantly specifying new triples for the same resource (in an academic point of view). And the problem of multiple resources on a single request wouldn't be solved by this approach.
> After some thought, we came with the concept of "Generic Request URI".
> 
> Generic Request URI
> A URI that has as a base, a known and never changing URI, and that ends with a slug that is different for every Generic Request URI created (in our case a timestamp).
> Example
> A template of the form: http://example.org/generic-requests/<timestamp> would create URIs like:
> <http://example.org/generic-requests/1412868212000>
> <http://example.org/generic-requests/1412868258000>
> <http://example.org/generic-requests/1412868262000>
> Using a Generic Request URI when creating resources covers the following problems:
> It standardises the URIs the server will receive.
> If the client wants to specify a hint, it would do so by passing a Slug header.
> Each request describes a unique resource and thus it is academically correct.
> Multiple resources can be created by declaring each one with a different Generic Request URI.
> 
> 
> So an LDP server would accept requests with the following forms:
> A resource with a fully qualified URI. In this case the client attempts to create a resource with a known URI so a Slug header isn't allowed and if the URI is already in use the server would respond with 409 Conflict.
> A resource with a Generic Request URI and no slug specified. The server would use the URI of the parent resource as a base and forge a slug for the new resource however the server is configured to do so.
> A resource with a Generic Request URI and a Slug header. The server would use the Slug header as a hint for the URI of the new resource to be created.
> I've more comments and concepts to share, but I will write another email for them.
> 
> -- 
> Miguel Aragón	
> Mobile: +52 (811) 798 9357
> Skype: miguel.araco
> Email: miguel.aragon@base22.com
> CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message.

Social Web Architect
http://bblfish.net/
Received on Friday, 10 October 2014 07:59:07 UTC