- From: Paul Gearon <gearon@ieee.org>
- Date: Fri, 15 May 2009 10:53:48 -0500
- To: "Seaborne, Andy" <andy.seaborne@hp.com>
- Cc: Alexandre Passant <alexandre.passant@deri.org>, SPARQL Working Group <public-rdf-dawg@w3.org>
On Fri, May 15, 2009 at 5:01 AM, Seaborne, Andy <andy.seaborne@hp.com> wrote: <snip/> > Servers should be allowed to (and encouraged?) to reject "file:". Technically, much like "FROM <file:...>" but with worse consequences. I agree, but this is still a useful feature if security is configured appropriately. We've regularly been in a position of loading multi-GB files, and this is handled much better from the local file system. Also, if you have permission to write to graphs on the server then that will often correspond to permission to write to the host's file system. >> So that's file: URLs. What about http: ? >> >> If we are using the http protocol in the URL to be loaded then >> everything is being transferred by http protocol anyway, so why >> confuse the issue by including a "command" in the transfer? It also >> makes it awkward for the client that wants to send a file to the >> server, but doesn't have an HTTP server on hand to respond to an HTTP >> request for the file to be loaded. (This particular scenario also >> requires two connections, when one would suffice) > > The number of web hops the data takes is important. With LOAD <url> the data flows from the URL to the server, and does not flow via the client. Sorry, I wasn't clear. We never want the data to move more than once. The server should receive an http: URL and retrieve that on its own. My first point was really a matter of style. Since I was already thinking of protocol-based load operations, I was just suggesting that maybe a text-based command would be overkill. Looking back at it, it's a strange objection, and I withdraw the suggestion. I was only playing devil's advocate anyway. :-) My second point was the case where a user has a file that they want to upload. If we only support "load <http://....>" then this means that the user must have access to write the required file to a web server somewhere. If that's a separate server, then they have to move the file there first before it can be loaded, meaning 2 transfers of the data. Otherwise, the web server must be on the client's host, which creates the bizarre case of the sparql server connecting back to the client that jsut issued the request in order to get the file. Of course, this has the obvious solution of allowing the client to send the data up with a POST, as was discussed later. > A use case I have in mind is the ability to collect data from a number of places with an update script of > > LOAD <url1> > LOAD <url2> > LOAD <url3> > ... Certainly, and I support this. The difficulty I'm pointing out is the case where the data to be loaded is being held by the client (this forms the majority of our use cases). Even if these LOAD commands were not available, they could be simulated with (for the first url): INSERT { ?s ?p ?o } WHERE { GRAPH <url1> {?s ?p ?o} } Incidentally, I'd also like to see the LOAD command updated to have an optional [INTO <uri>] at the end of the command. So the following would be equivalent: LOAD <url1> INTO <uri2> INSERT INTO <uri2> { ?s ?p ?o } WHERE { GRAPH <url1> {?s ?p ?o} } >> I'd like to see a standard for POSTing a file to a graph on a server, >> as this can be done easily with code or even a web form. Personally, I >> also like having a command that does a load (we have one in Mulgara) >> but the issues that I described make it seem difficult to standardize >> in a way that will be suitable for any type of configuration. >> >> Please feel free to correct me on any of the above points. :-) >> >> Regards, >> Paul Gearon > > Good point about the use case for a simple POST-data and POST-from-form which look compelling so we're being to tease out the requirements for update. Thanks. I implemented this before having a use case, just because it seemed to make sense. But since we've had it we're finding that it's become one of the most popular ways to load data (I use it exclusively now). Regards, Paul Gearon
Received on Friday, 15 May 2009 15:54:23 UTC