HTTP Update : graph naming from Andy Seaborne on 2009-11-10 (public-rdf-dawg@w3.org from October to December 2009)

From: Andy Seaborne <andy.seaborne@talis.com>
Date: Tue, 10 Nov 2009 12:45:17 +0000
To: RDF Data Access Working Group <public-rdf-dawg@w3.org>
Message-ID: <4AF9605D.1060206@talis.com>

I had a quick look at implementing the update protocol and a couple of 
points came up regarding ?graph= vs http://host/service/http:// vs 
http://host/graphs/mygraph56

I wanted a service endpoint that could handle both server controlled 
URIs e.g. http://host/graphs/mygraph56 and also graph names that were 
not server local. http://host/graphs/?graph=.. or the non-query 
concatenation form.

(Aside: some agreed neutral terminology needed!
    server-relative URIs
    query string identification
    concatenation identification
)

1/ Service endpoint naming.

It's not always clear what service endpoint URI actually is so splitting 
request URIs at the service endpoint can be tricky.

This can be because the service endpoint is actually inside the firewall 
and the request is passed from the front end.  Aligning these names 
would be an additional deployment constraint that might be hard to 
justify. (Steve has mentioned this before.) Also, in a framework like 
servlets, the processing code might not actually know their own service 
endpoint name very well, or it may be registered under several different 
URLs.

This gets into service endpoint vs dataset/graph store naming.

2/ Extreme URIs.

For the non-query string form, corner case URIs can be difficult or 
ambiguous.  URI scheme names may appear in URIs as legitimate character 
strings so looking for a ":" is not always safe enough.

Some (unusual URIs - but this is a spec) cases: one level of encoding 
removed for clarity.

# Two encoded http: in URI
http://service/graphs/http://other/abc/http://yetanother/xyz

is a legal URI but which "http://" is the split?

This quite nasty as it could be passed round so it's not just the 
service decoding it to get the meaning out.



# Enmbedded data has a URI in it.
http://service/graphs/http://other/g1%25query=\
       CONSTRUCT...{GRAPH<http//yetother/}

%25 is ? encoded.

The "?" in ?query is encoded a second time to neutralise it and means 
something very different from if encoded once :-)


I'm still puzzling over the use of ?query=CONSTRUCT and query patterns 
with URIs in them.


3/ Server-scoped names with #frag as tricky.


These are extreme examples but ones we need to consider.  I'm not keen 
on just saying "don't do that" and placing a limitation in a spec unless 
there identified reasons not too.  Some principle is needed.  The 
server-controlled cases should be just REST operations, with no 
SPARQL-derived restrictions so we have to make sure even extreme URIs do 
the right thing (at least, they are not ambiguous).


What experience have people had here?

 Andy

Received on Tuesday, 10 November 2009 12:45:42 UTC