Re: URI-Triples: RDF serialization for use in query strings from Benja Fallenstein on 2005-03-14 (semantic-web@w3.org from March 2005)

From: Benja Fallenstein <benja.fallenstein@gmail.com>
Date: Mon, 14 Mar 2005 18:22:03 +0100
To: Danny Ayers <danny.ayers@gmail.com>
Cc: semantic-web@w3.org
Message-ID: <ff7ba12a050314092210d08f8c@mail.gmail.com>
Hi Danny,

(disclaimer: the post below is a bit rambling about half-cooked ideas)

On Mon, 14 Mar 2005 16:35:54 +0100, Danny Ayers <danny.ayers@gmail.com> wrote:
> Your approach to syntax make sense, and this is definitely worthwhile
> work. I'm not sure it's worth putting much weight on legibility of the
> URI strings, as the encoding make them pretty unreadable whatever you
> do.

True, but IMHO there is still a big difference between "readable if
you concentrate" and "almost unreadable even if you concentrate." I
can read the syntaxes I've been toying with pretty quickly myself, if
I concentrate. OTOH, I've had some feedback that others still find
them almost completely unreadable, so not sure how much it's worth.

I think the difference is big because anything that makes it easier to
play with the interface is good -- being able to put a query string in
your browser and experimenting with variations = good. Although--

> But being a short step away from readability (using XSLT
> formatting where XML data is available, or more simply code*).

Hmm, one thing I hadn't considered, though, is that modern browsers
will do %-encoding for you in the URI bar. Perhaps I've been barking
up the wrong tree -- putting Turtle in my browser bar still seems like
the simplest thing, and definitely more readable!

> What I
> don't get from your examples is the intended behaviour of the service
> (for GET and POST). In other words, if you're doing a GET and passing
> a bunch of statements, what are you asking the server for? 

I was thinking about GET and PUT; for POST, we don't need to encode
stuff in the URI, obviously.

I've actually slightly changed my mind after furiously thinking about
this during the weekend; I think encoding a SPARQL request in the URI
may be more useful, if we can find implementation techniques that make
it easy to implement for web services (more notes on what I mean here
below).

Before that, I was thinking of this as a replacement for the
application/x-www-form-urlencoded query string. Look at a query string
like

    ?q=Danny+Ayers&ie=UTF-8&oe=UTF-8

It seemed to me that what is going on in most query string is that we
give properties of some resource -- namely, the resource identified by
the whole URI that the query string is part of. I.e., what the above
says is,

    @prefix s: <http://google.com/search??q=Danny+Ayers&ie=UTF-8&oe=UTF-8>

    s:    google:q     "Danny Ayers".
    s:    google:ie    "UTF-8".
    s:    google:oe   "UTF-8".

So what we have here is a non-extensible description of the resource
we want a representation of. My idea with URI-Triples was to make this
description extensible.

As I said, though, I think that a SPARQL gateway is probably the
better problem. And probably just %-encoding it is the right thing,
considering that browsers do accept it... I just put this in
Konqueror's location bar:

http://example.net/?CONSTRUCT * WHERE (?x foaf:mbox <mailto@example.org>)

and Konqueror went here:

http://example.net/?CONSTRUCT%20*%20WHERE%20(?x%20foaf:mbox%20<mailto:foo@example.org>)

(Uh-oh, looks like it's buggy -- it didn't escape <> -- but whatever.)

I was originally looking for something that can do "I want some
reasonable information about this resource, e.g. a CBD if you think
that's reasonable" -- but then I found that SPARQL has DESCRIBE, so
nothing special needed there :-)

I said above that I see an implementation problem. What I'm worried
about is services that generate data on-the-fly, with a mapping
between data and triples that isn't entirely trivial. Here's an
extreme example to make the point. Consider a service which, when
GETting this URI:

    http://example.org/hashers/sha1?str=foo+bar+baz

returns this graph:

    _:a    hash:input    "foo bar baz".
    _:a    hash:fn       hash:SHA1.
    _:a    hash:output   "c7567e8b39e2428e38bf9c9226ac68de4c67dc39".

And now consider this SPARQL query:

    SELECT ?a WHERE ( ?x hash:input ?a ) ( ?x hash:fn hash:SHA1 )
        ( ?x hash:output "c7567e8b39e2428e38bf9c9226ac68de4c67dc39" )

Obviously, the service can't "solve" this query. Now not all services
may be such hard cases, but the basic pattern -- with a limited set of
queries -- may not be so unusual. Consider a service on the desktop,
where it's fast and quick to give the size of a specified file, but
you would need to go through the whole harddrive to get the one file
with size 9012384. The implementor would want to implement the
function "given a file, return its characteristics"; can we layer a
SPARQL query service on top of that in a reasonable fashion?

> (aside -
> might the common 256 character limit on GETs be a problem?).

Absolutely.

However, I think that's inherent in the problem -- we just need to
screw it. GET with a body in the request would be a bigger change (and
a spec change too, while the URI spec allows for arbitrary-size URIs).
Of course we *could* use POST, but then we throw away the benefits
from GET.

And when we nest URIs in URIs -- as we have to, in RDF-related
requests -- we'll get long URIs, whatever we do.

We could alleviate the problem, though, by providing default
namespaces on the server side -- i.e., the server would implicitly set
some namespace prefixes if the client doesn't set them explicitly.
E.g. above, I assumed that foaf: was set implicitly.

> There are a few RDF APIs around (e.g. Joseki's WebAPI), and the SPARQL
> Protocol looks like it'll be very useful. But there doesn't seem to be
> much cohesion between the approaches, still nothing you could call The
> RDF Protocol. Ok, it's probably a situation where standards based on
> implemented ad hoc specs might be preferable to up-front WG
> recommendation. But maybe it's time for the DAWG to pick this up and
> do some combination of the various approaches and a little unnatural
> selection.

Hmm, I had a quick look at Joseki, and it sounds to me like obvious
HTTP + the SPARQL protocol does most of this.

Getting a whole graph is HTTP GET; not much to think about there.

Query is the SPARQL protocol, using SPARQL as the query language.

Update, well -- actually I think update should be handled like get and
query, using HTTP PUT. So, if you want to change the volume of
my:speaker to 20, you do this:

    PUT http://example.org/?query=CONSTRUCT%20*%20WHERE%20(my:speaker%20snd:volume%20?x)
    Content-Type: application/x-turtle

    @prefix...

    my:speaker snd:volume "20".

I.e., you use the URI to specify which part of the graph you want to
change, and PUT the triples representing the new state there...

> *
> class URIEncoder:
...

Unfortunately, it doesn't work because {} doesn't guarantee iteration
order (so you might first escape " " to "%20" and then "%" to "%25"),
and because for decoding, escapes may be lower-case. But it's close
:-)

- Benja
Received on Monday, 14 March 2005 17:22:35 UTC