Re: Issue 10 -- Hash vs. Slash

* Juan Sequeda <juanfederico@gmail.com> [2011-02-23 13:33-0600]
> Thanks Ivan for this!
> 
> Comment inline
> 
> On Wed, Feb 23, 2011 at 4:53 AM, Ivan Herman <ivan@w3.org> wrote:
> 
> >
> > On Feb 23, 2011, at 07:10 , Juan Sequeda wrote:
> >
> > > I believe the issue is the following:
> > >
> > > 1) hash - http://foo.example/DB/People/ID=7#_
> > >
> > > vs
> > >
> > > 2) slash - http://foo.example/DB/People/ID=7
> > >
> > > For option 1) you would actually have to retrieve the whole graph while
> > for option 2) you would do a 303 ( a la linked data) and just get the
> > particular triples needed.
> > >
> > > Eric, is this right?
> >
> > I am not sure that is so clear cut.
> >
> > Juan, I know that much of what I write below is known to you, so it is not
> > meant as a direct answer to you. However I thought writing down the issue
> > more in details makes sense for others, mainly for those of us who are more
> > on the 'database' side rather than the Linked Data/RDF side. Sorry if it is
> > a bit longish (and I am sure that Richard, who is much more picky on these
> > things than I am, will correct me if needed:-)
> >
> > We are talking about the URI identifying the subject for a row. For the
> > sake of discussion, let us say this URI is <lala>. Note that <lala> does not
> > represent a 'real' thing _in_ the database, it rather refers to a kind of an
> > abstract, conceptual thing.
> >
> > The question for a linked data person is: what do I expect to receive when
> > I do an HTTP GET request on <lala>? Note that we have _not_ yet defined
> > that, but I think a fairly safe answer is to say that somehow the triples
> > that have <lala> as subject, ie, the triples for the row, are returned in
> > some way. And it also depends on the type of information I expect; ie, what
> > are the preferred media types I add to my HTTP GET request. If I expect
> > HTML, then I may get an HTML page with a one row table with headers and the
> > values. If I expect RDF in some encoding format, I may get all the triples
> > that are generated for that row and which have <lala> as subject. This is
> > the magic of content negotiations performed on the server side. In both
> > cases, we have to realize that the returned information is NOT <lala>; it is
> > a _representation_ thereof. Ie, the information that is returned should have
> > a different URI, say, <lala-r>. The question of course is what is then
> > <lala-r> if I know <lala>. (One can go a step further and have a
> > <lala-r-html> and <lala-r-rdf>, a bit like the dbpedia URI-s are used, but
> > we may not want to go that far.)
> >
> > Let us say <lala> is </People/ID=7#_>. Per HTTP spec, what goes to the
> > server in terms of a GET URI is _not_ </People/ID=7#_>; it is
> > </People/ID=7>. This is what the fragid of URI+HTTP tells us to do. (There
> > is no need for any client side trick; if I copy paste that URI into my
> > browser's address bar, this is what should happen for a well behaving
> > browser). In other terms we can safely assume/define <lala-r> =
> > </People/ID=7>, this is what is sent to the client, the content negotiation
> > occurs, I will get back information under the URI </People/ID=7> which I can
> > consider to be the representation of the (abstract) URI </People/ID=7#>. I
> > am done. (There is of course a trick here: indeed, there is nothing
> > _meaningful_ after the hash, it is just a trick to automatically
> > differentiate between <lala> and <lala-r>. In more general cases that might
> > be a load because the information being returned may not have the right
> > granularity, and this is what we referred to on the call that 'the whole
> > graph has to be downloaded'. But this is not relevant here.)
> >
> > Let us say <lala> is </People/ID=7>. In this case the client has no other
> > choice then to send </People/ID=7>, ie, <lala> to the server. So the client
> > has to have its own setup on what <lala-r> is. This URI has to be
> > communicated back to the client which, in a second HTTP round, will ask for
> > <lala-r>. This is the dreaded 303 response mechanism: the server sends back
> > a message saying: "<lala> is not something I can return, but you may want to
> > look at <lala-r> which is a good representation of <lala>", so the client
> > will then issue a second GET on <lala-r>.
> >
> > Ie, there is no issue Javascript here. How and where SQL comes into the
> > picture is also immaterial in this sense. I believe that, at the end of the
> > day, we can safely assume that the graph being returned is the same in both
> > cases. But setting up the slash case seems to be more demanding on the
> > server side and increases the necessary HTTP requests. So, personally, I
> > think that that the hash version has a lower load.
> >
> > Actually:
> >
> > - It is a mini-minor issue, but I am not even sure that the '_' character
> > is necessary at the end of the URI. AFAIK, everything works equally well if
> > we use </People/ID=7#>. But I may miss something...
> >
> 
> Exactly. I still don't have a clear reason why we need a '_'. Eric is the
> one who proposed it. Eric, can you please clarify.

The media type for the returned resource determines the interpretation
of fragids in that resource. In RDF, it's at least customary to provide
a fragement identifier so that
  <:People rdf:ID"_" :fname="Bob"/>
describes the node uttered in the DM.

I'm loath to go back to both RDF Core and the TAG to revisit
http-range-14 with the clarifying question "Is the denotation of
<lala#> different from the denotation of <lala>?" In at least HTML,
they are the same, and I wouldn't want to bet that no RDF libraries
normalize them, especially when parsing RDFXML.


> > - I wonder whether we should describe somewhere (maybe not as a required
> > feature but an advised one) that the client, when getting a GET request on
> > <lala>, is expected to return a graph containing all triples in a row in the
> > requested media type.
> >
> >
> > >
> > > Ashok, do you mean that it doesn't really matter? Are you saying that
> > when that URI is dereferenced, let it be hash or slash, that it would always
> > get translated into a SQL query and just get the triples that are needed?
> > >
> > > The issue that I see with this URI is the following... consider the
> > prefix
> > >
> > > PREFIX ex: <http://foo.example/DB/People/ID=>
> > >
> > > for slash you would have
> > >
> > > ex:7
> > >
> > > for hash it would be
> > >
> > > ex:7#_
> > >
> > > For ex:7, that works, right? But ex:7#_ is not allowed.
> >
> > I find this argument compelling indeed. And maybe this is something that we
> > will have to live with.

I also lament the loss of this attractive syntax.

Note that this syntax only works in SPARQL and Turtle because we
specifically liberalized XML's NameStartChar:
[[
  [4] NameStartChar ::= ":" | [A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] …
]] — http://www.w3.org/TR/REC-xml/#NT-NameStartChar

It would also only work for the last attribute in compound primary
keys (rare).


> I guess this will be the discussion for next week :)
> 
> 
> >
> > Sorry for the long email...
> >
> > Ivan
> >
> >
> > > If this is true, I would rather have the slash uri.
> > >
> > > Thoughts?
> > >
> > > Juan Sequeda
> > > +1-575-SEQ-UEDA
> > > www.juansequeda.com
> > >
> > >
> > > On Tue, Feb 22, 2011 at 6:05 PM, ashok malhotra <
> > ashok.malhotra@oracle.com> wrote:
> > > I'm trying to understand the issue.
> > >
> > > Are we discussing how to identify the RDF node that corresponds to a row
> > in a table
> > > with a primary key?  If so, we should create a URI that, when
> > dereferenced, performs a
> > > SQL query and gets the row.  A bit of JavaScript on the client or server
> > can turn the
> > > row into an RDF node with properties.   Not sure why we need the 303
> > >
> > > Is this the right question?
> > >
> > > --
> > > All the best, Ashok
> > >
> > >
> >
> >
> > ----
> > Ivan Herman, W3C Semantic Web Activity Lead
> > Home: http://www.w3.org/People/Ivan/
> > mobile: +31-641044153
> > PGP Key: http://www.ivan-herman.net/pgpkey.html
> > FOAF: http://www.ivan-herman.net/foaf.rdf
> >
> >
> >
> >
> >
> >

-- 
-ericP

Received on Monday, 28 February 2011 13:19:54 UTC