Re: Review of SPARQL 1.1 Uniform HTTP Protocol for Managing RDF Graphs review, rev1.56 from Chimezie Ogbuji on 2010-10-07 (public-rdf-dawg@w3.org from October to December 2010)

From: Chimezie Ogbuji <ogbujic@ccf.org>
Date: Thu, 07 Oct 2010 15:37:05 -0400
To: "Steve Harris" <steve.harris@garlik.com>, "public-rdf-dawg@w3.org Group" <public-rdf-dawg@w3.org>
Message-ID: <C8D399A1.13D62%ogbujic@ccf.org>
Hello, Steve.  Thanks for the review.  My response is inline below

On 10/5/10 4:27 PM, "Steve Harris" <steve.harris@garlik.com> wrote:
> Major
> 
> §4.3
> 
> I'm concerned that "Implementations of this protocol MUST obey the rules
> specified there regarding the resolution of relative URI references" rules out
> reverse proxies as implementations of this protocol. In our experience reverse
> proxies are commonly used infront of SPARQL endpoints to provide load
> balancing, additional security, and/or hardening. From the clients p.o.v. the
> proxy is the SPARQL endpoint.

I'm not sure I understand.  Can you give an example of how the URI
resolution rules in [RFC3986] break proxy behavior and some idea of how
other protocols that build on URIs in the same way avoid this problem? Do
you have some suggested text to address this or are you saying the normative
dependence on [RFC3986] is the issue here?
 
> Also HTTPS is an issue here, it's not generally possible for the server to
> tell from HTTP headers whether the HTTP or HTTPS protocol was used by the
> client, unless it has sight of the outer layers of the networking code. It may
> be possible to guess from the port number, but I don't think that's a good
> basis for picking URIs.

I'm not sure I understand the issue here as well.  Are you saying that using
graph URIs of the form https://example.com is problematic because the server
cannot verify if the underlying protocol being used is indeed HTTPS? If so,
I'm not sure what this has to do with this specification in particular
(rather than being a general issue with using URIs).
 
> I find the logic in the last example in 4.3
> (http://example2.com/rdf-graphs/employee/1) a bit convoluted. It seems like
> it's necessary to parse the RDF document before you can determine which graph
> is to be updated. If I sent that request, and got that result I would
> certainly be surprised.

Parsing the document is necessary if the indirectly specified graph URI is
not absolute because that is the highest precedent (and must be ruled out).
See the earlier thread on this:

http://lists.w3.org/Archives/Public/public-rdf-dawg/2010JanMar/0542.html

It is the same as in any other specification that needs to resolve relative
references involved in an HTTP message with a payload that can embed the
base URI and where the specification has a normative dependence on RFC 3986
for defining the appropriate behavior.

For example, if you were to fetch and parse the following document from
http://example.com/rdf-graphs/service/:

    <?xml version='1.0' encoding='UTF-8'?>
      <rdf:RDF
        xml:base='http://example2.com/rdf-graphs/employees/'
        xmlns:rdf='...'>
        <rdf:Description rdf:about="1">
            <rdfs:label>An RDF document</rdfs:label>
        </rdf:Description>
    </rdf:RDF>   

The resulting RDF triple should be:

<http://example2.com/rdf-graphs/employees/1> rdfs:label "An RDF document"

Despite the fact that it was fetched from example2.com.


> Minor
> 
> Abstract
> 
> "SPARQL update language", should it be SPARQL Update...?
> Why is ³statements² italicised?
> http://www.w3.org/TR/ is not linked

Changed.
 
> §1
> 
> "It emphasizes a clear separation between a RDF graph management action from
> the networked body of RDF knowledge identified as the target of the action,
> the lexical form of a Request URI, the URI of a graph in an
> Network-manipulable Graph Store, and the (optional) RDF delivered with the
> message" ‹ I can't parse this sentence.

Changed to:

[[[
It emphasizes the distinction between an RDF graph management action, the
networked body of RDF knowledge identified as the target of the action, the
lexical form of a Request URI, the URI of a graph in an Network-manipulable
Graph Store, and the (optional) RDF delivered with the message
]]]
 
> §2
 
> I think REST should be in []s, there's a informative reference at the end of
> the doc.
> "Network-manipulable Graph Store - The subset of a Graph Store comprised of
> named RDF graphs that can be directly managed by interactions through this
> protocol" ‹ does this imply that you can't managed the unnamed graph?

Yes, as it is currently written it doesn't support this.  We now have an
issue for this, and it will need to be addressed in the next round of edits.

> "RDF knowledge" seems to overlap heavily with "RDF Graph", and "RDF graphs"
> used earlier, and "RDF payload" defined later.

Their definitions distinguish them from each other.  RDF knowledge is an
information resource (the others are not).  An RDF graph is a data structure
(defined elsewhere).  RDF payload is the representation carried in an HTTP
message within the protocol (the others are not), etc.
 
> §4 
> in 4.1 it might be worth a quick note about what to do in the presence of
> reverse proxies, or even just noting that they're a problem in this case.

This is related to an earlier question, but do you have an example of a
problematic situation or a summary of the general problem that can be used
as a basis for such a note?
 
> I didn't feel that Figure 1 made the situation clearer. I think if the network
> operations were separated out from the conceptual relationships it might help.
> Is Figure 2 missing some arrows? There doesn't seem to be a connection between
> the encoded URI, the graph store, and the operation.

Figure 1 was updated to reflect the change to the term "RDF knowledge"
instead.  I have added an 'identifies' relationship between the 'parent' URI
and the target of the operation in Figure 2.
 
> §5 
> It's not clear if the 404 requirement trumps the 400/405 or not. e.g. if I
> send an
> TRACE http://server/?graph=doesnotexist
> should it return 404 or 405?

I've changed the section of that paragraph from:

" with 405 (Method Not Allowed) or 400 (Bad Request), respectively."

To 

" with 405 (Method Not Allowed) or 400 (Bad Request), respectively. A
request using an unsupported HTTP verb in conjunction with a malformed or
unsupported request syntax should respond with a 405 (Method Not Allowed)."
 
> The 404 requirement presumably doesn't apply to PUT and POST.

The subsequent sentence was changed from

" If the RDF knowledge identified in the request does not exist in the
server, a 404 (Not Found) response code [...]"

To

"If the RDF knowledge identified in the request does not exist in the server
and the operation requires that it does, a 404 (Not Found) response code
[...]"

> §5.1
> 
> What does "native" mean in this section?

Changed from 

"It SHOULD be considered a native implementation of the following sequence
of [..]"

to

"Any request that uses the method in this way SHOULD be
be understood to
have the same effect as the following sequence of [...]"

(changed elsewhere as well)

> 
> I think it should be
>    DROP SILENT GRAPH <graph_uri> ;
>    INSERT DATA { GRAPH <graph_uri> { .. RDF payload .. } }
> As per the latest Update draft.

This has been changed.
 
> §5.2
> 
> Don't understand opening sentence "The HTTP DELETE method SHOULD be used to
> delete the RDF knowledge identified by either the request or encoded URI. This
> method MAY be overridden by human intervention (or other means) on the origin
> server." does this imply that HTTP DELETE is the preferred way to delete
> content? I suspect "be used to" should be removed. There's similar wording in
> other parts of §5 too.

Changed to: "A request that uses the HTTP DELETE method SHOULD delete the
RDF knowledge identified by either the request or encoded URI [...]"

Other similarly worded sentences were changed in this way.

Thanks

-- Chime


===================================

P Please consider the environment before printing this e-mail

Cleveland Clinic is ranked one of the top hospitals
in America by U.S.News & World Report (2009).  
Visit us online at http://www.clevelandclinic.org for
a complete listing of our services, staff and
locations.


Confidentiality Note:  This message is intended for use
only by the individual or entity to which it is addressed
and may contain information that is privileged,
confidential, and exempt from disclosure under applicable
law.  If the reader of this message is not the intended
recipient or the employee or agent responsible for
delivering the message to the intended recipient, you are
hereby notified that any dissemination, distribution or
copying of this communication is strictly prohibited.  If
you have received this communication in error,  please
contact the sender immediately and destroy the material in
its entirety, whether electronic or hard copy.  Thank you.
Received on Thursday, 7 October 2010 19:47:09 UTC