Re: Review of http://www.w3.org/2001/sw/DataAccess/proto-wd/ from Eric Prud'hommeaux on 2005-09-02 (public-rdf-dawg@w3.org from July to September 2005)

From: Eric Prud'hommeaux <eric@w3.org>
Date: Fri, 2 Sep 2005 18:32:11 -0400
To: Kendall Clark <kendall@monkeyfist.com>
Cc: DAWG public list <public-rdf-dawg@w3.org>
Message-ID: <20050902223211.GD17622@w3.org>
On Fri, Sep 02, 2005 at 11:36:52AM -0400, Kendall Clark wrote:
> 
> On Tue, Aug 30, 2005 at 12:47:58AM -0400, Eric Prud'hommeaux wrote:
> > In this, I use the notation ~ to indicate a change to a line, + or -
> > to indicate addition or removal.
> > 
> > Abstract:
> > 
> > [[
> > SPARQL is a query language and data access protocol for RDF.
> > ]]
> > 
> > This implies to me that the protocol is used for something other than,
> > or at least, more than, SPARQL queries. I suggest "SPARQL is a query
> > language and protocol for RDF."
> 
> Hmm, really? "data access" is in the name of our group... I think this is
> editorial, and I'm not convinced by the claim of a misleading implication.
> And it's even harder to sustain that implication given what the first few
> sentences of the introduction say.

Let me try to motivate the controversial editorial comments...

The only data access protocol we offer is the query protocol. "query
language and data access protocol" suggests to me that there's a query
language and data access a protocol beyond the queries, i.e. GetGraph
or something like that. I'm not saying the current words are
specifically inaccurate, but they do let one make a wrong assumption
which is corrected by further reading of the spec.

> > s/an SPARQL query/a SPARQL query/ as SPARQL is always voiced as a word
> > rather than an acronym.
> 
> Right, which is one reason why "SPARQL Query Language" in the QL spec is
> fine, despite Andy's claims to the contrary. I sent him this exact comment
> (about "SPARQL" acting like a noun rather than an acronymy)... Oh well, so
> much for inter-spec consistency.

Aha, so I should look inwards before looking at other specs. Pointer?
  http://www.w3.org/Search/Mail/Public/search?type-index=public-rdf-dawg&index-type=t&keywords=SPARQL+noun+Kendall&search=Search
got only your message to me.

> > s/is being developed by/was developed by/ for publication to instill
> > sense of confidence.
> 
> Yep, got it.
> 
> > Introduction:
> > 
> > "or binding to another protocol" leaves the reader thinking not about
> > wire protocols, but some peer protocol, perhaps for manipulating
> > graphs. I recommend:
> > [[
> > SPARQL Protocol is described in two ways:  first, as an abstract interface
> > independent of any binding to
> > a transport protocol; second, as HTTP and SOAP bindings of this
> > interface.
> > ]]
> 
> Eh... All these appeals to the Ideal Reader fall flat. SOAP isn't a
> transport protocol, after all!

Technically, neither is HTTP (in OSI-parlance, it's a transfer
protocol; TCP is a transport protocol). It would be nice to get
the reader pointed in the right direction, but I don't get the
impression that you are interested in this level of nitpicking.

> > ========================================================================
> > 
> > 
> > SPARQL Protocol:
> > 
> > [[
> > and operations; and it is also described concretely
> > ]]
> > s/ and// I think a ';' should separate a independent sentence and
> > starting sentences with "and" is weak.
> 
> I tweaked this.
> 
> > [[
> > SparqlQuery is the protocol's only interface. It contains one
> > operation, query, which is used to convey a SPARQL query -- including a
> > SPARQL query string and, optionally, an RDF dataset description -- and a
> > query result between clients (requesters) and services (responders).
> > ]]
> > 
> > The sentence is awkward to parse. How about moving the description of
> > the message contents (query string and dataset) to the discussion of
> 
> I tweaked this sentence too.
> 
> > Why In/Out instead of the more familiar Request/Response?
> 
> Those're the terms used by WSDL2.
> 
> > [[
> > This interface and its operation are described in the following WSDL
> > 2.0 fragment (from sparql-protocol-query.wsdl):
> > ]]
> > 
> > The reader encounters namespaces suddenly so I would say
> > "(from sparql-protocol-query.wsdl, which contains the namespace
> > declarations).
> 
> Ok.
> 
> > [[
> > The RDF dataset may be specified either in a legal [SPARQL] query
> > using FROM and FROM NAMED keywords; or it may be specified in the
> > protocol described in this document; or it may be specified in both
> > the query proper and in the protocol.
> > ]]
> > 
> > s/legal //
> > s/query propert/query string/
> 
> Done.
> 
> > [[
> > Resolving an Ambiguous RDF Dataset
> > ]]
> > 
> > The database isn't ambiguous 'cause this spec says that protocol
> > overrides query string. How about "Overriding the RDF Dataset"?
> > This also shows up in an example section title.
> 
> I'm still happy with ambiguous.
> 
> > [[
> > the dataset specified in the protocol must be the RDF dataset consumed
> > by SparqlQuery's query operation.
> > ]]
> > I guess the rule is, if any dataset parameter is specified by the
> > protocol, the query must be performed over exactly the dataset
> > specified in the protocol, including partial intersections and proper
> > subsets.
> 
> Was this a suggestion for additional, clarifying language?

Hmm, rereading [[
In the case where both the query and the protocol specify an RDF
dataset, but not the identical RDF dataset, the dataset specified in
the protocol must be the RDF dataset consumed by SparqlQuery's query
operation.
]], I think it says it. Maybe an extra sentence will cement the point:

"Thus, if the protocol specifies either a default-graph-uri or
any named-graph-uris, the query operation MUST ignore any FROM
or FROM NAMED directives in the SPARQL query string."

I'm not super-pleased with introducing "directive", but I feel it's
worth it to clarify that one can't, say, override the default graph
without wiping out all the named graphs.

> > 3. query Out Message
> > 
> > [[
> > Abstractly, the contents of the Out Message of SparqlQuery's query
> > operation is an instance of an XML Schema complex type, called
> > query-result in Figure 1.2, composed of either one or the other of two
> > further elements:
> > ]]
> > confusing
> > s/ either one or the other of two further elements//
> 
> ACK.
> 
> > 4. query Fault Messages
> > 
> > QueryRequestRefused does not represent an refusal to query a
> > request. How about just "QueryRefused"?
> 
> No, it represents a refusal of a query request.
> 
> > [[
> > When the MalformedQuery fault message is returned, query processing
> > services should include explanatory, debugging, or other additional
> > information intended for human consumpution via the fault-details type
> > defined in Figure 1.3.
> > ]]
> > text is nearly consistent with another defn 2 paragraphs below, except
> > that the the later one talks about "the fault-details  XML Schema type"
> 
> ACK.
> 
> > 
> > QueryRequestRefused
> > [[
> > It is not part of the semantics of the QueryRequestRefused fault
> > message as to whether the server may or may not process a subsequent,
> > identical request or requests.
> > ]]
> > 
> > One request-one fault, if I understand. Also, could be shorter:
> > 
> > "A QueryRequestRefused fault message does not indicate whether the
> > server will process a subsequent, identical request."
> 
> ACK.
> 
> > HTTP Bindings
> > 
> > [[
> > it requires protocol bindings to become a concretely invocable
> > operation.
> > ]]
> > I don't think concretely adds anything.
> > s/a concretely invocable/an invocable/
> 
> ACK.
> 
> > ]]
> >             <!-- Default is application/xml -->
> >             Whttp:outputSerialization="" 
> >             whttp:faultSerialization=""/>
> > ]]
> > That Whttp seems suspicious. I guess it's not actually excerpted text yet.
> 
> ACK. Changed due to decisions made in Tues's call anyway.
> 
> > HTTP Examples
> > 
> > [[
> > The following abstract HTTP trace examples illustrate invocation of
> > the query operation under several different scenarios. These example
> > traces are abstracted from legal HTTP traces in three ways: (1) In
> > each example the string "EncodedQuery" represents the properly encoded
> > string equivalent of the SPARQL query given in the first block of each
> > example; (2) only partial response bodies, containing the query
> > results, are displayed; (3) the URI values of default-graph-uri and
> > named-graph-uri are not properly encoded. See @@ for legal HTTP
> > traces.
> > ]]
> > 
> > s/legal/complete/g
> 
> ACK.

or "kosher". i still like "kosher".

> > This doesn't tell me what encoding I should use, or if the query
> > string is url-encoded (unless I look in the <pre/>s for
> >   whttp:inputSerialization="application/x-www-form-urlencoded").
> 
> Yep, that section is (in that draft) unfinished. Coming very soon.
> 
> 
> > By the phrase "which is formatted in order to be readable", do you
> > mean HTML formatting (bolding), or not url-encoded? (HTML is
> > permanently in debug mode.) The *-graph-uri parameters need to be
> > encoded in real life.
> 
> I'm dropping that phrase because it doesn't add anything to the disclaimer
> that prefaces all of the HTTP examples.

Cool. that's my pref.

"are not properly encoded". Well, they're not encoded at all. I went
hunting for the best word (crums below), but the best I can think of
to replace "properly encoded" is "urlencoded".

Crumbs:

The WSDL Adjunects 6.9.1 should say something about this encoding, but
I think they only make references to the mime type and to "%-encoded"
(by which they probably refer to the IRI encoding).

I think adjuncts needs some of the text from XForms

http://www.w3.org/TR/2003/REC-xforms-20031014/slice11.html#serialize-urlencode

[[
The encoding of EltName and value are as follows: space characters are
replaced by +, and then non-ASCII and reserved characters (as defined
by [RFC 2396] as amended by subsequent documents in the IETF track)
are escaped by replacing the character with one or more octets of the
UTF-8 representation of the character, with each octet in turn
replaced by %HH, where HH represents the uppercase hexadecimal
notation for the octet value and % is a literal character. Line breaks
are represented as "CR LF" pairs (i.e., %0D%0A).
]]


> > 
> > Does a SPARQL server need to negotiate to n3? How about turtle? or
> > RDFXML?
> > 
> > Is it A Good Idea to Accept: application/sparql-results+xml;
> > charset=utf-8 on a DESCRIBE query? (see DESCRIBE with simple RDF
> > dataset.)
> 
> I took an ACTION about this and we discussed it.
> 
> Thanks for the review, Eric.

np. Thanks for your work.
-- 
-eric

office: +81.466.49.1170 W3C, Keio Research Institute at SFC,
                        Shonan Fujisawa Campus, Keio University,
                        5322 Endo, Fujisawa, Kanagawa 252-8520
                        JAPAN
        +1.617.258.5741 NE43-344, MIT, Cambridge, MA 02144 USA
cell:   +81.90.6533.3882

(eric@w3.org)
Feel free to forward this message to any list for any purpose other than
email address distribution.
Received on Friday, 2 September 2005 22:32:18 UTC