Re: SPARQL WG open comments from Axel Polleres on 2011-11-15 (public-rdf-dawg@w3.org from October to December 2011)

From: Axel Polleres <axel.polleres@deri.org>
Date: Tue, 15 Nov 2011 09:29:33 +0100
To: Paul Gearon <gearon@ieee.org>
Cc: "Lee Feigenbaum" <lee@thefigtrees.net>, "SPARQL Working Group" <public-rdf-dawg@w3.org>
Message-Id: <C8E4580A-F4B7-43F1-A090-FFAF7CE8B1B8@deri.org>
Hi Paul, 

one small thing... As for:

> >> > 3. In searching for the definition of the backslash "\" symbol in
> >> > section 4.2, it looks like it is supposed to be set difference, ...
> >>
> >> Axel?
> >
> > yes, that's what it is supposed to mean... Do you think we should replace it with 'set-difference', or say that '\' denotes set-difference?
> > (both ok for me, it's anyways just an editorial change).
> 
> Just explain that it means set difference. I was OK with it meaning
> that, but it just needs to be defined.


I have fixed this now in the latest editor's draft.

Axel

On 14 Nov 2011, at 23:28, Paul Gearon wrote:

> I'm adding the WG to the addressees here, as my original reply was
> accidentally only sent to the chairs. Consequently I'm leaving large
> blocks of text so that context is clear for anyone who wants to catch
> up on the thread. I apologise in advance if this makes it hard to read
> the email.
> 
> After illness and just returning from a conference I am trying to
> catch up on the issues that Lee and Axel have asked me on. Responses
> to the most recent message from Axel are below.
> 
> On Wed, Nov 9, 2011 at 3:23 AM, Axel Polleres <axel.polleres@deri.org> wrote:
> > Hi Paul,
> >
> >> Was deathly ill last week and am trying
> >> to catch up right now before I have to travel in the morning (so I
> >> can't make the call tomorrow either).
> >
> > Hope your better again!
> >
> >> I will be able to update the wiki when
> >> I get back.
> >
> > Please let us know when you have it on the wiki on the list, if you could make it before the next meeting, that'd be awesome.
> > more answers inline below...
> >
> > On 8 Nov 2011, at 06:47, Paul Gearon wrote:
> >
> >> Hi Lee,
> >>
> >> Sorry I've been out of touch. Was deathly ill last week and am trying
> >> to catch up right now before I have to travel in the morning (so I
> >> can't make the call tomorrow either).
> >>
> >> More comments below:
> >>
> >> > On 11/1/2011 9:08 PM, Lee Feigenbaum wrote:
> >> >>>> Hi Paul,>>>> On today's SPARQL call, we went through all of our open comments. In the>> next few weeks, we'll be looking to close all of these comments as we>> wrap up last call. We will also have a 2nd last call period that will>> run from the beginning of December until January.>>>> To that end, we need to identify ASAP any substantive changes to our>> documents that will need to be included in a 2nd Last Call.>>>> Today, we identified the following comments that have not yet been>> responded to that are your responsibility (or are partially your>> responsibility, but need input from update to add to what's already>> there from query). Could you please try to find some time in the next>> few weeks to address these comments?>>>> RC-4>> DB-4>> DB-5
> >>
> >>
> >> I've done a little with those comments, though  I haven't had it
> >> together enough to post. I am including everything I have here so that
> >> you'll have it for the meeting. I will be able to update the wiki when
> >> I get back.
> >
> >>
> >> For DB-4, the section:
> >>
> >> > 3. http://www.w3.org/TR/sparql11-update/#graphStore
> >> > "Operations may specify graphs to work with, or they may rely on a
> >> > default graph for that operation."
> >> > But don't operations use RDF Datasets, rather than graphs?
> >>
> >> <AndyS> This is not about query despite the subject line :-)
> >>
> >> Operations query RDF Datasets, and make modifications to graphs. I
> >> suggest changing the words "work with" to "be modified".
> 
> I've changed the Update document to reflect this.
> 
> 
> >> > 4. Regarding the typographical conventions that you use for conformance
> >> > keywords:
> >> > [[
> >> > When this document uses the words must, must not, should, should not,
> >> > may and recommended, and the words appear as emphasized text, they must
> >> > be interpreted as described in RFC 2119 [RFC2119].
> >> > ]]
> >> > As you can see in the excerpt quoted above, the typographical emphasis
> >> > of these keywords (i.e., bolding) is lost when the document is viewed as
> >> > plain text or copied and pasted as plain text.  This makes it more
> >> > difficult to quote and discuss portions of the specification precisely.
> >> > To ensure clarity, please make these keywords UPPER CASE (perhaps in
> >> > addition to being bold).
> >>
> >> I wanted to be consistent, so I looked at SPARQL 1.1 Protocol for RDF.
> >> I was under the assumption that there must have been some uniformity.
> >> However, I see that the Graph Store HTTP Protocol doc uses
> >> capitalization, and Service Description uses capitalization and a
> >> style, while none of the other documents refer to RFC 2119 at all.
> >>
> >> If David's concerns make sense, then I can change it, though I
> >> recommend a standard approach be used by the other documents that
> >> reference RFC 2119.
> >
> > I guess we could think about that (using uniformly capitalization & bold, for instance).
> 
> I have left these words bolded, but have also added capitalization. If
> a standard is adopted between documents then I will change to that.
> 
> 
> >> For DB-5
> >>
> >> > 1. Please either add capability for virtual graphs or keep the COPY, ADD
> >> > and MOVE shortcuts, to enable standard SPARQL to be used more
> >> > efficiently as a rules language and in data production pipelines.  COPY,
> >> > ADD and MOVE operations cost almost nothing to implement, and they help
> >> > with efficiency.  By "virtual graph" I mean a graph that consists of the
> >> > merge of a particular set of named graphs -- a very important capability
> >> > for efficient data production pipelines.
> >>
> >> I support full acceptance of these features. I do not support
> >> introducing virtual graphs into SPARQL 1.1.
> >
> > +1 (personal opinion)
> 
> This needs input from the working group before it can be endorsed, but
> I'm happy to make the appropriate change as soon as we can.
> 
> 
> >> > 2. This paragraph in sec 3.1.3 is a bit confusing:
> >> > [[
> >> > That is, the GroupGraphPattern in the WHERE clause will be matched
> >> > against the dataset described by explicit USING or USING NAMED clauses,
> >> > if specified, and against the graph store otherwise. Any graph name
> >> > specified in a WITH clause will - for evaluating the WHERE clause -
> >> > refer to the default graph to be used in the absence of USING or USING
> >> > NAMED clauses. In the presence of one or more graphs referred to in
> >> > USING clauses, the default graph will be the merge of these graphs,
> >> > meaning that the graph in a WITH clause will be ignored while evaluating
> >> > the WHERE clause. If there is no USING clause, but there is one or more
> >> > USING NAMED clauses, then the dataset will include an empty graph for
> >> > the default graph.
> >> > ]]
> >> > In particular, the sentence "Any graph name specified in a WITH clause
> >> > will - for evaluating the WHERE clause - refer to the default graph to
> >> > be used in the absence of USING or USING NAMED clauses." seems odd.  The
> >> > graph specified in the WITH clause will refer to the *default* graph?  I
> >> > would think it would be used *instead* of the default graph.  Isn't that
> >> > the point of WITH?  Perhaps the term "default graph" is being used in an
> >> > unusual way in this paragraph, to mean "the graph that will used in the
> >> > absence of USING or USING NAMED"?  I think it would be misleading to
> >> > call that a "default graph".  Normally the term "default graph" refers
> >> > to the unnamed slot in a Graph Store, per the first paragraph in section
> >> > 2.  I think it would be best to use the term only in that way.
> >>
> >> He's right, in that it's confusing.
> >>
> >> One problem is that there are 2 types of "default" graph. There's the
> >> default graph for the store, and the default graph in a query. For
> >> instance, a query that says "select * {?s ?p ?o}" gets data from the
> >> "default" graph, but the protocol can set this graph to be anything,
> >> by using the default-graph-uri parameter. If this hasn't been set,
> >> then the query will refer to the default graph of the store (the
> >> default-default graph).
> >>
> >> Paragraph 3.1.3 is referring to the default graph of a query, while
> >> David is referring to the default graph of a store, hence his
> >> confusion.
> >>
> >> Other than that, the text makes sense, if you know what it's supposed
> >> to mean. However, I suspect that if you don't already know what it's
> >> trying to say, then it may be a bit impenetrable. Does it need to be
> >> changed?
> >
> > Could you propose some clarifying modification or addition?
> 
> I have not had a chance to consider this since last week. I will try
> to come up with something by the telecon tomorrow.
> 
> 
> >> > 3. In searching for the definition of the backslash "\" symbol in
> >> > section 4.2, it looks like it is supposed to be set difference, ...
> >>
> >> Axel?
> >
> > yes, that's what it is supposed to mean... Do you think we should replace it with 'set-difference', or say that '\' denotes set-difference?
> > (both ok for me, it's anyways just an editorial change).
> 
> Just explain that it means set difference. I was OK with it meaning
> that, but it just needs to be defined.
> 
> 
> >> > 4. The difference between "USING" and "USING NAMED" is not explained,
> >> > except in passing: "This describes a dataset in a manner similar to FROM
> >> > and FROM NAMED clauses in the SPARQL1.1 Query Language."
> >>
> >> This does not appear to be "in passing" to me.
> >
> >>
> >> The INSERT/DELETE operations (the only ones that use USING and USING
> >> NAMED) operate as a query-and-update. The query is basically exactly
> >> the same as what is described in SPARQL 1.1 Query Language. However,
> >> to avoid confusion in "deleting from" a graph, we opted to avoid the
> >> use of the keyword FROM and replace it with USING.
> >>
> >> Does someone suggest better wording to avoid David's concern?
> >>
> >
> > We could replace "in a manner similar to FROM and FROM NAMED" with
> > "in the same way as FROM and FROM NAMED" and maybe add a direct link to
> >  http://www.w3.org/TR/sparql11-query/#specifyingDataset
> > Ok?
> 
> Done.
> 
> 
> >> > 5. As written, this in sec 3.1:
> >> > http://www.w3.org/TR/sparql11-update/#graphUpdate
> >> > [[
> >> > Graph update operations change existing graphs in the Graph Store but do
> >> > not explicitly delete nor create them. Non-empty inserts into
> >> > non-existing graphs will, however, implicitly create those graphs, i.e.,
> >> > an implementation *should* create graphs that do not exist before
> >> > triples were inserted into them (there may be implementations providing
> >> > an update service over a fixed set of graphs which in such case *must*
> >> > return with failure for update requests that would create an unallowed
> >> > graph), and *may* remove graphs that are left empty after triples are
> >> > removed from them.
> >> > ]]
> >> > seems to say that an implementation that operates over a *variable*
> >> > (non-fixed) set of graphs still has the option of not automatically
> >> > creating graphs that do not exist.
> >> >
> >> > I suggest rewording the above portion as:
> >> > [[
> >> > Graph update operations change existing graphs in the Graph Store but do
> >> > not explicitly delete nor create them. Non-empty inserts into
> >> > non-existing graphs will normally implicitly create those graphs, i.e.,
> >
> > I still like ", however, implicitly" better than "normally implicitly"
> >
> >> > an implementation fulfilling an update request *should* silently and
> >> > automatically create graphs that do not exist before triples are
> >> > inserted into them, and *must* return with failure if it fails to do so
> >> > for any reason.  (For example, the implementation may have insufficient
> >> > resources, or an implementation may only provide an update service over
> >> > a fixed set of graphs
> >
> > where the implicitly created graph is not within this fixed set
> >
> >> .)  An implementation *may* remove graphs that are
> >> > left empty after triples are removed from them.
> >> > ]]
> >>
> >> (similar suggestion for point 6)
> >>
> >> David's rewording does seem a little better, and I'm happy to incorporate it.
> >>
> >
> > see my suggested addition/modification above. Otherwise ok with that change.
> 
> Done.
> 
> 
> >> Point 7 is similar, but I prefer the original text.
> >>
> >>
> >> > 8. How is the URI of a Graph Store indicated?  The concept of a Graph
> >> > Store is central to the SPARQL 1.1 Update spec, and hence one should be
> >> > able to use a URI to refer to a particular Graph Store, but the spec
> >> > doesn't say how this is done.
> >> <further discussion on this>
> >>
> >> I don't have an answer to this one. Whenever I've used an RDF store
> >> the documentation for that software has always told me the form of the
> >> URI for the store. I've never seen it defined in any way, but then,
> >> I've never really needed it to be. Suggestions?
> >>
> >
> > I'd say the following:
> >
> > The information how a graph store is accessed is defined in the protocol and graph store protocol specs:
> > A graph store is accessible by either an update service (cf. protocol) or via the graph store protocol (cf. graph store protocol),
> > in any case it is hidden behind the service, so it's accessible via the URI of a SPARQL
> > update service or via a URI that responds to the graph store protocol.
> 
> This works for me, though I changed the wording a little:
> 
> "The information how a graph store is accessed is defined in the
> protocol and graph store protocol specs. A graph store is accessible
> by either an update service (cf. protocol) or via the graph store
> protocol (cf. graph store protocol). In either case the graph store is
> hidden behind the service, making it accessible via the URI of a
> SPARQL update service or via a URI that responds to the graph store
> protocol."
> 
> 
> >> -- On RC-4 --
> >>
> >> I agree with Andy's comments. I also wanted to note that Richard's
> >> example is invalid, in that it would not be legal on a query endpoint.
> >>
> >> Richard also says:
> >> "I am surprised that the security issues arising from obfuscation
> >> through string escaping are not stated in the Security Considerations
> >> sections of SPARQL Query and SPARQL Update."
> >>
> >> I do not consider this to be an issue, as it is only users with update
> >> permissions who will be successfully issuing update operations.
> >>
> >> There is the potential for this to be an issue for a system that wants
> >> to create a fine-grained permissions scheme (for instance, allowing
> >> insertion, but not removal). Is this a concern worth documenting?
> 
> This was a question for the larger group. Should I add something for
> systems that are in this category?
> 
> 
> >> Andy comments that a WG decision is needed for the following:
> >>
> >> >> • As part of the changes to the escape processing model for \u escapes,
> >> >> additional characters (e.g. "=", ",") would be allowed, in \u escaped form,
> >> >> in prefixed names.
> >>
> >> > I oppose this change, as there is no use case for it. Prefixed names are a
> >> > convenience for authors to make long IRIs easier to write and read. Escapes
> >> > like \u003D and \u002C are neither easy to write nor easy to read, so they
> >> > defeat the purpose of prefixed names. IRIs that include such characters just
> >> > have to be written as absolute or relative IRIs.
> >>
> >> Richard is quite right here. However \u unescaping is something that
> >> an implementor would likely want to do before parsing. Singling out
> >> the prefixes so that they are not treated this way would require
> >> parsing them out, then unescaping the remainder of the text, and then
> >> continuing the parsing.
> >>
> >> Unless there is a good reason to require that prefixes not allow
> >> escaping, then I would prefer to keep escaping on the entire text.
> >
> > Shall we discuss this in one of the upcoming TelCos?
> 
> Sure, though I'm in favour of the status quo.
> 
> Regards,
> Paul Gearon
>
Received on Tuesday, 15 November 2011 08:30:15 UTC