- From: Lee Feigenbaum <lee@thefigtrees.net>
- Date: Tue, 19 Jul 2011 22:57:51 -0400
- To: Carlos Buil Aranda <cbuil@fi.upm.es>
- CC: Gregory Williams <greg@evilfunhouse.com>, SPARQL Working Group <public-rdf-dawg@w3.org>
On 7/19/2011 5:02 PM, Carlos Buil Aranda wrote: > just summarizing a bit, the main concern is the use of variables in > SERVICE. The problem relies in the specification of the semantics and > the boundedness condition which suggests an order for executing the > query. The problem is first, the way of defining the execution of > SERVICE VAR, which using the current join semantics is wrong because it > is not possible to evaluate Join(G, Service(VAR, G, Transform(P), > SilentOp)) because VAR still hasn't a value. Following this, the > boundedness condition, which has to be completed for all the SPARQL 1.1 > operators. Am I right? > > So, the possible solutions are: > - For the SERVICE VAR semantics > - add a new operation that could allow the evaluation (operation > which wouldn't be bottom-up) > - define its whole semantics in a new way > - For the boundedness restriction > - specify all cases: it may take a bit long > - remove it? I do not think this is a good idea, how do we specify > then that a variable is bound (which is needed for the evaluation > semantics of SERVICE VAR)? > > something missing? any other option? As someone who was a bit uncomfortable including it in the first place, I'll add another, dramatic option: remove SERVICE VAR from the specification altogether. Lee > > Carlos > > PS I notice that there are still pending issues in the previous email to > be addressed, this is just a summary of what I think is the most > important topic of the email > > 2011/7/19 Gregory Williams <greg@evilfunhouse.com > <mailto:greg@evilfunhouse.com>> > > > On Jul 19, 2011, at 1:06 PM, Carlos Buil Aranda wrote: > > > 2011/7/18 Gregory Williams <greg@evilfunhouse.com > <mailto:greg@evilfunhouse.com>> > >> >> > >> >> "Let G := Join(G, Service(VAR, G, Transform(P), SilentOp))" > >> >> I don't think this works, as the evaluation semantics > should try to evaluate the Service() pattern without access to the > results of evaluating the G pattern (which are needed to bind VAR) > >> > I do not completely understand what you mean. > >> > >> What I mean here is that to execute the Service() part of this > expression in a bottom-up fashion, it must be able to evaluate > without data from G. The bottom-up semantics are defined by Query > 1.1 section 18.5 ("Evaluation of Join(P1, P2)") and by Federation > 1.1 section 3.1 ("Definition: Evaluation of a Service Pattern"). In > this case, the left-hand-side of the join (P1) is the pattern before > the SERVICE block (P2). I don't think you can evaluate Service(VAR, > G, Transform(P), SilentOp), because you can't invoke a service > operation on a variable. You need to substitute VAR for an actual > URL, but the URLs that need to be substituted are produced in an > entirely separate evaluation (eval(D(G), P1)). > >> > > thanks, now I see the problem. One solution could be to add a > restriction in which P1 must have been evaluated before P2, being P2 > a SERVICE pattern and if only P1 contains VAR. This would make > explicit that VAR must exist before that join happens. The query > could be represented as a tree in which each leave is a pattern and > the nodes the operators. > > I don't think such a restriction would work with the current > evaluation semantics. To do that, you'll need to not use Join() and > define your own join operation that isn't bottom-up. > > >> I think this is a big problem, and it relates to another of my > comments: > >> > >> >> "foreach i in Ω(?var->i)" > >> >> Where does Ω come from in this definition? I think it's > meant to refer to results from a join that is outside the scope of > this operation. > >> > yes, I will make that explicit. > >> > >> I don't think making it explicit will help, because as currently > defined it simply can't work with the existing Join() evaluation > semantics. Have I misunderstood something? > >> > > no, you are right. If we define Ω as a set of solutions that were > previously generated that would work, right? > > Only if you're not using Join(). > > > >> >> "if IRI is a service URL" > >> >> "if IRI is a SPARQL service" > >> >> How do I know if it's a SPARQL service URL or just some > other URL? > >> > You can't, users are reponsible of knowing what they query in > the same way users should know what data they want to query in a SELECT > >> > >> Users being responsible for knowing that is irrelevant if the > spec is using language like "if IRI is a service URL" and "if IRI is > a SPARQL service". Where is the spec language defining what happens > if the IRI *isn't* a SPARQL service? > > you are right, I will add a note in the defn of what happens if > the IRI isn't a SPARQL service. It will make the query to fail > unless SILENT is present. Is that ok? > > I would think the best action would be to simply drop the wording > about "is a service URL". The translation/evaluation should proceed > only on the distinction between IRI and VAR, not on what *type* of > IRI it is. Let failures during the service invocation handle the > cases where the IRI isn't actually a "service URL". > > > >> >> "UNBOUND is not a possible value for ?Xi in BindingValues" > >> >> I don't know what "not a possible value" means. "?Xi is > not unbound in BindingValues"? > >> > it is related to the issue you noticed in the service04.arq > test. I will fix that. > >> > >> I'm not sure this is connected to the service04 issue. My > concern was with the use of "possible" in the description. I would > think UNBOUND is always a "possible" value, it just might not > actually be present in "BindingValues". This might just be me being > pedantic, but I'd prefer a different working that made more explicit > that the condition here is that UNBOUND can't appear in the > BindingValues clause for the ?Xi variable. > > If you think that a rewording is necessary, it is ok for me. I'm > not a native English speaker so any suggestion/correction is very > welcomed. > > Sure. How about: > > "* P = P1 BINDINGS ?X1 ... ?Xn {BindingValues } and ?X is either > strongly bound within P1 or ?X = ?Xi and UNBOUND is not a value > bound to ?Xi in BindingValues." > > ? > > >> Also, I think there are a lot more cases than described where > it's simply not possible to tell if the variable is bound at the > (syntactic) point in the query where the SERVICE is used. The > combination of BIND, RAND, IF, EXISTS, select expressions, extension > functions, etc. make it impossible to know if a variable is going to > be bound ahead of time, and these cases aren't mentioned. The > definition of "strongly bound" seems intentionally conservative, so > maybe these are all cases meant to be an error. If that's the case, > I think this needs to be pointed out explicitly. > > I think it is possible if a variable may be bound or not > syntactically, but I have not worked out all the cases you are > pointing. The idea is that a variable is bounded if it the pattern > that contains that variable can be executed beforehand. You are > right that many cases and each of them has to be studied in detail. > The boundedness condition has to be checked in detail. The > boundedness condition could go as a note until each case is studied, > which could be done after LC, do you agree? if I'm mistaken we can > propose a different thing to make sure that a variable is bounded at > execution time. > > I'm worried that the boundedness condition being "checked in detail" > might go well beyond our current timeline. > > The two definitions (strongly boundedness and service safeness) are > defined in section 3.1 which is referenced as the conformance > criteria, but I still don't know how these definitions related to > conformance. > > >> The discussion of "service-safeness" and "boundedness" (which > elsewhere is actually 'strong boundedness') in section 2.4 seems > rather disconnected from the rest of the text. These two things are > defined at the end of section 3.1, but there isn't any text in 3.1 > that refers to them. After these definitions are included, only in > section 4.1 is "service safeness" mentioned, and then only weakly > ("The Service Safeness definition ***suggests the use*** of a > specific order in the execution", emphasis mine). MUST a conforming > implementation execute patterns in an order suggested by the > "service safeness" definition? I think this either needs much > stronger definitions and normative text, or we should consider > dropping the variable-endpoint form of federation entirely (punting > it until next time, I suppose). > > I do not think that dropping the variable-endpoint is a good > idea, I find it very handy, for instance to gather data from a set > of endpoints and make a copy in a specific server. I can regroup > everything in a SERVICE VAR section, using the safeness definition, > and making explicitly that are cases that might have not considered, > warning that an order in the execution is needed. After LC add > everything needed. > > I'm not arguing that it's "very handy," but I think it's enough > underspecified that I'm worried sorting out all the issues could > impact our schedule. > > I'm not sure what you mean by "After LC add everything needed," but > I wouldn't want to publish the spec in its current form while > relying on the publication of some future Note to sort out problems. > I'd be surprised if that were acceptable for a Rec. > > >> Section 3.1 "Definition: Evaluation of a Service Pattern" says > "Execution failures cause the query to fail." Section 4.1 says "If a > solution does not bind the variable, or binds it to something which > cannot resolve to a SPARQL service, that solution is eliminated." > How does "execution failure" differ from not being able to "resolve > [a URL] to a SPARQL service"? If you get back a HTTP 400 or 500 (or, > I guess, any other response code without a valid protocol response > body), how is an implementation supposed to determine if this is an > "execution failure" or a situation where the endpoint URL being used > failed to "resolve to a SPARQL service"? > > > > I will add the HTTP response codes accordingly. > > My point here was that there's no way to distinguish the two cases, > but for one you're saying to drop the result, and the other you're > saying to abort the query. I think both cases (being > indistinguishable) need to result in the same action. Moreover, I > think "execution failure" and "resolve to a SPARQL service" need to > be defined properly. > > > thanks, > .greg > >
Received on Wednesday, 20 July 2011 02:58:33 UTC