- From: Polleres, Axel <axel.polleres@siemens.com>
- Date: Tue, 3 Apr 2012 07:12:07 +0200
- To: Lee Feigenbaum <lee@thefigtrees.net>, Andy Seaborne <andy.seaborne@epimorphics.com>
- CC: SPARQL Working Group <public-rdf-dawg@w3.org>
Dear all, As input to the discussion, I forward in the end of this mail with the commenter's permission the informal discussion with Jorge Perez on JP-4 about the previous proposal involving +,*,{*},{+}. I think some of his points might be also relevant to Option 6: a) Jorge's question about the semantics of (:a | :b)* should be answered, i.e., whether it counts the duplicates of (:a | :b) and then discards only the duplicates generated by * or whether it just discards all the duplicates. b) Jorge seems to have a strong preference for the restriction to counting/non-counting on path-level, i.e., ALL()/DISTINCT() That would be current options 7) and 8), however, that was against the previous options which involved {+},{*}). My guess is that the design of Option 6 not having {*} and {+} and not having {n,m} might resolve the last part of jorge's response below, since infinite paths aren't really an issue anymore with the new semantics of +,*, right? Also, by dropping {n,m} we may resolve Wim's (WM-1) concern. As mentioned in my previous mail, I'd personally prefer approaching all three commenters with a digest of options *the group can live with* above picking one only. From my side and current knowledge, I am ok with either Option 6), 7), and 8). Best, Axel -------------------------------------- Hello Axel, On Tue, Mar 6, 2012 at 12:58 PM, Axel Polleres <axel@polleres.net> wrote: > Hi Jorge, Marcelo, (offlist) > > As you may have noticed, the group is not inactive about your comment, but discussing it intensively. > The reason for silence over the official channels is that we want the issue solved before giving an official answer. This is the proposal where we stand at the moment [1]... > > Essentially, it suggests changing "*" and "+" to non-counting semantics, as you proposed, > and have an additional counting version "{*}" and "{+}". > > Most importantly, I would like to "test the waters" whether you're ok with this way forward in principle, > and please understand that the group is under particular time pressure. > Please let me know as soon as possible! I must admit that I need to think a bit more on it, but at a first glance I think that this solution opens some new issues. For instance, what is the meaning of something like (:a | :b)* are you going to count the duplicates of (:a | :b) and then discard only the duplicates generated by *? or are you going to just discard all the duplicates? in any case I think that it would be very confusing for the users. My guess is that things would become more complicated if * and {*} is combined with other operators. Also please notice that Wim Martens showed in his paper that * is not the only problem, as expressions like path(n,m) also suffer from extreme complexity problems and they are not covered by the new solution. So going to your question, unfortunately I am not Ok with this new solution having *, +, {*}, {+}, but, the only strong point to be against this decision would be the work of Wim which is out of the scope of our message. So I would considered my comment answered if the answer is on the line of adding *, {*} (I would say, "I do not agree, but I acknowledge that the group considered the issue raised by my comment"). On the positive side, I personally like more the proposal of DISTINCT/ALL-PATHS over path expressions (that was also in the mailing list), something like DISTINCT( (:a | :b)* ) ALL-PATHS( (:a | :b)* ) but only if DISTINCT and ALL-PATHS are used as a top level modifier of path expressions. That is, something like (:a/DISTINCT(:b*))* should not be allowed, since, otherwise one would reach to a similar problem as of having * and {*}. Also please notice that adding DISTINCT and ALL-PATHS at the top level would affect just a very minor part of the current grammar. I think that a fast way of getting out of all this problem is to define both options in the specification ALL-PATHS and DISTINCT as top level modifiers of path expressions, and say that when neither ALL-PATHS nor DISTINCT are specified, then one of them is picked by default. As you may guess, I would like the default to be DISTINCT :-). This way of defining things would also allow a subsequent group to define new modifiers like SIMPLE-PATHS, or ALL-SIMPLE-PATHS or some other constructor that allow, for example, to select paths until some length (for optimization or some other purposes). In any case, and whatever the decision would be, my main concern is still the semantics of ALL-PATHS(...) (or {*}). The question is, what are you really counting when there are infinite paths? We have discussed a bit with Marcelo but we haven't reached a clear consensus on what to count. Nevertheless, if you do not make ALL-PATHS as the default, then there is no problem at all on which semantics it has, and users may decide to use this modifier at their own risk. Hope my message helps (and I am open to continue with the discussion, since it interests me a lot!). Cheers, - jorge
Received on Tuesday, 3 April 2012 05:15:30 UTC