- From: Doerthe Arndt <doerthe.arndt@tu-dresden.de>
- Date: Tue, 24 Oct 2023 22:31:08 +0000
- To: Niklas Lindström <lindstream@gmail.com>
- CC: Thomas Lörtsch <tl@rat.io>, RDF-star WG <public-rdf-star-wg@w3.org>, "Peter F. Patel-Schneider" <pfpschneider@gmail.com>
- Message-ID: <9759E836-EC57-448E-9DEB-715F159199AD@tu-dresden.de>
Dear Niklas, I will most likely write a longer response, but one point here bothers me, so I will directly answer. Am 24.10.2023 um 22:07 schrieb Niklas Lindström <lindstream@gmail.com<mailto:lindstream@gmail.com>>: Dear Dörthe, On Tue, Oct 24, 2023 at 7:10 PM Doerthe Arndt <doerthe.arndt@tu-dresden.de<mailto:doerthe.arndt@tu-dresden.de>> wrote: Dear Niklas, I assume that your worry is that for graph terms to work, you'd have to match its signature (or arity)? It can be, depending what the graph means. It is interesting to see what you’d expect, especially since I have another expectation (and so far, nothing is fixed, so we are both right ;) ). I see your perspective better now (*closed* graph terms in Notation 3), so I understand your expectations here. I don't think that's an issue. If this: << dbr:Linköping ex:locatedIn dbr:Sweden >> ex:statedAt "2023-10-23"^^xsd:date . Was replaced with, or equivalent to (ignoring that this N3 cannot work in TriG without lookahead parsing dealing with ambiguity, due to default graph blocks): { dbr:Linköping ex:locatedIn dbr:Sweden } ex:statedAt "2023-10-23"^^xsd:date. And that thus, this is also possible: { dbr:Linköping a ex:City ; ex:locatedIn dbr:Sweden } ex:statedAt "2023-10-23"^^xsd:date. Then I'd assume a query like (again ignoring that this syntax probably won't fly in SPARQL): SELECT ?p ?o ?date { { dbr:Linköping ?p ?o } ex:statedAt ?date } Would yield: | ex:locatedIn | dbr:Sweden | "2023-10-23"^^xsd:date | In fact, this: SELECT ?p ?o ?date { { dbr:Linköping ?p ?o. ?s1 ?p1 ?o1 } ex:statedAt ?date } should match too, just binding ?s1, ?p1 and ?o1 to each of the two triples in turn (so an unperformant query, with unused redundant results). Mmm, so basically, you include my predicate log:includes implicitly to the query? Note that the question here is (and I think that was also one of the questions for the different TriG semantics): is the graph we state as a graph term open or closed? I would expect (but as all of us, I am biased), that if my graph has no name at all, that it is closed. So, if I state { dbr:Linköping a ex:City ; ex:locatedIn dbr:Sweden } ex:statedAt "2023-10-23"^^xsd:date. Actually, I didn't, I thought of the syntax as no different from "regular" SPARQL BGPs. I think Notation 3 is sufficiently different from SPARQL (and TriG) here that, *if* graph terms (as types) were supported (again I have back and forth on this), I think the syntax should be somewhat different; e.g. using a leading marker, like %{ ... }. This would also reasonably signal that it is closed. And I think you are correct in that expectation. Here is the point which surprised me that much, because in my opinion, N3 is not that different from SPARQL’s BGPs. We only have a very strict difference between graph terms and graphs. Of course, if I have something like dbr:Linköping a ex:City ; ex:locatedIn dbr:Sweden. and a rule, let’s say { dbr:Linköping ?p ?o } =>{:a :b :c}. the rule would still „fire“ and yield :a :b :c. even though, the graph contains more than this single triple. In this sense we still have BGP matching. In N3 the implication arrow is basically just a predicate (log:implies) and we can graph terms also in other contexts. Named graphs in SPARQL are still query graphs, so I would see them similarly as the rules as well. Note that the difference you see here is mainly on how you’d use graph terms in stated data and this is not fixed in SPARQL either. So, N3 is really not as different as you think here. I am talking about the exact graph { dbr:Linköping a ex:City ; ex:locatedIn dbr:Sweden } and not of, for example, a graph containing these two triples (and maybe more). So, in my view the graph above does not yield { dbr:Linköping a ex:City . } ex:statedAt "2023-10-23"^^xsd:date. Agreed, these two closed terms don't match. Note, that the predicate here is confusing and that I can find predicates where this would make no sense like: {:cat :is :alive, :dead } a :inconsitency. Should not yield {:cat :is :alive } a :inconsitency. Also agreed. It does surface a thought I've had a lot lately: Notation 3 is sufficiently different from "just" RDF, in that it doesn't really play under the same rules, right? Why would you say that? I think it really does. Is e.g. entailment expected to play a role when parsing N3 for instance? Ah, ok, I See. So, N3 has two main aspects and maybe that caused your confusion: 1. N3 extends RDF by graph terms, this is exactly what we have here. 2. N3 extends RDF by providing so called built-in predicates (in a sense predicates which have an own meaning, such as refs:domain has a special meaning under RDFS semantics). The most important of these built-ins is log:implies which is indeed an implication. Just as you’d expect an RDFS-reasoner to perform RDFS-entailment when dealing with RDFS, you would also expect an N3-reasoner to perform N3 entailment when confronted with N3 built-ins. So, it extends RDF. Point 1 is the point which is important for us here, point 2 influences how point 1 is dealt with (its semantics) of course. I'm still under the assumption that N3 is for implementing rules and inference. As said, N3 covers 2 aspects, therefore your confusion. (E.g. RDFS and OWL are often implemented using N3 rules?) N3 is a rule lunges supporting existential rules, so you can use it to implement RDFS-entailment and OWL-EL and OWL-RL. I think these differences are important to recognize in order to sort out our expectations and assumptions (including, I think, on opacity). As said, the use for the rules and other built-ins of course influences how graph terms are interpreted. But this does not clash with SPARQL, it could clash with our use cases, but that is another thing. As it is late at night here in Germany, my response stops here for now. I might write more when I am awake again. Kind regards, Dörthe So, in a sense we are back to the point where we need to consider the intended use and I think examples can get far more complex with graphs (which is also why we want them in the first place). Indeed use cases are paramount here. I recently added a few where quoting multiple triples are useful: https://github.com/w3c/rdf-ucr/issues/26 That said, I think the bulk of what we've collected make do with quoting one arc. We should also contrast those with "coarse-grained" provenance already using regular named graphs as well though. (Here is one example from our union library catalogue where the data is "embellished" with snippets from other records, wrapped as named graphs: https://niklasl.github.io/ldtr/demo/?url=https%3A//libris.kb.se/0xbdc9nj2qbd6dd/data.trig&edit=true) We do not need to follow N3, but there, graphs are closed and you can state relations between them and have predicates which help you to put them into relation. The reason is that otherwise you could never talk about a concrete graph as they do not have names. Makes total sense (in that context). I think my point here is just: we all have expectations on how the graphs should behave and most likely they differ. Indeed. :) I think distinguishing BGPs from graph terms is the crucial thing here, then I believe our expectations will converge. And the features and properties they bring must be tested against our use cases, of course. This is based on what I think James also answered, that for named graphs, if you have: _:g1 { dbr:Linköping a ex:City ; ex:locatedIn dbr:Sweden } _:g1 ex:statedAt "2023-10-23"^^xsd:date ; ex:source wikipedia:Linköping . then this works: SELECT ?p ?o ?date { graph ?g { dbr:Linköping ?p ?o. ?s1 ?p1 ?o1 } ?g ex:statedAt ?date . } This is because SPARQL BGPs in graph blocks match what's there; they're not excluding graphs containing more triples. (I'm sure e.g. Andy would phrase this much more correctly.) You mean SELECT ?p ?o ?date { graph ?g { dbr:Linköping ?p ?o. } ?g ex:statedAt ?date . } Right? I did! It makes no big difference though, as both would work; but mine is redundant and less efficient. I think the example here is easier because we have the graph name (even though it is a blank node) and this determines somehow which graph we mean. Here, you assume, that _:g1 { dbr:Linköping a ex:City ; ex:locatedIn dbr:Sweden } „Means“ (in an informal sense) that there is the graph _:g1 and that this graph contains the triples dbr:Linköping a ex:City ; ex:locatedIn dbr:Sweden. But _:g1 can contain more triples, it is open in that sense and if we want to talk about it, we use its label (the blank nodes _:g1). Exactly. There are many possible points of view. More than we can enumerate. :) What we need is to find the ones that most effectively and efficiently cover the current and future use cases. (Which is often surprisingly difficult; the result not seldom looking quite different in hindsight. And while reaching agreements is hard, doing it alone would be impossible.) This all said, I'm unconvinced of either triple or graph terms, as they make it possible to talk about the abstract type itself, as opposed to a reified occurrence thereof (which when talked about is a token of the type). With this comment you just made clear for me what you mean by type vs. token in this context: you would like that in <<:a :b :c>> :p :o. <<:a :b :c>> :pp :oo. The two <<:a :b :c>> wohl refer to different instances? Right? If not, please correct me, because previously, I did not fully get that (always easier to know the own point of view and „being right“ than understanding someone else’s ;) ). That would be important for your use case? I think that this can make things complicated, but before I complain (and construct evil examples), I need to fully understand. Would you want the << >>-notation to only be syntactic sugar for reification? Your view of my perspective is spot on in principle here; that is exactly the difference I mean. And I do have a preference for reification with sugar on top (albeit I think it has drawbacks (triple explosion being one), and named graphs have features that can amend that). But actually, I might not want that for this particular notation! I think that notation "affords" uniqueness, since it looks so much like IRIs, as we're "trained" to read Turtle et al like that. What I do want is to not use that, especially not as subjects, as there is little I could say (beyond logical rules) about it. I'd rather use blank graphs, which to our "TriG intuition" are *tokens* of that singleton edge (as in reified occurrences whom I can speak about): [ :p :o ] { :a :b :c } [ :pp :oo ] { :a :b :c } In this case, I'm still for my proposed "quotation dash" shorthand (revisions could be made to make the annotation more palatable of course, if the idea was to be accepted): :a :b -- :c {[ :p :o ] [ :pp :oo ]} . meaning the above. (Both which mean "one occurence with :p :o, one with :pp :oo, both claiming a: :b :c, neither of which are accepted as asserted".) But I'll write more about that in another reply. Looking forward to this. Thank you! I'll try to complete it before tomorrow. All the best, Niklas Kind regards, Dörthe All the best, Niklas On Mon, Oct 23, 2023 at 6:07 PM Doerthe Arndt <doerthe.arndt@tu-dresden.de<mailto:doerthe.arndt@tu-dresden.de>> wrote: Dear Thomas, all, In addition to what Peter said about RDF-star semantics and opacity, I’d like to clarify the community group semantics a little bit more: remember that we talk about the meaning of triple terms and not of the constituents (subject, predicate, object) of these terms. What was done in the unstar-mapping was a kind of reification with which we represented the triple with a blank node and then connected the iris of the constituents to this blank node (using the correct predicates) and also the lexical representation of these constituents. With this „trick“ we allowed that the quoted triple interpretation to be aware of the lexical representation of the triple and, if needed, to differentiate between triples having different interpretations, but that was not forced and as Peter also mentioned, the concrete interpretation was left open. For the working group semantics several possibilities have been discussed and they all rely on an interpretation function for the triple term (for example IT in Enrico’s case). This function maps to a resource (and it can do more, but does not need to). The interpretation function for the triple term can be applied on triples from the domain of discourse (then we can indeed combine it with IS or some alternative IS’), but it would for example also be possible to apply the IT function directly on the graphical representation of the triple (of course we need to be careful with blank nodes here). My point is just: please try to see the triple term as a whole also as a resource to better understand the opacity. To the rest of the discussion and the added complexity: apart from all the theoretical aspects we discuss here (and where I agree that graphs are more complex than triples), please also note that we would have to decide howto deal with quoted graph terms in practice. In SPARQL queries, it is relatively easy to search for a triple term having dbr:Linköping as subject, like: Select ?p ?o ?date { << dbr:Linköping ?p ?o>> ex:statedAt ?date } But to make a similar query for graphs, we either need to know the exact structure of the graph (that is: how many triples does it contain?) or we need to come up with extra Filter functions for SPARQL. If we have { dbr:Linköping a ex:City; ex:locatedIn dbr:Sweden} ex:statedAt „23.10.2023“^^xsd:date. A query Select ?p ?o ?date { {dbr:Linköping ?p ?o. ?s1 ?p1 ?o1} ex:statedAt ?date } Would fire, but Select ?p ?o ?date { {dbr:Linköping ?p ?o. ?s1 ?p1 ?o1. ?s2 ?p2 ?o3} ex:statedAt ?date } would not. I am sure we can solve this problem together, but this adds complexity since we need to have a discussion on how we would like to solve it. Side note: in N3 we would have a predicate log:includes for that and while it makes this case easier, it also adds complexity simply because your graph terms can contain blank nodes and you are back to a problem of simple entailment… (and I will not go further unless you ask :) )- In N3 you would do something like (I try to make it „SPARQL-style“ but I am not sure whether or not this makes it clear, so, feel free to ask): Select ?p ?o ?date { ?graph ex:statedAt ?date. ?graph log:includes {dbr:Linköping ?p ?o. }. } The log:includes is some kind of function which can give you elements of your graph. I just added this here as one example to illustrate that Peter is right here: things get more complex if we have graph terms. I am sure that we can solve that together and I would like to do that with all of you, but at the same time I am worried that it will take too long… Kind regards, Dörthe
Received on Tuesday, 24 October 2023 22:31:20 UTC