- From: Thomas Lörtsch <tl@rat.io>
- Date: Mon, 18 Dec 2023 22:06:01 +0100
- To: Olaf Hartig <olaf.hartig@liu.se>
- Cc: "andy@apache.org" <andy@apache.org>, "public-rdf-star-wg@w3.org" <public-rdf-star-wg@w3.org>
> On 18. Dec 2023, at 15:33, Olaf Hartig <olaf.hartig@liu.se> wrote: > > On Mon, 2023-12-18 at 00:09 +0100, Thomas Lörtsch wrote: >> On 15. Dec 2023, at 15:33, Olaf Hartig <olaf.hartig@liu.se> wrote: >>> On Fri, 2023-12-15 at 13:57 +0100, Thomas Lörtsch wrote: >>>>> On 15. Dec 2023, at 00:01, Olaf Hartig <olaf.hartig@liu.se> >>>>> wrote: >>>>> [...] >>>>> If so, what would you expect to happen if someone writes the >>>>> following? >>>>> >>>>> :T rdfx:typeOf << :s1 :p1 :o1 >> . >>>>> :T rdfx:typeOf << :s2 :p2 :o2 >> . >>>>> >>>>> Which triple type would :T denote in this case (if any)? >>>> >>>> In the strictly monotonic world of RDF :T would then probably >>>> refer >>>> to a little graph term. >>> >>> I don't think this works, but I also think there is no point going >>> further into this discussion for now. >> >> Not attempting to force a discussion on you, but >> >> :X a :House, :Bird . >> >> is perfectly legal RDF. RDF is not especially safe. > > Okay, I backtrack. It certainly depends on how you define the semantics > of rdfx:typeOf. > > Based on how you introduced it, I was assuming that the meaning you > have in mind for it is to state that the subject of an rdfx:typeOf > triple (i.e., :T in the examples) is "a reference to the [triple] type" > captured by the triple term in the object position of the triple. Based > on this meaning, I see the two lines above as an inconsistency. I don’t think so. Athough I don’t know anything that qualifies as a house-bird, it IMO is not an _inconsistent_ concept as it is not in itself contradictory or per se unsatisfiable. But I’ll let some logicians make the call. > With your statement above (":T would then probably refer to a little > graph term") you are implying a different meaning of rdfx:typeOf; being > a reference to a triple term/type and being a reference to graph term > are different thing (at least, I see it that way). Moreover, if an > rdfx:typeOf triple is meant to state that the subject of this triple is > a reference to a graph term, then I would expect the object of this > triple to be that graph term (as a whole), rather than just one of the > triples that is part of the graph term. I would expect that too, and I would expect that a language provides me with the means to unambiguously express something that matches that expectation, eg: :T rdfx:typeOf << :s1 :p1 :o1. :s2 :p2 :o2 >> But if the foreseeable and legitimate needs and expectations of users are not met, they will create cow paths like :T rdfx:typeOf ( << :s1 :p1 :o1 >>, << :s2 :p2 :o2 >> ) . because that’s just what humans do when they need a tool that isn’t provided: they get creative and create the missing tool from what is at their disposal. And frankly: I love that, even if it bends semantic quite a bit. It’s the tool providers, it’s us who are to blame if something like that happens in practice, not the users. And it will happen. >> [...] >>>>> In other words, should the two occurances of the >>>>> subexpression << :s :p :o >> in the following two lines be >>>>> understood to "implicitly reference" the same token or two >>>>> different tokens? >>>>> >>>>> << :s :p :o >> :p2 :o2 . >>>>> << :s :p :o >> :p3 :o3 . >>>> >>>> Always a different one, and that’s indeed crucial (I pointed that >>>> out in the nested graph proposal too). >>> >>> In this case, I cannot see how it would be possible to make more >>> than one annotation statement for each token? (If you attempt to >>> answer this question based on an example, please write the example >>> either in terms of the abstract syntax or the N-Triple-star format, >>> but not in Turtle-star.) >> >> I was sloppy in this example, but I seem to remember that in the >> context of the whole mail it might have been clearer. The idea was >> (and is) that providing no identifier makes is interpreted as "i >> don’t care about the name of this) and a blank node identifier is >> created. That motivates the need to define a way to refer to the type >> (above). I just realize the type reference, given in the syntax >> above, could then be interpreted as denoting a type of occurrence - >> that would have to be explained away… >> >> << :s :p :o >> :p2 :o2 . >> << :s :p :o >> :p3 :o3 ; >> :p4 :o4 # multiple annotations only as >> trees >> # if no explicit ID is provided >> >> is then the same as >> >> << _:b1 | :s :p :o >> :p2 :o2 . >> << _:b2 | :s :p :o >> :p3 :o3 ; >> :p4 :o4 . > > No, I would not say that these are the same. In contrast, the first of > these two snippets of Turtle is the same as the following. > > << :s :p :o >> :p2 :o2 . > << :s :p :o >> :p3 :o3 . > << :s :p :o >> :p4 :o4 . At present, yes, but not in the proposal I’m making. Thomas > Best, > Olaf > > >> N-Triples-star might look like this: >> >> _:b1 rdfx:occurrenceOf << <http://example/s> <http://example/p> < >> http://example/o> >>. >> _:b2 rdfx:occurrenceOf << <http://example/s> <http://example/p> < >> http://example/o> >> >> _:b1 <http://example/p2> <http://example/o2> . >> _:b2 <http://example/p3> <http://example/03> . >> _:b2 <http://example/p4> <http://example/04> . >> >> >> Best, >> Thomas >> >> >>> Thanks, >>> Olaf >>> >>>> The preference for types in the semantics of RDF might be >>>> characterized as early optimization: understandable for an >>>> integration focused technology, and well understood in logic. >>>> However, the unification of tokens into types risks losing >>>> context >>>> (and annotations). It can just as well be postponed to querying >>>> (DISTINCT) or to a concious data management operation (spring >>>> cleaning in the dataset). The one thing that one doesn’t want to >>>> lose >>>> when working with data is … data. So late unification of tokens >>>> into >>>> types has some merit. >>>> >>>> Best, >>>> Thomas >>>> >>>> >>>> >>>>> Thanks, >>>>> Olaf >>>>> >>>>> >>>>>> and may either provide a custom name or will be provided with >>>>>> a >>>>>> new blank node to name the reference. >>>>>> >>>>>> >>>>>> ## Syntax >>>>>> >>>>>> We should try to make the naming syntactically as uniform and >>>>>> predicatble as possible. The nested graph proposal uses a >>>>>> pair of >>>>>> square brackets [] prepending constructs to indicate the >>>>>> name. If >>>>>> a custom name is given it is entered into that pair. That >>>>>> violates the rules for [] in Turtle/TriG but seems to parse >>>>>> unambiguously. Not providing any name syntactically and >>>>>> still >>>>>> assuming the presence of a blank node name is a bit more >>>>>> tricky. >>>>>> >>>>>> :liz :spouse :dick [id:1]{| :start 1964; :end 1974 |} . >>>>>> :liz :spouse :dick {| :start 1975; :end 1976 |} . # >>>>>> _:id2 >>>>>> >>>>>> [] << :s :p :o >> :start 1964 ; :end 1974 . >>>>>> >>>>>> In any case: if it doesn’t parse without a prepended name, >>>>>> then >>>>>> prepend a []. >>>>>> >>>>>> >>>>>> ## Unasserted vs Asserted >>>>>> >>>>>> Why not define a property that not only references a token, >>>>>> but >>>>>> also creates the triple, e.g.: >>>>>> >>>>>> :liz :spouse :dick [id:1]{| :start 1964; :end 1974 |} . >>>>>> >>>>>> mapping to >>>>>> >>>>>> id:1 rdfx:assertionOf << :liz :spouse :dick >> >>>>>> :start 1964; :end 1974 . >>>>>> >>>>>> instead of >>>>>> >>>>>> id:1 rdfx:occurrenceOf << :liz :spouse :dick >> >>>>>> :start 1964; :end 1974 . >>>>>> :liz :spouse :dick . >>>>>> >>>>>> That way we get identifiers for each triple occurrence >>>>>> together >>>>>> with the triple being asserted - direct identification, not >>>>>> earyl >>>>>> optimization. See above why that is important. >>>>>> >>>>>> All this unasserted business may seem a bit eccentric, but >>>>>> it’s >>>>>> the key to any sort of configurable semantics like quotation >>>>>> etc. >>>>>> It therefore has huge potential - if done right. >>>>>> >>>>>> >>>>>> ## SPARQL sugar >>>>>> >>>>>> You compare the occurence-based shortcut relation to >>>>>> syntactic >>>>>> sugar for RDF lists, which is fine, except that querying >>>>>> those >>>>>> lists is a hardship. Same for RDF/XML’s syntactic support for >>>>>> RDF >>>>>> standard reification. Any kind of RDF syntactic sugar also >>>>>> needs >>>>>> proper support in SPARQL to be effective in practice. >>>>>> >>>>>> >>>>>> ## Triple terms vs Graph terms >>>>>> >>>>>> Just for completeness: all for this can easily be expanded to >>>>>> graph terms. The syntax >>>>>> >>>>>> []{ :s :p :o. :u :v :w } >>>>>> >>>>>> is explored in the nested graph proposal. >>>>>> >>>>>> >>>>>> ## Graph Terms vs Named Graphs >>>>>> >>>>>> I like Adrians example [0] of a complicated named graph based >>>>>> application and I’m taking that serious. However it should >>>>>> also >>>>>> be clear that triple/graph terms in the end are always stored >>>>>> in >>>>>> a way very similar to named graphs. There is just no other >>>>>> way in >>>>>> a quad based system. Triple/graph terms can be represented as >>>>>> named graphs, named graphs can be represented as graph terms. >>>>>> It’s a practical question of how to encode >>>>>> belonging/membership: >>>>>> syntactically as nested graphs, via a new term type as in >>>>>> RDF- >>>>>> star that transforms a triple into a term at the surface (but >>>>>> NOT >>>>>> in the underlying storage layer, for obvious performance >>>>>> reasons), via explicit binding relations as Niklas proposes >>>>>> [1] >>>>>> (and as Dydra implements nested graphs), etc. The main >>>>>> question >>>>>> is how to ensure that those binding relations don’t get lost >>>>>> in >>>>>> the process, but that IMHO is true for any solution. Nested >>>>>> graphs can be serialized to graph terms, which are just an >>>>>> extension of triple terms. That requires an additional en/de- >>>>>> coding step to fit them into an environment that reserves >>>>>> named >>>>>> graphs to its own purposes. That extra step is the price that >>>>>> those applications have to pay for being so particular about >>>>>> their use of named graphs. That’s only fair, and probably >>>>>> still >>>>>> economical for them. >>>>>> >>>>>> >>>>>> ## Term types vs Datatypes >>>>>> >>>>>> The most fundamental grievance with RDF-star is the >>>>>> introduction >>>>>> of a new term type when a new datatype of type RDF/TTL would >>>>>> suffice. All I proposed above is readily imlpementable in the >>>>>> nested graph proposal, which does map to TriG and regular N- >>>>>> quads >>>>>> and such a datatype (and even Turtle and N-triples, but >>>>>> that’s >>>>>> another discussion). >>>>>> >>>>>> >>>>>> Best, >>>>>> Thomas >>>>>> >>>>>> >>>>>> >>>>>> [0] >>>>>> https://lists.w3.org/Archives/Public/public-rdf-star-wg/2023Dec/0019.html >>>>>> [1] >>>>>> https://lists.w3.org/Archives/Public/public-rdf-star-wg/2023Nov/0032.html >>>>>> >>>>>> >>>>>>> Andy >>>>>>> >>>>>>> [1] >>>>>>> https://lists.w3.org/Archives/Public/public-rdf-star-wg/2023Dec/0024.html >>>>>>> >>>>>>> [2] >>>>>>> https://w3c.github.io/rdf-concepts/spec/#section-triples >>>>>>> (as of 2023-12-10) >>>>>>>
Received on Monday, 18 December 2023 21:06:16 UTC