- From: Thomas Lörtsch <tl@rat.io>
- Date: Mon, 18 Dec 2023 22:06:01 +0100
- To: Olaf Hartig <olaf.hartig@liu.se>
- Cc: "andy@apache.org" <andy@apache.org>, "public-rdf-star-wg@w3.org" <public-rdf-star-wg@w3.org>
> On 18. Dec 2023, at 15:33, Olaf Hartig <olaf.hartig@liu.se> wrote:
>
> On Mon, 2023-12-18 at 00:09 +0100, Thomas Lörtsch wrote:
>> On 15. Dec 2023, at 15:33, Olaf Hartig <olaf.hartig@liu.se> wrote:
>>> On Fri, 2023-12-15 at 13:57 +0100, Thomas Lörtsch wrote:
>>>>> On 15. Dec 2023, at 00:01, Olaf Hartig <olaf.hartig@liu.se>
>>>>> wrote:
>>>>> [...]
>>>>> If so, what would you expect to happen if someone writes the
>>>>> following?
>>>>>
>>>>> :T rdfx:typeOf << :s1 :p1 :o1 >> .
>>>>> :T rdfx:typeOf << :s2 :p2 :o2 >> .
>>>>>
>>>>> Which triple type would :T denote in this case (if any)?
>>>>
>>>> In the strictly monotonic world of RDF :T would then probably
>>>> refer
>>>> to a little graph term.
>>>
>>> I don't think this works, but I also think there is no point going
>>> further into this discussion for now.
>>
>> Not attempting to force a discussion on you, but
>>
>> :X a :House, :Bird .
>>
>> is perfectly legal RDF. RDF is not especially safe.
>
> Okay, I backtrack. It certainly depends on how you define the semantics
> of rdfx:typeOf.
>
> Based on how you introduced it, I was assuming that the meaning you
> have in mind for it is to state that the subject of an rdfx:typeOf
> triple (i.e., :T in the examples) is "a reference to the [triple] type"
> captured by the triple term in the object position of the triple. Based
> on this meaning, I see the two lines above as an inconsistency.
I don’t think so. Athough I don’t know anything that qualifies as a house-bird, it IMO is not an _inconsistent_ concept as it is not in itself contradictory or per se unsatisfiable. But I’ll let some logicians make the call.
> With your statement above (":T would then probably refer to a little
> graph term") you are implying a different meaning of rdfx:typeOf; being
> a reference to a triple term/type and being a reference to graph term
> are different thing (at least, I see it that way). Moreover, if an
> rdfx:typeOf triple is meant to state that the subject of this triple is
> a reference to a graph term, then I would expect the object of this
> triple to be that graph term (as a whole), rather than just one of the
> triples that is part of the graph term.
I would expect that too, and I would expect that a language provides me with the means to unambiguously express something that matches that expectation, eg:
:T rdfx:typeOf << :s1 :p1 :o1. :s2 :p2 :o2 >>
But if the foreseeable and legitimate needs and expectations of users are not met, they will create cow paths like
:T rdfx:typeOf ( << :s1 :p1 :o1 >>, << :s2 :p2 :o2 >> ) .
because that’s just what humans do when they need a tool that isn’t provided: they get creative and create the missing tool from what is at their disposal. And frankly: I love that, even if it bends semantic quite a bit. It’s the tool providers, it’s us who are to blame if something like that happens in practice, not the users. And it will happen.
>> [...]
>>>>> In other words, should the two occurances of the
>>>>> subexpression << :s :p :o >> in the following two lines be
>>>>> understood to "implicitly reference" the same token or two
>>>>> different tokens?
>>>>>
>>>>> << :s :p :o >> :p2 :o2 .
>>>>> << :s :p :o >> :p3 :o3 .
>>>>
>>>> Always a different one, and that’s indeed crucial (I pointed that
>>>> out in the nested graph proposal too).
>>>
>>> In this case, I cannot see how it would be possible to make more
>>> than one annotation statement for each token? (If you attempt to
>>> answer this question based on an example, please write the example
>>> either in terms of the abstract syntax or the N-Triple-star format,
>>> but not in Turtle-star.)
>>
>> I was sloppy in this example, but I seem to remember that in the
>> context of the whole mail it might have been clearer. The idea was
>> (and is) that providing no identifier makes is interpreted as "i
>> don’t care about the name of this) and a blank node identifier is
>> created. That motivates the need to define a way to refer to the type
>> (above). I just realize the type reference, given in the syntax
>> above, could then be interpreted as denoting a type of occurrence -
>> that would have to be explained away…
>>
>> << :s :p :o >> :p2 :o2 .
>> << :s :p :o >> :p3 :o3 ;
>> :p4 :o4 # multiple annotations only as
>> trees
>> # if no explicit ID is provided
>>
>> is then the same as
>>
>> << _:b1 | :s :p :o >> :p2 :o2 .
>> << _:b2 | :s :p :o >> :p3 :o3 ;
>> :p4 :o4 .
>
> No, I would not say that these are the same. In contrast, the first of
> these two snippets of Turtle is the same as the following.
>
> << :s :p :o >> :p2 :o2 .
> << :s :p :o >> :p3 :o3 .
> << :s :p :o >> :p4 :o4 .
At present, yes, but not in the proposal I’m making.
Thomas
> Best,
> Olaf
>
>
>> N-Triples-star might look like this:
>>
>> _:b1 rdfx:occurrenceOf << <http://example/s> <http://example/p> <
>> http://example/o> >>.
>> _:b2 rdfx:occurrenceOf << <http://example/s> <http://example/p> <
>> http://example/o> >>
>> _:b1 <http://example/p2> <http://example/o2> .
>> _:b2 <http://example/p3> <http://example/03> .
>> _:b2 <http://example/p4> <http://example/04> .
>>
>>
>> Best,
>> Thomas
>>
>>
>>> Thanks,
>>> Olaf
>>>
>>>> The preference for types in the semantics of RDF might be
>>>> characterized as early optimization: understandable for an
>>>> integration focused technology, and well understood in logic.
>>>> However, the unification of tokens into types risks losing
>>>> context
>>>> (and annotations). It can just as well be postponed to querying
>>>> (DISTINCT) or to a concious data management operation (spring
>>>> cleaning in the dataset). The one thing that one doesn’t want to
>>>> lose
>>>> when working with data is … data. So late unification of tokens
>>>> into
>>>> types has some merit.
>>>>
>>>> Best,
>>>> Thomas
>>>>
>>>>
>>>>
>>>>> Thanks,
>>>>> Olaf
>>>>>
>>>>>
>>>>>> and may either provide a custom name or will be provided with
>>>>>> a
>>>>>> new blank node to name the reference.
>>>>>>
>>>>>>
>>>>>> ## Syntax
>>>>>>
>>>>>> We should try to make the naming syntactically as uniform and
>>>>>> predicatble as possible. The nested graph proposal uses a
>>>>>> pair of
>>>>>> square brackets [] prepending constructs to indicate the
>>>>>> name. If
>>>>>> a custom name is given it is entered into that pair. That
>>>>>> violates the rules for [] in Turtle/TriG but seems to parse
>>>>>> unambiguously. Not providing any name syntactically and
>>>>>> still
>>>>>> assuming the presence of a blank node name is a bit more
>>>>>> tricky.
>>>>>>
>>>>>> :liz :spouse :dick [id:1]{| :start 1964; :end 1974 |} .
>>>>>> :liz :spouse :dick {| :start 1975; :end 1976 |} . #
>>>>>> _:id2
>>>>>>
>>>>>> [] << :s :p :o >> :start 1964 ; :end 1974 .
>>>>>>
>>>>>> In any case: if it doesn’t parse without a prepended name,
>>>>>> then
>>>>>> prepend a [].
>>>>>>
>>>>>>
>>>>>> ## Unasserted vs Asserted
>>>>>>
>>>>>> Why not define a property that not only references a token,
>>>>>> but
>>>>>> also creates the triple, e.g.:
>>>>>>
>>>>>> :liz :spouse :dick [id:1]{| :start 1964; :end 1974 |} .
>>>>>>
>>>>>> mapping to
>>>>>>
>>>>>> id:1 rdfx:assertionOf << :liz :spouse :dick >>
>>>>>> :start 1964; :end 1974 .
>>>>>>
>>>>>> instead of
>>>>>>
>>>>>> id:1 rdfx:occurrenceOf << :liz :spouse :dick >>
>>>>>> :start 1964; :end 1974 .
>>>>>> :liz :spouse :dick .
>>>>>>
>>>>>> That way we get identifiers for each triple occurrence
>>>>>> together
>>>>>> with the triple being asserted - direct identification, not
>>>>>> earyl
>>>>>> optimization. See above why that is important.
>>>>>>
>>>>>> All this unasserted business may seem a bit eccentric, but
>>>>>> it’s
>>>>>> the key to any sort of configurable semantics like quotation
>>>>>> etc.
>>>>>> It therefore has huge potential - if done right.
>>>>>>
>>>>>>
>>>>>> ## SPARQL sugar
>>>>>>
>>>>>> You compare the occurence-based shortcut relation to
>>>>>> syntactic
>>>>>> sugar for RDF lists, which is fine, except that querying
>>>>>> those
>>>>>> lists is a hardship. Same for RDF/XML’s syntactic support for
>>>>>> RDF
>>>>>> standard reification. Any kind of RDF syntactic sugar also
>>>>>> needs
>>>>>> proper support in SPARQL to be effective in practice.
>>>>>>
>>>>>>
>>>>>> ## Triple terms vs Graph terms
>>>>>>
>>>>>> Just for completeness: all for this can easily be expanded to
>>>>>> graph terms. The syntax
>>>>>>
>>>>>> []{ :s :p :o. :u :v :w }
>>>>>>
>>>>>> is explored in the nested graph proposal.
>>>>>>
>>>>>>
>>>>>> ## Graph Terms vs Named Graphs
>>>>>>
>>>>>> I like Adrians example [0] of a complicated named graph based
>>>>>> application and I’m taking that serious. However it should
>>>>>> also
>>>>>> be clear that triple/graph terms in the end are always stored
>>>>>> in
>>>>>> a way very similar to named graphs. There is just no other
>>>>>> way in
>>>>>> a quad based system. Triple/graph terms can be represented as
>>>>>> named graphs, named graphs can be represented as graph terms.
>>>>>> It’s a practical question of how to encode
>>>>>> belonging/membership:
>>>>>> syntactically as nested graphs, via a new term type as in
>>>>>> RDF-
>>>>>> star that transforms a triple into a term at the surface (but
>>>>>> NOT
>>>>>> in the underlying storage layer, for obvious performance
>>>>>> reasons), via explicit binding relations as Niklas proposes
>>>>>> [1]
>>>>>> (and as Dydra implements nested graphs), etc. The main
>>>>>> question
>>>>>> is how to ensure that those binding relations don’t get lost
>>>>>> in
>>>>>> the process, but that IMHO is true for any solution. Nested
>>>>>> graphs can be serialized to graph terms, which are just an
>>>>>> extension of triple terms. That requires an additional en/de-
>>>>>> coding step to fit them into an environment that reserves
>>>>>> named
>>>>>> graphs to its own purposes. That extra step is the price that
>>>>>> those applications have to pay for being so particular about
>>>>>> their use of named graphs. That’s only fair, and probably
>>>>>> still
>>>>>> economical for them.
>>>>>>
>>>>>>
>>>>>> ## Term types vs Datatypes
>>>>>>
>>>>>> The most fundamental grievance with RDF-star is the
>>>>>> introduction
>>>>>> of a new term type when a new datatype of type RDF/TTL would
>>>>>> suffice. All I proposed above is readily imlpementable in the
>>>>>> nested graph proposal, which does map to TriG and regular N-
>>>>>> quads
>>>>>> and such a datatype (and even Turtle and N-triples, but
>>>>>> that’s
>>>>>> another discussion).
>>>>>>
>>>>>>
>>>>>> Best,
>>>>>> Thomas
>>>>>>
>>>>>>
>>>>>>
>>>>>> [0]
>>>>>> https://lists.w3.org/Archives/Public/public-rdf-star-wg/2023Dec/0019.html
>>>>>> [1]
>>>>>> https://lists.w3.org/Archives/Public/public-rdf-star-wg/2023Nov/0032.html
>>>>>>
>>>>>>
>>>>>>> Andy
>>>>>>>
>>>>>>> [1]
>>>>>>> https://lists.w3.org/Archives/Public/public-rdf-star-wg/2023Dec/0024.html
>>>>>>>
>>>>>>> [2]
>>>>>>> https://w3c.github.io/rdf-concepts/spec/#section-triples
>>>>>>> (as of 2023-12-10)
>>>>>>>
Received on Monday, 18 December 2023 21:06:16 UTC