Re: UCR

Hi Thomas,

On Wed, Jun 12, 2024 at 11:33 PM Thomas Lörtsch <tl@rat.io> wrote:
>
>
>
> > On 10. Jun 2024, at 22:53, Niklas Lindström <lindstream@gmail.com> wrote:
> [...]
> > To aid in this assessment, I made a short presentation (6 slides) at
> > [1], focusing on what I think is the most pertinent question at hand:
> > "Tokens and/or Reifiers?"
> >
> > Best regards,
> > Niklas
> >
> > [1]: https://docs.google.com/presentation/d/e/2PACX-1vQd9lU1j4TPxluCe-cB0t7_BUpy8zAfeY_5hDlbwIyOB8wsiRqkRtSFP4AeflV5UsE4EqT-Y3_Jjx9q/pub
>
> Not a fan of slide decks since they are hard to comment on.

To spare the list some prose for a while, I tried a more visual
approach, primarily for perspective, I think we need to share more
perspective and interpretations of examples, and *retain* that. This
isn't it; it's seeking that.


> On slide "Use Case Categories" you broadly distinguish two categories:
> 1. Token provenance - to which timestamps, source, and level of trust can be assigned
> 2. Statement qualification - about detailed circumstances such as events or situations
>
> I agree in principle/roughly, but I think we can do better:
>
> A) about the statement as a whole, i.e an entitiy in its own right
> B) about the kind of relation described by the statement
>
> A is of course very well suited to describe provenance, but also versioning, plausibility, propositional attitude, etc. However, the crucial aspect is that it talks about the statement as a whole, as an object in its own right. Annotating that object doesn’t change the assertion it represents, it only comments on it.
>
> B on the other hand qualifies the relation. It may add that a "likes" relation is indeed strongly felt, that a "buys" relation was performed via eletronic payment, etc. It might even go further and comment on properties of the subject and object at the time the relation existed, maybe that Paul was in a hurry when he bought the ticket, but such detail seems out of scope to this WG. However, I don’t find the notion of "events or situations" helpful to clarify the distinction between 1/A and 2/B.
>
>
> What bugs me right now is that
> - reification is well suited to represent 1/A
> - instantiation via singleton properties is better suited to represent 2/B.
> However, who wants to complicate things even further than they are already?!
>
> But, w.r.t. to the discussion about what reification actually is, if occurrences are the right concept/term, etc, I think it’s important that we agree on the categorization A/B. I hope you find it clearer than 1/2, but maybe you can come up with an even better abstraction.

I think your categories represent different delineations. It appears
to deal with different specializations of statements (closer to tokens
of "1", or sub-relationships), not necessarily the wider notion of
reifiers ("2").

Your "A" is either the simple logical expression, or what the triple
denotes, the abstract relationship itself. I take the RDF spec to
imply that the former is a token, and the latter is basically a formal
atom (or axiomatic abstraction) of binary, directed propositional
logic. This atom is not useful as a subject of further description in
most domains of discourse. (I'd say even literals come before them in
the theoretical order of "useful subjects".) But *tokens of* such
certainly are.

And I'd say your "B" is the notion of edge instantiation in LPGs? I
don't see its usefulness in RDF, due to its stricter foundation. (And
it already has rdfs:subPropertyOf for direct sub-relationship
specializations.) I do think your examples are great for the
usefulness of reifiers, since they turn "liking" and "buying" into
situations or events (which quite naturally cannot be restricted to
reify only one statement). And of course the identity of agents
participating in those events may be temporal specializations of their
general, "endurant" identities. RDF can already be used to model *all*
of that, it just has a somewhat cumbersome way of indirectly relating
the simpler, direct statements to such more qualified states of
affairs that reify them.

As I see it, a statement qualification doesn't have to be just a
restriction in meaning, it can also be wider than a specialization of
a relationship, in that it represents some kind of condition or
circumstance in relation to it. It can thus be both more particular,
as in concrete, and apply to more than one statement. They are
truth-makers of truth-bearers. But this is most certainly
"vocabulary".

So while I'm sure a better (and/or more approachable) abstraction can
be made, I'm not sure your substitution keeps the delineation I'm
seeking. (The sides of which, as I see it, the current "baseline"
proposal attempts to cater for.)


> Your slide "Two Kinds of 'Occurrences'" doesn’t make much sense to me, especially how you characterize a token. I think, plato.stanford.edu is more helpful to define the notion of a token. More generally you mix use cases with structural properties and add orthogonal questions like opacity to the mix. I think that needs more differentiation, and separation of concerns.
>
> Enrico mentioned in the last SemTF meeting that there are different kinds of referential opacity:
> - totally opaque, like a literal referring only to itself
> - co-referentially opaque, refering to a real world entity but suppressing co-denotation
> - maybe various levels of co-referential opacity depending on syntactic details (e.g. if the two intergers 42 and 042 are different or not)
> We have to discuss if we need those, all of them, or which ones…
> We can derive them all from the abstract unasserted triple term via specific (and to be defined) properties, we can define different syntaxes to represent them (some combinatiosn of <> and "" might do the trick), etc, but who would implement all that, and who would even bother to understand? So do we need to decide?

I am also asking for which concerns there are. I think opacity is a
rabbit hole (leading to a philosophical permathread) which is best
tackled, when needed, as just a raw string used as a property value of
a transparent statement "token", for the cases where quotation of
source representation is part of the domain of discourse. I don't
think we need core syntax for that, as use cases may differ a lot on
details. (For one, I wouldn't be surprised if some require the actual
prefixes used, etc, since that can be a relevant aspect of errors in
data capture.)

However, I do think the way both SPARQL and SHACL operate "on the
abstract syntax itself" (a gross oversimplification, but just think of
e.g. isIRI and isBlank in SPARQL), and the way graph names are outside
of interpretations, are interesting factors. There *might* be a way
to, simply enough, capture the token/statement duality of a triple
that is accessible in the interpretation, if deemed a fundamental,
useful capability for required use cases.

> That is orthogonal to the asserted-unasserted axis.
> It is also orthogonal to the 1/A-2/B categorization above.

I agree.


> Again, I don’t find that "situations, events or circumstances" categorization useful. There are many more things on earth, like theories, relationships, broadly agreed upon facts, the periodic system, Paul buying a ticket, etc. We will neither be able to categorize them all nor do we need to: on the abstract level of statements about statements they all behave the same. The rest is vocabulary, if one is so inclined, and not our topic.

Well, I do explicitly say that they are any kind of rdfs:Resource. I
suppose we could concretize the example categorizations: situations
(a.k.a. states of affairs), such as "broadly agreed upon facts" and
"the periodic system"; events, such as "Paul buying a ticket"; and
circumstances (closely related to situations), such as "relationships"
and "theories".

Vocabulary is perhaps not our topic to develop, but usage thereof in
RDF is surely our responsibility to cater for.

Best regards,
Niklas



> Best,
> Thomas

Received on Thursday, 13 June 2024 13:56:00 UTC