Re: Combining RDF-star and Singleton Properties [ was Re: The singleton property option] from Miel Vander Sande on 2024-05-27 (public-rdf-star-wg@w3.org from May 2024)

From: Miel Vander Sande <miel.vandersande@meemoo.be>
Date: Mon, 27 May 2024 08:49:16 +0200
To: Franconi Enrico <franconi@inf.unibz.it>
Cc: Kurt Cagle <kurt.cagle@gmail.com>, ddooss@wp.pl, Thomas Lörtsch <tl@rat.io>, RDF-star Working Group <public-rdf-star-wg@w3.org>, "Peter F. Patel-Schneider" <pfpschneider@gmail.com>
Message-ID: <CAHeRLWtajEKGMC7xpx=L6-WT16ajN=RoPh0ebYhsv7hQnitWTg@mail.gmail.com>
Hello,

I have to agree with Enrico (but I might be missing something). Even
syntactically

:subject [ :namedNode | :property1 :value1; :property2 :value2 ] :object .

doesn't add much to

:subject :namedNode :object .
:namedNode :property1 :value1; :property2 :value2 .

which seems to be same if I understand the document correctly. Same holds
for S and O positions.
And you still have to make each :namedNode unique and refer to the
predicate to actually get singleton properties,

In this proposal, how would you annotate :subject :namedNode "42" without
having to change your graph?

Cheers,

Miel

Op vr 24 mei 2024 om 15:55 schreef Franconi Enrico <franconi@inf.unibz.it>:

> Hi Kurt,
> It seems to me that your proposal is a rephrase of various discussions we
> already had, and ruled out.
>
> *Named Node in the Predicate Position*: this seems to be just a rephrase
> of the singleton property - once you try to give semantics to it. Observe
> that, wrt the current status of the discussion, your proposal does not
> express directly the multi-edge case, since you have to name each edge of
> the same type with a distinct name.
>
> *Reifier Expression*: this is just a rephrase of "option 1" we discussed
> and ruled out some time ago. It has severe drawbacks in reconstructing the
> reifier back from the tree reification triples.
>
> cheers
> —e.
>
> On 23 May 2024, at 21:46, Kurt Cagle <kurt.cagle@gmail.com> wrote:
>
> Dominik,
>
> Thanks, I will incorporate the Cyper coding as well.
>
> *Kurt Cagle*
> Editor in Chief
> The Cagle Report
> kurt.cagle@gmail.com
> 443-837-8725 <http://voice.google.com/calls?a=nc,%2B14438378725>
>
>
> On Thu, May 23, 2024 at 12:33 PM ddooss@wp.pl <ddooss@wp.pl> wrote:
>
>> Dear Kurt
>>
>> Thank you for sharing your document on the proposed enhancements to RDF
>> reification and LPG harmonization. Your approach to addressing these issues
>> separately while utilizing a similar notation is intriguing and seems quite
>> promising.
>>
>> The named node expressions, in particular, provide a clear method for
>> transforming typically ephemeral blank nodes into actionable, referencable
>> elements.
>>
>> I would appreciate a formal presentation of these concepts in our next
>> meeting. A detailed exposition will help ensure that everyone understands
>> the intricacies of your proposal and will facilitate a thorough comparison
>> with other existing proposals. I definitely think that we need to have
>> formal definitions, as well as the semantics of your proposal.
>>
>> Regarding translating these ideas into a DDL/DML language like Cypher,
>> could you provide an example that mirrors the LPG scenario described?
>> Demonstrating how these RDF constructs could be represented in Cypher would
>> aid in evaluating their practical applicability in a graph database context.
>>
>> Best regards,
>> Dominik
>>
>> Dnia 23 maja 2024 19:24 Kurt Cagle <kurt.cagle@gmail.com> napisał(a):
>>
>> I've attached a document that covers YET ANOTHER proposal (more properly
>> a recommendation I've made before).
>>
>> There are two issues that we seem to be rehashing here. The first is the
>> question of reificational notation, while the second has to do with LPG
>> harmonization. My contention is that these are different issues, though we
>> can use similar notation for both.
>>
>> *Reification*
>>
>> A named reification is simply a set of statements:
>>
>> :r rdf:subject :s; rdf:predicate :p; rdf:object :o .
>>
>> This is not a triple. It is three statements about the state that a
>> triple can be in. It does not introduce a triple into the system,it makes
>> no assertions about the truthiness or even, by itself existence of that
>> triple. It is simply a statement about the components that a triple might
>> have. You cannot reason with it directly, though you can use other
>> processes (SPARQL, SHACL, etc.) to construct or verify the existence of
>> triples for which these assertions are true. Properly speaking, the above
>> itself should probably be qualified:
>>
>> :r rdf:subject :s; rdf:predicate :p; rdf:object :o ; a rdf:Reification .
>>
>> The notation << :r | :s :p :o >> makes the above statement more compact,
>> but the reification can apply to any triples within a system, or none at
>> all, regardless of the values.
>>
>> *Named Node Expressions*
>>
>> I propose, in the attached, that we use a similar nomenclature for what
>> I'm turning named node expressions, to whit:
>>
>> [ ?nn | :p1 :o1 ; :p2 :o2 ]
>>
>> where ?nn is replaced by a formal (not blank) IRI.
>>
>> This is a Turtle (not RDF) syntactical amendment. The above takes what
>> would ordinarily be a blank node and replaces it with a named node:
>>
>> For instance:
>>
>> :liz :hasMarriage [ :marriage 1 | :to :Ricard, :start "1965" ; :end
>> "1975" ].
>>
>> which expands to:
>>
>> :liz :hasMarriage  :marriage 1 .
>> :marriage 1 :to :Richard .
>> :marriage 1  :start "1965" .
>> :marriage 1   :end "1975" .
>>
>> Why is this important? Because the blank node is a pointer to a data
>> structure, but use of the [] notation makes it impossible to reference that
>> data structure from within Turtle. By adding in a named node as the
>> referencing node, you gain that ability, and it is a key ability for
>> modeling.
>>
>> For instance, I can use the expression:
>>
>> :liz :hasMarriage [ :marriage 1 | :start "1965" ; :end "1975"; :to
>> :richard ], [ :marriage 2 | :start "1975" ; :end "1985"; :to :john].
>>
>> This is semantically equivalent to the JSON
>>
>> {"liz":{"hasMarriage":[{"marriage1":{"start":"1965",
>> "end":"1975","to":"richard"}},"marriage1":{"start":"1965",
>> "end":"1975","to":"richard"}}]}}
>>
>> The same thing can be done with both predicate-positioned named node
>> expressions and subject-oriented ones.
>>
>> This addresses the LPG equivalency relationship, and does so without ever
>> touching reifications.
>>
>> Note that this also highlights an important point. Blank nodes are useful
>> because they are unique and system-assigned. However, they are not
>> referenceable. The Turtle notation:
>>
>> :liz :hasMarriage _:b1, _:b2 .
>> _:b1 :start "1965" ; :end "1975"; :to :richard .
>> _:b2 :start "1975" ; :end "1985"; :to :john .
>>
>> is simply a preprocessor directive to replace the "named" nodes with
>> anonymous IRIs in the final indexing.  You still have to make _:b1 and _:b2
>> unique, or the data structures disintegrate.
>>
>> Anyway, I ask the chair for time during our next meeting to discuss this
>> proposal.
>>
>> *Kurt Cagle*
>> Editor in Chief
>> The Cagle Report
>> kurt.cagle@gmail.com
>> 443-837-8725 <http://voice.google.com/calls?a=nc,%2B14438378725>
>>
>>
>> On Wed, May 15, 2024 at 5:32 AM Thomas Lörtsch <tl@rat.io> wrote:
>>
>> YET ANOTHER GRAND UNIFYING PROPOSAL
>> ===================================
>>
>> What appeared as the way forward last winter is getting more and more
>> convoluted as the details are discussed. I agree with Niklas that this is a
>> result of the choice for Option 3, but it also is sign of a deeper problem:
>> we might still don’t work with the right primitives and still don’t have a
>> solid understanding of the problem we’re dealing with. I’m well aware that
>> everybody is tired and wants to be done with all this, but it seems to me
>> that we should change course, again. I’ll discuss some background first,
>> but then make a pretty concrete proposal of how to attack the problem by
>> combining the syntax of RDF-star with the semantics of singleton
>> properties. IMHO it has some very concrete advantages: less triples, less
>> confusing indirections, and more intuitive semantics.
>>
>>
>> BACKGROUND
>> ==========
>>
>> I see two main use cases for statement annotation:
>>
>> - n-ary relations
>>  with a primary topic and secondary, qualifying attributes
>>  -> that can be interpreted as INSTANTIATION
>>
>> - metadata annotations
>>  that are orthogonal to the topic of the statement
>>  -> that can be interpreted as REIFICATION
>>
>> The metadata use case (quite often characterized as provenance) is strong
>> in RDF land with its focus on integration of data from heterogeneous
>> sources. In LPG land much more emphasis is put on structuring the graph
>> into easy to navigate main relations and their less important details (and
>> attributed objects, but that’s another topic). Of course that is just a
>> very rough characterization, and overlaps in both directions are common.
>>
>> The metadata use case is well captured by REIFICATION because reification
>> stays clear of the annotated statement itself (lets keep in mind that
>> reification is a general concept and don’t associate it with the syntactic
>> verbosity its implementation in RDF for a moment). There is an air gap
>> between the statement and its reification that ensures that the original
>> statement is unencumbered and unchanged by the annotation. This is good for
>> the metadata use case but it is not easy to understand as recent mail
>> exchanges on the list between Olaf, Niklas, Bryan and Peter have shown
>> (again) and the indirection can cause irritating and unfortunate effects.
>>
>> In an n-ary relation the main relation can be understood as an
>> INSTANTIATION of the type of relation it represents, with each instance
>> having its own secondary attributes as qualifications. Instantiation is a
>> concept that is well understood and maps nicely to everyday
>> conceptualizations like "a car" (engine, four wheels, etc) and "my car"
>> (again engine, four wheels, etc, but also a sedan, blue, old, etc).
>> Instantiation is what drives the semantics of the singleton property
>> approach.
>>
>> Of course, the distinction is more one of tendencies than of a hard
>> separation: reification can represent n-ary relations and instantiation via
>> n-ary relations can represent metadata, one person’s data is another
>> person’s metadata, etc. However, in both cases that comes at a certain cost
>> in intuitiveness and naturalness. If used wrongly, subtle breaks can be
>> introduced that may lead to surprising and undesirable results.
>>
>> The problem with the singleton property approach as proposed by Nguyen et
>> al is that it tries to achieve its goal without a change to the syntax of
>> RDF. It lacks the boldness of RDF-star to introduce a new term type into
>> RDF. This makes it verbose, hard to optimize and requiring to entail the
>> primary relation as if it was an additional detail, an afterthought even.
>> On the other hand RDF-star was intially a syntax without a well-defined
>> semantics, or model theory even, and this WG still struggles to make it all
>> work out. This here is an attempt to re-use  the singleton property
>> approach as the semantic underpinning of the RDF-star syntax, or, put the
>> other way, to augment the singleton property approach with the RDF-star
>> syntax, thereby getting rid of its verbosity. So let’s get to it.
>>
>>
>> CORE
>> ====
>>
>> 1) a return to RDR and pre-CG RDF*: EACH TRIPLE TERM IS ASSERTED, e.g.
>>
>>    << :s :p :o >> :b :c .
>>
>> asserts ' :s :p :o ' and annotates it in one go. This gets rid of the
>> need for an annotation/shorthand syntax and it safes an extra triple to
>> actually assert the assertion. It captures the predominant intuition:
>> saying something and adding detail to it. At this point it doesn’t matter
>> much if that detail is metadata or qualifying detail. What matters is that
>> both are solidly connected, not separated (and prone to mixups and
>> misunderstandings through overlapping multi-edge situations).
>> A query for { :s :p :o } in SPARQL-star on the above example must
>> retrieve the statement ':s :p :o' from the triple term << :s :p :o >>, etc
>> - "Turtle with holes".
>> This means that in common scenarios there is zero overhead because of
>> singleton property verbosity and entailments, unasserted assertions, etc.
>> The main use case is very straightforward to use (and implement, I reckon).
>>
>>
>> 2) the RDR/RDF* proposal is extended with TRIPLE TERM IDENTIFIERS not
>> unlike the current WG proposal, but with a twist: user-provided identifiers
>> are handled differently (more on that below). Just as in the current WG
>> proposal a bnode identifier is provided by the system for every triple
>> term, e.g. the above '<< :s :p :o >> :b :c .' is equivalent to
>>
>>    << _:p1 | :s :p :o >> :b :c .
>>
>> The triple term is now a QUAD-TUPLE: the identifier becomes part of the
>> triple term also in model and abstract syntax, getting rid of the abstract
>> triple term type (the thing syntactically expressed as '<<( :s :p :o )>>'
>> in the current proposal - however, we will reuse that syntax, see below).
>> This identifier is equivalent to the singleton property itself in the
>> approach so named.
>> The statement identifier, refering to an instance/occurrence of the
>> abstract statement, is essential to capture the semantics of most use
>> cases, not the least LPG uses, where statements (or edges in LPG) of the
>> same type can occur multiple times, each with different and not to be mixed
>> up sets of annotations.
>>
>> The current WG proposal offers to users the possibility to explicitly
>> define an IRI en lieu of but semantically equivalent to the system provided
>> bnode, e.g.
>>
>>    << :x | :s :p :o >> :b :c .
>>
>> Its purpose is to work around the limitations of line-based
>> serializations. We do this too, but in a different way (that’s the TWIST
>> hinted at above): an explicitly provided identifier is stored separately
>> from and additionally to the system provided bnode. The quad therefore
>> conceptually becomes a QUIN-TUPLE - however, stores may choose to just
>> store the explicit identifier via an extra statement, like in the mapping
>> discussed next. The rationale behind this arrangement will become apparent
>> below when we discuss many-to-many relations, sets and graphs.
>>
>>
>> 3) a MAPPING to standard RDF is based on the singleton property approach,
>> e.g.
>>
>>    :s :p :o .
>>    :s _:p1 :o .
>>    _:p1 rdf12:singletonOf :p ;  # _:p1 is a singleton property of _:p
>>         rdf12:id :x ;           # :x is a user-provided identifier
>> refering to _:p1
>>         :b :c .
>>
>> This should work well through the whole installed base and stack of
>> RDF/RDFS/OWL/etc, at least in principle (issues e.g. with missing predicate
>> indexes notwithstanding). [0] have found that singleton properties have
>> quite favorable properties w.r.t. reasoning (and even more so if, in
>> contrast to those authors, one interprets singleton annotations not as
>> constraints but as additional detail).
>>
>>
>> I claim that so far all this is pretty straightforward and covers the
>> vast majority of real world usage. It is cleaner and more concise than the
>> current proposal in that it doesn’t separate assertion from annotation, it
>> saves that extra un-asserted triple in storage and it makes the shorthand
>> annotation syntax superfluous.
>>
>>
>> Some details are important to understand:
>>
>> - rdf12:singletonOf rdfs:subPropertyOf rdf:type .
>> This reflects the intuition that each singleton is unique, an intuition
>> that is better expressed as instantiation than as subclassing.
>>
>> - the verbosity and optimization troubles of singelton properties, as
>> evidenced in the mapping, only occur in environments that don’t support
>> RDF-star triple terms (otherwise what would we need RDF-star for ;-).
>>
>> - the mapping loses the strong connection between a statement and its
>> annotation. Just as with the current WG proposal it is possible to have the
>> same statement asserted and, e.g. after merging a different source,
>> annotated but assumedly un-asserted. The latter information will get lost,
>> making the whole concept of un-asserted assertions brittle and unreliable.
>> The current WG proposal always has this problem, this proposal here only
>> when mapping to standard RDF.
>>
>> - the main difference between our proposal and the singleton property
>> approach is that we reverse access: we put the un-annotated statement in
>> the foreground (by means of the triple term syntax), both in the user
>> facing syntax and at the implementation level, whereas in the singleton
>> property approach it has to be entailed from the annotated singleton
>> statement. This makes our proposal much more straightforward to use and
>> implement.
>>
>>
>>
>> EXTENSIONS
>> ==========
>>
>> The current WG proposal tries to cover more ground than just statement
>> annotation, most notably annotating un-asserted assertions, but also other
>> stuff that depending on perspecticve seems like low-hanging fruit,
>> especially annotating sets of statements and referentially opaque statement
>> annotations. We argue that those are orthogonal demands and should be
>> implemented in a way that doesn’t complicate the above very simple basic
>> arrangement. It seems however that it is possible to achieve this with
>> modest effort.
>>
>>
>> UNASSERTED ASSERTIONS
>>
>> We re-use the syntax of abstract triple terms from the current WG
>> proposal to encode unasserted assertions, as the concept of abstract triple
>> terms is obsolete in our approach. Like triple terms they are four-tuples,
>> i.e. they always have an identifier implicitly provided by the system as a
>> bnode. As the use case is rather niche we consider the introduction of an
>> unasserted assertion in model and abstarct syntax overkill, but instead
>> advocate to implement the syntax as syntactic sugar for standard
>> reification, e.g.
>>
>>    <<( :s :p :o )>> :b :c .
>>
>> in standard RDF maps to standard reification:
>>
>>    _:p2 rdf:type :rdf:Statement ;
>>         rdf:subject :s ;
>>         rdf:predicate :p ;
>>         rdf:object :o ;
>>         :b :c .
>>
>> The same for explicitly named un-asserted assertions like e.g. '<<( :x |
>> :s :p :o )>>  :b :c .'
>>
>>    _:p2 rdf:type :rdf:Statement ;
>>         rdf:subject :s ;
>>         rdf:predicate :p ;
>>         rdf:object :o ;
>>         rdf12:id :x ;
>>         :b :c .
>>
>>
>> REFERENTIAL OPACITY
>>
>> Referential opacity has come up again lately and although I’m pretty wary
>> of the concept I can see a way in which its introduction will probably not
>> harm RDF in general. Most of all I like to see it as an orthogonal concern
>> that should not be entagled with annotations or un-asserted assertions as
>> such. Therefore I take up the idea of introducing yet another syntax (by
>> Enrico IIRC, in a recent telco), e.g.
>>
>>    <<" :s :p :o ">> :b :c .
>>
>> Again this may be implemented as a new term type in model and abstract
>> syntax, or we may follow Antoine Zimmermanns proposal for an RDF literal
>> datatype. The latter can be employed to define referential opacity as
>> syntactic sugar and map to standard RDF maps as follows:
>>
>>    :s :p :o .
>>    :s _:p3 :o .
>>    _:p3 rdf12:singletonOf :p ;
>>         :b :c ;
>>         rdf12:ofArtefact ":s :p :o"^^rdf:ttl .
>>
>> The RDF literal datatype documents precisely the syntactic representation
>> of the statement. This is a very un-intrusive approach to referential
>> opacity and IMHO won’t get in the way of standard RDF procedures. Of course
>> it doesn’t prevent undesirable entailments from being made, as the approach
>> to referential opacity taken by the CG proposal does, but at least it
>> allows to track them back to the original source and treat them
>> accordingly. Bnodes might either not be handled or be covered more fully
>> than in teh CG report proposal by allowing artefacts to be concise bounded
>> descriptions. E.g. an artefact ":s :p _:b1 . _:b1 :d :e , :f ."^^rdf:ttl
>> would give a full account of the meaning of _:b1 at the time the artefact
>> was created.
>>
>>
>> REFERENTIALLY OPAQUE UNASSERTED ASSERTIONS
>>
>> Again, referential opacity and unasserted assertions are orthogonal
>> concerns, and therefore yet another syntax is introduced to combine the
>> two, e.g.
>>
>>    <<(" :s :p :o ")>>  :b :c .
>>
>> Following the above proposals this is mapped to standard RDF by adding
>> the literal representation to the reification quad, e.g.
>>
>>    _:p4 rdf:type :rdf:Statement ;
>>         rdf:subject :s ;
>>         rdf:predicate :p ;
>>         rdf:object :o ;
>>         rdf12:hasArtefact ":s :p :o"^^rdf:ttl ;
>>         :b :c .
>>
>> Considering the mindboggling level of disambiguation that this
>> arrangement provides the complexity isn’t too bad IMHO ;-)
>>
>>
>> MANY-TO-MANY, SETS, GRAPHS
>>
>> Like the current WG proposal this approach doesn’t rule out many-to-many
>> relations, e.g.
>>
>>    << :x | :s :p :o >> :b :c .
>>    << :x | :u :v :w >> :b :c .
>>
>> We might even consider to introduce a supporting syntax, aka GRAPH TERMS,
>> e.g.
>>
>>
>>    << :s :p :o .
>>       :u :v :w >> :b :c .
>>
>> or, explicitly named
>>
>>    << :x | :s :p :o .
>>            :u :v :w >> :b :c .
>>
>> I don’t want to push the envelope too far (given the constraints imposed
>> by the charter, the controverses around the topic, etc) but it’s good to
>> see that this is syntactically straightforward - it isn’t with the
>> shorthand annotation syntax of the WG proposal.
>>
>> Anyway, employing the mapping to standard RDF from CORE above, we get a
>> straightforward definition of the meaning of many-to-many annotations (no
>> matter if they come as singleton terms or as hypothetical graph terms),
>> e.g. mapping the above many-to-many relation to
>>
>>    :s :p :o .
>>    :u :v :w .
>>    :s _:p5 :o .
>>    _:p5 rdf12:singletonOf :p ;
>>         rdf12:id :x ;
>>         :b :c .
>>    :u _:v1 :w .
>>    _:v1 rdf12:singletonOf :v ;
>>         rdf12:id :x ;
>>         :b :c .
>>
>> This establishes a FOR-EACH semantics: annotations of the graph term are
>> annotating each triple, not the graph (or set of triples if one prefers
>> that slightly looser wording) itself. The same is true for annpotations on
>> :x: they too are mapped to all statements so named, e.g.
>>
>>    << :x | :s :p :o .
>>            :u :v :w >> :b :c .
>>    :x :d :e .
>>
>> is mapped to
>>
>>    :s :p :o .
>>    :u :v :w .
>>    :s _:p5 :o .
>>    _:p5 rdf12:singletonOf :p ;
>>         rdf12:id :x ;
>>         :b :c ;
>>         :d :e .
>>    :u _:v1 :w .
>>    _:v1 rdf12:singletonOf :v ;
>>         rdf12:id :x ;
>>         :b :c ;
>>         :d :e .
>>
>>
>> To annotate the set of triples itself one would have to create an
>> explicit reference via the identifier :x, e.g.
>>
>>     :x rdf12:asObject [     # or "rdf12:asGraph", "rdf12:asSet" ...
>>         :f :g
>>     ] .
>>
>> A possible use case might be to express that a set of statements together
>> describe a situation, postulate a theory, etc.
>> This arrangement can also be used to annotate singleton statements as
>> objects of there own right (not as annotations to the predicate).
>> Semantically this is probably closer to reification than to n-ary
>> relations, but I’m not really sure myself what to make of it. In any case
>> it is more expressive than the current WG proposal which provides no means
>> to differentiate between the object and its content (httpRange-14 raising
>> its ugly head again, I guess)
>>
>>
>> However, this arrangement has repercussions on the interpretation of
>> annotations on :x (or _:x) in x-to-one cases, i.e. when single statements
>> are annotated, because it can only mean that also those annotations refer
>> to the statement as a whole, not its n-ary property. This is a departure
>> from the current state which leaves this question open - and sure to cause
>> some irritation.
>>
>>
>> TBC… I’m leaving it at this for reasons of time, but also to solicit some
>> general comments. The details most certainly need some more tweaking, as so
>> far all proposals did. The means to explicitly name an occurrence were
>> introduced to overcome the limitations of serialization, but they do open
>> the door to many-to-many relations, and that comes in handy when discussing
>> sets and graphs. However it mixes orthogonal concerns, so may have
>> unintended consequences. I expect this arrangement to be controversial, and
>> maybe buggy. Comments welcome!
>>
>> Best,
>> Thomas
>>
>> [0] https://link.springer.com/chapter/10.1007/978-3-319-58068-5_39
>>
>>
>> > On 2. May 2024, at 16:00, Peter F. Patel-Schneider <
>> pfpschneider@gmail.com> wrote:
>> >
>> > The singleton property approach has benefits and downsides.  The quoted
>> triple approach has benefits and downsides.
>> >
>> > One very big advantage of the singleton property approach is that it is
>> (barely) possible to use it with any RDF system, even RDF systems that have
>> no optimizations.  A big disadvantage of the quoted triple approach is that
>> it requires new syntax, new semantics, and new implementations.
>> >
>> > One cannot successfully argue that just because the singleton property
>> approach may require more triples that it is inherently worse than the
>> quoted triple approach.   RDF implementations can be tuned to the singleton
>> property approach, providing special data structures for singleton
>> properties and special code to optimize SPARQL queries for the singleton
>> property approach.
>> >
>> > One possible way to do this is to use a special approach for singleton
>> properties where the internal name of the blank node encodes the parent
>> property.  This could result in minimal or even no storage overhead for
>> singleton properties.  Of course the implementation effort to make this
>> completely transparent would be significant, but then so is the effort to
>> make a performative implementation of quoted triples.
>> >
>> > I note that in this approach the singleton property triples would look
>> very much like multiple edges, i.e., this could be considered to be a
>> space-efficient implementation of RDFn.
>> >
>> > peter
>> >
>> >
>> > On 4/30/24 15:46, Thompson, Bryan wrote:
>> >> Your proposal would require two statements on top of the original SPO
>> statement before you should begin to make assertions about the original SPO
>> statement?
>> >> Anything based on the singleton property approach will have quite an
>> impact on database statistics.  The number of used predicates would jump
>> from millions (for open linked data) to the cardinality of the statements
>> about which statements are being made (e.g., billions, 10s of billions,
>> etc.). @Williams, Gregory <mailto:ngregwil@amazon.com> or @Schmidt,
>> Michael <mailto:schmdtm@amazon.com> can comment on this, but this
>> certainly places a new burden on common techniques for extracting
>> statistics from a graph.
>> >> Note that there is really no reason to rely on the P position in your
>> proposal.  You could use S since it already allows blank nodes.  You then
>> hang the Subject of the original asserted SPO on the statement about that
>> unique subject. (Or you could use O, which might be kinder for database
>> statistics since they tend to focus on SP* analysis.)
>> >> _:si :statementInstanceHasSubject :s .
>> >> _:si :p :o .
>> >> :s :p :o.
>> >> I have been impressed in the past with the space and time overhead
>> which arises out of various modeling decisions around possible statements
>> about statements treatments.  I would recommend carefully considering that
>> impact.  Another 2 triples makes a huge difference when all statements
>> carry annotations, as they do in some domains.  For example, consider the
>> relatively common case in which you have a graph consisting of a topology
>> and edge weights.  This is very common - lots of graphs are simply edges
>> and their weights.  As I understand it, your proposal would have 3 times
>> the data volume to model the topology (some set of edges) in a manner which
>> would permit associating edge weights with the edges in that topology.  And
>> the database would need to chase a long chain to obtain those edge weights
>> in a correct manner: :s :p :o. => :s _:pi :o => _:pi rdfs:subPropertyOf :p
>> . => _:pi :hasWeight 1.0.  The cost of chasing that chain would make
>> applications relying on edge weights very expensive in both time and
>> space.  I can't see that as being responsive to such use cases.  To be
>> efficient, there needs to be a close association between an edge and the
>> properties of that edge.  Their resolution needs to be very efficient.
>> >> Also note that this singleton property proposal would not support
>> alignment in the data (interoperability in the data) with LPG edge
>> properties.  So it would fail to offer a unification path for the common
>> use cases of RDF and LPG.
>> >> Thanks,
>> >> Bryan
>> >>
>> ------------------------------------------------------------------------------
>> >> *From:* Peter F. Patel-Schneider <pfpschneider@gmail.com>
>> >> *Sent:* Tuesday, April 30, 2024 10:40:18 AM
>> >> *To:* public-rdf-star-wg@w3.org
>> >> *Subject:* RE: [EXTERNAL] The singleton property option
>> >> CAUTION: This email originated from outside of the organization. Do
>> not click links or open attachments unless you can confirm the sender and
>> know the content is safe.
>> >> I think that this is far too strong.   The singleton property approach
>> has
>> >> problems, but not to this extent.
>> >> For any statement that does not require annotation, the singleton
>> property
>> >> approach does not require any changes at all, i.e.,  just use
>> >> :s :p :o .
>> >> For a statement that does require annotation, the singleton property
>> requires
>> >> two or three triples, one to make the blank node a subproperty of the
>> desired
>> >> property, one to state the relationship using the blank node, and, if
>> the RDF
>> >> system does not implement RDFS semantics, one to make the statement
>> using the
>> >> regular property, i.e.,
>> >> _:pi rdfs:subPropertyOf :p .
>> >> :s _:pi :o .
>> >> :s :p :o.
>> >> The added storage for this might be less than that needed for efficient
>> >> processing of quoted triples, particularly if the third statement is
>> not needed.
>> >> There is no need to change modelling if the statement is annotated
>> after the fact.
>> >> peter
>> >> On 4/30/24 12:26, Thompson, Bryan wrote:
>> >>> The singleton property approach undermines the direct use of
>> predicates in
>> >>> statements and forces a second hop for any use case to determine the
>> actual
>> >>> predicate used.  It also requires that the "statement" is modeled
>> differently
>> >>> in advance, thus increasing the space requirements even if no
>> statements about
>> >>> statements are used.
>> >>>
>> >>>
>> >>> This is not efficient.
>> >>>
>> >>>
>> >>> Effectively, the singleton property model says that the RDF triple is
>> wrong.
>> >>> It says that you should model using (S ID O) and then model the
>> predicate and
>> >>> other information as statements about that ID.  This is not the RDF
>> model.
>> >>>
>> >>>
>> >>> The approach with Statements about Statements should IMHO be built on
>> (S P O
>> >>> ID).  That is, there is a unique identifier for the SPO and you make
>> >>> statements about that statement ID.
>> >>>
>> >>>
>> >>> Bryan
>> >>>
>> >>>
>> ------------------------------------------------------------------------------
>> >>> *From:* Thomas Lörtsch <tl@rat.io>
>> >>> *Sent:* Tuesday, April 30, 2024 12:02:21 AM
>> >>> *To:* public-rdf-star-wg@w3.org; Thompson, Bryan; Niklas Lindström;
>> RDF-star
>> >>> Working Group
>> >>> *Subject:* RE: [EXTERNAL] The singleton property option
>> >>>
>> >>> *CAUTION*: This email originated from outside of the organization. Do
>> not
>> >>> click links or open attachments unless you can confirm the sender and
>> know the
>> >>> content is safe.
>> >>>
>> >>>
>> >>> Brian,
>> >>>
>> >>> Niklas combines the RDF-star syntax with the semantics of Singleton
>> >>> Properties. AFAIK no implementations of or papers on Singleton
>> Properties have
>> >>> done that. This combination doesn't even require an index on
>> properties.
>> >>>
>> >>> This combination is nearer to the original RDR approach than anything
>> else
>> >>> discussed by CG and WG. It is IMO a very neat idea and deserves a
>> closer look.
>> >>>
>> >>> Thomas
>> >>>
>> >>>
>> >>>
>> >>> Am 29. April 2024 19:06:37 MESZ schrieb "Thompson, Bryan" <
>> bryant@amazon.com>:
>> >>>
>> >>>      The singleton property approach has many downsides and is
>> pragmatically
>> >>>      unworkable.  There is a good reason people are not happy with
>> this approach.
>> >>>
>> >>>
>> >>>      Bryan
>> >>>
>> >>>
>> ------------------------------------------------------------------------------
>> >>>      *From:* Niklas Lindström <lindstream@gmail.com>
>> >>>      *Sent:* Friday, April 26, 2024 2:08:41 PM
>> >>>      *To:* RDF-star Working Group
>> >>>      *Subject:* [EXTERNAL] The singleton property option
>> >>>      CAUTION: This email originated from outside of the organization.
>> Do not
>> >>>      click links or open attachments unless you can confirm the
>> sender and know
>> >>>      the content is safe.
>> >>>
>> >>>
>> >>>
>> >>>      For completeness (and perhaps to widen the perspective), here is
>> the
>> >>>      singleton property option I briefly mentioned on the semantics
>> call
>> >>>      (and alluded to in [1]). Also see [2] for the original; this is
>> just a
>> >>>      quick strawman adaptation for the benefit of the LPG perspective.
>> >>>
>> >>>      It extends RDF 1.1 differently; no triple terms, no opacity,
>> just:
>> >>>
>> >>>      1. Allow bnodes as predicates (blank predicates).
>> >>>      2. Define rdf:singletonPropertyOf for linking those to the
>> property
>> >>>      they represent instances/occurrences/edges of.
>> >>>
>> >>>      3. Well-formedness conditions:
>> >>>      3.1 Bnode predicates are only to be used once; with one s and o
>> >>>      (similar to list cons nodes, who are "single purposed").
>> >>>      3.2 The rdf:singletonPropertyOf is semantically functional
>> (exactly
>> >>>      like rdf:first and rdf:rest).
>> >>>
>> >>>      4. For optimization, implementations can put triples with blank
>> >>>      predicates in a dedicated table (using edgename as unique key),
>> >>>      relying on well-formedness for cohesion. Such a table is
>> completed in
>> >>>      two steps: 1) the singleton assertion inserts s and o for
>> edgename; 2)
>> >>>      the rdf:singletonPropertyOf assertion inserts p for edgename. If
>> >>>      well-formedness is broken, all optimization bets are off.
>> Perhaps a
>> >>>      dedicated skolemization scheme can be employed for some more
>> control
>> >>>      and/or "unstarring".
>> >>>
>> >>>      5. RDF-star syntax obviously needs no naming syntax; naming these
>> >>>      would break well-formedness.
>> >>>      6. What these *mean* of course needs a good definition (property
>> >>>      specializations, edge type instances or similar). Are they
>> asserted?
>> >>>      Sure. Do they assert something using their
>> rdf:singletonPropertyOf
>> >>>      property as predicate? No. (Could they? Well, they can be
>> declared
>> >>>      ("inline") to *also* be subPropertyOf the same property, and
>> through
>> >>>      entailment that would happen.)
>> >>>      7. Reifiers become a usage pattern (informative) as suggested
>> from the
>> >>>      property edge perspective. Any desired :reifiedBy or :partOf
>> relation
>> >>>      can link predicate singletons to one or more "reifiers".
>> >>>
>> >>>      Basic example:
>> >>>
>> >>>           << :s :p :o >> :source <stream662be7ba> ;
>> >>>               :timestampMills 1714153402 .
>> >>>
>> >>>      Expands to:
>> >>>
>> >>>           :s _:e1 :o .
>> >>>           _:e1 rdf:singletonPropertyOf :p ;
>> >>>               :source <stream662be7ba> ;
>> >>>               :timestampMills 1714153402 .
>> >>>
>> >>>      Annotation syntax:
>> >>>
>> >>>           :s :p :o {| :reifiedBy <#reifier> |} .
>> >>>
>> >>>      Expands to:
>> >>>
>> >>>           :s :p :o .
>> >>>           :s _:e1 :o .
>> >>>           _:e1 rdf:singletonPropertyOf :p ;
>> >>>             :reifiedBy <#reifier> .
>> >>>
>> >>>      Possible singleton property entailment?:
>> >>>
>> >>>           _:e1 a rdf:SingletonProperty;
>> >>>               rdf:subject :s ;
>> >>>               rdf:prediate :p ;
>> >>>               rdf:object :o .
>> >>>
>> >>>      Will entailment break well-formedness if (accidentally?) *put
>> back*
>> >>>      into a regular graph? Of course, just as RDF lists are "broken"
>> >>>      whenever that happens (as in look terrible when serialized, make
>> no
>> >>>      sense when queried, etc.).
>> >>>
>> >>>      Best regards,
>> >>>      Niklas
>> >>>
>> >>>      [1]:
>> >>>      <
>> https://lists.w3.org/Archives/Public/public-rdf-star-wg/2024Apr/0158.html
>> >>>      <
>> https://lists.w3.org/Archives/Public/public-rdf-star-wg/2024Apr/0158.html
>> >> <
>> https://lists.w3.org/Archives/Public/public-rdf-star-wg/2024Apr/0158.html
>> >>>
>> >>>      [2]: <https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4350149/
>> >>>      <https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4350149/
>> >> <https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4350149/>>>
>> >>>
>> >
>>
>>
>>
>
Received on Monday, 27 May 2024 06:49:49 UTC