Re: [External] : example showing why rdf:state is essential from Souripriya Das on 2024-08-16 (public-rdf-star-wg@w3.org from August 2024)

From: Souripriya Das <souripriya.das@oracle.com>
Date: Fri, 16 Aug 2024 03:35:14 +0000
To: Gregg Kellogg <gregg@greggkellogg.net>, Thomas Lörtsch <tl@rat.io>
CC: "public-rdf-star-wg@w3.org" <public-rdf-star-wg@w3.org>
Message-ID: <CY5PR10MB607171B83FD1EA4B8ACBA23CFA812@CY5PR10MB6071.namprd10.prod.outlook.com>
Here are the main points that drive my thoughts:
1) asserted data is the norm, reifications are exception (see, for example, LPG/relational and even RDF1.1 data) – we need to make associating an id to asserted data as easy as possible
2) very important to have one-to-one mapping when converting (asserted) data between LPG/relational and RDF1.2 (at abstract syntax/N-Triples level, not just Turtle) – an LPG edge or binary relationship should be equivalent to exactly one RDF1.2 triple, not more than one – which does not work if rdf:reifies is used because that requires presence of the asserted s-p-o triple as well (e.g., :s :p :o . :r rdf:reifies <<( :s :p :o )>> . vs. only having :r rdf:id <<( :s :p :o )>> ,)
3) the less the user has to deal with terms like reifies/reification, the better it will be for RDF adoption from a competitive point of view.

Regarding, rdf:id and the fact that many-to-many semantics allows same "identifier" for multiple triple-terms: That thought occurred to me as well, but I think the same applies to the fact that a single reifier could reify multiple triple-terms is odd too in exactly the same way. So, rdf:reifies is no better than rdf:id in that regard. Moreover, rdf:reifies does not provide one-to-one mapping for asserted data, as mentioned above. Additionally, to a regular user, "reifies" is a scary term. If we are looking for alternatives, how about any of the following possible candidates: rdf:asserts, rdf:states, rdf:identifies, rdf:contains, rdf:includes?

Thanks,
Souri.
________________________________
From: Gregg Kellogg <gregg@greggkellogg.net>
Sent: Thursday, August 15, 2024 8:41 PM
To: Thomas Lörtsch <tl@rat.io>
Cc: public-rdf-star-wg@w3.org <public-rdf-star-wg@w3.org>
Subject: Re: [External] : example showing why rdf:state is essential

On Aug 15, 2024, at 3:30 PM, Thomas Lörtsch <tl@rat.io> wrote:



Am 15. August 2024 23:49:53 MESZ schrieb Gregg Kellogg <gregg@greggkellogg.net<mailto:gregg@greggkellogg.net>>:

Gregg Kellogg
gregg@greggkellogg.net

On Aug 15, 2024, at 12:15 PM, Souripriya Das <souripriya.das@oracle.com> wrote:

I did some re-thinking based on the comments I heard during today's meeting. Since our main (and only?) goal is to allow data creators to easily associate an id to a triple so that they can use it as subject or object of other triples (and also, support parallel edges), we can replace the rather meaningful (unfortunately) and hence confusing property name, rdf:reifies, with rdf:id – something that exactly satisfies our original goal (without venturing beyond).

To me, a term such as rdf:id suggests a unique identifier for a triple, rather than an identifier that is associated with a triple, along with potentially others. I believe the rdf:reifies predicate captures the notion that a reifier reifies a triple, as may other reifiers.

I agree. On my walk home I pondered if rdf:mentions might be a nice enough term instead of rdf:reifies (and along with rdfs:states and possibly rdf:quotes). It expresses what reification does, but in a simpler, less intimidating way.

So, suppose that RDF1.2 adds built-in support for the rdf:id property and triple-terms (only for use with rdf:id). Anything beyond this in this context is up to the data creator. SPARQL does not do anything other than pattern matching for it (although it may provide some shortcuts just for convenience). Note that other data models have built-in support for "asserted" data only. Even with RDF1.2, I'd expect use of reification to be rare or infrequent.

With this rdf:reifies -> rdf:id change, the example in my previous email becomes simple and would have no limitations and most importantly, cause no confusion for users.

# mapping from relational data: one-to-one
:stint1 rdf:id <<( :Bob :workedFor :A )>> . # S1
:stint2 rdf:id <<( :Bob :workedFor :B )>> . # S2
:stint3 rdf:id <<( :Bob :workedFor :A )>> . # S3

# R4 is marked as "Unreliable", a user terminology, using an extra triple – there is no interference from any of the pre-existing triples
:stint4 rdf:id <<( :Bob :workedFor :B )>> . # R4
:stint4 rdf:type :Unreliable .

I’m not bothered by having other triples be used to provide such nuance. As I noted on today’s call (which may have gone unnoticed): I liken the arguments of being similar to the RISC vs CISC school of CPU architecture, where the RISC paradigm uses many simple instructions to accomplish what a CISC architecture may do with a single instruction. The “complex” instruction can be deceptively simple, as it seems to be atomic, but in reality takes many cycles to perform, and may be interrupted due to memory fetches. While the RISC design breaks complex operations into primitive instructions. I view the RDF Abstract Algebra as being a reduced instruction set for RDF which “higher level” languages, such as Turtle compiles into.

Rant - >

Have you noticed Andy's latest proposal:

:s :p :o {| a rdf:Stated |}.

I admit I first missed it, but the irony that the syntax that obviously states the triple it annotates has to add an extra annotation to express that it actually states it, is just in a category of its own. I can hardly think of a sillier arrangement.

By the way, the triple count of annotating an asserted triple is now at 4 (*), worse than the CG proposal (**), and only 2 better than standard reification. After 5 years, with 3 different syntaxes, 1 new term type, and countless specs to update. RISC, really?

<ran.t

The rdf:Stated could be implicit in the annotation syntax and automatically emitted by the parser. I think Andy is arguing that using rdf:Stated serves much the same purpose as a special predicate such as rdf:states; I’m inclined to agree with this reasoning.

I don’t think that triple counts of different solutions is particularly pertinent for the purpose of modeling.

Gregg

(*)
:s : p :o.
:id rdf:reifies <<( :s : p :o. )>>.   # RDF standard reification needs 2 more
:id a rdf:Stated.
:id :y :z.


(**)
:s : p :o.
:id :occurrenceOf << :s : p :o. >>. # :occurrenceOf was only informally defined
:id :y :z.



Gregg
Received on Friday, 16 August 2024 03:35:33 UTC