Re: [External] : example showing why rdf:state is essential

> On 16. Aug 2024, at 02:41, Gregg Kellogg <gregg@greggkellogg.net> wrote:
> 
>> On Aug 15, 2024, at 3:30 PM, Thomas Lörtsch <tl@rat.io> wrote:
>> 
>> 
>> 
>> Am 15. August 2024 23:49:53 MESZ schrieb Gregg Kellogg <gregg@greggkellogg.net <mailto:gregg@greggkellogg.net>>:
>>> 
>>> Gregg Kellogg
>>> gregg@greggkellogg.net
>>> 
>>>> On Aug 15, 2024, at 12:15 PM, Souripriya Das <souripriya.das@oracle.com> wrote:
>>>> 
>>>> I did some re-thinking based on the comments I heard during today's meeting. Since our main (and only?) goal is to allow data creators to easily associate an id to a triple so that they can use it as subject or object of other triples (and also, support parallel edges), we can replace the rather meaningful (unfortunately) and hence confusing property name, rdf:reifies, with rdf:id – something that exactly satisfies our original goal (without venturing beyond).
>>> 
>>> To me, a term such as rdf:id suggests a unique identifier for a triple, rather than an identifier that is associated with a triple, along with potentially others. I believe the rdf:reifies predicate captures the notion that a reifier reifies a triple, as may other reifiers.
>> 
>> I agree. On my walk home I pondered if rdf:mentions might be a nice enough term instead of rdf:reifies (and along with rdfs:states and possibly rdf:quotes). It expresses what reification does, but in a simpler, less intimidating way. 
>> 
>>>> So, suppose that RDF1.2 adds built-in support for the rdf:id property and triple-terms (only for use with rdf:id). Anything beyond this in this context is up to the data creator. SPARQL does not do anything other than pattern matching for it (although it may provide some shortcuts just for convenience). Note that other data models have built-in support for "asserted" data only. Even with RDF1.2, I'd expect use of reification to be rare or infrequent.
>>>> 
>>>> With this rdf:reifies -> rdf:id change, the example in my previous email becomes simple and would have no limitations and most importantly, cause no confusion for users.
>>>> 
>>>> # mapping from relational data: one-to-one
>>>> :stint1 rdf:id <<( :Bob :workedFor :A )>> . # S1
>>>> :stint2 rdf:id <<( :Bob :workedFor :B )>> . # S2
>>>> :stint3 rdf:id <<( :Bob :workedFor :A )>> . # S3
>>>> 
>>>> # R4 is marked as "Unreliable", a user terminology, using an extra triple – there is no interference from any of the pre-existing triples
>>>> :stint4 rdf:id <<( :Bob :workedFor :B )>> . # R4
>>>> :stint4 rdf:type :Unreliable .
>>> 
>>> I’m not bothered by having other triples be used to provide such nuance. As I noted on today’s call (which may have gone unnoticed): I liken the arguments of being similar to the RISC vs CISC school of CPU architecture, where the RISC paradigm uses many simple instructions to accomplish what a CISC architecture may do with a single instruction. The “complex” instruction can be deceptively simple, as it seems to be atomic, but in reality takes many cycles to perform, and may be interrupted due to memory fetches. While the RISC design breaks complex operations into primitive instructions. I view the RDF Abstract Algebra as being a reduced instruction set for RDF which “higher level” languages, such as Turtle compiles into.
>> 
>> Rant - >
>> 
>> Have you noticed Andy's latest proposal:
>> 
>> :s :p :o {| a rdf:Stated |}. 
>> 
>> I admit I first missed it, but the irony that the syntax that obviously states the triple it annotates has to add an extra annotation to express that it actually states it, is just in a category of its own. I can hardly think of a sillier arrangement.
>> 
>> By the way, the triple count of annotating an asserted triple is now at 4 (*), worse than the CG proposal (**), and only 2 better than standard reification. After 5 years, with 3 different syntaxes, 1 new term type, and countless specs to update. RISC, really? 
>> 
>> <ran.t
> 
> The rdf:Stated could be implicit in the annotation syntax and automatically emitted by the parser. I think Andy is arguing that using rdf:Stated serves much the same purpose as a special predicate such as rdf:states; I’m inclined to agree with this reasoning.

Sure, that’s a way to do it. You can always throw one more triple at a problem, but you’re also asking a lot of users, and parsers, and query rewriters. And you also re-inroduce a problem that Andy was so eager to abolish: the dependence of a reification on multiple statements. 
This reminds me a lot of the type-occurrence discussion that was dragging on for years, until it was finally understood that all those extra statements and TEP mechanisms where too complex, and the underlying problem too fundamental as to be tackled by yet another level of indirection. The difference if an annotation is meant to refer to a statement that is true in a graph or if it is agnostic about or even opposed to its truth, is fundamental. A RISC design has to get the fundamentals right. Elegance comes from providing just the right amount of everything, not more, not less.

> I don’t think that triple counts of different solutions is particularly pertinent for the purpose of modeling.

Well, I think some people might call that a little careless and superficial.

Thomas


> Gregg
> 
>> (*) 
>> :s : p :o. 
>> :id rdf:reifies <<( :s : p :o. )>>.   # RDF standard reification needs 2 more
>> :id a rdf:Stated. 
>> :id :y :z. 
>> 
>> 
>> (**) 
>> :s : p :o. 
>> :id :occurrenceOf << :s : p :o. >>. # :occurrenceOf was only informally defined
>> :id :y :z. 
>> 
>> 
>> 
>>> Gregg

Received on Monday, 19 August 2024 10:21:28 UTC