Re: RDF star and LPGs

Hi Gregg,

> On 19. Jul 2024, at 02:53, Gregory Williams <greg@evilfunhouse.com> wrote:
> 
> I mentioned in the call today that I had concerns about the "RDF star and LPGs” document[1] that was linked on IRC. I didn’t find any previous mention of that document in the mailing list archives, so unsure if it has been discussed previously (perhaps on a previous call?). Overall, I'm skeptical of the mapping presented in the document, and especially as the basis for making arguments that the current baseline proposal is compatible with LPG use-cases that have been previously discussed.
> 
> (I may be making assumptions about the mapping here, but since it is presented in the form of a single, simple example, I guess that’s to be expected. Similarly, I may be reading a lot into the "<—>” mapping indicator used between the example LPG and RDF data; please let me know if I’ve misunderstood the intention.)
> 
> 
> 
> I think on its face the RDF-LPG mapping presented does not preserve the intended semantics. To me, the intent of the LPG syntax is pretty clearly to assert that there are two transactions that occurred. They exist!
> 
> ```
> CREATE (a1)-[:TRANSACTION {amount: 1000, currency: "gbp", date: "2002-09-24Z"}]->(a2)
> CREATE (a2)-[:TRANSACTION {amount: 900, currency: "gbp", date: "2002-10-03Z"}]->(a3)
> ```
> 
> But the mapped RDF does *not* define any transactions. There are no `:TRANSACTION` triples in the graph – that predicate only exists inside triple terms. So maybe they are hypothetical transactions?
> 
> ```
> << :e1 | :a1 :TRANSACTION :a2 >> ;
>   :amount 1000 ;
>   :currency "gbp" ;
>   :date "2002-09-24Z"^^xsd:date .
>   << :e2 | :a2 :TRANSACTION :a3 >> ;
>   :amount 900 ;
>   :currency "gbp" ;
>   :date "2002-10-03Z"^^xsd:date .
> ```
> 
> Furthermore, the mapping does not appear to be a bijection, and cannot be round-tripped. What happens to those triple terms if you try to map back to LPG? Do they disappear, as they are not asserted in the RDF graph? If, instead, they are mapped back into actual `:TRANSACTION` edges, how would that be different from mapping similar RDF that also asserted the `:TRANSACTION` triples in the graph (in addition to the triple terms)? I think this starts to touch on the point that Thomas has been making about asserted vs. unasserted assertions and the difficulty of modeling that difference in our current formalization.

Yes, partly it touches the points that I’m making about the problem with the current handling of asserted and unasserted assertions. However, the obvious solution to the problem as you describe it seems to be to add two triples to the graph:

    :a1 :TRANSACTION :a2 .
    :a2 :TRANSACTION :a3 .

The graph then is:

    :a1 a :Account ; 
        :accountNumber 1 .
    :a2 a :Account ;
        :accountNumber 2 .
    :a3 a :Account ;
        :accountNumber 3 .

    :a1 :TRANSACTION :a2 .
    << :e1 | :a1 :TRANSACTION :a2 >>
        :amount 1000 ;
        :currency "gbp" ;
        :date "2002-09-24Z"^^xsd:date .

    :a2 :TRANSACTION :a3 .
    << :e2 | :a2 :TRANSACTION :a3 >>
        :amount 900 ;
        :currency "gbp" ;
        :date "2002-10-03Z"^^xsd:date .

I don’t know why the example doesn’t contain those triples already, as IIUC they would be a required part of a mapping. Would this solve your problem? Would this solution also be safe when round-tripping back and forth?

AFAIKT it should work in simple cases, but it doesn’t address other problems. The solution I propose - introducing two completely distinct primitives to represent asserted and unasserted assertions - would solve your problem more directly and robustly, besides addressing some missing parts for more intricate cases.
 
> Under “OBSERVATIONS":
> 
> > Reifiers denote edge identifiers.
> I think this suggests that you'd end up with an empty graph if you tried to map RDF without any reifiers back to LPG (or at least a graph with no edges).

No, I don’t think so. RDF without reifiers should map directly to LPG edges that are not annotated.

> That doesn't seem viable, as all existing RDF data prior to rdf-star would be entirely inaccessible based on this mapping. I am nervous about any one-way mapping, as it has implications for the sorts of systems you can build and the use-cases that they can address.

That is a valid concern, but it comes with a caveat: RDF has facilities that LPGs don’t cover, so there is always a potential for lossyness. However, we should make sure that data mapped from LPG to RDF which isn’t changed on the RDF side - e.g. no entailments, no substitution of co-denoting terms, etc - can be mapped back to LPG without any loss or change. 

> And I’m similarly nervous about basing support for the baseline proposal on this document as an assurance that we are now able to capture LPG use cases.

The discussion so far concentrated on many-to-many mappings, and in that respect I agree with the design of the minimal baseline: adding a provision to ensure many-to-one mappings would cut too deep into the design and workings of RDF (by requiring referential opacity) to be feasible. OTOH it requires much less effort on the LPG side to work around that precious oddity of RDF. So I think in this respect the baseline proposal is well balanced and sufficiently practical for both worlds.

> The mapping is just too casual, and leaves out far too many specifics to be useful in understanding if it can practically address LPG use-cases.

Practical experience with LPGs is pretty sparse among the members of the WG, so it may well be that we are missing important aspects. It would be very helpful if you provided some more detail about the missing specifics. Can you name some, or add to the use case, or provide one (or more) of your own?

Thanks,
Thomas

> 
> thanks,
> .greg
> 
> [1] https://github.com/w3c/rdf-star-wg/wiki/RDF-star-and-LPGs
> 

Received on Friday, 19 July 2024 09:52:52 UTC