- From: Holger Knublauch <holger@topquadrant.com>
- Date: Thu, 29 Oct 2020 22:00:40 +1000
- To: Pierre-Antoine Champin <pierre-antoine.champin@ercim.eu>, public-rdf-star@w3.org
- Message-ID: <c4627d2a-57ef-78a2-aff8-2fcc88c63806@topquadrant.com>
On 29/10/2020 7:55 pm, Pierre-Antoine Champin wrote: > > Holger, > > If I follow your idea that RDF* must be only syntactic sugar on top of > RDF, it means that any RDF* graph is in fact a standard RDF graph, > hence can be serialized as Turtle (or RDF/XML, etc...). > > So these long URIs representing triples MAY end up exposed to some > systems that do not understand the Turtle* syntax. > > Yet, it seems to me (but I am not entirely sure) that some of your > arguments below rely on the fact that these long URIs would only be > used internally, and never exposed, which only works if everyone > adopts RDF* / Turtle* / SPARQL*. In that case I would argue that, > although you are reusing the same URI type in your implementation, the > special treatment you add for this type of URI makes it, in practice, > a new type of node... > > More comments below. > > On 29/10/2020 04:11, Holger Knublauch wrote: >> On 10/28/2020 9:29 PM, Jerven Bolleman wrote: >> >>> Hi All, >>> >>> Yes it can be defined as syntactic sugar, and IMO it should. >> >> I agree. And this is not merely a matter of allowing RDF* to be >> implemented by special IRIs as one implementation strategy among >> others, but it should become the only permitted implementation >> strategy. The reason is that operators such as isIRI() need to work >> consistently across implementations. > Agreed >> isIRI() should return true for embedded triples. > I don't see this as a requirement for RDF* in general! >> There should not be a new RDF node type because that is simply not >> needed. No application should break because it encounters RDF* >> graphs. Whether it can make sense and interpret them as >> "reifications" (e.g. for display purposes) should be for them to >> decide incrementally. But by default they are just URIs to them. > Ok, so we agree that long URIs could be exposed out of the system that > generated them. >> >> Having said this, the spec could remain vague about how these long >> URIs are formed. > > I disagree. If RDF* is really just syntactic sugar, then an RDF* graph > should be serializable in Turtle* or in Turtle indifferently, without > loss of information. So two different RDF* systems should be able to > exchange that RDF* graphs using Turtle*, /but also /using Turtle. > > I don't see how this could happen if the "encoding" of embedded > triples into URIs is not totally specified. > I would be ok with that. > >> This includes the question of how blank nodes are represented. I >> would simply map them to the serialization of the internal IDs. The >> APIs and SPARQL should provide a function to distinguish these >> special IRIs from "normal" ones, and then functions to extract >> subject, predicate and object, and vice versa. See >> http://datashapes.org/reification.html#tosh for an example >> implementation. >> >> A minimum practice that the spec could prescribe is that these IRIs >> need to start with, say, "urn:triple:" but this is mainly to make >> sure that no other applications use those URIs by accident (which is >> really unlikely in practice but still...) and for human readers who >> stumble upon them in raw form. >> >>> Considering in the beginning that RDF* was defined in terms of RDF >>> reification. And it can be implemented that way e.g. as POC done for >>> rdflib. >>> >>> Which gets is back to the reason what do we want to achieve with >>> RDFstar? I think most of us want to have reification without the >>> hassle of writing out a quad all the time. Plus having optimizations >>> possible at the storage level/query engine. >>> >>> A solution for the triples with all IRI and/or Literals is simply >>> generate an implicit IRI for them. >>> >>> For the triples with a blank node I think the simplest is to >>> generate an >>> implicit new blank node for them. (Which would be skolemized in some >>> form in any triple store anyway). >> >> Not sure why this is necessary. Why not use IRIs? A typical scenario >> would be: >> >> 1. RDF* file gets loaded into graph store >> 2. Graph store selects its new internal IDs for blank nodes >> 3. References to bnodes from embedded triples use these same (new) >> IDs. All good. >> 4. RDF* file gets saved back. >> 5. The cycle repeats on another machine that loads this new file. > > This is all very good if only Turtle* (or another RDF*-aware syntax) > is used. > > If at step 4 you serialize the RDF* graph in, say, Turtle, then the > URI encoding your internal ID is leaked to the outside world. > > Even worse: another embedded triple, from another system using the > same internal ID, would be generated with the same IRI (even if the > two bnodes represent totally different things). > So why would this be worse than a solution that introduces a new node type for triples: These cannot even be written to turtle at all. Holger >> >> While such an RDF* graph is in memory (or database storage) the >> actual IDs don't matter to anyone, and shouldn't be relied on by any >> external graph. This is already the situation for blank nodes now - >> they are anonymous nodes within the current graph only. >> >> Likewise no external graph should rely on the specific syntax of >> these (possibly long) IRIs - they are only accessed and used via >> SPARQL* operators and corresponding functions. >> >>> (...) >>> >>> The problem here becomes, lack of a WG means we don't have a good way >>> to determine consensus and actually record a decision. >> >> A while ago there was a poll on times for a first meeting. Is this >> still the plan? > > We are not forgetting it. The decision should be made soon. > > best > >
Received on Thursday, 29 October 2020 12:00:56 UTC