- From: Pierre-Antoine Champin <pierre-antoine.champin@ercim.eu>
- Date: Wed, 28 Oct 2020 17:36:48 +0100
- To: public-rdf-star@w3.org
- Message-ID: <5d4b293a-0f45-d8c7-ac0c-66c8fb88a464@ercim.eu>
Jerven, thanks for your feedback. On 28/10/2020 12:29, Jerven Bolleman wrote: > Hi All, > > Yes it can be defined as syntactic sugar, and IMO it should. > Considering in the beginning that RDF* was defined in terms of RDF > reification. To be precise, RDF* was defined with an abstract syntax of its own, but no specific semantics (instead, a projection to RDF standard reification). What motivated us to propose a dedicated semantics was, mostly * that standard reification has no specific semantics in standard RDF, and * to provision for referential opacity (more on this below). I'll discuss the second item below (as you mention it in the following of your message). > And it can be implemented that way e.g. as POC done for rdflib. > > Which gets is back to the reason what do we want to achieve with > RDFstar? I think most of us want to have reification without the > hassle of writing out a quad all the time. Plus having optimizations > possible at the storage level/query engine. > > A solution for the triples with all IRI and/or Literals is simply > generate an implicit IRI for them. > > For the triples with a blank node I think the simplest is to generate an > implicit new blank node for them. (Which would be skolemized in some > form in any triple store anyway). > > SPARQL and RDF syntax wise this would be simple. API wise this would > also be easy in python rdflib. We have two new subclasses of URIRef, > and BNode basically. > > > class TripleUriRef(URIRef): > > class TripleBnode(BNode): > > in java rdf4j it would probably be something like > > interface Triple{ > } > interface TripleIRI extends IRI, Triple { > } > interface TripleBNode extends BNode, Triple { > } > > Semantics stay the same, we only get a new syntax for reification. > Question about PG|SA stays unanswered by doing this. I agree that this strategy would probably work if it wasn't for referential opacity. However, you would still need to specify a precise semantics for rdf:subject, rdf:predicate and rdf:object, to specify how to handle corner cases such as: :alice :says [ rdf:subject :bob ]. :alice :says [ rdf:subject :bob, :charlie; rdf:predicate :age; rdf:age 42 ]. :alice :says [ rdf:subject :bob; rdf:predicate :age; rdf:age 42 ]. :bob :says [ rdf:subject :bob; rdf:predicate :age; rdf:age 42 ]. > > For me as an user, what benefits would a different semantics for > RDFstar bring? Calling it a "different semantics" is slightly unfair; it is an /extension/ of the standard semantics. In any case, you need to extend the semantics, either of the new abstract syntax (the approach we chose), or of the specific vocabulary (here, rdf:subject and co.) used to "encode" RDF* into RDF's abstract syntax. Granted, the second option would be less disruptive. But it assumes that referential opacity is not a problem (again, see below). > > Going the syntactic sugar route we don't need a specification for > RDFstar, just for TurtleStar, RDF/XMLstar and SPARQLstar etc. > Which we would need anyway. > > The problem here becomes, lack of a WG means we don't have a good way > to determine consensus and actually record a decision. The idea was to mimic a WG process as much as possible, to help things move forward (with a report, github issues to track discussions and decisions, and weekly calls). > > Still mapping RDFstar in terms of RDF reification leaves open the > issue of referential opacity. I think that this is a red herring, I > think the superman problem is an issue with datamodelling, which > should not complicate our entire tech stack. i.e the :superman > owl:sameAs :clarkKent is a faulty assertion and that fault leads to > the impossibility to correctly express what :louislane believes. owl:sameAs is a can of worms of its own. If you are not convinced by the superman problem, what about: << dbr:Paris dbo:populationTotal 2229621>> :assertedBy <http://dbpedia.org>. << geo:2229621 gn:population 2138551>> :assertedBy <http://geonames.org>. dbr:Paris owl:sameAs geo:2229621. Arguably, we don't want to infer that <http://geonames.org> says anything about dbr:Paris (that is, if we want to preserve *which terms* was used by each source, not only *which resource* they are describing). best > > Regards, > Jerven > > PS. Regarding the PG or SA mode I am a fan of going for PG, given the > UniProt experience with RDF/XML rdf:ID which is a PG syntax. rdf:ID > being a PG syntax is important for our internal code being needed to > generate and read our rdf/xml. Not having this kind of sugar in other > syntaxes is why we are still preferably shipping rdf/xml for UniProt. > > PSS. about 15% of UniProt triples are reifcation quads. Being able to > get these out of the quad table would be nice. Especially the > consequences for reducing the joins to use them and how badly these > quads fit into current indexes. > > > > > On 10/28/20 9:57 AM, Pierre-Antoine Champin wrote: >> Holger, >> >> (I did what should have been done a long time ago: rename this >> subthread to something more relevant) >> >> On 27/10/2020 23:31, Holger Knublauch wrote: >>> On 10/28/2020 1:53 AM, Pierre-Antoine Champin wrote: >>>> Holger, >>>> >>>> Now I'm confused. This thread (which should have been renamed a long >>>> time ago) is, in my understanding, about Martynas' question raised >>>> here >>>> >>>> <https://www.w3.org/mid/CAE35Vmy3vbThwHnKjbhMQuwKkH0BhNoxr_Gp15Ri5LfOdedsSA@mail.gmail.com> >>>> >>>> >>>>> Does RDF* need new semantics at all? >>>> While I believe the answer is "yes", I concede that answering "no" to >>>> that question would be convenient, because it would mean that existing >>>> implementations of RDF could handle RDF* at the syntactical level >>>> only, >>>> i.e. parse Turtle* and store it standard RDF triples. >>>> >>>> In your examples below, however, you propose to extend existing >>>> implementations -- which defeats the purpose of fitting RDF* into >>>> standard RDF semantics... >>> >>> The current RDF* draft requires introducing a 4th term type "RDF* >>> triples": >>> >>> > IRIs <https://www.w3.org/TR/rdf11-concepts/#dfn-iri>,literals >>> <https://www.w3.org/TR/rdf11-concepts/#dfn-literal>,blank nodes >>> <https://www.w3.org/TR/rdf11-concepts/#dfn-blank-node>andRDF* >>> triples <https://w3c.github.io/rdf-star/#dfn-triple>are collectively >>> known asRDF* terms. >>> >> Correct. >> >> In my understanding, introducing a new subclass of Node (TripleNode) >> was the implementation counterpart of this extension of the abstract >> syntax, but it seems that I was misunderstanding. >> >>> The approaches based on (long) URIs avoid this and therefore are >>> likely much less concerning w.r.t. existing implementations. A >>> syntactic mapping means that existing APIs can represent these >>> triples as normal URI nodes. From our own experience moving to this >>> design was not disruptive (although some users have raised concerns >>> about exposing the ugly long URIs in unexpected places such as >>> exporting them to plain Turtle). >>> >>> Having said this, *some* implementations may represent these triple >>> nodes differently, e.g. using the internal data structure I outlined >>> below. This avoids ever storing these long URIs, makes it arguably >>> easier to index and search over them, and probably keeps the issues >>> with changing bnode identifiers at bay. But this is an >>> implementation detail to me. >>> >> Agreed, but conversely, implementing the current draft using long >> IRIs can be considered an implementation detail... >> >> I think we also agree that the aim of the spec is to be as clear and >> simple as possible, and that implementations may depart use their own >> different internal models (for various reasons: backward >> compatibility, optimization...) as long as they behave according the >> the spec. >> >>> The additional semantics of interpreting these special URI nodes >>> differently would be local to the RDF* specs and would not require >>> adaptations to existing specs. >>> >> Again, agreed: this would be much less disruptive to the entire >> ecosystem –and much less work for us in writing this document ;-). >> >> But again I think that blank nodes in embedded triples make this >> approach very hard. Defining the correct behaviour of "long IRI" when >> they actually represent embedded blank nodes, if even possible, would >> be extremely cumbersome (as opposed to the clear and simple way that >> the spec should aim for). That being said, any PR proving me wrong is >> welcome. >> >>> Holger >>> > >
Received on Wednesday, 28 October 2020 16:36:58 UTC