- From: Jerven Bolleman <jerven.bolleman@sib.swiss>
- Date: Wed, 25 Sep 2019 10:59:27 +0200
- To: public-rdf-star@w3.org
Hi Olaf, All, RDF* uses a triple t' itself instead of a name (id) of the triple. I think this can be pure syntactic sugar. The key part is that there needs to be a set of defined mappings from triple to name that can be generated with a simple function. The hack that allows this is to introduce a new URN type, I propose as an exemplar of the idea urn:triple:raw:%3Chttp%3A%2F%2Fpurl.uniprot.org%2Fcore%2FProtein%3E%20%3Chttp%3A%2F%2Fwww.w3.org%2F1999%2F02%2F22-rdf-syntax-ns%23type%3E%20%3Chttp%3A%2F%2Fpurl.uniprot.org%2Funiprot%2FP05067%3E Assume a statement like <<http://purl.uniprot.org/uniprot/P05067> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.uniprot.org/core/Protein>>. Would infer in existing reification turtle. <http://purl.uniprot.org/uniprot/P05067> a <http://purl.uniprot.org/core/Protein> . <urn:triple:raw:%3Chttp%3A%2F%2Fpurl.uniprot.org%2Fcore%2FProtein%3E%20%3Chttp%3A%2F%2Fwww.w3.org%2F1999%2F02%2F22-rdf-syntax-ns%23type%3E%20%3Chttp%3A%2F%2Fpurl.uniprot.org%2Funiprot%2FP05067%3E> a rdf:Statement ; rdf:subject <http://purl.uniprot.org/core/Protein> ; rdf:predicate rdf:type ; rdf:object <http://purl.uniprot.org/core/Protein> . The key thing is that such a raw syntax can be made nice in existing RDF/XML, and of course would not be horrid in Turtle* as there is no need to use the syntax as is. For uniprot.org we have started to use content derived identifiers for our reification [1]. And if there was a default technique we could use to avoid the current materialization overhead it would be great :) In this way each triple is named by itself and this uglyness can be hidden behind an abstract machine. Now Blank Nodes are of course a problem as they always are ;) The un-identifable nodes can not be addressed in such a way. I think that is ok. For those the mapping would be indeed <[] a up:Protein> would lead to [] rdf:Statement ; rdf:predicate a ; rdf:object up:Protein . and the identifier of the triple would be a blank node. The usual workarounds would apply. I want to reaffirm that this is an ugly syntax that should be well hidden under the default RDF* beauty. Benefit of allowing this is that it would allow existing RDF stores to very quickly adapt to SPARQL* without touching their storage layer. Making adoption of the change fast because no one needs to do a lot of work to have it "working" and then can spend a lot of time on making it fast. For data providers like me who have used reification a lot, it is nice to because it allows translation of existing sparql queries that use reification patterns to be interpreted as SPARQL*. e.g. this means that we can change to RDF* from day one and not wait until the last of our users has upgraded their RDF database. PREFIX up:<http://purl.uniprot.org/core/> SELECT ?annotationEvidence WHERE { ?p up:annotation ?a [] a rdf:Statement ; rdf:subject ?p rdf:predicate up:annotation ; rdf:object ?a ; up:attribution/up:evidence ?annotationEvidence . } Can then be mechanically transformed to. PREFIX up:<http://purl.uniprot.org/core/> SELECT ?annotationEvidence WHERE { <?p up:annotation ?a> up:attribution/up:evidence ?annotationEvidence . } Without changing the semantics of our datamodel or impacting existing users. This way there is compatibility between RDF and RDF* without needing a flag day and convincing everyone to change at once. Next question is how to deal with incomplete reification quads? including those for which there is no asserted triple. Pragmatically triplestores can deal with those in different ways. The first is to not allow them. The second is to store them and if they are present in the store execute a query as above. PREFIX up:<http://purl.uniprot.org/core/> SELECT ?annotationEvidence WHERE { { <?p up:annotation ?a> up:attribution/up:evidence ?annotationEvidence . } UNION { [] a rdf:Statement ; rdf:subject ?p rdf:predicate up:annotation ; rdf:object ?a ; up:attribution/up:evidence ?annotationEvidence . } } Considering the presence of incomplete and not-asserted reification triples are very rare in the wild RDF corpera. I think for commercial practicality most vendors will just go for the not supported operation. Other case is to introduce for each incomplete reif quad a blank node containing triple. e.g. [] rdf:object uniprotkb:P05067 leads to <[] [] uniprotkb:P05067> in the store. For those of use still using RDF/XML we would only need an update to section 2.17 of the spec to allow us to be RDF* without needing to change our writers at all. Regards and apologies for the ugly urlescaped syntax before many of you had your coffee in the morning, Jerven [1] https://sparql.uniprot.org/sparql?query=PREFIX+rdf%3a%3chttp%3a%2f%2fwww.w3.org%2f1999%2f02%2f22-rdf-syntax-ns%23%3e+%0d%0aPREFIX+up%3a%3chttp%3a%2f%2fpurl.uniprot.org%2fcore%2f%3e+%0d%0aSELECT+%3freif%0d%0aFROM+%3chttp%3a%2f%2fsparql.uniprot.org%2fcitationmapping%3e%0d%0aWHERE%0d%0a%7b%0d%0a++++%3freif+a+rdf%3aStatement+.%0d%0a++%09FILTER(strlen(str(%3freif))+%3c+120)%0d%0a%7d On 9/25/19 9:27 AM, Olaf Hartig wrote: > On Wed, 2019-09-25 at 00:00 -0500, Patrick J Hayes wrote: >>> On Sep 20, 2019, at 3:56 AM, Olaf Hartig <olaf.hartig@liu.se> >>> wrote: >>> [...] >>> In fact, in RDF* there is no need for such a naming convention >>> because, when talking about a triple t'=(s,p,o) in some other >>> triple t, the idea of RDF* is to directly use the triple t' itself >>> instead of using a name for that triple. >>> >>> t = ( (s,p,o), p2, o2 ) >> >> I understand, and agree. But this does mean that your often-repeated >> claim to somehow reduce RDF* to RDF reification is not accurate. RDF* >> is a genuine extension to RDF. > > Indeed, it is an actual extension. So, you are right: Reducing RDF* to > RDF reification requires the introduction of identifiers for triples, > as a result of which the reification description is not semantically > linked anymore to the described triple (due to the limitation of RDF > reification). > > Thanks, > Olaf > >
Received on Wednesday, 25 September 2019 09:00:04 UTC