What <<>> means

(I had vowed to not get invoved in this business, but I cannot resist putting my 2c into the discussion, so here goes…)

Some observations.

1. From reading the various contributions, the primary goal seems to be to provide a compact and semantically coherent way to both assert a triple and to say something about it, where ‘it’ here means the instance or token of the triple in a particular document, either the same one that contained the asserted triple or one closely associated with it. And to do all this with minimal damage to the basic RDF model of graphs, triples, etc..

2. Linking this to RDF reification was seen as one way to keep the RDF model intact, but in retrospect that might have been a mistake. FOr a variety of reasons, mostly historical, RDF reification has little to do with the primary goal. (In fact, I still have no clear idea what RDF reificaiton is for, even after working on two RDF working groups for everal years.)

3. Putting aside reification as irrelevant, therefore, and focussing instead on the primary goal of annotating triples, there are basically two ways to do this. Either we somehow provide (invent) a way to give names to triples, so that we can use that name in other triples to make the assertions comprising the annotations; or, we attach the annotations to the triples by ostention, that is by directly attaching them to the triple. Which is a bit like pointing to the triple and saying “this triple” instead of inventing a name for it and using the name. This was, I believe, the original idea of the <<s p o>> P O . notation, which was unfortunately somewhat confused by linking it to RDF reification. 

4. The problem with this, however, is that it requires extending RDF syntax to allow a new kind of node. Which naturally suggests that we should seek a way to treat this as a shorthand for a construct using more conventional RDF. Not reification, so what? The ‘outer’ triple would be a RDF triple if its subject were a bnode, URI or literal (well, it would be a generalised RDF triple with a literal subject). Of these, the semantically most sensible would be a literal, because literals, unlike URIs, are considered to have fixed denotations; and we want these triple names to have exactly this quality, to rigidly identify a particular syntactic object in a particular RDF source (https://www.w3.org/TR/rdf11-concepts/#change-over-time). 

5. So, let me suggest a literal scheme for identifying triles in documents. The datatype name is ‘http://tripleID/' and it reconizes strings of the form A+B+C where A is either the URI of the document containing the triple token, or the empty string, indicating the document in which the literal occurs; B is one of a set of predefined strings defining the various RDF surface syntaxes, eg 'RDF-XML’, ‘TURTLE’, etc;, and C is a numeral which identifies the particular triple following a convention defined for documents of that syntactic type. This requires standardizing a few of these conventions, but this should not be an impossibly large task (though I admit not having any idea how to approach ordering triples in RDF-XML.) Thus for example the literal value of  ‘+NTRIPLES+17’^^http://tripleID/ is the 17th triple in the linear ordering of the triples in the document in which that literal occurs. 

6. With this datatype defined, we can treat 
<<s p o> P O>
in a N-triples document as shorthand for the pair of triples
s p o .
‘+NTRIPLES+1’^^http://tripleID/ P O .
which admittedly has a literal subject, but in every other respect is conventional RDF, albeit recoognizing a rather unconventional datatype.

7. I suggest that using some such literal scheme (perhaps more elegant than this one) for generating triple-token identifiers is the simplest and semantically least objectionable way to map starred notations back into conventional RDF. It does not change the core of RDF by requiring new kinds of node, or exending the RDF semantics. It keeps all the oddity inside the description of the datatype. It allows metadata triples to annotate triples in other documents, even in a different RDF dialect, but it is more compact when this is not needed. It also can be fairly simply extended, if required, to allow annotaitons of multiple triples, ie of subgraphs, eg by allowing the strings to be extended by adding more ‘+numeral’ phrases, without changing the basic model. 

8. Of the other alternatives for the annotation triple subject, bnides obvoiusly do not cut it as they don;t act as names or idnetifiers; and IRIs require a convention for naming triples iwth IRIs, which is so close to the named-graph idea that it hardly seems worth inventing something new to do it, but in any case requires extending the RDF model to include such a naming convention. 

OK, thats my 2c. If I have just re-invented someone else’s wheel, please forgive my failure of scholarship from the shadows of retirement. 

Pat Hayes

Received on Sunday, 6 December 2020 09:04:27 UTC