- From: Andy Seaborne <andy@apache.org>
- Date: Mon, 7 Dec 2020 09:59:14 +0000
- To: public-rdf-star@w3.org
The important point is that triple-terms are literals. Not a literal-with-datatype (it could be represented that way) but as a literal-ish-thing, all you need to know is what it looks like - the S/P/O. Downside - you can't get the S/P/O out of the triple-term afterwards. With <<>> you can. << ?s :p :o >> can be a match operation. Even without the pattern matching, DATATYPE(?literalWithDT) accesses the datatype part of the literal. so it is natural to have: SUBJECT(?tripleTerm) to get the subject. Andy On 07/12/2020 08:41, Ivan Herman wrote: > Hi Pat, > > you are of course allowed to come out of 'the shadows of retirement'; > let me be allowed to do the same from the shadows of retirement from > the Semantic Web Activity:-) > > Let us suppose we had a standardized scheme for an RDF canonicalization. > As you well know:-) this would involve a canonical, deterministic > relabeling of blank nodes. Because it would be done on the RDF data, not > its serialization, it would be serialization independent. > > *If* we have that, we could 'simply' (details to be filled in) generate > the hash of the _canonical_ triple in, say, N-triple syntax and use that > as the identification of the triple. Wouldn't that simplify your > approach? (Ie, you would not have to find a way to order the triples in > RDF/XML :-) What it would mean that the triple below would 'simply' be: > > "an ugly hash string"^^http://tripleID/ <http://tripleID/> P O . > > The added value would also be that if we have a <S P O> without a blank > node, it hash becomes unique, ie, not necessarily dataset dependent. > Also, a canonicalization algorithm (see below) may also be defined for > quad, ie, the dependency of an <S P O> on a specific graph could also be > taken into account. > > WDYT? > > Ivan > > P.S. Of course, this relies on a standard for RDF canonicalization. For > other reasons, this is very much needed in practical RDF applications. > There are a few algorithms out there by now (Aidan Hogan has a published > algorithm, Dave Longley has a different one which is also deployed in > some areas) and we are actually considering reconciling those two and > create a W3C standard for it. Just do not ask me when that would happen... > > >> On 6 Dec 2020, at 10:04, Patrick J Hayes <phayes@ihmc.us >> <mailto:phayes@ihmc.us>> wrote: >> >> (I had vowed to not get invoved in this business, but I cannot resist >> putting my 2c into the discussion, so here goes…) >> >> Some observations. >> >> 1. From reading the various contributions, the primary goal seems to >> be to provide a compact and semantically coherent way to both assert a >> triple and to say something about it, where ‘it’ here means the >> instance or token of the triple in a particular document, either the >> same one that contained the asserted triple or one closely associated >> with it. And to do all this with minimal damage to the basic RDF model >> of graphs, triples, etc.. >> >> 2. Linking this to RDF reification was seen as one way to keep the RDF >> model intact, but in retrospect that might have been a mistake. FOr a >> variety of reasons, mostly historical, RDF reification has little to >> do with the primary goal. (In fact, I still have no clear idea what >> RDF reificaiton is for, even after working on two RDF working groups >> for everal years.) >> >> 3. Putting aside reification as irrelevant, therefore, and focussing >> instead on the primary goal of annotating triples, there are basically >> two ways to do this. Either we somehow provide (invent) a way to give >> names to triples, so that we can use that name in other triples to >> make the assertions comprising the annotations; or, we attach the >> annotations to the triples by ostention, that is by directly attaching >> them to the triple. Which is a bit like pointing to the triple and >> saying “this triple” instead of inventing a name for it and using the >> name. This was, I believe, the original idea of the <<s p o>> P O . >> notation, which was unfortunately somewhat confused by linking it to >> RDF reification. >> >> 4. The problem with this, however, is that it requires extending RDF >> syntax to allow a new kind of node. Which naturally suggests that we >> should seek a way to treat this as a shorthand for a construct using >> more conventional RDF. Not reification, so what? The ‘outer’ triple >> would be a RDF triple if its subject were a bnode, URI or literal >> (well, it would be a generalised RDF triple with a literal subject). >> Of these, the semantically most sensible would be a literal, because >> literals, unlike URIs, are considered to have fixed denotations; and >> we want these triple names to have exactly this quality, to rigidly >> identify a particular syntactic object in a particular RDF source >> (https://www.w3.org/TR/rdf11-concepts/#change-over-time >> <https://www.w3.org/TR/rdf11-concepts/#change-over-time>). >> >> 5. So, let me suggest a literal scheme for identifying triles in >> documents. The datatype name is ‘http://tripleID/' <http://tripleID/'> >> and it reconizes strings of the form A+B+C where A is either the URI >> of the document containing the triple token, or the empty string, >> indicating the document in which the literal occurs; B is one of a set >> of predefined strings defining the various RDF surface syntaxes, eg >> 'RDF-XML’, ‘TURTLE’, etc;, and C is a numeral which identifies the >> particular triple following a convention defined for documents of that >> syntactic type. This requires standardizing a few of these >> conventions, but this should not be an impossibly large task (though I >> admit not having any idea how to approach ordering triples in >> RDF-XML.) Thus for example the literal value of >> ‘+NTRIPLES+17’^^http://tripleID/ <http://tripleID/> is the 17th >> triple in the linear ordering of the triples in the document in which >> that literal occurs. >> >> 6. With this datatype defined, we can treat >> <<s p o> P O> >> in a N-triples document as shorthand for the pair of triples >> s p o . >> ‘+NTRIPLES+1’^^http://tripleID/ <http://tripleID/> P O . >> which admittedly has a literal subject, but in every other respect is >> conventional RDF, albeit recoognizing a rather unconventional datatype. >> >> 7. I suggest that using some such literal scheme (perhaps more elegant >> than this one) for generating triple-token identifiers is the simplest >> and semantically least objectionable way to map starred notations back >> into conventional RDF. It does not change the core of RDF by requiring >> new kinds of node, or exending the RDF semantics. It keeps all the >> oddity inside the description of the datatype. It allows metadata >> triples to annotate triples in other documents, even in a different >> RDF dialect, but it is more compact when this is not needed. It also >> can be fairly simply extended, if required, to allow annotaitons of >> multiple triples, ie of subgraphs, eg by allowing the strings to be >> extended by adding more ‘+numeral’ phrases, without changing the basic >> model. >> >> 8. Of the other alternatives for the annotation triple subject, bnides >> obvoiusly do not cut it as they don;t act as names or idnetifiers; and >> IRIs require a convention for naming triples iwth IRIs, which is so >> close to the named-graph idea that it hardly seems worth inventing >> something new to do it, but in any case requires extending the RDF >> model to include such a naming convention. >> >> OK, thats my 2c. If I have just re-invented someone else’s wheel, >> please forgive my failure of scholarship from the shadows of retirement. >> >> Pat Hayes >> >> > > > ---- > Ivan Herman, W3C > Home: http://www.w3.org/People/Ivan/ <http://www.w3.org/People/Ivan/> > mobile: +33 6 52 46 00 43 > ORCID ID: https://orcid.org/0000-0003-0782-2704 > <https://orcid.org/0000-0003-0782-2704> >
Received on Monday, 7 December 2020 09:59:30 UTC