Re: RDF* and conjectures

Hello Thomas! 

> [b.t.w.: i find your intervention very productive. thank you for your enthusiasm :) ]

Thanks. This means a lot to me. Sometimes I fail to notice when I am becoming a nuisance or a bore, and statements like these are reassuring...


> Am 23. September 2021 12:59:58 MESZ schrieb Fabio Vitali <fabio.vitali@unibo.it>:
>> Dear Thomas, all. 
>> 
>> My own impression is that the semantics of named graphs is problematic in a way that cannot be fixed, and it is better to find a way out in a totally different way. 
>> 
>> We have a proverb where I come from, that says: if you glue back together a broken vase, you still have a broken vase. Dataset semantics is a broken vase. 
>> 
>> Personally, I see a clear parallelism with RDF reification, a complicated and pedantic mechanism for an important necessity, and although it has a clear semantics, but over time people have variously enforced or ignored it, so it is not reliably found in the wild. 
> 
> its syntax is  underspecified. its semantics very clearly define the reified triple as a referentially transparent occurrence, but it lacks a property to describe the location where that occurrence occurs. 
> 
>> So Olaf and you guys 
> 
> please don't count me in :-/ i never made a secret of the fact that i find an identifier like RDF/XMLs id-attribute much more practical than the verbose embedded statement for the purpose of statement annotation and it quite frankly is beyond me how RDF* gained so much support. 

My point of view is that this syntax or another are irrelevant as long as they concretely reduce the visual clutter and allow to rapidly and concisely express the relevant use cases. 

I am not a big fan of formal languages for general humans but this is where we are: I think that RDF linearizations should use labels to denote entities, rather than uris, which guarantee unambiguity at the expense of clarity, but this is a digression, I think, in this group. 

> but statement annotation (especially with an eye towards property graph compatability, and competibility) is a whole different story than what the proposed semantics (and the original Named Graphs proposal, and N3, and you) focus on, and shoe horning one into the other won't work out if one is not very scrupulous - which the editors clearly aren't. 
> 
>> said: RDF reification is a broken vase, and even if we fix it we still are stuck with a broken vase. Instead, let's get a new and shiny vase and use it in addition with the old one for our purposes. 
> 
> RDF standard reification is just a vocabulary and therefor relatively easy to replace. named graphs OTOH use curly braces and we only have a very limited set of such delimiters available. we simply can't afford to waste them. ask Olaf how hard it was to come up with the chevron syntax. 
> 
> also named graphs are predominantly used with the semantics implicit in SPARQL. also there is a possibility to define a vocabulary that allows to declare the semantics of a dataset. in a world where it is possible to garner broad support for embedded triples just because they seem to bring sound and concise statement annotation (which if course now they don't, but they could) everything is possible... 

Not sure what you imply here. RDF*, at least in the communities I know of, is being advertised as the next big thing not because of what it is, but because it seems to be the best available thing to express what they have needed for a long time, i.e., I can assign a provenance to a statement regardless of whether I agree with the author of the statement. 

Doing so without RDF* is a pain in the neck, and with RDF* is not ideal, but considerably better. At some point they'll notice that if they have individual triples, RDF* is perfect, but more complicated situations, such as attributing without asserting structures such as graphs (e.g. nanopublications) or sequences of triples (e.g. n-ary relationships) will require that they quote and assign provenance to every single triple, and they will not be happy. Happier than before RDF*, but not happy. 

E.g.: 

 :assertion {
   << wd:Q2 wdt:P571 "-4004-10-23"^^xsd:date>> :accordingTo wd:Q333481 . 

   // Ussher said that the Earth was created in 4004 bC
 } 

 :provenance { 
   :assertion prov:wasAttributedTo wd:Q333481 . 
 } 

 :pubInfo {... }

 :Head {
   : a np:Nanopublication .
   : np:hasAssertion :assertion .
   : np:hasProvenance :provenance .
   : np:hasPublicationInfo :pubInfo .
 }
\
This graph is far from ideal: Ussher is the author of the whole assertion, not one of the members of one of its triples. Yet, since with RDF* you cannot quote whole graphs, people will need to make do with this. Similarly: 

    :Hamlet crm:P94i_was_created_by :CreationOfHamlet. 
    
    :CreationOfHamlet a crm:E65_Creation. 
    <<:CreationOfHamlet crm:P14_carried_out_by :WilliamShakespeare>> :accordingTo :SamuelJohnson .
    <<:CreationOfHamlet crm:P4_has_time-span :Year1603>> :accordingTo :SamuelJohnson .
    <<:CreationOfHamlet crm:P215_has_reliability :High>> :accordingTo :SamuelJohnson .

If I want to attribute without accepting the triples of an n-ary relationship, I need to quote every single triple, because, unless we quoting to graphs, there is no single way to wrap them and quote them all without at the same time stating them. 

Better than before, but... it is possible to improve. 

>> So with RDF*  we now have 
>> 1a) a nice and compact syntax for stated triples s p o 
>> 1b) which corresponds, for those so inclined, to _:x a rdf:Statement; rdf:Subject s; rdf:Predicate p; rdf:Object o.   
>> and  
>> 2a) a nice and compact syntax for non-stated triples <<s p o>> 
>> 2b) which corresponds, for those so inclined, to _:x unstar:Subject s; unstar:Predicate p; unstar:Object o, etc..   
>> 
>> Things are clear, the truth state of quoted triples in RDF* is clearly non asserted, and it is impossible to confound the two types of statements, neither in syntax nor in semantics. Nice and clean: if there is a doubt in interpretation, create something new so different from the old that there is no way to mixing them up again. Good. 
> 
> i agree that this syntactic feature is very important for usability. but as i said in my other mail: it is not new anymore, it is already defined in practice as referentially opaque occurrence. the cow paths are laid out, no matter if one thinks that's good or not. 

I am not sure I follow: you do not like RDF* to be referentially opaque and would have preferred it to be referentially transparent? 

I have to understand the big deal of this. As a mathematician by background, I am used to think that names of unbounded variables can change at will, so if you need referential opacity you just use fresh variable names. 

Also, what Lois Lane THINKS of Clark Kent is immaterial. The individual we denote with Clark Kent, being the same individual we denote with Superman, can certainty fly. The problem is not in the specification of the opinion of Lois Lane, but in the statement :ClarkKent owl:sameAs :Superman, which should not be true [*]. Remove this triple from your dataset and everything comes naturally in its place. 

Conjectures do not mess with identifiers of subjects and objects, they just create new predicates, so they are referentially transparent because they can't help being so. I need to find a realistic use case for why this is NOT good. 

Interested in your opinion on this. 

Ciao

Fabio

---


[*] I would ontologically disagree with the assertion of this triple. Clark Kent and Superman are not the same individual. Superman wears a cape, Clark Kent wears glasses, etc. These characteristics need to be differentiated, e.g. by introducing an intermediate class, call it a Persona, a Disguise, a Portrayal, and the individual that we are talking about uses one Persona or the other in different moments. Flying is an attribute of the :Superman Persona, and not of the :ClarkKent one. Solved without introducing referential opacity. 

I somewhat believe that if you introduce an intermediate class to shield the individual, you solve 99% of the ontological disputes that exist around here. 

----

> also, syntactically Antoine Zimmermann's proposal to define an RDF literal datatype is even more convincing: nothing looks and feels more like a quote than a quote. implementing SPARQL for sucg a datatype shouldn't be too hard. 
> 
> sorry again for my terseness! 
> 
> ciao 
> thomas 
> 
> 
>> 
>> Now let's come to named graphs. This situation is better from the syntactical point of view, since graph syntax is actually quite reasonable, but worse from the semantic point of view, since there is none accepted. 
>> 
>> This is a broken vase, and even if you manage to glue back all that is wrong in the current situation, it would still be a broken vase. Let's learn the lesson from RDF* and let's get a new and shiny vase and use it in addition with the old one for our purposes. 
>> 
>> Then we would have
>> 1c) a nice and compact syntax for usual named graphs, whatever semantics you want to associate to them, as before (the broken vase)
>> 2c) a nice and compact syntax for non-stated named graphs (the new vase)
>> 
>> Things would be clear, the truth state of quoted graphs would be clearly non asserted, and it would be impossible to confound the two types of graphs, neither in syntax nor in semantics. Nice and clean. 
>> 
>> This is what I wish to create: a new and shiny vase for graphs corresponding to the one that RDF* is becoming for reification. 
>> 
>> Ciao
>> 
>> Fabio
>> 
>> --
>> 
>>> On 22 Sep 2021, at 23:47, thomas lörtsch <tl@rat.io> wrote:
>>> 
>>> 
>>> 
>>>> On 21. Sep 2021, at 19:08, Andy Seaborne <andy@apache.org> wrote:
>>>> 
>>>> The more appropriate text for RDF-star is probably that in
>>>> "RDF 1.1 Concepts and Abstract Syntax"
>>>> 
>>>> 1.6 Working with Multiple RDF Graphs
>>>> https://www.w3.org/TR/rdf11-concepts/#managing-graphs

>>>> 
>>>> and the definition:
>>>> 
>>>> 4. RDF Datasets
>>>> https://www.w3.org/TR/rdf11-concepts/#section-dataset

>>>> 
>>>> which has the note:
>>>> 
>>>> """
>>>> Despite the use of the word “name” in “named graph”, the graph name is not required to denote the graph. It is merely syntactically paired with the graph. RDF does not place any formal restrictions on what resource the graph name may denote, nor on the relationship between that resource and the graph.
>>>> """
>>> 
>>> The failure of the RDF 1.1 WG to standardize a named graphs semantics is well known and the very reason for this soul searching expedition into the semantics of SPARQL as a normative practical force. 
>>> 
>>> As a co-editor of SPARQL 1.0 and 1.1 and a participant in the RDF 1.1 WG (and co-editor of TriG as I just noticed) and probably numerous other RDF-related standardization efforts you should be in a formidable position to shed some light on the question which model theoretic semantics might best describe the semantics of SPARQL. 
>>> 
>>> You might also comment on if the RDF 1.1 WG discussed standardizing a model theoretic semantics as close as possible to the operational semantics of SPARQL, if that was deemed impossible for technical or "political" (read: conflicts with vendor interests) reasons. 
>>> 
>>> These are just two ideas of how you could help flatten the knowledge differences in this CG.
>>> 
>>> Thomas
>>> 
>>> 
>>>> 
>>>>  Andy
>>>> 
>>> 
>>> 
>> 

Received on Friday, 24 September 2021 09:19:59 UTC