Re: RDF-ISSUE-25 (Deprecate Reification): Should we deprecate (RDF 2004) reification? [Cleanup tasks] from Sandro Hawke on 2011-04-08 (public-rdf-wg@w3.org from April 2011)

From: Sandro Hawke <sandro@w3.org>
Date: Fri, 08 Apr 2011 00:42:30 -0400
To: Ivan Herman <ivan@w3.org>
Cc: Eric Prud'hommeaux <eric@w3.org>, David Wood <dpw@talis.com>, RDF Working Group WG <public-rdf-wg@w3.org>
Message-ID: <1302237750.6230.241.camel@waldron>
On Fri, 2011-04-08 at 04:48 +0200, Ivan Herman wrote:
> As an aside, I would love to use a different example that this superman. I have never watched these superman movies, so all this symbols are just abstract entities to me, which does not help understanding the issues...

IANAL (I am not a logician) but I've seen three examples commonly used.
Frege used the fact that "the morning star" and "the evening star" both
refer to Venus.  Wikipedia [1] uses Superman/ClarkKent and also
Cicero/Tully.

I'm expecting to learn more about this in the Provenance Working Group,
but mostly I'm thinking one should use the g-text (some bytes you can
securely hash if necessary) if you really need opacity.   

So, then, what's wrong with RDF reification, if referential transparency
is acceptable?  Well, picking perhaps the easiest [GRAPHS] use case
("Exchanging the contents of RDF stores" [2]), how would you serialize a
simple RDF dataset?  Maybe:

<u1> { <a> <b> 1, 2 }
<u2> { <a> <c> 3, 4 }

would be:

<u1> eg:enumeration ( [ rdf:subject <a>; rdf:predicate <b>; rdf:object 1 ],
                      [ rdf:subject <a>; rdf:predicate <b>; rdf:object 2 ] ).

<u2> eg:enumeration ( [ rdf:subject <a>; rdf:predicate <c>; rdf:object 3 ],
                      [ rdf:subject <a>; rdf:predicate <c>; rdf:object 4 ] ).

I think that *works* - we just needed to invent one predicate.  And
each quad turns into about five triples.   We could make it a just four
triples if we didn't care to say u1 and u2 have *only* these triples:

<u1> eg:hasTriple [ rdf:subject <a>; rdf:predicate <b>; rdf:object 1 ],
                  [ rdf:subject <a>; rdf:predicate <b>; rdf:object 2 ].

<u2> eg:hasTriple [ rdf:subject <a>; rdf:predicate <c>; rdf:object 3 ],
                  [ rdf:subject <a>; rdf:predicate <c>; rdf:object 4 ].

So, why do SPARQL folks prefer TriG and N-Quads to these forms?  I don't
know.    

One guess might be concern about handing the input efficiently in the
cases where the input triples are not localized.  If you put that into
N-Triples and sort it by predicate, performing the import is going to
require holding the entire structure in memory.  But a valid response
might be, "don't do that".  (That is, keep the bnodes localized as they
would be turtle or rdf/xml that avoided node ids, if you want
de-serialization to be a streaming operation.)

Another guess: it just seems kind of bulky for no real benefit.   This
is amusingly just what non-RDF-folks feel whenever faced with RDF
syntaxes before they need the benefits.  ;-)

Another guess -- in Lee's case, he's talk about them hand editing
TriG.  

One could imagine { ... } being defined to be syntactic sugar for
[ eg:enumeration ( [ rdf:subject ...; rdf:predicate; ... ]
[ ... ] ... ) ]

But, yes, referential transparency might be a problem, depending how one
is going to use these structures.

      -- Sandro

[1] http://en.wikipedia.org/wiki/Opaque_context 
[2]
http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs-UC#Exchanging_the_contents_of_RDF_stores
   
> I.
> 
> ----
> Ivan Herman
> Tel:+31 641044153
> http://www.ivan-herman.net
> 
> 
> 
> On 8 Apr 2011, at 02:43, Eric Prud'hommeaux <eric@w3.org> wrote:
> 
> > * Sandro Hawke <sandro@w3.org> [2011-04-07 18:23-0400]
> >> On Thu, 2011-04-07 at 18:18 -0400, David Wood wrote:
> >>> On Apr 7, 2011, at 18:07, RDF Working Group Issue Tracker <sysbot+tracker@w3.org> wrote:
> >>> 
> >>>> 
> >>>> RDF-ISSUE-25 (Deprecate Reification): Should we deprecate (RDF 2004) reification? [Cleanup tasks]
> >>>> 
> >>>> http://www.w3.org/2011/rdf-wg/track/issues/25
> >>>> 
> >>>> Raised by: Sandro Hawke
> >>>> On product: Cleanup tasks
> >>>> 
> >>>> 
> >>>> The RDF 1999 and 2004 Recommendations include vocabulary and syntax
> >>>> (in RDF/XML) for RDF "reification".  The vocabulary is rdf:Statement,
> >>>> rdf:subject, rdf:predicate, and rdf:object; the syntax is rdf:ID used
> >>>> on a property element.
> >>>> 
> >>>> Although this feature is sometimes used in practice, some experts
> >>>> advise data providers to avoid it.  It has no syntactic support in
> >>>> RDFa or Turtle.  Should the WG align with this advice and say this
> >>>> feature is only to be use for backward compatibility?  (That is,
> >>>> RDF/XML parsers must continue to support the syntax, and libraries
> >>>> should allow applications to use the features to interoperate with
> >>>> legacy RDF systems.)
> >>>> 
> >>>> Note that many or all of the use cases of reification are also uses
> >>>> cases for [GRAPHS].  The decision about the fate of reificiation is
> >>>> connected with what happens with [GRAPHS].
> >>> 
> >>> 
> >>> Might reification undergo a renaissance when provenance comes back into fashion?  Couldn't we consider reification a degenerate case of a named graph?
> >>> 
> >>> We might want to go slowly on this one...
> >> 
> >> I think it's one of the candidate solutions for the GRAPHS use cases.
> >> My guess is it's unlikely to survive, but who knows.  :-)
> >> 
> >> Maybe I should move it from [Cleanup tasks] to [GRAPHS] ?
> > 
> > People objected to reification for inference and syntax reasons.
> > 
> > INFERENCE
> > The inference issues boil down to the fact that rules applicable to a
> > flat graph must be transformed when applied to a reified graph. The
> > principle exemplar being owl:sameAs:
> >  <LoisLane> <says> [ rdf:s <Superman> ; rdf:p <can> ; rdf:o <fly> ] .
> >  <Superman> owl:sameAs <ClarkKent> .
> > Applying the sameAs to the reified graph tells you that Lois Lane says
> > that Clark Kent can fly, just as it would if you applied it to all
> > symbols in
> >  <SYSTEM> { <LoisLane> <uttered> <G1> . }
> >  <G1> { <Superman> <can> <fly> . }
> > 
> > If we want use graphs for quoting, we have to be judicious about the
> > application of sameAs. Perhaps we want our <SYSTEM> to infer that if
> >  <Superman> <canBeatUp> <LexLuther> .
> > then
> >  <ClarkKen> <canBeatUp> <LexLuther> .
> > Of course, we can be equally judicious about the application of sameAs
> > in the reified world, using a rule like:
> >  { ?X owl:sameAs ?Y .
> >    <SYSTEM> <holds> [ rdf:s ?X ; rdf:p ?p ; rdf:o ?o ] . }
> >  => 
> >  { <SYSTEM> <holds> [ rdf:s ?Y ; rdf:p ?p ; rdf:o ?o ] . }
> > 
> > In short, I'm not convinced that named graphs offers any more quoting
> > ability than reification. We just can't mix reified and non-reified
> > statements. (More precisely, we need to know which statements are
> > reified, much as we need to know if an statement is inside {}s.)
> > 
> > 
> > SYNTAX
> > We can define a predicate <uttered> to encode quoting in named graphs:
> >  uttered: asserts that the subject asserted all of the statements
> >           in the graph named in the object.
> >  <SYSTEM> { <LoisLane> <uttered> <G1> .
> >             <Superman> <canBeatUp> <LexLuther> .}
> >  <G1> { <Superman> <can> <fly> . }
> > or reification:
> >  uttered: asserts that the subject asserted the dereification of the
> >           objects of the <holds> arc from the object. [wordsmithing opportunity]
> >  <SYSTEM> <holds> [ rdf:s <LoisLane> ; rdf:p <uttered> ; rdf:o <G1> ] ,
> >                   [ rdf:s <Superman> ; rdf:p <canBeatUp> ; rdf:o <LexLuther> ] .
> >  <G1> <holds> [ rdf:s <Superman> ; rdf:p <can> ; rdf:o <fly> ] .
> > or more simply in N3:
> >  uttered: asserts that the subject asserted the statements in the object.
> >  <SYSTEM> <holds> { <LoisLane> <uttered> { <Superman> <can> <fly> . } .
> >                     <Superman> <canBeatUp> <LexLuther> . } .
> > 
> > What happens when Lois says that Lex says that Superman can fly?
> > name graphs:
> >  <SYSTEM> { <LoisLane> <uttered> <G1> .
> >             <Superman> <canBeatUp> <LexLuther> . }
> >  <G1> { <LexLuther> <uttered> <G2> . }
> >  <G2> { <Superman> <can> <fly> . }
> > reification:
> >  <SYSTEM> <holds> [ rdf:s <LoisLane> ; rdf:p <uttered> ; rdf:o <G1> ] ,
> >                   [ rdf:s <Superman> ; rdf:p <canBeatUp> ; rdf:o <LexLuther> ] .
> >  <G1> <holds> [ rdf:s <LexLuther> ; rdf:p <uttered> ; rdf:o <G2> ] .
> >  <G2> <holds> [ rdf:s <Superman> ; rdf:p <can> ; rdf:o <fly> ] .
> > n3:
> >  <SYSTEM> <holds> {
> >    <LoisLane> <uttered> {
> >      <LexLuther> <uttered>  {
> >        <Superman> <can> <fly> . } . } .
> >    <Superman> <canBeatUp> <LexLuther> . }
> > 
> > SPARQL syntax might lead us to believe that queries can use nesting to
> > match she-said-he-said quotes, but I don't think there's any distinction
> > between (here arbitrarily promoting <SYSTEM> to the default graph):
> >  ASK {
> >    ?she <uttered> ?g1
> >    GRAPH ?g1 {
> >      ?he <uttered> ?g2
> >      GRAPH ?g2 {
> >        <Superman> <can> <fly>
> >      }
> >    }
> >  }
> > and
> >  ASK {
> >    ?she <uttered> ?g1
> >    GRAPH ?g1 {
> >      ?he <uttered> ?g2
> >    }
> >    GRAPH ?g2 {
> >      <Superman> <can> <fly>
> >    }
> >  }
> > 
> > The real challenge for named graphs comes when we don't have names for
> > our speach acts. Reification causes no problem:
> >  <SYSTEM> <holds> [ rdf:s <LoisLane> ; rdf:p <uttered> ; rdf:o _:g1 ] .
> >  _:g1 <holds> [ rdf:s <LexLuther> ; rdf:p <uttered> ; rdf:o _:g2 ] .
> > but names graphs requires bnode scope to escape the graph boundries:
> >  <SYSTEM> { <LoisLane> <uttered> _:g1 . }
> >  _:g1 { <LexLuther> <uttered> _:g2 . }
> > Critics of bnodes will no doubt say "invent names for your speach acts",
> > but "honor the names you invented" is a pretty heavy burden compared to
> > having to write out reification.
> > 
> > -- 
> > -ericP
> > 
>
Received on Friday, 8 April 2011 04:42:42 UTC