Re: reification "test questions": first crack from Graham Klyne on 2001-06-15 (w3c-rdfcore-wg@w3.org from June 2001)

From: Graham Klyne <Graham.Klyne@Baltimore.com>
Date: Fri, 15 Jun 2001 14:15:43 +0100
To: fmanola@mitre.org
Cc: w3c-rdfcore-wg <w3c-rdfcore-wg@w3.org>
Message-Id: <5.0.2.1.2.20010615122807.03653a90@joy.songbird.com>
Frank,

Some thoughts on your questions.  This is a long response.  The original 
questions are at the end of this message.  I've tried to provide some 
material that might hopefully find its way into test cases, etc in due course.

...

RQ1.  Consider the following test case:
---------------------------------------

<!--    Issue: ?
               Is a reified statement unique in Statements?
-->

<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
   <rdf:Statement about="http://example.org/statement1">
     <rdf:predicate rdf:resource="http://example.org/predicate" />
     <rdf:subject rdf:resource="http://example.org/subject" />
     <rdf:object rdf:resource="http://example.org/object" />
   </rdf:Statement>
   <rdf:Statement about="http://example.org/statement2">
     <rdf:predicate rdf:resource="http://example.org/predicate" />
     <rdf:subject rdf:resource="http://example.org/subject" />
     <rdf:object rdf:resource="http://example.org/object" />
   </rdf:Statement>
</rdf:RDF>


N-triples (output by SirPAC):

<http://example.org/statement1>
   <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
   <http://www.w3.org/1999/02/22-rdf-syntax-ns#Statement> .
<http://example.org/statement1>
   <http://www.w3.org/1999/02/22-rdf-syntax-ns#predicate>
   <http://example.org/predicate> .
<http://example.org/statement1>
   <http://www.w3.org/1999/02/22-rdf-syntax-ns#subject>
   <http://example.org/subject> .
<http://example.org/statement1>
   <http://www.w3.org/1999/02/22-rdf-syntax-ns#object>
   <http://example.org/object> .
<http://example.org/statement2>
   <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
   <http://www.w3.org/1999/02/22-rdf-syntax-ns#Statement> .
<http://example.org/statement2>
   <http://www.w3.org/1999/02/22-rdf-syntax-ns#predicate>
   <http://example.org/predicate> .
<http://example.org/statement2>
   <http://www.w3.org/1999/02/22-rdf-syntax-ns#subject>
   <http://example.org/subject> .
<http://example.org/statement2>
   <http://www.w3.org/1999/02/22-rdf-syntax-ns#object>
   <http://example.org/object> .


This suggests that different resources that reify the same statement are 
allowable (to prevent the above from being valid RDF would seem perverse to 
me).  It doesn't answer the question as asked:  I think that would be a 
matter for the formal semantics.


RQ2.  Statement sets
--------------------

I think the set of Statements is something from the "domain of 
interpretation" of RDF, rather than a part of the language.

I think the questions here would be answered by some more formality in the 
definition of the RDF abstract syntax and semantics;  e.g. I think an "RDF 
graph" is a well formed expression of the RDF language, whose abstract 
syntax might be defined something like this:


Terminal symbols:

   N : Nodes      (may be represented by Qnames or URIs)
   L : Literals   (may be represented by strings)
   P : Properties (may be represented by Qnames or URIs)

   rdf:type       (distinguished member of Properties)
   rdf:subject    (distinguished member of Properties)
   rdf:object     (distinguished member of Properties)
   rdf:predicate  (distinguished member of Properties)

   rdf:Statement  (distinguished member of Nodes)

   [ ]            ("punctuation" literals)

Nonterminal symbols:

   G : Graphs         (distinguished symbol of this syntax)
   S : Statements
   V : Values         (nodes or literals)
   R : Reifications

Productions:

::=    denotes a production in the syntax metalanguage,
|      denotes alternative productions in the syntax metalanguage,
<NULL> is a placeholder for an empty sequence of symbols

   G ::= S | G G | <NULL>

   S ::= R | A

   R ::= [ N rdf:type rdf:Statement ]
         [ N rdf:predicate P ]
         [ N rdf:subject N ]
         [ N rdf:object V ]

   A ::= [ N P V ]

   V ::= N | L


Note this abstract syntax is independent of any RDF surface syntax.  It 
calls for interpretations of the terminal symbols that stand for Nodes, 
Literals, Properties, Graphs, Statements and Values.  Also for 
interpretation of the syntax productions that combine these into bigger 
structures.

One might adopt an interpretation that answers the RQ2 questions thus:
(a)
- each invocation of production A denotes a member of Statements.
- each invocation of production R denotes 4 members of Statements.
- each invocation of a production for G denotes a subset of Statements.
- multiple subsets of Statements are possible.
(b)
- the sets of statements are in the domain of interpretation of RDF, not 
the RDF language itself.  I think some metalanguage is needed to describe 
the formal model of RDF.
(c)
- I think I agree:  while there may be different resources that stand for a 
reification of a statement, they must all be interpreted as relating in 
some way to the same member of Statements.  In practice, this might 
correspond to the idea that if one "unreifies" ("asserts"?) different 
reifications of a statement, one ends up asserting the same statement.


RQ3.  Statements about statements
---------------------------------

I'll try and reduce the example to the simplest test case:

<!--    Issue: ?
               Statements about statements?
-->

<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
          xmlns:a="http://example.org/" >
   <rdf:Statement about="http://example.org/statement">
     <rdf:predicate rdf:resource="http://example.org/creator" />
     <rdf:subject rdf:resource="http://www.w3.org/Home/Lassila" />
     <rdf:object>Ora Lassila</rdf:object>
     <a:attributedTo>Ralph Swick</a:attributedTo>
   </rdf:Statement>
</rdf:RDF>


<http://example.org/statement>
   <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
   <http://www.w3.org/1999/02/22-rdf-syntax-ns#Statement> .
<http://example.org/statement>
   <http://www.w3.org/1999/02/22-rdf-syntax-ns#predicate>
   <http://example.org/creator> .
<http://example.org/statement>
   <http://www.w3.org/1999/02/22-rdf-syntax-ns#subject>
   <http://www.w3.org/Home/Lassila> .
<http://example.org/statement>
   <http://www.w3.org/1999/02/22-rdf-syntax-ns#object>
   "Ora Lassila" .
<http://example.org/statement>
   <http://example.org/attributedTo>
   "Ralph Swick" .

(a)
- The "attributedTo" property has as its subject a resource that stands for 
a reification of the statement.  I think some applications might want to 
interpret this as a modal operator on the statement denoted by the reification.
(b)
- Again, I think this is a distinction between the language and its domain 
of interpretation:  reification is in the language;  statements are in the 
domain of interpretation.
(c)
- You say "there exists a statement (that I now create)...".  I don't think 
it's right to say "create" here:  would "denote" or "describe" be closer?


RQ4.  Significance of a fact in a graph
---------------------------------------

I agree -- a model theory would help.  I also disagree with DanC's comments 
that a statement that appears in a graph is, ipso facto, true.  I think we 
need an interpretation that can assign truth of falsity to any 
statement.  Then we can say that a graph is assigned Truth if all its 
statements are assigned Truth, otherwise False.

In this, the reification of a statement is NOT a statement.  I would view 
the 4 statements that form a reification as always being assigned Truth if 
the reification is well formed (according to the syntax and whatever else 
is required for a well-formed reification).


RQ5.  Reification to express statement grouping
-----------------------------------------------

I view this as a variation on attribution.  For example, the fact that
a statement was obtained from some trusted source.  Or that a statement
was made in the same context as some other statements.  I think there
may be subtleties here that don't belong in the RDF core.

(a)
- yes, I think it can express syntactic context (among others)
- when to include just reifications?  When the statements are not facts, I 
think.  (The bagID syntax does not appear to provide a way to do this).
(b)
- explicit reification/unreification operators?  Sounds hairy to me.
(c)
- in talking about "collection of statements" I think there may be some 
confusion between the language and its domain of interpretation.


RQ6.  "Model of statement"
--------------------------

I think the term "model" here may be unfortunate;  I think what is meant 
here is the inverse of "model" in the model theoretic sense.  I'd be 
inclined to say that the reification is a way to create a sub-expression in 
RDF that denotes the subject of statements-about-statements.

(a)
- No further comment.
(b)
- I see nothing special about the RDF device for making statements about 
statements, other than that it exists in the specification.  It is probably 
true that there are other, better ways to achieve the same end and I, for 
one, won't squeal if the group decides to develop some cleaner, easier 
alternative.

#g
--


At 03:56 PM 6/14/01 -0400, Frank Manola wrote:
>RQ1:  Are members of the class rdf:Statement uniquely picked out by
>their predicate/subject/object properties?
>
>This seems fairly reasonable, since the Formal Model (M&S Section 5)
>says that each element of Statements is a predicate/subject/object
>triple, and there isn't anything else to identify the members.  On the
>other hand, it's reasonable that the same predicate/subject/object
>triple will appear in different "places" (e.g., several people record
>the same metadata about a given Web resource).  If we consider URIs as
>identifying *appearances" of these triples, we can imagine multiple URIs
>identifying the same triple (distinguishing the multiple appearances, or
>having different identities from the perspective of different
>*identifying* authorities), not unlike the idea that multiple URIs might
>identify the same real world thing (like a person).  Or do we consider
>the predicate/subject/object triple as being the URI, and these other
>URIs as identifying the appearances (or something else)?
>
>RQ2:  M&S section 5 says that there is a set called Statements (whose
>elements are triples). What is the intended scope of this set?  That is,
>is this intended to be a conceptual extension (for language
>specification purposes only) of class Statements that includes all RDF
>statements anywhere?  Is it intended to be possible to have subsets of
>this set representing specific collections of RDF statements (e.g., a
>collection of statements made to describe a given resource)?
>
>a. Section 5 also says "We can view *a* set of statements as a directed
>labeled graph...", which seems to suggest that multiple sets of
>statements are possible. On the other hand, we (equivalently) can ask
>the question "how many graphs are there?  One (corresponding to all
>statements in Statements)?  Many (which again suggests there are subsets
>of Statements)?  Note that M&S also says "A statement and its
>corresponding reified statement exist independently in *an* [not *the*]
>RDF graph and either may be present without the other."  [Note that,
>while it may be really obvious that there are going to be subsets of
>Statements, the M&S doesn't explicitly talk about that very clearly.
>One thing that the M&S, or some related document, could use is some more
>thoroughly-developed Use Cases that go beyond the current examples to
>show how various collections of the kinds of descriptions used in the
>examples are represented in the Web, are accessed when needed, are
>reified and unreified if necessary, etc.]
>
>b.  "set" is not an RDF-defined collection;  "bag" is the closest. So we
>cannot describe the formal model in RDF(?)
>c.  If "set" is taken literally, and "the class rdf:Statement" is taken
>to refer to a single set of all RDF statements anywhere, it seems that
>the answer to RQ1 must be "yes", because there is no other way to
>uniquely identify the triples.
>
>---------------
>
>RQ3:  M&S Section 4.1 says "If, instead, we write the sentence 'Ralph
>Swick says that Ora Lassila is the creator of the resource
>http://www.w3.org/Home/Lassila' we have said nothing about the resource
>http://www.w3.org/Home/Lassila; instead, we have expressed a fact about
>a statement Ralph has made."   If we use reification to write the RDF
>for the Ralph Swick example, have we in fact "expressed a fact about a
>statement Ralph has made"?  Alternatively, what is the thing that we
>have expressed a fact about?  Alteratively (again), in what sense is the
>reified statement really a statement?
>
>a.  M&S says "A statement and its corresponding reified statement
>exist independently in an RDF graph and either may be present without
>the other.  The RDF graph is said to contain the fact given in the
>statement if and only if the statement is present in the graph,
>irrespective of whether the corresponding reified statement is present."
>This suggests that the statement Ralph purportedly made may not be
>there;  only its reification is.
>
>If we say Ralph Swick says X and X is only present in reified form (not
>present as a fact) what is X?  How do we know what Ralph said?  Note
>that we do not discuss conversion back and forth between reified and
>unreified statements.  E.g., I might want to collect all the things
>Ralph Swick said, convert them to statements, and determine if they were
>consistent.
>
>b.  The formal model says "facts (that is, statements) are triples that
>are members of Statements".  This suggests that the thing Ralph said
>*isn't* a statement (otherwise it would be a fact), so how can we say
>we're expressing a fact about a *statement* Ralph made?
>
>c.  The intended semantics seem to be something like "there exists a
>statement (that I now create) that I want to attribute to Ralph Swick."
>This is consistent with the idea that both the statement and its
>reification are in Statements.  However, if only the reification is in
>Statements, in what sense is the original statement (the one I want to
>attribute to Ralph Swick) a statement (since it's not in Statements).
>
>------------
>RQ4:  What is the significance of an RDF graph "containing a fact"?  Is
>someone asserting that something is true?  Assuming that there are
>multiple graphs, what is the significance of apparently contradictory
>"facts" in multiple graphs?  [We don't really say anything about this
>stuff.  A model theory for RDF would help deal with this.]
>
>------------
>
>RQ5:  M&S section 4.1 says "Reification is also needed to represent
>explicitly in the model the statement grouping implied by Description
>elements."  Why (or under what circumstances) is it necessary to
>explicitly represent this grouping?  And should this idea be extended to
>other groupings of statements (e.g., a group of statements made to
>describe a single resource, or intended to be consistent with respect to
>some model)?  That is, is it always necessary to reify groups of
>statements in order to indicate they constitute a group, or only
>sometimes?  If the latter, which times?  Why?
>
>The Description element is introduced in the RDF syntax as a shorthand
>to allow multiple statements to be made about the same resource without
>repeating the resource identifier.  E.g., the example
>
><rdf:RDF>
>     <rdf:Description about="http://www.w3.org/Home/Lassila">
>       <s:Creator>Ora Lassila</s:Creator>
>       <s:Title>Ora's Home Page</s:Title>
>     </rdf:Description>
>   </rdf:RDF>
>
>results in two triples being generated.  However, if a bagID is
>specified, the example
>
><rdf:RDF>
>     <rdf:Description about="http://www.w3.org/Home/Lassila"
>bagID="D_001">
>       <s:Creator>Ora Lassila</s:Creator>
>       <s:Title>Ora's Home Page</s:Title>
>     </rdf:Description>
>   </rdf:RDF>
>
>results in 13 triples being generated.
>
>(NB: there is an issue relating to the generation of these bags already
>identified).
>
>a.  One explanation for this is that this is intended to suggest a way
>of recording syntactic context (in this case, that several statements
>come from the same Description element) in RDF.  That is, you generate a
>resource representing the context (a bag representing the Description
>element in this case), reify each of the contained statements, and add
>all the resulting triples (including the triples representing the
>original statements) to Statements.  Presumably this approach could be
>extended to other types of syntactic contexts as well (all the RDF
>statements on a given Web page, for example).  However, this suggests
>the need for some principle for specifying when to include just the
>reifications, and when to include the original statements as well.
>(Also, this seems an extreme way of representing contextual information,
>since the number of statements balloons enormously).
>
>b. As noted above, while RDF defines what the reified model of a triple
>is, it at present contains no explicit mechanism (or operator) for
>moving between a triple and its reification (in either direction).
>
>c.  The PICS example (section 7.6) uses BagID, while the Dublin Core
>example (section 7.4) doesn't.  Suppose we decide we want to attribute
>the specified collection of Dublin Core statements to some individual.
>Must we reify the whole collection?  (Note that if the collection of
>statements is a separate resource, it has a URI that could be used
>without the need to reify them).  If not, why not (and why can't this
>reason apply to other collections)?  This clearly relates to the
>question below.
>
>-----------------------
>RQ6:  M&S Section 4.1 says "Statements are made about resources.  A
>model of a statement is the resource we need in order to be able to make
>new statements (higher-order statements) about the modeled statement."
>Is this "model of a statement" really needed in order to make statements
>about statements?
>
>a.  One of the points noted in the rdf-logic discussions is that, in
>logic, "higher-order statements" don't mean "statements about
>statements".
>
>b.  The line of reasoning here is presumably that, if statements can
>only be made about resources, the only way to make statements about
>statements is to make the latter statements resources.  This means they
>must have URIs.  However, why is this particular model needed in order
>for URIs to be assigned to statements?  Moreover, why does the M&S have
>to specify *any* mechanism for assigning URIs to statements?  RDF is
>specified independently of how any other resources (which may be of
>arbitrary complexity) are assigned URIs. Moreover, RDF statements might
>be represented in many different concrete formats, each of which has a
>particularly-suitable way of assigning URIs.  [There is, in fact, some
>intuitive reason why there ought to be some way of modeling statements
>about statements, e.g., attribution.  Conceptual graphs, for example,
>has such a mechanism.  However, this involves more than a kind of "DOM"
>model or infoset for the statement.]
>
>
>--
>Frank Manola                   The MITRE Corporation
>202 Burlington Road, MS A345   Bedford, MA 01730-1420
>mailto:fmanola@mitre.org      voice: 781-271-8147   FAX: 781-271-8752

------------------------------------------------------------
Graham Klyne                    Baltimore Technologies
Strategic Research              Content Security Group
<Graham.Klyne@Baltimore.com>    <http://www.mimesweeper.com>
                                 <http://www.baltimore.com>
------------------------------------------------------------
Received on Friday, 15 June 2001 09:23:20 UTC