- From: Frank Manola <fmanola@mitre.org>
- Date: Thu, 14 Jun 2001 15:56:42 -0400
- To: w3c-rdfcore-wg <w3c-rdfcore-wg@w3.org>
I had undertaken to produce some "test cases" to try to clarify some of the issues revolving around RDF reification (including places where related concepts in RDF might be unclear), based on earlier discussions with Dan Brickley and others. In the meantime, the notion of "test cases" has undergone a bit of an evolution from what we were originally talking about. It now means an explict set of RDF triples, together with expected output. This "evolved" definition more fits an intuitive definition of "test cases" than what we were talking about, so I'll call what I'm doing here "test questions" for now (with actual "test cases" possibly to follow). This isn't as clean, thoroughly-thought-out, or complete as I'd like (I got hauled off to do some other stuff in the meantime), but I thought I'd get my initial ideas out to give the WG a whack at them. In some cases, I think the answers to the questions might need some serious thinking. In other cases, the answers may be straightforward, but we may need to do some serious wordsmithing to clearly express that intention in the M&S. Following each question (or collection of questions) is some discussion. The first question is Dan's original one ("RQ" stands (here) for "reification question"). --Frank RQ1: Are members of the class rdf:Statement uniquely picked out by their predicate/subject/object properties? This seems fairly reasonable, since the Formal Model (M&S Section 5) says that each element of Statements is a predicate/subject/object triple, and there isn't anything else to identify the members. On the other hand, it's reasonable that the same predicate/subject/object triple will appear in different "places" (e.g., several people record the same metadata about a given Web resource). If we consider URIs as identifying *appearances" of these triples, we can imagine multiple URIs identifying the same triple (distinguishing the multiple appearances, or having different identities from the perspective of different *identifying* authorities), not unlike the idea that multiple URIs might identify the same real world thing (like a person). Or do we consider the predicate/subject/object triple as being the URI, and these other URIs as identifying the appearances (or something else)? RQ2: M&S section 5 says that there is a set called Statements (whose elements are triples). What is the intended scope of this set? That is, is this intended to be a conceptual extension (for language specification purposes only) of class Statements that includes all RDF statements anywhere? Is it intended to be possible to have subsets of this set representing specific collections of RDF statements (e.g., a collection of statements made to describe a given resource)? a. Section 5 also says "We can view *a* set of statements as a directed labeled graph...", which seems to suggest that multiple sets of statements are possible. On the other hand, we (equivalently) can ask the question "how many graphs are there? One (corresponding to all statements in Statements)? Many (which again suggests there are subsets of Statements)? Note that M&S also says "A statement and its corresponding reified statement exist independently in *an* [not *the*] RDF graph and either may be present without the other." [Note that, while it may be really obvious that there are going to be subsets of Statements, the M&S doesn't explicitly talk about that very clearly. One thing that the M&S, or some related document, could use is some more thoroughly-developed Use Cases that go beyond the current examples to show how various collections of the kinds of descriptions used in the examples are represented in the Web, are accessed when needed, are reified and unreified if necessary, etc.] b. "set" is not an RDF-defined collection; "bag" is the closest. So we cannot describe the formal model in RDF(?) c. If "set" is taken literally, and "the class rdf:Statement" is taken to refer to a single set of all RDF statements anywhere, it seems that the answer to RQ1 must be "yes", because there is no other way to uniquely identify the triples. --------------- RQ3: M&S Section 4.1 says "If, instead, we write the sentence 'Ralph Swick says that Ora Lassila is the creator of the resource http://www.w3.org/Home/Lassila' we have said nothing about the resource http://www.w3.org/Home/Lassila; instead, we have expressed a fact about a statement Ralph has made." If we use reification to write the RDF for the Ralph Swick example, have we in fact "expressed a fact about a statement Ralph has made"? Alternatively, what is the thing that we have expressed a fact about? Alteratively (again), in what sense is the reified statement really a statement? a. M&S says "A statement and its corresponding reified statement exist independently in an RDF graph and either may be present without the other. The RDF graph is said to contain the fact given in the statement if and only if the statement is present in the graph, irrespective of whether the corresponding reified statement is present." This suggests that the statement Ralph purportedly made may not be there; only its reification is. If we say Ralph Swick says X and X is only present in reified form (not present as a fact) what is X? How do we know what Ralph said? Note that we do not discuss conversion back and forth between reified and unreified statements. E.g., I might want to collect all the things Ralph Swick said, convert them to statements, and determine if they were consistent. b. The formal model says "facts (that is, statements) are triples that are members of Statements". This suggests that the thing Ralph said *isn't* a statement (otherwise it would be a fact), so how can we say we're expressing a fact about a *statement* Ralph made? c. The intended semantics seem to be something like "there exists a statement (that I now create) that I want to attribute to Ralph Swick." This is consistent with the idea that both the statement and its reification are in Statements. However, if only the reification is in Statements, in what sense is the original statement (the one I want to attribute to Ralph Swick) a statement (since it's not in Statements). ------------ RQ4: What is the significance of an RDF graph "containing a fact"? Is someone asserting that something is true? Assuming that there are multiple graphs, what is the significance of apparently contradictory "facts" in multiple graphs? [We don't really say anything about this stuff. A model theory for RDF would help deal with this.] ------------ RQ5: M&S section 4.1 says "Reification is also needed to represent explicitly in the model the statement grouping implied by Description elements." Why (or under what circumstances) is it necessary to explicitly represent this grouping? And should this idea be extended to other groupings of statements (e.g., a group of statements made to describe a single resource, or intended to be consistent with respect to some model)? That is, is it always necessary to reify groups of statements in order to indicate they constitute a group, or only sometimes? If the latter, which times? Why? The Description element is introduced in the RDF syntax as a shorthand to allow multiple statements to be made about the same resource without repeating the resource identifier. E.g., the example <rdf:RDF> <rdf:Description about="http://www.w3.org/Home/Lassila"> <s:Creator>Ora Lassila</s:Creator> <s:Title>Ora's Home Page</s:Title> </rdf:Description> </rdf:RDF> results in two triples being generated. However, if a bagID is specified, the example <rdf:RDF> <rdf:Description about="http://www.w3.org/Home/Lassila" bagID="D_001"> <s:Creator>Ora Lassila</s:Creator> <s:Title>Ora's Home Page</s:Title> </rdf:Description> </rdf:RDF> results in 13 triples being generated. (NB: there is an issue relating to the generation of these bags already identified). a. One explanation for this is that this is intended to suggest a way of recording syntactic context (in this case, that several statements come from the same Description element) in RDF. That is, you generate a resource representing the context (a bag representing the Description element in this case), reify each of the contained statements, and add all the resulting triples (including the triples representing the original statements) to Statements. Presumably this approach could be extended to other types of syntactic contexts as well (all the RDF statements on a given Web page, for example). However, this suggests the need for some principle for specifying when to include just the reifications, and when to include the original statements as well. (Also, this seems an extreme way of representing contextual information, since the number of statements balloons enormously). b. As noted above, while RDF defines what the reified model of a triple is, it at present contains no explicit mechanism (or operator) for moving between a triple and its reification (in either direction). c. The PICS example (section 7.6) uses BagID, while the Dublin Core example (section 7.4) doesn't. Suppose we decide we want to attribute the specified collection of Dublin Core statements to some individual. Must we reify the whole collection? (Note that if the collection of statements is a separate resource, it has a URI that could be used without the need to reify them). If not, why not (and why can't this reason apply to other collections)? This clearly relates to the question below. ----------------------- RQ6: M&S Section 4.1 says "Statements are made about resources. A model of a statement is the resource we need in order to be able to make new statements (higher-order statements) about the modeled statement." Is this "model of a statement" really needed in order to make statements about statements? a. One of the points noted in the rdf-logic discussions is that, in logic, "higher-order statements" don't mean "statements about statements". b. The line of reasoning here is presumably that, if statements can only be made about resources, the only way to make statements about statements is to make the latter statements resources. This means they must have URIs. However, why is this particular model needed in order for URIs to be assigned to statements? Moreover, why does the M&S have to specify *any* mechanism for assigning URIs to statements? RDF is specified independently of how any other resources (which may be of arbitrary complexity) are assigned URIs. Moreover, RDF statements might be represented in many different concrete formats, each of which has a particularly-suitable way of assigning URIs. [There is, in fact, some intuitive reason why there ought to be some way of modeling statements about statements, e.g., attribution. Conceptual graphs, for example, has such a mechanism. However, this involves more than a kind of "DOM" model or infoset for the statement.] -- Frank Manola The MITRE Corporation 202 Burlington Road, MS A345 Bedford, MA 01730-1420 mailto:fmanola@mitre.org voice: 781-271-8147 FAX: 781-271-8752
Received on Thursday, 14 June 2001 15:57:26 UTC