Re: RDF-ISSUE-5 (Graph Literals): Should we define Graph Literal datatypes? [RDF Graphs] from Pat Hayes on 2011-03-05 (public-rdf-wg@w3.org from March 2011)

From: Pat Hayes <phayes@ihmc.us>
Date: Sat, 5 Mar 2011 08:51:24 -0600
To: nathan@webr3.org
Cc: RDF Working Group WG <public-rdf-wg@w3.org>
Message-Id: <4774D76C-58AE-4698-A2E3-E717D0F278F7@ihmc.us>
On Mar 5, 2011, at 6:50 AM, Nathan wrote:

> Pat Hayes wrote:
>> On Mar 4, 2011, at 3:59 PM, RDF Working Group Issue Tracker wrote:
>>> RDF-ISSUE-5 (Graph Literals): Should we define Graph Literal datatypes? [RDF Graphs]
>>> 
>>> http://www.w3.org/2011/rdf-wg/track/issues/5
>>> 
>>> Raised by: Sandro Hawke
>>> On product: RDF Graphs
>>> 
>>> We could define datatypes, such as ser:rdfxml and ser:turtle, whose
>>> lexical space is the set of valid document strings in RDF/XML, Turtle,
>>> etc, and whose value space contains the corresponding RDF graphs.
>>> 
>>> This would allow people to use ordinary RDF tools to express facts involving RDF graphs, such as that some graph was obtained from some URI at some point in time, or that some person claims some graph is true or false.
>> Allow me to cast doubt on this claim. I do not believe that graph literals (in contrast to named graphs) would in fact provide such functionality in practice. For several reasons.
>> 1. This would allow such 'metaRDF' descriptions only for the case where the object graph - the one being described - is completely specified by its full textual representation. This would make such metaRDF almost unusable for large object graphs, and exceedingly awkward, at best, for all but toy object graphs. For any graph, the g-text is a much more verbose way to refer to it than a URI would be. 2. The full textual representation of a graph does not, ironically, serve to "identify" it in the sense required. Suppose I publish some RDF in a box with a URI. The URI identifies the box, but it does not identify the graph. The very same graph might be a snapshot of a different box with a different provenance and history and authority claiming it to be true. It is the box, not the graph, which will be asserted or will have a history or be deprecated, etc.. But a graph literal of a snap of a box does not identify the box. Even if we say that such a literal identifies any box whose snap is equivalent to the literal, the task of checking such equivalence is NP-complete (an old result of Jeremy's) so we have hamstrung our implementations ahead of time. And this is probably not a good rule to adopt, in any case, even if it were computationally cheap.
>> 3. It is completely unnecessary, if we have named graphs. A named graph has a name which refers to it and identifies its box. Most descriptive languages, including RDF, use names in this way to make assertions about the things named. AFAIKS, nothing is gained by making such a graph into a literal instead of simply using its name to refer to it. And this use of graph names requires no changes to any RDF syntax (or indeed semantics.)
> 
> Pat,
> 
> What you say is true, and that quoted graphs or graph literals are all anonymous (your point 2), however the need for them is quite different, without, how would one say that "ora did not write a book called moby dick", or say "on the 18th february g-box had a value of x"?
> 
> { [ :name "ora" ] :wrote [ :title "moby dick ] } a :Falsehood .

We need to think very carefully before sanctioning negative statements like :Falsehood. If the graph contains bnodes, denying it can be making a very large claim. THis basically introduces the universal quantifier into RDF, making it a full first-order logic. The potential for semantic disasters here is much larger than anything that RDF has had so far, so we need to be careful. 

> 
> { <a> <b> <c> } :uri <u> ; :retrieved "2011-02-18"^^xsd:date .
> 
> It's the ability to talk about a distinct set of triples/statements, or the "value" (g-snap) of a g-box at a certain time.

Well, a g-text can be treated as a g-box (just make a box which is 'fixed' and emits the text when poked.) So, you make said box and name it, and use that name.  Just in the way that you would put an image file into some HTML. 

Your toy examples illustrate my first point very well. Suppose what you want to say is that DBpedia on a certain date contained a particular error. Are you going to quote all of DBPedia?

> 
> When we have this ability, then we can do things such as diff and patch, and annotate our g-text(s) with more meta/provenance information.

No, what you will be able to do is *quote* your texts. But any text outside a literal - any text already in existence somewhere on the Web - will still be un-annotated. This mechanism does not provide any way to annotate a text that is not already annotated. It only allows you to copy it and annotate the (quoted) copy. That is my basic problem with graph literals: they don't let you do anything about actual graphs, only copies of graphs. Its like proposing marriage to a photograph. 

> 
> The points you make all show the key difference between quoting something and talking about it, as opposed to to talking about something named that changes over time.

We can name things that change over time and also name things that don't change over time. The naming process is the same in both cases. There are going to be both kinds of g-box around. Not all websites take advantage of Roy's flexibility: most of them just sit there being HTML and emitting the same representation every time. RDF information resources will probably be similar. 

Pat

> 
> Best,
> 
> Nathan
> 

------------------------------------------------------------
IHMC                                     (850)434 8903 or (650)494 3973   
40 South Alcaniz St.           (850)202 4416   office
Pensacola                            (850)202 4440   fax
FL 32502                              (850)291 0667   mobile
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
Received on Saturday, 5 March 2011 14:52:01 UTC