Re: RDF-ISSUE-5 (Graph Literals): Should we define Graph Literal datatypes? [RDF Graphs] from Pat Hayes on 2011-03-05 (public-rdf-wg@w3.org from March 2011)

From: Pat Hayes <phayes@ihmc.us>
Date: Sat, 5 Mar 2011 09:58:48 -0600
To: Sandro Hawke <sandro@w3.org>
Cc: RDF Working Group WG <public-rdf-wg@w3.org>
Message-Id: <FDBCCC39-03BD-4D83-8411-925668206DBB@ihmc.us>
Quick thought regarding your reply (and others). I think we are thinking about g-boxes subtly differently. You seem to be assuming that a g-box is *always* liable to change. I have been assuming that many g-boxes will be static and simply emit the same graph text every time. (So calling them a 'box' might be misleading.) In fact, a text can BE a static box, just as an HTML file can BE a rather boring website. 

So what we need, maybe, is some way to *say* that a box is static. Suppose we have a specified class of things called static g-boxes. Then we can say 

gname:thisBox rdf:type rdf:static .

to ensure that its content will not change. It can even say this itself, in fact. And then we can put a text in it and be sure that it won't evolve on us while we aren't looking. And then we can refer to texts without having to quote them.

OK, maybe its worth having graph literals as well. I still think this is a useful idea, however. 

On Mar 4, 2011, at 11:17 PM, Sandro Hawke wrote:

> On Fri, 2011-03-04 at 19:26 -0600, Pat Hayes wrote:
>> On Mar 4, 2011, at 3:59 PM, RDF Working Group Issue Tracker wrote:
>> 
>>> 
>>> RDF-ISSUE-5 (Graph Literals): Should we define Graph Literal datatypes? [RDF Graphs]
>>> 
>>> http://www.w3.org/2011/rdf-wg/track/issues/5
>>> 
>>> Raised by: Sandro Hawke
>>> On product: RDF Graphs
>>> 
>>> We could define datatypes, such as ser:rdfxml and ser:turtle, whose
>>> lexical space is the set of valid document strings in RDF/XML, Turtle,
>>> etc, and whose value space contains the corresponding RDF graphs.
>>> 
>>> This would allow people to use ordinary RDF tools to express facts involving RDF graphs, such as that some graph was obtained from some URI at some point in time, or that some person claims some graph is true or false.
>> 
>> Allow me to cast doubt on this claim. I do not believe that graph literals (in contrast to named graphs) would in fact provide such functionality in practice. For several reasons.
> 
> I hesitate to reply because of your regrets (and please accept my
> condolences)

I'll be around, slightly distracted, until Tuesday next week. 

> , but maybe I can have the honor of having my e-mail first
> in the myriads sacks that await your return.   Or... maybe this way we
> can have it all perfectly sorted before your return.
> 
> Anyway, I completely agree with what you say in favor of RDF data
> talking about g-boxes -- I agree that's often the best approach -- but
> I think there are other use cases for graph literals.

OK, maybe. 

> 
> It would be good to work through the use cases and try to figure out the
> costs of benefits of addressing each with each approach.   I might
> manage to do that.
> 
>> 1. This would allow such 'metaRDF' descriptions only for the case where the object graph - the one being described - is completely specified by its full textual representation. This would make such metaRDF almost unusable for large object graphs, and exceedingly awkward, at best, for all but toy object graphs. For any graph, the g-text is a much more verbose way to refer to it than a URI would be. 
> 
> I think there are lots of real apps that use many tiny (but not "toy")
> graphs of 1-50 triples, often where the graph represents an 'object' or
> a simple claim.    The data store might have a billion triples, but the
> granularity of the metadata is often per-triple or close to it.
> 
> Yes, the URI is probably still smaller, but it presents its own
> problems, like worrying about the box contents changing.
> 
> I don't think apps like this weigh in heavily for either design.   
> 
> (I certainly agree situations with very large graphs get 'exceedingly
> awkward' for graph literals.)
> 
>> 2. The full textual representation of a graph does not, ironically, serve to "identify" it in the sense required. Suppose I publish some RDF in a box with a URI. The URI identifies the box, but it does not identify the graph. The very same graph might be a snapshot of a different box with a different provenance and history and authority claiming it to be true. It is the box, not the graph, which will be asserted or will have a history or be deprecated, etc.. But a graph literal of a snap of a box does not identify the box. Even if we say that such a literal identifies any box whose snap is equivalent to the literal, the task of checking such equivalence is NP-complete (an old result of Jeremy's) so we have hamstrung our implementations ahead of time. And this is probably not a good rule to adopt, in any case, even if it were computationally cheap.
> 
> I think there are different use cases here.   Sometimes we want to
> reason about the box, sometimes we want to reason about a graph that
> might have been in certain boxes at certain times.   It's nice to be
> able to talk about both the boxes and the graphs.

See above. I see a graph/text as a simple kind of box.

> I think you're proposing that whenever we want to talk about a known
> graph, we first put it in a box, and then talk about that box.   That's
> the technique, as I mentioned, that I've been using in my own code for
> many years.  It doesn't really need any new specs.  So, yes, it works,
> but I think it'd be somewhat clearer to be able to talk about the graphs
> themselves, sometimes.
> 
> (I can live with either answer to ISSUE-5; I just think it's something
> we'll need to think about and decide.)
> 
>> 3. It is completely unnecessary, if we have named graphs. A named graph has a name which refers to it and identifies its box. Most descriptive languages, including RDF, use names in this way to make assertions about the things named. AFAIKS, nothing is gained by making such a graph into a literal instead of simply using its name to refer to it. And this use of graph names requires no changes to any RDF syntax (or indeed semantics.)
> 
> Again, I'm all in favor of being explicit about boxes, sometimes giving
> them URI names, sometimes maybe referring to them using blank nodes.
> But I disagree that "nothing is gained by making such a graph into a
> literal instead of simply using its name to refer to it."   You mean
> instead of simply using the name of a box that currently happens to
> contain it?  What if that box content changes?

See above.

>  How do you find out what
> URI names which graph?  

?? You GET it and look at what you, um, get. 

I guess I am assuming, as a kind of ground assumption, that 'graphs' on the Web will almost always be accessed through a URI. There simply wont be any 'loose' graphs just kind of floating around in the cloud. Maybe this is wrong? What kind of a vision of graphs-on-the-Web do you have, that raises the problem of having a graph and wondering what its URI might be?

>  These issues can be addressed, but in some
> cases, particularly when you're working with boxes whose contents change
> rapidly (current stock price) or continually (current temperature), I
> think it's probably better to also have an explicit notion of the graphs
> themselves.

I think you aren't letting your own invention carry its proper weight. :-)

Pat

> 
>    -- Sandro
> 
>> Pat Hayes
>> 
>> 
>>> 
>>> This would address some of the use cases for quads, reification, named
>>> graphs, etc, with a mechanism that is very simple to understand and
>>> relatively easy to implement.
>>> 
>>> Languages (like Turtle and RDF/XML) could be extended to provide
>>> syntactic sugar for these literals, much as Turtle provides a nicer
>>> syntax for numbers, but that is not necessary for these literals to be
>>> useful and is not part of this proposal.
>>> 
>>> Some discussion in http://lists.w3.org/Archives/Public/public-rdf-wg/2011Mar/0130.html
>>> 
>>> 
>>> 
>>> 
>> 
>> ------------------------------------------------------------
>> IHMC                                     (850)434 8903 or (650)494 3973   
>> 40 South Alcaniz St.           (850)202 4416   office
>> Pensacola                            (850)202 4440   fax
>> FL 32502                              (850)291 0667   mobile
>> phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
>> 
>> 
>> 
>> 
>> 
>> 
>> 
> 
> 
> 
> 

------------------------------------------------------------
IHMC                                     (850)434 8903 or (650)494 3973   
40 South Alcaniz St.           (850)202 4416   office
Pensacola                            (850)202 4440   fax
FL 32502                              (850)291 0667   mobile
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
Received on Saturday, 5 March 2011 15:59:24 UTC