Re: Drop “g-boxes”, talk about “stateful resources”

Hi Pat,

On 24 May 2012, at 02:06, Pat Hayes wrote:
>>> Putting on my hat as Primer editor I'm not sure the proposed new term is going to fly. The rationale you give for "stateful resource" makes a lot of sense from the REST perspective, but for me it won't work to explain the notion to a broader audience (especially given the RDF use of the term "resource").
>> I'd like to see an example that shows this difficulty.
> The RDF usage of "resource" for "anything that can be referred to" does make any usage which restricts it this seriously rather problematic. We have had this problem now for at least 9 years. 

I'd like to see an example that shows this difficulty.

>>> Although strictly speaking "graph container" is slightly different from the intendeed meaning of "stateful resource", the notions are very close.
>> There are many things that can be said to have a state that can be expressed in RDF, but that definitely are not containers of RDF graphs. People, for example. Or HTML web pages with embedded RDFa.
> Whoa. We have to distinguish being described by RDF, and being a source of RDF syntax. Of course there are any number of things with state that can be described using RDF: people, for example. (In fact, probably just about anything.) Here for example is one way to describe people's ages using RDF:
> Bill :hasState [ 
> a :PersonState
> :date "2012_01_01"^^xsd:date
> :age "51"^^xsd:integer ]
> and I'm sure we can think of some others.

Looks reasonable to me.

> But people are not things that emit RDF graphs.

And I didn't say they are.

What I said is:

1. The most common use of graph names, and one that we should encourage, will probably be to associate URLs with the graphs obtained by dereferencing the URL.

2. Therefore, I believe that our abstract syntax and our semantics should work really well for that case.

3. There are valid use cases where IRIs denoting people will be used as graph names. (We shouldn't encourage people to publish such datasets on the web, but it is valid and useful practice in local RDF stores.) There is nothing in any of the current specs that use RDF datasets (SPARQL, SPARQL Update, SPARQL Graph Store Protocol, R2RML, possibly others) that says anything against this practice.

4. Therefore I believe that our abstract syntax and our semantics should be defined in a way that doesn't break horribly in these cases. It's ok if it becomes a bit tortured in these cases.

5. I propose that we use the terms “stateful resource” for the things denoted by graph names, and consider the term “state” for the relation between the resource and its associated graph.

6. I do *not* propose that we define the abstract syntax or semantics in terms of “things that emit RDF graphs”, or formally connect them to REST. I merely pointed out that the scenario described in 1) above is going to be *one common use* of our work, and that the proposed terms—“stateful resource” and “state”—work very well in that case.

7. I point out that the proposed terms—“stateful resource” and “state”—still kind of work, although somewhat tortured, in less common but valid and justified cases like the “person+description” example you gave above and mentioned in 3).

> Most people know absolutely nothing at all about RDF triples and will never even think of the concept during their entire lives. They do not emit or produce or deliver RDF graphs under any circumstances whatever.

But saying that they have state that can be represented in RDF—and thus are “stateful resources” in RDF parlance—isn't entirely absurd (but certainly tortured, but that's ok, it's not a core use case).

> HTML web pages with embedded RDFa do do this, but I see no reason why we can't think of such an HTML page as a "container" of RDF. Indeed, that seems very natural. Basically, the intuition is that if you can get xxx out of it, then it is a container of xxx. 

The thing is: There are many different ways of getting xxx out of it. What set of parsers are you using? What accept headers are you sending? Do you have a tag soup parser or a strict XML parser? Does it parse RDFa 1.1 or RDFa 1.0? Do you have a microdata or microformats parser as well? Do you have some additional entity extraction logic that generates more triples? Does your system add some metadata about the retrieval and parse process to the graph?

These are all legitimate things to store in a SPARQL named graph that is named with the URL of some HTML page. So is the page a “graph container” for all these things at the same time? With a “container”, I'd expect that the question what exactly it contains is clearly and crisply defined.

With the “state” of some resource, especially if we look at the REST sense, it's intentionally left more fuzzy. The HTML representation we get back is said to somehow encode the state of the resource, subject to the limitations of the representation format. Then we can use whatever algorithms we like to recover as much of that state as possible in the form of an RDF graph.

> Can you give one example of something that we need to describe as a stateful resource but isn't naturally conceptualized as an RDF container? In my world, pepole are not the former

People are the former—see above. But note again that in my book, being a “stateful resource” doesn't *necessarily* imply that it emits RDF graphs. (The reverse implication, on the other hand, should at least be considered a good practice.)

> and HTML/RDFa pages are the latter,

It's hard to conceptualize them as graph containers if their content depends so much on who is asking about it—see above.

> so got any more? 

Let's say I have a parser that turns CSV files into RDF, using some simple scheme of deriving IRIs from rows and columns. This CSV file is obviously a stateful resource whose state can be expressed as an RDF graph. But is it a graph container? As I understand the definition of graph container, it's not.

Also see Sandro's “RDF Spaces” document (I take it that “space” = “graph container”):

He has a list of things that are not spaces. At least the first and third examples (natural-language web pages, and RDF pages whose contents change depending on who is asking) *must* be expressible in our formalism, otherwise I'd have to -1 it because it precludes use cases that we are doing at the moment. I'd say that it's fair to say that both examples are stateful resources and their state can be expressed in RDF.

To summarize: I'm *very* clear on what I want our abstract syntax to be (isomorphic to SPARQL's RDF datasets), and I'm increasingly clear on what I want the semantics to be (the IRIs in IRI-graph-pairs don't denote the graph, but denote some other resource that is associated through some relationship with the graph). Now all we have to do is pick a term for the thing denoted by the IRIs, and pick a term for its relationship to the graph. The choice is somewhat arbitrary—we could choose any terms if we wanted—and the most important criteria to me are that the terms fit the “web use case” really well, but still work ok-ish with a wide range of other use cases. The “graph container” and “contents” terms, in particular if we accept the refined definition that Sandro puts forward in the “RDF Spaces” document, doesn't fit the web case really well, and explicitly excludes certain valid use cases, therefore I want other terms. “Stateful resource” and “state” are the best ones I've heard yet.


Received on Thursday, 24 May 2012 16:46:16 UTC