Re: Graph-State Resources (was Re: graphs and documents Re: [ALL] agenda telecon 14 Dec) from Sandro Hawke on 2011-12-15 (public-rdf-wg@w3.org from December 2011)

From: Sandro Hawke <sandro@w3.org>
Date: Thu, 15 Dec 2011 12:52:09 -0500
To: Andy Seaborne <andy.seaborne@epimorphics.com>
Cc: public-rdf-wg@w3.org
Message-ID: <1323971529.6252.1168.camel@waldron>
On Thu, 2011-12-15 at 13:10 +0000, Andy Seaborne wrote:
> >> The old HTTP-NG work, which tried to wrap a distributed objects layer
> >> around HTTP and other browser-accessible protocols, used 'WebDocument"
> >> - http://www.w3.org/TR/WD-HTTP-NG-interfaces/ and nearby.
> >>
> >> Let's call this notion 'Thingy' for purpose of the next paragraph.
> >>
> >> If I have two physically linux boxes A and B, wired up to be part of
> >> the Web using round-robin DNS so that two separate hard drives / CPUs
> >> etc are serving up a common identical set of content, and
> >> http://example.com/something1 will 50% of the time serve from A, 50%
> >> serve from B. Lots of Web sites in practice serve using more
> >> sophisticated variants on this pattern.
> >>
> >> Are we clear that in our story, there is just one "Thingy'? by virtue
> >> of the concept being focussed on its public name, rather than
> >> possibly-evolving internals. Since we know about the mechanics inside,
> >> we might be tempted to say there are two thingies, ... but that slips
> >> away from the central idealisation here. We act in the Web like we're
> >> talking to some unified service, which will tell us "it's state". In
> >> practice the details are rarely that clear.
>  >
> > REST is a simplification of the Web, to be sure.  It's probably about
> > right for these purposes.
>  >
> >> Anyhow, WebResource I can live with. I prefer not to use any phrase
> >> with "Information Resource" inside it, like "Web Accessible
> >> Information Resource", since it suggests we've clarified what an
> >> "information resource" amounts to.
> > None of those terms are any help for us here, trying to name a
> > generalization of a g-box.   We still need a term that limits it to RDF.
> 
> Unconvinced.  What's an RDFa document?  It's some RDF, some scripts, 
> some HTML links, some appearance.  Is that limiting it to RDF?

It's not a Graph-State Resource, as I'm trying to define the term.
There's a lot more to its state (except in degenerate cases, like a sort
of RDFa-quine) than is conveyed in the triples.

I'm looking for a class of things which have very similar behavior and
attributes.  My most recent angle is trying to document how to use REST
with these things.  I want to be able to talk about how HEAD, GET, PUT,
and PATCH should work on these things.   RDFa documents have to be
handled quite differently -- one could not, for instance, PATCH an RDFa
document with an application/sparql-update patch.   I'm trying to focus
on the class of things for which SPARQL Update is a meaningful PATCH
language.

Another way to look at this class is in terms of which properties have
this class as their domain or range.   We've already talked about a
graphState predicate (aka log:semantics, aka graphSnapshot).   (I guess
you could define those as only giving the RDF portion of the content.
That's okay, but if so, I want some way for the client to know that's
what happened.)

Looking for other properties which might apply to GSRs, I thought of
VoID and came across this:

        The fundamental concept of VoID is the dataset. A dataset is a
        set of RDF triples that are published, maintained or aggregated
        by a single provider. Unlike RDF graphs, which are purely
        mathematical constructs [RDF-CONCEPTS], the term dataset has a
        social dimension: we think of a dataset as a meaningful
        collection of triples, that deal with a certain topic, originate
        from a certain source or process, are hosted on a certain
        server, or are aggregated by a certain custodian. Also,
        typically a dataset is accessible on the Web, for example
        through resolvable HTTP URIs or through a SPARQL endpoint

                - http://www.w3.org/TR/2011/NOTE-void-20110303/#dataset
                
Terminology aside, that seems to match g-box rather well.  

The term "Graph-State Resource" is meant to convey something a bit
broader than "Graph Container" or "G-Box".  Conceptually, GSRs have a
lot more individuality.   Two g-boxes with the same triples are probably
very similar; with GSR's there is less of that suggestion -- one GSR
might be a furnace controller, and another might be just a g-box holding
a recent copy of the state of the furnace controller.  The difference
would be visible in the data about these items, in how they change over
time, and probably in how they respond to a POST.

> I think that all that matters is observation, what it returns on 
> dereference when asking for RDF.  What the thing "is" does not matter; 
> only partially observations are possible.
> 
> We can't hope that observation A and observation B are in anyway 
> consistent (they happen at different times) unless something says so 
> explicitly (HTTP header or claim by publisher).
> 
> And a resource may be observable as RDF one moment and not the next.
> 
> log:semantics is a web-at-an-instance predicate and makes an 
> observation.  That's OK for one rules run.  It is somewhat problematic 
> when passing on information to another party if that's all that's passed on.

I'm thinking more about setting up interfaces than about observing them.
I agree, from outside, prodding over HTTP, one can't tell how the
content varies over time, over who's asking, or what media types might
be offered.

I'm thinking, instead, about how to help people set up systems which
communicate via RDF.   Human users might be told in natural language, or
web clients might be told in RDF that some URIs name GSRs.   That
establishes an understanding that allows some useful things to be done
-- like the client know that it can reasonably try to use PUT to replace
the set of triples, and it wont be mangling some important HTML content.
(I'm not saying all those specs are written yet; that's a goal.) 

    -- Sandro

>  Andy
> 
>
Received on Thursday, 15 December 2011 17:52:21 UTC