[Fwd: Re: Comments on "SPARQL 1.1 Uniform HTTP Protocol for Managing RDF Graphs"] from Nathan on 2011-03-18 (public-rdf-wg@w3.org from March 2011)

From: Nathan <nathan@webr3.org>
Date: Fri, 18 Mar 2011 18:57:43 +0000
To: RDF WG <public-rdf-wg@w3.org>
Message-ID: <4D83AB27.5080705@webr3.org>
fwd, just for the logs, or in case anybody totally disagrees.

-------- Original Message --------
Subject: Re: Comments on "SPARQL 1.1 Uniform HTTP Protocol for Managing 
RDF Graphs"
Date: Fri, 18 Mar 2011 18:55:53 +0000
From: Nathan <nathan@webr3.org>
To: Kjetil Kjernsmo <kjekje@ifi.uio.no>

Hi Kjetil,

Great question, here's how I understand it, (a lengthy but full answer
which covers everything):

Kjetil Kjernsmo wrote:
> So, the key issue and the root of my confusion is the question: "What does the 
> URI of a information resource consisting of some RDF triples identify?" The 
> question isn't admittedly not very precise, for a reason that will be 
> apparent soon. 
> 
> Lets take an example: What does the URI http://www.kjetil.kjernsmo.net/foaf 
> identify? Apart from a foaf:PersonalProfileDocument, is it an RDF Graph or an 
> RDF Document? 

First of all there are a few things to clear up, which hopefully will
help explain.

1: a "Document" named by a URI in the web sense, whether an "RDF
Document" or an "HTML Document" or a "foaf:PersonalProfileDocument"
*does not* refer to the representation (the specific content+meta you
GET, a specific chunk of rdf/xml or turtle), it refers to that which
remains consistent over time when you repeatedly GET the same URI. For
instance if you update your foaf, then the URI still refers to the same
document, not a specific representation of it at a specific time.

2: the Universal Interface (HTTP) is an example of information hiding at
it's best, that is to say whatever is behind the interface, behind the
http wall, is hidden. So implementation details on the server side such
as whether the server is configured to pull from a some output from a
cache, a file from a filesystem, or generate something based on a sparql
query remains hidden, and doesn't particularly matter.

In short, the elements that are visible are:
- the URI ( http://www.kjetil.kjernsmo.net/foaf )
- the Protocol ( HTTP )
- the set of Representations (content+meta's) over time which are
associated with that URI and sent via that protocol.

I say the above because, everything else one can say beyond that is a
story, usually related to a specific context or use case.

General Story: The URI refers to a source of information and you can GET
representations of that information.

RDF Specific Story: The URI refers to a source of information of the RDF
variety and you can GET representations of that information.

FOAF Specific Story: The URI refers to a source of information about a
person (or agent), or the RDF variety, and you can GET representations
of that information.

So if we loose the overused "document" and "resource" terms we can say
that the URI refers:
   FOAF-Personal-Profile-Information
   RDF-Information
   Information

Hopefully that all makes sense, so I'll move on to the RDF Graph and
representations side of things. Both of these terms have been overused
heavily and their meaning is now somewhat confused. I believe Sandro has
already pointed you to the temporary terminology we're using the the RDF
WG which helps explain [1].

Let's say, that when a URI refers to some RDF-Information, then /behind/
the HTTP interface we have a box of rdf triples (g-box), we might add
triples to the box, or remove some from it, the box's contents change
over time. At any one specific instant, the snapshot of that box's
contents is a Set of Triples (g-snap), and that snapshot can be
serialized for transfer in a number of different ways (g-text).

Above, the term "g-text" equates to the term "representation", and the
term "g-snap" equates to "RDF Graph" (in the traditional sense, a
mathematical set of triples).

Hopefully that also makes sense, but do remember it's just a story, the
only things that are actually visible to man or machine are:
  - the URI which refers to some information
  - some representations of that information

clarify: The representations of that information are of course subject
to content negotiation, access control and permissions, and change over
time - this is why the word "Representation" was introduced originally,
to make it clear that we're not talking about a "file" (which isn't
subject to negotiation, acl, change over time) - not because it's a
representation of the thing described by the information.

 From the above we can see that a URI refers to a source of information,
and not a specific representation of that information at a specific time
(e.g. a g-text), and not to that abstract information at a specific time
either (e.g. a g-snap, an "RDF Graph").

Now we can answer your question clearly:

The URI http://www.kjetil.kjernsmo.net/foaf identifies an "RDF Document"
(RDF-Information which potentially changes over time and which you can
retrieve representations of via http), not an "RDF Graph" (abstract set
of triples).

So that begs the question, how do you refer to, or talk about
representations and the abstract g-snaps/rdf-graphs?

One way to talk about representations, is to pop them in a literal and
describe the literal, after all a representation is just a sequence of
bytes and some metadata (content-type for instance), just as we do with
strings or numbers.

Another possibility is to create some fixed static information and refer
to it by URI (a "fixed resource"), this requires some trust and also
some way of indicating the information is fixed, never changing, but it
could be done - this works on the premise that because the information
is fixed, anything you say about the source of information (information
resource) must also be true of the single representation of it, and
visa-versa. (note: data: URIs may be able to be leveraged here too).

As for g-snaps (RDF Graphs, abstract sets of triples)? Well that's
somewhat trickier, you either need a literal+datatype that specifically
indicates any statements made about it are made about the set of triples
it encodes, or some feature built in to the media type that let's you
quote sets of triples in a lexical form (like quoted graphs in N3).

[1] http://www.w3.org/2011/rdf-wg/wiki/Graph_Terminology

> [snip]
> does it have any bearing on the problem that
> http://www.kjetil.kjernsmo.net/foaf is a foaf:PersonalProfileDocument?

No that's what it is :)

> Can  something be both a foaf:PersonalProfileDocument and an RDF Document? (my 
> intuition says yes)

Yes, the former can be seen as a subclass of the latter, personally I
find seeing it as:
   FOAF-Personal-Profile-Information
   RDF-Information
   Information
most useful.

> Can something be both a foaf:PersonalProfileDocument and 
> an RDF Graph? (my intuition says no).

Nope.

> There are many other conventional resources to identify the same way, owl:Ontology and cc:Work comes to mind. 
> Would the answer be any different? 

Nope.

> Now, I've possibly exposed myself as totally confused about core Semantic Web 
> concepts, but I do so with the confidence that I'm not a n00b, and if I'm 
> confused, I'm probably not alone, and the issue should be properly explained 
> to the community.

You most certainly are not alone in this respect, I can guarantee you
that! But glad you asked in such a clear manner :)

Sincerely hope that helps a little,

Best,

Nathan
Received on Friday, 18 March 2011 18:58:56 UTC