Re: Comments on "SPARQL 1.1 Uniform HTTP Protocol for Managing RDF Graphs" from Nathan on 2011-03-18 (public-awwsw@w3.org from March 2011)

From: Nathan <nathan@webr3.org>
Date: Fri, 18 Mar 2011 18:55:53 +0000
To: Kjetil Kjernsmo <kjekje@ifi.uio.no>
CC: Tim Berners-Lee <timbl@w3.org>, SW-forum Web <semantic-web@w3.org>
Message-ID: <4D83AAB9.4050100@webr3.org>
Hi Kjetil,

Great question, here's how I understand it, (a lengthy but full answer 
which covers everything):

Kjetil Kjernsmo wrote:
> So, the key issue and the root of my confusion is the question: "What does the 
> URI of a information resource consisting of some RDF triples identify?" The 
> question isn't admittedly not very precise, for a reason that will be 
> apparent soon. 
> 
> Lets take an example: What does the URI http://www.kjetil.kjernsmo.net/foaf 
> identify? Apart from a foaf:PersonalProfileDocument, is it an RDF Graph or an 
> RDF Document? 

First of all there are a few things to clear up, which hopefully will 
help explain.

1: a "Document" named by a URI in the web sense, whether an "RDF 
Document" or an "HTML Document" or a "foaf:PersonalProfileDocument" 
*does not* refer to the representation (the specific content+meta you 
GET, a specific chunk of rdf/xml or turtle), it refers to that which 
remains consistent over time when you repeatedly GET the same URI. For 
instance if you update your foaf, then the URI still refers to the same 
document, not a specific representation of it at a specific time.

2: the Universal Interface (HTTP) is an example of information hiding at 
it's best, that is to say whatever is behind the interface, behind the 
http wall, is hidden. So implementation details on the server side such 
as whether the server is configured to pull from a some output from a 
cache, a file from a filesystem, or generate something based on a sparql 
query remains hidden, and doesn't particularly matter.

In short, the elements that are visible are:
- the URI ( http://www.kjetil.kjernsmo.net/foaf )
- the Protocol ( HTTP )
- the set of Representations (content+meta's) over time which are 
associated with that URI and sent via that protocol.

I say the above because, everything else one can say beyond that is a 
story, usually related to a specific context or use case.

General Story: The URI refers to a source of information and you can GET 
representations of that information.

RDF Specific Story: The URI refers to a source of information of the RDF 
variety and you can GET representations of that information.

FOAF Specific Story: The URI refers to a source of information about a 
person (or agent), or the RDF variety, and you can GET representations 
of that information.

So if we loose the overused "document" and "resource" terms we can say 
that the URI refers:
   FOAF-Personal-Profile-Information
   RDF-Information
   Information

Hopefully that all makes sense, so I'll move on to the RDF Graph and 
representations side of things. Both of these terms have been overused 
heavily and their meaning is now somewhat confused. I believe Sandro has 
already pointed you to the temporary terminology we're using the the RDF 
WG which helps explain [1].

Let's say, that when a URI refers to some RDF-Information, then /behind/ 
the HTTP interface we have a box of rdf triples (g-box), we might add 
triples to the box, or remove some from it, the box's contents change 
over time. At any one specific instant, the snapshot of that box's 
contents is a Set of Triples (g-snap), and that snapshot can be 
serialized for transfer in a number of different ways (g-text).

Above, the term "g-text" equates to the term "representation", and the 
term "g-snap" equates to "RDF Graph" (in the traditional sense, a 
mathematical set of triples).

Hopefully that also makes sense, but do remember it's just a story, the 
only things that are actually visible to man or machine are:
  - the URI which refers to some information
  - some representations of that information

clarify: The representations of that information are of course subject 
to content negotiation, access control and permissions, and change over 
time - this is why the word "Representation" was introduced originally, 
to make it clear that we're not talking about a "file" (which isn't 
subject to negotiation, acl, change over time) - not because it's a 
representation of the thing described by the information.

 From the above we can see that a URI refers to a source of information, 
and not a specific representation of that information at a specific time 
(e.g. a g-text), and not to that abstract information at a specific time 
either (e.g. a g-snap, an "RDF Graph").

Now we can answer your question clearly:

The URI http://www.kjetil.kjernsmo.net/foaf identifies an "RDF Document" 
(RDF-Information which potentially changes over time and which you can 
retrieve representations of via http), not an "RDF Graph" (abstract set 
of triples).

So that begs the question, how do you refer to, or talk about 
representations and the abstract g-snaps/rdf-graphs?

One way to talk about representations, is to pop them in a literal and 
describe the literal, after all a representation is just a sequence of 
bytes and some metadata (content-type for instance), just as we do with 
strings or numbers.

Another possibility is to create some fixed static information and refer 
to it by URI (a "fixed resource"), this requires some trust and also 
some way of indicating the information is fixed, never changing, but it 
could be done - this works on the premise that because the information 
is fixed, anything you say about the source of information (information 
resource) must also be true of the single representation of it, and 
visa-versa. (note: data: URIs may be able to be leveraged here too).

As for g-snaps (RDF Graphs, abstract sets of triples)? Well that's 
somewhat trickier, you either need a literal+datatype that specifically 
indicates any statements made about it are made about the set of triples 
it encodes, or some feature built in to the media type that let's you 
quote sets of triples in a lexical form (like quoted graphs in N3).

[1] http://www.w3.org/2011/rdf-wg/wiki/Graph_Terminology

> [snip]
> does it have any bearing on the problem that
> http://www.kjetil.kjernsmo.net/foaf is a foaf:PersonalProfileDocument?

No that's what it is :)

> Can  something be both a foaf:PersonalProfileDocument and an RDF Document? (my 
> intuition says yes)

Yes, the former can be seen as a subclass of the latter, personally I 
find seeing it as:
   FOAF-Personal-Profile-Information
   RDF-Information
   Information
most useful.

> Can something be both a foaf:PersonalProfileDocument and 
> an RDF Graph? (my intuition says no).

Nope.

> There are many other conventional resources to identify the same way, owl:Ontology and cc:Work comes to mind. 
> Would the answer be any different? 

Nope.

> Now, I've possibly exposed myself as totally confused about core Semantic Web 
> concepts, but I do so with the confidence that I'm not a n00b, and if I'm 
> confused, I'm probably not alone, and the issue should be properly explained 
> to the community.

You most certainly are not alone in this respect, I can guarantee you 
that! But glad you asked in such a clear manner :)

Sincerely hope that helps a little,

Best,

Nathan
Received on Friday, 18 March 2011 18:57:07 UTC