Re: Links and graphs from Graham Klyne on 2012-12-17 (public-ldp@w3.org from December 2012)

From: Graham Klyne <Graham.Klyne@zoo.ox.ac.uk>
Date: Mon, 17 Dec 2012 23:47:45 +0000
To: Erik Wilde <dret@berkeley.edu>
CC: mike amundsen <mamund@yahoo.com>, LDP <public-ldp@w3.org>, W3C provenance WG <public-prov-wg@w3.org>
Message-ID: <50CFAF21.4060303@zoo.ox.ac.uk>
Hi Erik,

I'll tackle the last part of your message first, then consider my other 
responses in light of that.  I'll start with an "RDF only" scenario, and then 
consider how that might generalize.

...

Starting point:  a client want to access provenance for some resource R via a 
provenance service (this is just setting the scene, I'll address your question 
about a new über-PROV service later.)

C: GET <R>          // or HEAD

S1: 200 OK
S1: Link <P>; rel=prov:ProvenanceService
   :

C: GET <P>
C: Accept: application/rdf+xml, text/turtle

P: 200 OK
P: Content-type: text/turtle
  :
P:
P: <P1> a prov:ProvenanceService ;
P:      prov:provenanceUriTemplate "<p1>?target={+uri}" .
P: <P2> a sd:Service
P:      (description per http://www.w3.org/TR/sparql11-service-description/)

C: GET <P1>?target=<R>
C: Accept: application/rdf+xml, text/turtle

P1: 200 OK
P1: Content-type: ...
P1:
P1: (RDF provenance data  about <R>)

OR, at the final exchange, based on the response from <P>, the client may choose 
to use SPARQL:

C: GET <P2>?(encoded query)        // or equivalent POST
C: Accept: application/sparql-results+xml

P2: 200 OK
P2: Content-type: application/sparql-results+xml
P2:
P2: (SPARQL query results)

...

Above is the sort of thing that's being considered for provenance access and 
query services.  (Direct access is dealt with separately.)

Your question was how this could be extended to include some new über-PROV 
service.  I first consider one that is described using RDF (e.g. in a fashion 
similar to the SPARQL service description), then I'll show how the non-RDF case 
still works.

The second exchange above might be augmented thus:

C: GET <P>
C: Accept: application/rdf+xml, text/turtle

P: 200 OK
P: Content-type: text/turtle
  :
P:
P: <P1> a prov:ProvenanceService ;
P:      prov:provenanceUriTemplate "<p1>?target={+uri}" .
P: <P2> a sd:Service
P:      (description per http://www.w3.org/TR/sparql11-service-description/)
P: <p3> a über:Service
P:      (description of über:Service)

A client that knows about über:Service is assumed to know about the RDF 
vocabulary that describes it (as well as knowing about RDF as a generic format).

You also noted that this is a very RDF-centric view of the web - and indeed the 
web is more than just RDF, so an alternative non-RDF approach to the 
über:Service description might be preferred; this is just a deployment of the 
existing media type based selection mechanisms you have described:

C: GET <P>
C: Accept: application/rdf+xml, text/turtle, application/über

P: 200 OK
P: Content-type: application/über
  :
P:
P: (parameters of über:Service)

I think that all I have added to the current mechanisms here is the possibility 
that an RDF response may describe multiple services that themselves may or may 
not be RDF based.  Thus, when the response is RDF (or any other format that 
supports diverse semantics), interpretation of the response body provides a 
second switching point after consideration of its media type.

(BTW, in preparing the above, I discovered http://www.w3.org/ns/formats/, which 
may suggest a route to media type based switching, with possible granularity 
refinements, based on an RDF description.  I haven't thought this through.)

...

Turning to the rest of your comments:

On 17/12/2012 17:17, Erik Wilde wrote:
> hello graham.
>
> On 2012-12-17 6:32 , Graham Klyne wrote:
>> It seems to me that RDF (in its various syntaxes) is a natural way to
>> provide a service description.
>
> it may be for a specific service, and in particulat for one that's RDF-centric.
> for providing a more comprehensive starting point,
> http://www.rfc-editor.org/queue.html is under development, but of course ou

(I guess that's meant to be a reference to @mnot's JSON-home draft? 
[http://tools.ietf.org/html/draft-nottingham-json-home-02])

> could argue that this could/should be RDF instead. it could, of course, but the
> important aspect always is that a client needs to know what it needs to know to
> use this kind of resource, so you would have to define a media type that makes
> that clear. it's application/json-home in its current form and you can recast
> this as application/json-home+turtle or something else to your liking to provide
> a more RDF-friendly variant. but in the end, there has to be some definition of
> how this works somewhere, and if you would RDFify mark's work, you might just
> reference his semantics, and create some RDF vocabulary for it.

(Yes, I've briefly discussed that possibility with him.)

>> A client application needs to understand
>> about RDF in general, and about specific RDF vocabularies that are used
>> to describe services.  A service provider can introduce new mechanisms
>> by introducing new descriptions (using new vocabulary terms) into the
>> RDF returned, without breaking existing clients.
>
> here you're taking a turn into a RDF-only world. which is one possible way to
> go, but not the web. if you want to provide a fabric that any client can use,
> then you have to tell them what they need to know to make things work. it's fine
> to tell them "parse turtle, create an RDF graph, and then interpret the
> following triples you find in there in the following way", but that's something
> you have to tell them. how else could somebody could write a functioning client?
> you have to communicate the expectations for clients in a way that allows people
> to build those clients.

Yes, a client absolutely needs to know what it needs to know:  in order to 
interpret an RDF based service description, then it clearly needs to know how to 
process RDF data.  It would also need to be able to locate and extract 
descriptions of services that it knows how to use.  This doesn't seem to me to 
be unreasonable for an RDF based application, though I accept that for non-RDF 
applications this would be a big deal - but the use of a second switching point 
via RDF is optional, so it needn't impact non-RDF applications that use media 
type based switching.

>> I think this approach is entirely consistent with Roy Fielding's
>> description of REST, in which a representation is "a sequence of bytes,
>> plus representation metadata".  The metadata may include, but is not
>> limited to, a media type description.
>
> i have to admit that i never went far in exploring this direction, because my
> area of research and work is "REST as the way URIs and HTTP are used in today's
> web fabric." i assume you could really treat REST as what it is, an
> architectural style, and come up with a different design, firmly based on REST
> and assuming that whenever information is being exchanged, it is in the form of
> some RDF. you might be able to design such a system, and it actually would be an
> interesting thought experiment. on the other hand, you would not have
> interoperability between today's fabric, and that hypothetical other one, so
> it's not a decision to be taken lightly.

I think I've covered this above:  use of RDF is via media type selection, so it 
remains possible to select other media types.

>> Thus, in our RDF service description for accessing provenance, we might
>> describe either or both of:
>> (1) a REST service description, e.g. per
>> http://www.w3.org/TR/prov-aq/#provenance-service-description, which
>> introduces a resource with RDF type 'prov:ProvenanceService' (this is a
>> fairly old PWD which is undergoing revision - part of my reason for this
>> discussion is to figure out what should go here).  One point of ongoing
>> debate is whether the RDF type should cover all of the access mechanisms
>> (REST and SPARQL).
>
> if you're planning to add general ways for how clients can find PROV information
> for a resource, i'd like to encourage you to use RFC 5988 and register something
> like "provenance" (if it cannot be folded into something that already is
> well-known on the web), so that the entire web community can use this
> interlinking consistently.

Absolutely,  RFC5988 was a starting point for much of this work.  Rather than 
registering a link relation name, we plan to use a URI as is also allowed by 
RFC5988.

>> But this does not use media type to discriminate between the options.
>
> what i am wondering how my simple web client, using an XML-based access to your
> data, would then find out. it seems you're making assumptions about the URIs
> based on their context, instead of providing URIs that are self-contained and
> could be taken entirely out of context and still would allow me to interact with
> the services. could you breifly show how you would, if you were building non-RDF
> access to your service, expose the PROV links?

Clearly, if we're using RDF descriptions, the client needs to understand RDF. 
So your XML only client would presumably get an HTTP not acceptable (?) response 
to its content negotiation, and be forced to fail.  (Assuming the service 
doesn't also offer an XML variant that your client does understand.)  I don't 
see this as any different from the current mechanisms, which must fail if the 
client and server don't both understand some common mechanism and its 
corresponding media type.

>> Depending only on the client's knowledge of how to process the media
>> types seems to me to be a narrow view compared with "examining and
>> choosing from among the alternative state transitions in the current set
>> of representations" [Fielding].  I'm concerned that this dependency
>> leads to an overloading of media types, which "specify the native
>> representation (canonical form) of such data"
>> [http://tools.ietf.org/html/rfc2045] to also cover data semantics.  For
>> formats where the semantics is tied to a specific format, this seems not
>> to be a problem, but does not sit so well with RDF which uses common
>> syntax to convey arbitrary semantics.
>
> there seems to be this lingering feeling that "RDF is different" because "it's
> semantics, and not a format".

I wouldn't put it that way, but RDF is different from most of the XML-based 
formats used.  It is a common format that can carry (and merge) arbitrary 
semantics.  That's not true of most other data formats on the web.  I don't 
claim it's unique, and certainly there could be other formats with similar 
capabilities, and they too could be used in a similar way to RDF provided that 
there are media types to distinguish the base format.

> ... pretty much all data formats (apart from those
> using ad-hoc syntaxes) work exactly that way, so there really is not need to
> deviate from what has been working in the past 20 years. application/atom+xml is
> atom semantics using xml syntax,

I would argue that, technically, XML is *not* a syntax.  It's a family of 
syntaxes.  It's a syntactic framework.  Each XML-based document type, defined by 
DTD, XSD, RelaxNG or other means is a syntax, which may have ore or less 
associated semantics.  RDF/XML is an XML-based syntax that provides a semantic 
framework for conveying arbitrary semantics; i.e. descriptions of arbitrary things.

> ... and if somebody felt the urgent need to provide
> that in EXI, they would need to register application/atom+exi and then people
> could start using atom with a different xml syntax. in the end, clients need to
> know what they have to parse when they start GETting something, so you must say
> which syntax(es) are available.

Sure.  But syntax is the starting point not necessarily the endpoint.

> ... personally, i don't think providing multiple
> syntaxes is a very good idea, but that's just my personal opinion. but as a
> matter of fact, pretty much all structured data services today use the exact
> same setup of "parse the data based on some general-purpose underlying syntax",
> and then "start processing the data based on some assumption what kind of
> vocabulary is expressed within that syntax." RDF is doing exactly the same as
> XML or JSON, just with a different syntax and metamodel.

I disagree that RDF is doing the same as XML.  It doesn't have any notion of 
syntactic constraints (like DTD, XML schema, etc.)  JSON is a more interesting 
case, as it doesn't (yet) have a way to impose schema constraints.

And RDF *is* different from both in that id does have a (minimal) formal 
semantics, which neither XML nor JSON have.  It's insufficient semantics to 
describe useful things in the world, but it's enough to provide some basic 
ground rules for preserving any meaning that is conveyed, e.g. when merging RDF 
from independent sources.

#g
--

>> One thing we lose by not using a media type to distinguish between the
>> access options is the ability to use content negotiation for selection.
>> I think this is somewhat offset by a server easily being able to supply
>> multiple RDF descriptions in a single response, so the process can
>> proceed without any additional round-trips.  (This doesn't preclude the
>> use of media types and content negotiation for non-RDF service
>> descriptions, so I think the essential flexibility and evolvability of
>> the REST style is not compromised.  In future, maybe we have a way to
>> negotiate on RDF types?)
>
> i don't think that RDF will ever become part of HTTP-level operations. we can
> negotiate on media types because that's what the uniform interface is using. and
> you can easily negotiate on RDF types as well, but you must make them part of
> the uniform interface. if you started introducing RDF-specific mechanisms in the
> uniform interface, it wouldn't be uniform anymore, because these things would be
> only usable for RDF clients, and then we'd have the split i was mentioning
> above. i'd really like to avoid doing that, because i think that both the
> non-RDF web and RDF would benefit tremendously from talking more to each other.
>
>> In summary, I think there can be alternatives to using media types for
>> guiding the "how?" of interactions, without compromising the essential
>> evolvability properties of REST.  But maybe you can point out where the
>> problems lie in the approaches I sketch here?
>
> i've given it a try. would you mind re-reading my last paragraph in the last
> response and specifically walking me through the über-PROV example (and let's
> assume that über-PROV is not going to be RDF-based, but is just so clearly
> superior that people want to start using it) and how you would see that playing
> out in your architecture? just don't make the assumption that clearly, going
> forward, everything useful that will ever happen on the web will be using RDF.
> this is just not how it's going to be.
>
> thanks,
>
> dret.
>
Received on Monday, 17 December 2012 23:51:11 UTC