RE: ACTION: elaborate on 4.4 from Seaborne, Andy on 2004-06-24 (public-rdf-dawg@w3.org from April to June 2004)

From: Seaborne, Andy <andy.seaborne@hp.com>
Date: Thu, 24 Jun 2004 21:13:16 +0100
To: "'RDF Data Access Working Group'" <public-rdf-dawg@w3.org>
Message-ID: <000201c45a27$b22e2710$0a01a8c0@atlas>
First, I'll observe that there is "result formats", means the abstraction
like graphs or variable binding result tables, and there is the concrete
"result formats" meaning serialization on-the-wire.  Some of the discussion
so far has not been completely clear as to which is what.  For the doc, we
could avoid the ambiguous phrase "result formats" and have "form of results"
and "serialization" or some such wording.

Overall, I don't see much difference here: we have a requirement "3.5 Local
Queries" so the query language has to be usable in that context.  Without
some concrete usages, I can't see which "query parameters" we need in the
local case which aren't in the query or a matter for the local API (e.g.
which local model to query).

The role of the protocol is about concrete serializations: of the query (and
other parameters needed at the serialization level) and of the result
serialization.

Rob's message has a lot of points in one place so I'll comment here:

"> " is Rob, 
"> > " is Jim

-------- Original Message --------
> From: public-rdf-dawg-request@w3.org <>
> Date: 24 June 2004 18:40
> 
> > WHY do you believe we should keep this independent and that it is
> > bizarre (your word) to do this?
> 
> The main point is that a query language is valuable
> independent of protocol.

Agreed.  And I would go further to say that the use of an existing protocol
(HTTP for example comes to mind) really does help deployment.

> Further, from a practical point of
> view I'm quite skeptical that this group will be able to come
> up with a general or robust protocol. No matter what protocol
> decisions we make, they will not be appropriate in all
> circumstances. Want to query a local RDF store (like
> in-process access to an RDF config file)? You probably want
> just an API, no network protocol.

Indeed, requirement "3.5 Local Queries" draws this out and so there is a
necessary split between the protocol and the query language.  At some sense
this is necessary - after on receipt, the server has to execute the query
locally.

> Want to connect to a remote
> RDF repository that's not under your control? You probably
> want an easily implemented universal protocol. Want to build
> a vertical app with a networked RDF store in its core? You
> probably want a robust and efficient protocol, even if that
> requires complex state management between client and server.

Agreed - HTTP and SOAP are the obvious, not not exclusive ones.  Multiuple
protocols bindings (realisations) will be necessary long-term for both
intranet and internet use.

> 
> > It seems to me that many very
> > successful protocols do indeed interact with the things they serve in
> > various ways (cf. http and mime types, http design and html)

In serialization matters, yes.  The networking has had a very strong sense
of layering and abstraction to make it clear who does what.  Some times, the
protocol is tuned to application scenario (TCP faststart and Negale's
algorithm) although the functionality is well defined and maintained.

I think the responsibility of our protocol is move a query string and get
the results back.  To do that, it has to worry about destination,
serialization & streaming.  Streaming triples was mentioned - that is one
form of a graph.  [A local API may also address some of these issues when it
is about system resource management.]

> 
> The fact that HTML has meta-data for HTTP headers in it is
> admittedly some slight inter-mingling (and I argue that's
> it's generally undesirable). But an HTML file does not, for
> example, contain different sets of information based on what
> Accept: header is passed in HTTP. HTML is a useful standard
> because it does what it does, and no more. If you had to
> perform complex programming for the special case of somebody
> wanting a different kind of HTML, then you'd lose both the
> simplicity of HTML that has spurred widespread adoption and
> the upper-layer transformations that have been added to web
> servers and plugins to overcome problems that were not
> foreseen when HTML was designed.
> 
> > and the
> > same is true in many query systems - esp. datalog- and OODB- based
> > protocols where it is not uncommon for some sort of information about
> > the query form to be part of the protocol (often simply as a
> > parameter to the query that can be brought in separate from the query
> > form itself).  I'm not an expert on JDBC, but I understood that it
> > also had some mechanisms (maybe they're system adds) to do this for
> > reporting back the results of SQL queries (sort of the opposite -
> > i.e. the protocol was specifically designed for the query language)
> 
> This is a very important point.
> JDBC is not a protocol. JDBC is an API. The actual
> over-the-wire format is entirely implementation-dependent.
> JDBC simply provides Java-language wrappers such that you can
> pass SQL to a database and collect the results.

Yes - but.  The fact there is an abstraction at the API does not enable
interworking.  If I have an app working against MySQL, even for simple
operations, it requires a (small) change to the app to now work with
PostgreSQL (aside from SQL issues).

The API has isolated the application from details; it has not enabled one
app to work with another server.

The value is the easy of porting; not interworking.  This WG is chartered
with interworking, hence the protocol matters. API issues are
internal-to-a-machine.

> 
> >   I'm not arguing, I'm just saying it does not seem /a
> > priori/ bizarre
> > to me to see a Web-based protocol and a Web-based language assuming
> > some sort of interaction with respect to Web formats and language
> >   issues. thanks
> >   JH
> 
> I think a query language is much much more important at this
> stage in the game than a network protocol, and I am quite
> sure that a good language would be used in many cases where a protocol
> would not. 
> 
> The concept of publicly-accessible RDF repositories sitting
> on the network is appealing, but I think it's a long way off.
> Issues like brokering, discovery, and aggregation need to be
> addressed before the "global distributed RDF knowledgebase"
> becomes a reality, and there is no chance at all that this
> group will find decent solutions to those problems.

This is mixing in things from a different level in the networking stack and
getting into a different space: it is not about remote access.  You bring in
a lot of interesting issues but they aren't in scope (personally, I'd like
to get involved in them!).

> For the
> foreseeable future, I am confident that the vast majority of
> RDF applications will be a client application connecting to
> an RDF repository for which it was explicitly programmed.
> 
> To be honest, from a developer's point of view it's the query
> language that requires the real investment. Hiding protocol
> behind a JDBC-like API for the purpose of future modification
> or optimization is already standard programming practice.

The hiding isolates the app from the driver (application developer issues),
but it does not make it possible for one application to access different
sources in the same way without some change (deployment issue).  We also
need to consider the one class of information publisher's concerns as well -
to make the information widely available without tieing to known
applications.

	Andy
Received on Thursday, 24 June 2004 16:13:50 UTC