Re: Named Containers : a framework for aggregation and query from Tom Adams on 2004-10-05 (public-rdf-dawg@w3.org from October to December 2004)

From: Tom Adams <tom@tucanatech.com>
Date: Tue, 5 Oct 2004 11:55:12 -0400
To: DAWG list <public-rdf-dawg@w3.org>
Message-Id: <F11FB01A-16E6-11D9-B04B-000A95C9112A@tucanatech.com>
Andy,

I like this a lot, I have some other general comments that may live 
here, but probably not... I'll address them in reply to the appropriate 
mails when I catch up on my mail.

> <snip/>
> ==== FROM
>
> This is as much about "protocol" as query but its needed for the local
> query case where there isn't a protocol layer.

I'm not sure I like the idea of putting this in the protocol, perhaps 
establishing a default context (case 3 below) would make this valid, 
but generally I like being able to address the container using a query. 
I'll need to think about this some more...

> FROM establishes the data for a query.  How URIs of named containers 
> get
> handled is up to the implementation but some systems will load URLs and
> files, some will attach to databases and some will do nothing much
> because the system environment handles getting to some collection of
> named containers.  There is no requirement to load URLs across the web.

We need to be sure we separate (or rather don't preclude it) the naming 
of containers from the protocol that is used to communicate with them. 
In Kowari/TKS we currently use the scheme of the model URI to determine 
the transport protocol, which has the consequence of binding model 
names to not only protocols but also servers (host is included in the 
URI).

> == Case 1: "FROM <u1> <u2>"
>
> Build a data context with two named containers named <u1> and <u2>.

Do we make this an implicit AND?

> == Case 2: "FROM <u1>"
>
> Build a data context with one named containers.  Accessing the 
> container
> via SOURCE and accessing the aggregation sees the same RDF graph down 
> to
> the bNodes. If there is no SOURCE in the query, this is just querying
> the graph identified by <u1> by however the system does it.
>
> == Case 3: No FROM in query.
>
> The implementation has to set the query data context.  This can be a
> single graph or a collection of named containers.
>
> If there is no name information, SOURCE ?src ( ?x ?y ?z ) can be 
> either:
>
> 3a/ fail - ?src can't be bound
>
> 3b/ match as if its a single graph but ?src is not bound.
>
> Note: its not possible to create a mix of named and unnamed containers
> in the query data.  That is intentional.  Implementations may choose to
> allow this but there would be no test cases.  Same goes for ?src being 
> a
> bNode and having some vocabulary to describe the container or container
> graph.
>
> I'd expect the case of no FROM, and getting the query context from
> outside to be common in the local case.
>
>
> == Case 4: "FROM <u> <u>" (same URI)
>
> This highlights the case where two URIs name the same graph; in more
> general cases this would have to be done outside the query language 
> FROM
> statement.
>
> For the same URI case, this is can go one of two ways:
>
> 4a/ Creates a data context with two named containers that do not share
> bNodes.  It's like reading in the file twice.
>
> 4b/ Creates a data context with two named containers that name the same
> graph.  bNodes are the same.
>
> 4c/ Make it illegal.
>
> Because the same URI is used, its possible to get indistinguishable
> query results - that's an argument in favour of 4c.
>
>
> ==== Systems
> <snip/>
> == Kowari/TKS
>
> The "from" keyword in Kowari allows the creation of a target graph
> through the union and intersection of sets of statements.  If bNodes 
> are
> kept distinct, union is RDF-merge because Kowari works on sets: the
> union will do the duplicate suppression (could someone confirm this
> please?)

Yes, you're right, Kowari does work on sets so duplicates are 
redundant. bNodes are unique within a server, that is, no two models on 
the server will contain the same identifier for different bNodes.

Kowari allocates node IDs for bNodes at a server level, and makes no 
attempt to keep them globally unique.

> In addition, the "in" keyword allows a pattern to be applied to a named
> graph.  It appears that the graph name can't be a variable.

This is a very handy feature, but no, the graph name (aka model URI) 
cannot be a variable I don't believe. I'll give it a shot and see if we 
can bind a value to it.

Cheers,
Tom
-- 
Tom Adams                  | Tucana Technologies, Inc.
Support Engineer           |   Office: +1 703 871 5312
tom@tucanatech.com         |     Cell: +1 571 594 0847
http://www.tucanatech.com  |      Fax: +1 877 290 6687
------------------------------------------------------
Received on Tuesday, 5 October 2004 15:55:16 UTC