Re: SPARQL Protocol for RDF from Patrick Stickler on 2005-06-03 (semantic-web@w3.org from June 2005)

From: Patrick Stickler <patrick.stickler@nokia.com>
Date: Fri, 3 Jun 2005 07:42:38 +0300
To: "ext Giovanni Tummarello" <giovanni@wup.it>
Cc: semantic-web@w3.org
Message-Id: <e6d89fe7a98ecd66ce6a7f77562c9ca3@nokia.com>
On Jun 3, 2005, at 00:59, ext Giovanni Tummarello wrote:

>
>>
>> form of description is mandated for DESCRIBE (and IMO shouldn't
>> be) there should at least be *some* form of default form of
>> description recommended -- so that implementers of SPARQL processors
>> would adopt some consistent form of response unless they had
>> strong reasons to do otherwise, but it seems such a proposal
>> is not sufficiently valued by enough members of the WG.
>
> Hasnt someone suggested a way to point at some URI for the definition  
> of either the requested description or the one that's being provided.?

I'm not entirely sure what you mean here. DESCRIBE allows you
to "point at some URI" for a description, but you can't ever
be sure what an arbitrary SPARQL processor will return for
such a description.

URIQA allows one to "point at some URI" and get a CBD,
to the extent that the URIQA service in question knows about
the resource.

>
>> CONSTRUCT
>> {
>>    ?s1 ?p1 ?o1 .
>>    ?o1 ?p2 ?o2 .
>>    ?r1 rdf:subject   ?s1 .
>>    ?r1 rdf:predicate ?p1 .
>>      ?r2 ?r2p ?r2o .
>>    }
>> }
>>
>> I guess this can be seen as a "poor man's CBD for SPARQL".
>
> make that desperate man's ;-) you cant do it fully given there are no  
> recursive queries.

True. And I don't ever expect to be that desperate, which is
why I gave up trying to "perfect" such a query ;-)

>
>> Of course, one could easily autogenerate a pretty exhaustive
>> version that would cover essentially all practical cases.
>
> :-) maybe...
> anyway,  of course if there was syntactic support for reification your  
> example be would much smaller and clearer , but syntactic support  
> (which doesnt appear to be a major implementation burden, isnt it a  
> regex substitution?) was removed.
>
> While i understand CBD , Minimum Self Contained graphs are probably  
> "strange" concepts, what made me sad is giving up the idea of using  
> SPARQL also in other projects such as RDF Textual encoding  [1].   
> since there is no support for lists.
> While i do understand why it seems difficult to implement certain  
> features efficiently, i dont get it why the standard could list them  
> anyway.. and people would just know that if they were to use them  
> they'd pay a high computational price. Better that having to write a  
> lot of java code..

Well, I think that anyone who has had any experience with SGML
can tell you about what "optional features" can do to the adoption
and consistent implementation of a standard.

I'd rather just leave such "optional features" out of the standard,
and make it clear that individual implementations are welcome to
include value added functionality, and if the industry converges
on efficient, consistent solutions to such functionality, then
perhaps a future version of the standard can reflect that.

The cleaner, tighter, and more easily digested standard will be
more successful insofar as adoption and implementation is concerned.

Also, note that the recursive functionality needed to obtain CBDs
can also be provided by a rule layer working in conjunction with
SPARQL, and might very well be better addressed at such a layer.

>
> In a partly unrelated matter, does anyone know how can one cope in  
> sparql with contexts being more than 1?
> Say one uses NG to indicate who is the author. Ok..  then after some  
> time one wants to distinguish also  between the "red" triples and  
> "blue" ones, or from other facets such as the original site where they  
> were posted et. or whatever. Should one exponentially multiply the  
> number of named graphs (creating new  graphs like fromGiovanni_red    
> fromGiovanni_blue ) (facetvalues)^(number of facets) , make a number  
> of graphs equal to the triples (in case of fuzzy trust values for  
> example) or simply duplicate triples once per aspect. (the same triple  
> should appear in the giovanni graph AND in the red graph and of course  
> i should remember where to delete it from the red graph when giovanni  
> revokes it as well).
> is there a best practices suggestion for this already?

I myself have no hard experience in this area, but for what it's
worth, if I were approching this problem tomorrow, I would maintain
named graphs according to the source/management of the data, and
as needed, infer other graphs (various intersections) by rules
or other machinery.

As for global trans-graph operations, that would be a matter
of controlling visibility/significance of graph boundaries
on operations, such that one may have an API function which
identifies/deletes triples either within a specific graph
or irregardless of their occurrence in any particular graph.

Thus, most/all of the above would be functionality I would
encapsulate in an API/Toolkit for working with named graphs,
and not try to capture explicitly/persistently in the
knowledgebase.

>
>>
>> I do expect/hope to see knowledge portals supporting
>> both SPARQL query interfaces as well as URIQA query
>> interfaces, and agents can then benefit the most from
>> both, either asking /sparql?query=... or /uriqa?uri=...
>>
> without a way to specify the description kind i guess it will be hard  
> that URIQA will be practically supported. Which is a pity..

Not sure why you think so. URIQA support is actually pretty
simple, and any SPARQL implementation will have all the
machinery for also providing a separate URIQA interface to
the same knowledge base.

And I expect that most SPARQL implementations will be
created atop otherwise stand-alone knowledge bases, and
thus support for multiple portals to that knowledge will
be the norm. And in fact, we already see alot of knowledge
servers already providing multiple portals based on
alternative query models.

I simply see SPARQL as something complementary to URIQA,
for when more expressive queries are needed. While it would
have been great to have been able to express CBDs in SPARQL,
I don't see the inability to do so as a major setback, just
a minor inconvenience. I also hope that in time, given
sufficient deployment and experience with SPARQL services,
that the industry at large will appreciate the broad utility
of CBDs and even standardize on that form of description
as a default response to DESCRIBE. We'll see...

>  since serving CBDs (or RDFN, MSGs) can be proven to be scalable  
> similar to today's web,  each request  having a computational  
> complexity at worse proportional to the number of blank nodes.  
> However, it can be fully cached ( with a size just a factor of the  
> original graph) and efficient caching update algorithms are possible  
> (you can have a reverse index on the URIs it touches so when someone  
> inserts a statement the cache can be recalculated). On the contrary  
> letting people execute arbitrary sparql at your server seems to me  
> hardly sustainable in the open once the SW moves from good faith  
> aggregation hackers to the real world...no?

Well, there are alot of benefits for implementing/deploying/using
URIQA rather than SPARQL *if* you do not need the additional
functionality that SPARQL provides. I.e. there will be many web
servers for which SPARQL is not only overkill, but a burden to
provide, yet for which simple CBDs of resources accessible via
that server would be highly useful.

At the same time, there will be servers for which SPARQL is
ideal, and absolutely necessary.

We have many instances of both kinds of servers, and we utilize
both forms of query support in various combinations.

It's not either-or. It's which (or both) are best for a given
application.

Cheers,

Patrick


>
> Giovanni
>
> [1]  Early version  
> http://giovanni.ea.unian.it/semanticweb/submissions/ELPUB2005/ 
> rdftef.pdf
>
>
>
Received on Friday, 3 June 2005 04:43:19 UTC