Re: Comments on SPARQL protocol document

Graham Klyne wrote:
> Kendall Clark wrote:
> 
>>On Fri, Sep 16, 2005 at 10:09:10AM +0100, Graham Klyne wrote:
>>
>>
>>>With reference to:
>>>http://www.w3.org/TR/2005/WD-rdf-sparql-protocol-20050914/
>>>
>>>I have not close-read this.  On a quick skim, two thoughts come to mind:
>>
>>
>>Graham, thanks for your comments. A few quick notes in response, though this
>>is only a personal response, not one from the WG:
>>
>>
>>
>>>(1) the SPARQL query language makes reference to the possibility of
>>>generating warnings under certain circumstances; cf.
>>>http://www.w3.org/TR/2005/WD-rdf-sparql-query-20050721/#construct
>>>
>>>In light of this, it might be appropriate for there to be some way to
>>>return any warnings along with the query results.
>>
>>
>>Not sure what you mean about including warnings with the query results. Do
>>you mean a warning in the protocol itself or including a warning in the
>>query results themselves?
> 
> 
> I read the QL spec as indicating that a warning might be generated in 
> addition to returning the query results, but as you say it is a bit 
> vague.  I did make a comment about this in my notes on the QL spec, so I 
> guess it's something to be worked out between QL and protocol?

A warning can be generated in a CONSTRUCT when a template is instantiated - a 
result is still generated.  This was something the WG explicitly discussed.

I would have thought that query processors may choose to generate warnings in 
other, helpful situations such as query debugging or encountering some 
condition that is unusual but can be (partially) recovered from.

Example: FILTERs tend to evaluate to false when unexpected conditions are 
encounters like wrong datatype or illegal lexicial form for the data type (I 
have seen this with xsd:dateTime particularly).  Emitting a warning is very 
helpful in tracking down why a query is returning fewer results than expected.
Warnings might arise when reading a graph via default-graph-uri.

Whether such warnings are accessible over the protocol is separable decision . 
  When streaming and when a helpful warning is to be delivered, the only 
mechanism we have is the result set <link> but that isn't nice.

	Andy

> 
> 
>>One way to read the QL spec ("a warning may be generated") is that one of
>>the protocol faults is returned instead of query results. But I agree that
>>it's vague and underspecified as-is. That may be intentional, but I'm not
>>sure.
>>
>>My personal inclination is to return one of the faults already defined, to
>>define another one, if necessary, or to add some kind of warning facility
>>into the results themselves.
>>
>>
>>
>>>(2) Security considerations
>>
>>
>>>Concerning anonymizing services, there may well be reasons for not
>>>providing client information.  E.g., I have recently been told that
>>>there is a principle in library systems that a reader should be able to
>>>access any publicly available information anonymously.  Making the
>>>logging of client identification a normative requirement seems to be in
>>>violation of this principle.
>>
>>
>>Well, it's a "should", not a "must", but I take yr point and will think more
>>about it. At the very least I suspect you are right that the prescriptivity
>>should be removed.
>>
>>
>>
>>>Concerning privacy, the normative "must take care that facts disclosed
>>>in or implied by query results do not violate applicable privacy ..."
>>>also seems to be over-prescriptive.  Again, I think that mention of this
>>>issue is appropriate, but I also think that the appropriate response to
>>>this is really a matter for the application, not the protocol
>>>specification.  I think it would be more helpful to suggest possible
>>>technical remedies.
>>
>>
>>Can you suggest some?
> 
> 
> Well, off the top of my head, without attempting to claim that any or 
> all are necessarily appropriate:
> 
> - providing authentication data to the back-end information service so 
> that it can enforce access control
> - encrypt the query results
> - using a P3P profile to control release of information (by some 
> unspecified means)
> - limit the amount of information returned (to restrain data mining attacks)
> - employing trust management techniques to mediate the release of 
> information
> 
> I would probably focus any such suggestions on security mechanisms, and 
> leave the policy specification/decision mechanisms to be 
> application-dependent.
> 
> The rest of this comment was an initial attempt to explore the space of 
> applicable security techniques, noting particularly that channel- or 
> object- based techniques might be appropriate, depending on the 
> application.  (It would probably be sensible to leverage as much as 
> possible of web services security considerations and profiles as is 
> deemed applicable.)
> 
> #g
> --
> 
> 
>>>In particular, I think it would be helpful to suggest ways in which
>>>authenticating information can accompany a query, so that the
>>>information service that is being queried can decide whether or not it
>>>is appropriate to release the requested information. 
>>
>>
>>Ah, yes, that should be mentioned in the policy section.
>>
>>
>>
>>>Also, suggest
>>>mechanisms for maintaining privacy of the query results.  I imagine such
>>>suggestions might simply be references to appropriate security protocol
>>>specifications, but I also note that security mechanisms at several
>>>levels might be applied (e.g. TLS/https, SOAP-level security
>>>mechanisms, MIME object security mechanisms, XML-level security
>>>mechanisms, etc.).  Does the working group have any view concerning what
>>>mechanisms are most appropriate for a general-use SPARQL query protocol
>>>implementation?
>>
>>
>>I can't speak for the WG, of course, but I don't remember much conversation
>>about that, specifically. That doesn't preclude us having a view, but I
>>don't know what it is. :>
>>
>>>Also on the subject of security considerations, I think it would be 
>>>worth mentioning the problems of spoofed server responses, and 
>>>suggesting use of mechanisms that allow the client to authenticate the 
>>>SPARQL query server and/or results.  It also occurs to me that the query 
>>>processor may need to be able to relay authenticating information from a 
>>>back-end or 3rd-party information source.
>>
>>
>>Okay, spoofing servers (especially via IRI hacks) also seems worth mentioning.
>>
>>
>>
>>>...
>>>
>>>Also, I note that the reference for RDF-Concepts contains a URI for the
>>>RDF-primer ("This version").
>>
>>
>>ACK.
>>
>>Cheers, 
>>Kendall
>>--
>>In times of universal deceit, telling the truth is a revolutionary act.  
>>					  --George Orwell
>>
> 
> 

Received on Friday, 16 September 2005 16:49:44 UTC