Re: UNSAID drafted and mapped to SQL from Thomas Roessler on 2004-12-20 (public-rdf-dawg-comments@w3.org from December 2004)

From: Thomas Roessler <tlr@w3.org>
Date: Mon, 20 Dec 2004 14:19:05 +0100
To: Pat Hayes <phayes@ihmc.us>
Cc: Giles Hogben <giles.hogben@jrc.it>, Rigo Wenning <rigo@w3.org>, Eric Prud'hommeaux <eric@w3.org>, public-rdf-dawg-comments@w3.org
Message-ID: <20041220131905.GA16109@raktajino.does-not-exist.org>

On 2004-12-18 21:58:34 -0800, Pat Hayes wrote, at
http://lists.w3.org/Archives/Public/public-rdf-dawg/2004OctDec/0534.html:

> The message that started the thread
> http://lists.w3.org/Archives/Public/public-rdf-dawg-comments/2004Nov/0016.html
> has an example that illustrates the point in its use case 2, the 
> financial institution that must not send its prospectus to
> customers in the US or Canada. For this institution to rely on an
> UNSAID query to ensure this rule was obeyed would be very risky,
> since in general the RDF content against which the query is being
> evaluated is not known to be complete with regard to citizenship
> information. It cannot be so known, except by special access to
> off-web information, as there are currently no Web protocols for
> communicating the fact that a source is complete in this way.

Indeed.  The same applies to the truthfulness of the information
contained in the RDF graph, or to the trustworthyness of information
about the graph's truthfulness that's transmitted inside the
protocol.  That's, obviously, not a reason to declare RDF and SPARQL
"very risky", and to drop them.

The point of applying UNSAID in the way described in use case 2 is,
precisely, that the graph that's queried is assumed to be
sufficiently complete for the querying party's purposes.  The
judgment whether or not this kind of assumption is "very risky"
(whatever this means) is not the protocol designer's to make, but
strictly a business decision made by the party that applies the
protocol.


In fact, the word "complete" is ambiguous here: While a graph may be
incomplete, in the sense that it lacks facts that are out there
(this is the notion of "incompleteness" that you apply to use case
2), the same graph may quite well be the querying party's complete
knowledge of facts at some point of time.  In this context, UNSAID
also serves to help a party know what it does not know.

Here's another use case, to illustrate this: Consider a party (say,
our bank) that knows it has partial information stored in an RDF
graph -- e.g., some social information (say, the grandmother's
maiden name) is only associated with some of the subjects (say, of
class account holder) in the graph. The party needs to collect this
information for all subjects of class account holder (say, due to
stricter money laundering legislation). UNSAID enables the bank to
acquire the missing information from those account holders for which
it is needed, and later on also enables sanctions against account
holders who do not provide it.


> If SPARQL contains UNSAID then it will be inconsistent with any 
> account of meaning which is based on the RDF/RDFS/OWL normative 
> semantics. This will not render SPARQL unusable, but it will place it 
> outside the 'semantic web layer cake' and probably lead to the 
> eventual construction of a different, and rival, query language for 
> use by Web reasoners.

Conversely, standardization of a too restricted version of SPARQL
(e.g., one without UNSAID) will drive applications to either
competing query languages, or to incompatible extensions that
provide the expressivity they need.

Note that this risk is not created by specifying a full version of
SPARQL, including UNSAID, and by additionally profiling some subset
of it that satisfies whatever assumptions you want to be able to
make.

Regards,
-- 
Thomas Roessler, W3C   <tlr@w3.org>

Received on Monday, 20 December 2004 13:19:08 UTC