RE: Does the QL allow for false or contradictory datasets? from Seaborne, Andy on 2005-01-24 (public-rdf-dawg-comments@w3.org from January 2005)

From: Seaborne, Andy <andy.seaborne@hp.com>
Date: Mon, 24 Jan 2005 17:57:10 -0000
To: "Dan Brickley" <danbri@w3.org>, <public-rdf-dawg-comments@w3.org>
Message-ID: <8D5B24B83C6A2E4B9E7EE5FA82627DC974FF59@sdcexcea01.emea.cpqcorp.net>
Dan,

Thanks for the comments - I have addressed comments which related to the
SPARQL QL document.

-------- Original Message --------
> From: Dan Brickley <>
> Date: 17 January 2005 10:46
> Subject: Does the QL allow for false or contradictory datasets?
> 
> Peeking at http://www.w3.org/2001/sw/DataAccess/rq23/
> "10.3.2 Identifying Resources" I see evidence of an assumption
> that the data we're querying will be a true description of the
> world, or at least not self-evidently contradictory.
> 
> 	The property foaf:mbox is defined as being an inverse function
property
> 	in the FOAF vocabulary so that this query will retrun
information about
> 	at most one person.
> 
> Since we don't (do we?) assume DAWG services will all be OWL-capable,

There is no assumption that a SPARQL query service is, or is not,
OWL-capable.

> I can imagine scenarios where this paragraph doesn't hold. I'm
> interested in the thinking behind the 'so' --- is some form of
> OWL reasoning being assumed? is good/clean/tidy/true data
> being assumed? Is SPARQL supposed to be usable against both
> OWL-smart and OWL-ignorant systems?

People may read the FOAF spec and see that that foaf:mbox is IFP.  This
creates an expectation and a SPARQL query service may or may not live up
to that expecation.  Services that provide something useful will thrive;
those that do not will find themselves unused.

SPARQL does not assume IFP processing - the section you quote is an
example of what might happen and goes on to say what happens when there
are two or more query solutions (not sure which version you quote).

> 
> (typo btw, s/retrun/return; sorry, I realise this is an editor's
draft,
> but I think my question general enough to be worth sending).

Thanks - fixed.

> 
> The dataset could (inaccurately, but most databases contain
> errors) associate the same foaf:mbox value with several different
> people and their descriptions. And some datasets might very usefully
> contain false claims...
> http://rdfweb.org/people/danbri/2001/12/puzzle/unicorny.html is
> an example (confusing; Unicorn was a bad example) of doing
> so for the purposes of image annotation --- ie. an RDF graph
> might describe the scene depicted in some work. Family tree
> (eg. gedcom) and other historical datasets are another obvious
> example, as are any where we care to track the source/provenance
> of our possibly flawed data (foafcorp info about companies
> being my favourite usecase here). SPARQL's 'source' facilities make
> such applications pretty likely. I don't have a rewording
> suggestion for the text I quote above as I don't know what
> the WG's design is.

Certainly, in the real world there is use of properties in ways that do
not meet their definition.  I don't see what SPARQL can or should do
about that.   SPARQL SOURCE (there is about to be a reworking of
keywrods - this will be come GRAPH as the keyword is now free) allows
for different graphs to make separate claims.  It is possible for an
application or a service to misuse this but I don't see that anything
else is possible.

(Not sure which version you refer to here - the working graph and the
currrent editors' draft have significantly different designs - we are
going with the design where there is no assumed merge of graphs.  In
general direction, the design is as the editors' working copy but there
are some revisions in the light of the face-to-face meeting we have just
had).

> 
> One other Editorial/wording comment, for 10.3.2 in the editor's copy,
> http://www.w3.org/2001/sw/DataAccess/rq23/
> 
> 	"This enables resources which are bNodes to be identified"
> 
> ...perhaps confuses the thing in the world with the thing in the RDF
> graph that stands for it.
> This btw is why RDFCore introduced the
> term "bNode" in preference to the previously-widespread
> "anonymous resource". Resources aren't intrinsically anonymous or
> blank; they might be URI-labelled or not in the possibly various
> RDF graphs which mention them. While we could argue bNodes, ie.
> bits of RDF graphs, are also things/resources, I expect the intent
> is not this, and we're talking about identification of the thing
> the bNode stands for.
> 
> I Suggest something like:
>      "This enables (indirect) identification of resources, regardless
>      of whether they are labelled with a URI in the graph(s) being
> queried." 

Noted and revised.  Thanks.

> 
> cheers,
> 
> Dan

	Andy

> 
> ps. I'm  likely to propose you drop 10.3/DESCRIBE entirely, so
> don't kill yourselves wordsmithing... ;)
> <ducks/>
Received on Monday, 24 January 2005 17:57:43 UTC