Does the QL allow for false or contradictory datasets? from Dan Brickley on 2005-01-17 (public-rdf-dawg-comments@w3.org from January 2005)

From: Dan Brickley <danbri@w3.org>
Date: Mon, 17 Jan 2005 05:46:27 -0500
To: public-rdf-dawg-comments@w3.org
Message-ID: <20050117104627.GB28196@homer.w3.org>

Peeking at http://www.w3.org/2001/sw/DataAccess/rq23/ 
"10.3.2 Identifying Resources" I see evidence of an assumption
that the data we're querying will be a true description of the 
world, or at least not self-evidently contradictory. 

	The property foaf:mbox is defined as being an inverse function property
	in the FOAF vocabulary so that this query will retrun information about
	at most one person. 

Since we don't (do we?) assume DAWG services will all be OWL-capable,
I can imagine scenarios where this paragraph doesn't hold. I'm 
interested in the thinking behind the 'so' --- is some form of 
OWL reasoning being assumed? is good/clean/tidy/true data 
being assumed? Is SPARQL supposed to be usable against both 
OWL-smart and OWL-ignorant systems?

(typo btw, s/retrun/return; sorry, I realise this is an editor's draft,
but I think my question general enough to be worth sending).

The dataset could (inaccurately, but most databases contain 
errors) associate the same foaf:mbox value with several different 
people and their descriptions. And some datasets might very usefully 
contain false claims... 
http://rdfweb.org/people/danbri/2001/12/puzzle/unicorny.html is 
an example (confusing; Unicorn was a bad example) of doing 
so for the purposes of image annotation --- ie. an RDF graph 
might describe the scene depicted in some work. Family tree 
(eg. gedcom) and other historical datasets are another obvious 
example, as are any where we care to track the source/provenance 
of our possibly flawed data (foafcorp info about companies
being my favourite usecase here). SPARQL's 'source' facilities make 
such applications pretty likely. I don't have a rewording 
suggestion for the text I quote above as I don't know what 
the WG's design is.

One other Editorial/wording comment, for 10.3.2 in the editor's copy,
http://www.w3.org/2001/sw/DataAccess/rq23/

	"This enables resources which are bNodes to be identified"

...perhaps confuses the thing in the world with the thing in the RDF
graph that stands for it. This btw is why RDFCore introduced the 
term "bNode" in preference to the previously-widespread 
"anonymous resource". Resources aren't intrinsically anonymous or
blank; they might be URI-labelled or not in the possibly various
RDF graphs which mention them. While we could argue bNodes, ie.
bits of RDF graphs, are also things/resources, I expect the intent 
is not this, and we're talking about identification of the thing 
the bNode stands for.

I Suggest something like:
     "This enables (indirect) identification of resources, regardless
     of whether they are labelled with a URI in the graph(s) being queried."

cheers,

Dan

ps. I'm  likely to propose you drop 10.3/DESCRIBE entirely, so 
don't kill yourselves wordsmithing... ;) 
<ducks/>

Received on Monday, 17 January 2005 10:46:28 UTC