Re: comments on SPARQL Query Language for RDF

>Hi Pat,
>
>On May 29, 2007, at 1954, Pat Hayes wrote:
>
>><snip>
>>
>
>>However, I am at a loss to understand how you 
>>refer to these 150,000 graphs if you have no 
>>way to name them. How do you even know how many 
>>you have? 
>>
>
>Each of the graphs consists of triples extracted 
>from a different document.  The document might 
>be identified by a file name, or a message ID,
>a documentum identifier, or whatever.  The quads 
>for that document share a common context 
>argument; a blank node.  The same
>blank node appears in subject position to record 
>provenance assertions about the graph (which 
>document, which extractor used,
>time of extraction, etc).

That works as long as everything is inside the 
intended scope of the blank node identifier, 
which is usually a document. BUt a query is not 
usually inside the same scope as the graph(s) 
being queried, so to use the blank node as an 
identifier in the query is (usually) impossible. 
You can, if you want, invent a new rule about 
blank node id scoping (the DAWG considered such a 
move but rejected it eventually as too 
complicated to be practicable) so that blank node 
ids have a wide enough scope, extending over 
several documents, to allow them to be used as 
identifiers in this way. But I see very little 
difference between this use of blank node IDs as 
names and the use of URIrefs as names. Reduced 
the essential syntactic essentials, it amounts to 
a choice of identifier prefixes between '_:' and 
':'

>
>>(It sounds from your description that you are 
>>in effect treating the provenance as *being* 
>>the name of the graph. Does that perspective 
>>help reconcile things?
>>
>
>You still seem to feel a need to name each graph.

There is a need to somehow identify a graph with 
a publicly usable - what shall I call it, as you 
dislike "name"? Mark? Identifier? Label? - mark, 
if you expect to be able to refer to it elsewhere 
than the place where it is stored or recorded. 
Such as in a query, for example.

>  Rows in a relational table don't have names and are still identifiable.

How? Suppose there is a table called FOO and I 
want to refer to a row in it. What do I say to 
identify that particular row? Give an index 
number? Then the pair of FOO + number *is* the 
name of the row.

>  The same goes for graphs;
>names are unnecessary, and not particularly useful.
>
>>>
>>There has to be some way for the query to refer 
>>to them. If you can think of way of doing this 
>>without somehow naming them, please explain it.
>>
>Hopefully, I just have.

No, you have simply told me about the names you 
use (" a file name, or a message ID, a documentum 
identifier, or whatever.") Those are names. 
"Name" just means some piece of text which serves 
to identify something. It is impossible to 
identify something which does not have some way 
of being identified: that way IS a name.

>
>>
>><snip>
>>
>>OK. Do you always query against the same set of unnamed graphs?
>>
>
>Yes in the short term.  New graphs are introduced on a continuing basis.
>
>>If so, you can treat this as a single graph for 
>>purposes of defining a SPARQL query answer.
>>
>
>Its a single graph if we ignore provenance, but 
>not if we take provenance into account when we 
>query.

How will you take provenance into account when 
you query, if the provenance isn't represented in 
RDF? It sounds like you are doing something that 
simply does not fall in the purview of SPARQL.

Pat
-- 
---------------------------------------------------------------------
IHMC		(850)434 8903 or (650)494 3973   home
40 South Alcaniz St.	(850)202 4416   office
Pensacola			(850)202 4440   fax
FL 32502			(850)291 0667    cell
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes

Received on Wednesday, 30 May 2007 20:04:35 UTC