- From: Pat Hayes <phayes@ihmc.us>
- Date: Wed, 30 May 2007 13:04:04 -0700
- To: Bob MacGregor <bmacgregor@siderean.com>
- Cc: public-rdf-dawg-comments@w3.org, "Eric Prud'hommeaux" <eric@w3.org>, "Richard Newman" <rnewman@franz.com>
>Hi Pat,
>
>On May 29, 2007, at 1954, Pat Hayes wrote:
>
>><snip>
>>
>
>>However, I am at a loss to understand how you
>>refer to these 150,000 graphs if you have no
>>way to name them. How do you even know how many
>>you have?
>>
>
>Each of the graphs consists of triples extracted
>from a different document. The document might
>be identified by a file name, or a message ID,
>a documentum identifier, or whatever. The quads
>for that document share a common context
>argument; a blank node. The same
>blank node appears in subject position to record
>provenance assertions about the graph (which
>document, which extractor used,
>time of extraction, etc).
That works as long as everything is inside the
intended scope of the blank node identifier,
which is usually a document. BUt a query is not
usually inside the same scope as the graph(s)
being queried, so to use the blank node as an
identifier in the query is (usually) impossible.
You can, if you want, invent a new rule about
blank node id scoping (the DAWG considered such a
move but rejected it eventually as too
complicated to be practicable) so that blank node
ids have a wide enough scope, extending over
several documents, to allow them to be used as
identifiers in this way. But I see very little
difference between this use of blank node IDs as
names and the use of URIrefs as names. Reduced
the essential syntactic essentials, it amounts to
a choice of identifier prefixes between '_:' and
':'
>
>>(It sounds from your description that you are
>>in effect treating the provenance as *being*
>>the name of the graph. Does that perspective
>>help reconcile things?
>>
>
>You still seem to feel a need to name each graph.
There is a need to somehow identify a graph with
a publicly usable - what shall I call it, as you
dislike "name"? Mark? Identifier? Label? - mark,
if you expect to be able to refer to it elsewhere
than the place where it is stored or recorded.
Such as in a query, for example.
> Rows in a relational table don't have names and are still identifiable.
How? Suppose there is a table called FOO and I
want to refer to a row in it. What do I say to
identify that particular row? Give an index
number? Then the pair of FOO + number *is* the
name of the row.
> The same goes for graphs;
>names are unnecessary, and not particularly useful.
>
>>>
>>There has to be some way for the query to refer
>>to them. If you can think of way of doing this
>>without somehow naming them, please explain it.
>>
>Hopefully, I just have.
No, you have simply told me about the names you
use (" a file name, or a message ID, a documentum
identifier, or whatever.") Those are names.
"Name" just means some piece of text which serves
to identify something. It is impossible to
identify something which does not have some way
of being identified: that way IS a name.
>
>>
>><snip>
>>
>>OK. Do you always query against the same set of unnamed graphs?
>>
>
>Yes in the short term. New graphs are introduced on a continuing basis.
>
>>If so, you can treat this as a single graph for
>>purposes of defining a SPARQL query answer.
>>
>
>Its a single graph if we ignore provenance, but
>not if we take provenance into account when we
>query.
How will you take provenance into account when
you query, if the provenance isn't represented in
RDF? It sounds like you are doing something that
simply does not fall in the purview of SPARQL.
Pat
--
---------------------------------------------------------------------
IHMC (850)434 8903 or (650)494 3973 home
40 South Alcaniz St. (850)202 4416 office
Pensacola (850)202 4440 fax
FL 32502 (850)291 0667 cell
phayesAT-SIGNihmc.us http://www.ihmc.us/users/phayes
Received on Wednesday, 30 May 2007 20:04:35 UTC