Re: Named graphs etc from Pat Hayes on 2004-03-09 (www-archive@w3.org from March 2004)

From: Pat Hayes <phayes@ihmc.us>
Date: Tue, 9 Mar 2004 15:13:55 -0600
To: Patrick Stickler <patrick.stickler@nokia.com>
Cc: <www-archive@w3.org>, "ext Jeremy Carroll" <jjc@hplb.hpl.hp.com>, "Pat Hayes" <phayes@ihmc.us>, "ext Chris Bizer" <chris@bizer.de>
Message-Id: <p06001f30bc73bc1dd1ae@[10.0.100.76]>
>On Mar 08, 2004, at 16:06, ext Chris Bizer wrote:
>
>>Hi Patrick,
>>
>>>That said, I'm starting to appreciate some of Chris' arguments about
>>>all statements being asserted, no matter what.
>>>
>>
>>The argument isn't that they are all asserted, but that they are uncertain
>>until the user applies a trust function to them. I think it is a three step
>>process:
>>1. Graphs published on the Semantic Web are not asserted but uncertain to
>>the user.
>>2. Before a user does something with the information, he applies a
>>subjective and task-specific trust function (or policy) to the information.
>>There is a wide range of different functions possible which take provenance,
>>the autor's reputation, related information published by other authors into
>>account.
>>3. After applying the trust function, the user treats the information as
>>asserted, keeping in mind that there is still the risk that it is wrong.
>
>This seems to blur two distinctions that I've held as important (1) whether
>some statement is intended to represent a fact, or is simply in the form
>of a statement for whatever purpose, and (2) whether a given statement is
>trusted, based on the source of the statement.
>
>In the case of non-asserted statements, i.e. statements that simply are
>in the form of a statement but not intended to represent an actual fact,
>trust is not an issue (though it is for asserted statements about unasserted
>statements).
>
>What I thought you were proposing, and what seemed like a useful 
>generalization,
>is to presume that all triples are asserted (requiring other forms 
>of expression
>to capture unasserted statements, such as via RDF reification), and focus on
>qualification of the asserted statements (either individually or per 
>an entire graph).
>
>But basing the assertiveness of statements on functions of trust seems an
>odd approach and one that doesn't reflect the way people behave. People
>simply don't care about whether they trust unasserted statements (even if
>they care about whether they trust asserted statements about those unasserted
>statements).
>
>I saw some utility in presuming that all statements/graphs are asserted (which
>I think has some support in the RDF specs) and letting folks use reification
>and even RDF/XML serialization to refer to forms of statements which
>are not asserted.

I agree with all the above.

>>
>>>I still have some questions about how to "bootstrap" trust, such that
>>>it seems there must be some requirement for each graph to contain
>>>statements reflecting its source/authority (a signature perhaps?)
>>>otherwise, how do you anchor your trust in terms of a given graph?
>>>
>>
>>Not a strict requirement. I think a trust architecture shouldn't strictly
>>require anything but use all trust relevant information it can get.
>>
>>There are different possibilities how provenance information could be
>>attached to graphs:
>>1. The author of the graph attaches provenance information and might also
>>sign the graph.
>>2. The crawler (or other information access architecture) that collects
>>published information adds the information where it found the data.
>
>Right. But in both cases, you have bootstrapped the trust architecture
>with external or atomic machinery: either the signature or system-maintained
>provenance information.
>
>One thought that occurred to me is that it might be cleaner/better to
>bootstrap the trust architecture using a specialized vocabulary which
>applies *only* to the graph in which statements using that vocabulary
>occur.

that is essentially a 'this graph' naming function, right? "This 
graph asserts..."

>
>This could also serve to allow folks to specify whether a graph is
>asserted or not.

I'm coming to think that we need to provide for separating out a 
graph from its assertion. Example: A publishes and asserts a graph, 
but B wants to refer to it without (re) asserting it (maybe B wants 
to argue with it). So B must be able to refer to the graph and its 
content without referring to the assertion of the graph. So there has 
to be a conceptual distinction between the published (named) graph 
and its assertion.

(Quick side question: do named graphs allow one to publish a 
reference to a graph without publishing the graph (never mind about 
asserting)? that is, can I say " a graph called "ex:myprivategraph" 
exists but I'm not going to show it to anyone" ??)

Back to main thread. But unless we prove for some kind of 
publish-as-assertion, we can't get off the ground. So, suggestion 
(putting together various ideas, including Patrick's below):

0. Asserting is a relationship between documents (? resources? agents 
who publish documents? owners of resources? whatever....) and graphs.

1. Primitive asserting is done by (publishing) a document with an 
asserting tag.  and which which contains a(n instance of a) graph. 
That document asserts that graph.  You need an asserting tag to get 
the whole process started, so asserting isn't something that can 
happen by accident.

2. But, in addition, one asserted graph can assert another graph, by 
reference using the name. In this case the asserter of the first 
graph asserts the named graph. So I can assert something that is 
published and named, even if the thing I'm asserting wasn't asserted 
by its own publication. Someone else can deny it, maybe, and someone 
else can say something about it without taking any particular stance 
towards asserting or denying it. The graph expressing the content is 
just sitting there being referred to.

Note, if A asserts B and B asserts C, A does not necessarily assert 
C. Assertion is only done explicitly, not by transitive closure. On 
the other hand, if B says that A asserts C, and A asserts B, then A 
asserts C; that is, assertion *by A* is transitive.  Also, if A 
asserts B and B imports C, then A has asserted C.

3. A graph can do both things at once, if it wants to. That is, it 
can be asserted with a tag, and it can say of itself that it is 
asserted. This is a very unambiguous way to assert yourself.

It can also, BTW, say of itself that it does not assert itself, which 
is odd but not paradoxical, since asserting isn't transitive (unlike 
importing). Even if A does not assert itself, B can assert it; and 
this can make sense even if A and B are the same asserter (eg owned 
by the same owner doing the assertion.

The only contradictory case is a graph *tagged as asserted* which 
says explicitly that it does not assert itself. And in this case we 
could just say that this is an assertion of a contradiction, and 
assertions of contradictions are content-free. Note this is not a 
logical (= RDF-semantic) contradiction, since the crucial asserting 
tag isn't visible from within the RDF itself.

4. An assertion can be done by a pair of graphs, A which expresses 
the content (and is not asserted) and B which asserts A and is 
asserted, as in 2. above. But also, this can be modified/extended so 
that A tags B as its warrant, providing a kind of secure external 
assertion, a warranted handshake. Note that since each refers to the 
other, the warrant can only be used once in this case.

---

All of these cases provide for a separation of content from 
assertion, since  A can refer to B by name without being understood 
to re-assert B; this allows A to express doubt about another 
assertion, say. (If A wants to re-assert B, it can do so explicitly.)

I think this provides for most of what anyone could want. It can take 
existing usage and fix it up without changing any graphs (other maybe 
than to give them names: we can use their URLs for now :-)

Pat


>
>Thus, the name of the graph is used in statements within that graph, and
>statements using a special property, e.g. x:isAsserted would
>only be interpretable if the subject of the statement is the
>same graph containing that statement.

No, let it work for any pair. Then B can assert A even if A doesn't 
assert itself. Which makes perfect sense.

>E.g.
>
>:graphA {
>      :graphA x:isAsserted x:true .
>}
>
>would mean that :graphA is asserted. But the statement in the
>following :graphB
>
>:graphB {
>      :graphA x:isAsserted x:true .
>}
>
>would simply have no interpretation regarding assertion because
>the semantics of the predicate are only defined iff the subject
>of the statement is the graph in which the statement occurs.
>
>Similar properties, constrained to having interpretations only when
>the subject is the graph containing the statement, could be defined
>to express source, authority, signatures, etc.
>
>All of these special properties could be members of a special class,
>e.g. x:GraphProperty, which would be the focal point for defining the
>constraining semantics requiring the subject to be the graph containing
>the statement in which the property is used.
>
>Such a vocabulary/semantics would then allow folks to bootstrap both
>the assertion and trust machinery, and systems employing that machinery
>would have simple and consistent implementational requirements.
>
>It may also be convenient (or even necessary?) to define a special URI
>that has an interpretation similar to that of 'localhost', e.g.
>x:thisGraph so that folks can assert, sign, and specify provenence
>information without having to actually name the graph -- and the special
>URI is implicitly bound to a distinct blank node for each graph to
>keep its interpretation distinct between graphs.
>
>???
>
>If a system scraping knowledge from various sources encounters graphs
>without any qualification (source, authority, signature, etc.) whatsoever,
>then it can in another graph make qualifying statements about the
>scraped graph using another vocabulary -- and then employ both vocabularies
>together to ultimately decide about trust.
>
>>
>>This information can afterwards be used in trust evaluations like "Use only
>>data that has been signed by authors I know" or "Use all information, no
>>matter if it is signed and not matter from which source or author it
>>originates".  The first policy is obviously stricter.
>>
>>The attached WWW2004 poster describes these ideas in more detail.
>
>I'll have a look. Thanks.
>
>Patrick
>
>
>>
>>Chris
>>
>>Jeremy: You will get the paper outline this afternoon.
>><bizer-www2004.pdf>
>
>--
>
>Patrick Stickler
>Nokia, Finland
>patrick.stickler@nokia.com


-- 
---------------------------------------------------------------------
IHMC	(850)434 8903 or (650)494 3973   home
40 South Alcaniz St.	(850)202 4416   office
Pensacola			(850)202 4440   fax
FL 32501			(850)291 0667    cell
phayes@ihmc.us       http://www.ihmc.us/users/phayes
Received on Tuesday, 9 March 2004 16:14:01 UTC