Re: Trust, Context, Justification and Quintuples from Chris Bizer on 2003-12-30 (www-rdf-interest@w3.org from December 2003)

From: Chris Bizer <chris@bizer.de>
Date: Tue, 30 Dec 2003 16:59:57 +0100
To: "Waggy" <waggy@yahoo.com>, <www-rdf-interest@w3.org>
Message-ID: <001f01c3ceed$f952f3c0$83ec2da0@chrisch4uuoi65>
Hi Waggy,

thanks a lot for describing your practical implementation experiences.
Is your prototype or a draft version of your paper online available?

> The second critical item to record about every statement is the source of
> the statement, a URI pointing to where this statement came from.
> Originally, I even called this item the, "context," of the statement, as
> you do, but when standardizing the model decided the formally defined
> Dublin Core concept of, "source," was appropriate.  When the statement is
> original work not derived from some other resource, the source item
> identifies the author (also by URI).
>
I think for many distributed applications recording the source of a
statement is important, but the source is only one of the potentially
interesting context attributes (you also mention dc:creator and dc:data
below). In order to keep thinks flexible I think it is better to model
contexts as RDF resources (e.g. instances of the class crdf:Context I use in
cRDF). This approach allows different applications can use different
vocabularies to describe the context attributes they are needing. For my
trust applications, I'm also planning to use a provenance vocabulary based
on Dublin Core, extended with some additional attributes.

> Although I originally assumed a statement ID would be necessary, it turns
> out it is not, though it can be convenient to assign one in some limited
> circumstances.  Specifically, if you need to formally track the creation,
> editing, and deletion of statements, say in co-authoring, author/editor,
> or draft/comment/revise/approve environments, statement IDs can be
> helpful.

Change tracking is one topic where statement IDs are useful. In the context
of trust enabled systems, statement IDs are also needed to capture
reputational information like "Fred believes a statement by Andy". So I
think there are arguments for even using quintuples instead of quads in
applications which need an efficient way to capture context *and* meta
information about statings.

> Otherwise, just keep in mind each triple is essentially its own
> identifier.  Although it can be said many times and in many ways,
>  { person:you ; holiday:Xmas ; funlevel:merry }
> is a unique sentiment.  Yes, it may be useful to specify all who made this
> statement, and when and under what conditions this was asserted, but the
> whole purpose of RDF is to unambiguously encode assertions.  (My apologies
> for the last sentence, I just copied it out of my recent notes; it uses
> XML namespaces for URI abbreviation.)  And, the dc:source recorded for the
> triple may already point to a resource containing the information you
> would otherwise cross-reference to the statement identifier.

I think you are mixing Statement and Stating here. Do you treat something
like :
{ person:you ; holiday:Xmas ; funlevel:merry; dc:source
http://www.example1.org }
{ person:you ; holiday:Xmas ; funlevel:merry; ; dc:source
http://www.example2.org  }

as one statement with two sources or as two different statings?
Jeremy was referring to this problem as "the old statement/stating
discussion" earlier in this threat. Maybe he can give us a link to some
documentation about the pros and cons of the two different views.

>
> For myself, I am currently prototyping draft/comment/revise/approve system
> for RDF triples and will add a statement identifier to the model as soon
> as it becomes necessary for the prototype to work well.  So far it doesn't
> need it, but I have just started developing the editing functions.
>

Are you developing your own proprietary RDF repository for your extended
triples or do you use standard software like Sesame or Jena for storing your
triples? Do you think your application could use a standard repository based
on quads?

Intellidimensions RDF Gateway is supporting quads and Dave Beckett has also
implemented something similar to quads into Sesame. If there are other
groups developing quad-based repositories, it would be great if they could
send me a note, so we can compare the different approaches before we start
implementing a cRDF repository.

Chris

> The other two items I have found useful to record for each triple,
> primarily for administrative purposes, are the dc:creator of the triple
> itself, and the triple's date and time of creation (dcterms:created).  At
> times it seemed a good idea to include other information, such as a
> primitive ordinal item (RDF sorting is not fun at the triple level) and
> for other details, but ultimately these four additions have proven their
> value.
>
> Oh, I also found it very, very helpful to use the same format for all
> resources recorded, including these four additional items and if used, the
> statement identifier.  (dcterms:created is a literal in dcterms:W3CDTF
> date/time format).
>
> -David E. Wagner II
>
> Chris Bizer asked for feedbacke about:
> ...
> > we did some brainstorming about trust, context and the justification of
> > query results and ended up with:
> > - an extended RDF data model based on quintuples (a triple plus two
> > additional elements: context and statement ID).
> > - a trust-oriented query language for this data model
> > - the concept of justification trees for tracking data provenance and
> > data
> > lineage.
> ...
>
>
> __________________________________
> Do you Yahoo!?
Received on Tuesday, 30 December 2003 10:46:22 UTC