Re: Statements/Reified statements from Jonas Liljegren on 2000-11-23 (www-rdf-interest@w3.org from November 2000)

From: Jonas Liljegren <jonas@rit.se>
Date: 23 Nov 2000 16:47:55 +0100
To: Graham Klyne <GK@Dial.pipex.com>
Cc: Sergey Melnik <melnik@db.stanford.edu>, ML RDF-interest <www-rdf-interest@w3c.org>, Wraf development <rdf@uxn.nu>
Message-ID: <87d7fmheqc.fsf@astral.paranormal.se>
Graham Klyne <GK@Dial.pipex.com> writes:

> At 09:51 AM 11/23/00 +0100, Jonas Liljegren wrote:
>
> >This means that instead of four, we have five:
> >
> >{ uri, pred, subj, obj, model }
> 
> I considered that approach for [1], but have preferred to use
> properties to create the association between statement-resource and
> context (model).  The above approach allows a given statement to be
> associated with only one context/model, where properties allow a
> given statement-resource to be incorporated into any number of
> contexts/models.  That seems very much more in line with the RDF
> philosophy of "anyone can say anything about anything".

This depends on if you look at it as a statement or a stating.
Anybody can state a specific statement but every stating is unique.


There are three special cases:

 1. Two URIs for the same statement:
    S1: [A] --B--> [C]  (M1)
    S2: [A] --B--> [C]  (M2)

 2. The same URI for diffrent statements:
    S1: [A] --B--> [C]  (M1)
    S1: [D] --E--> [F]  (M2)

 3. The same URI for the same statement:
    S1: [A] --B--> [C]  (M1)
    S1: [A] --B--> [C]  (M2)


This means that neither the triple, nor the URI can be used as the
unique key in the storage of RDF.  In the Wraf [2] DBI, I uses the
kombination of model and URI as the key.


What is the most efficiant way of storing data, while still allowing
any combinations?

I think that the most practical thing is to view the reified
statements as statings.  Since they are stated in diffrent models,
they will probably have diffrent URIs.  The common case will therefore
be that a statement belongs to just one model.


The Wraf is resource centric.  A resource has dynamic and static
properties.  Also the statements are properties.  Every resource is
said to belong to exactly one model.  This means that I will have to
represent the special cases 2 and 3 by expanding the statements to
their reification.

A previous version (alpha 3) allowed multipple models.  That solved
case 3, but not case 2.  In addition: what should I do if one of the
models changed and the other didn't?  This consideration led me to
conclude that it would be more efficient to just have one model and
group all the nasty cases together for special handling.  (The same
thing goes for literals and some other things.)


Even if I store statements as {uri, pred, subj, obj, model}, they have
an implicit representation in the RDF graph.  They are drawn as
reified statements contained in a model container.  Another
representation is to give the statement the property model.  (Wraf
will infere a lot of properties from other properties.)  


Case 2, above, has to be spelled out to the point there every URI only
belongs to one model. Here, we prefix generated URIs with G:

  G1: [S1] --type--> [Statement] (M1)
  G2: [S1] --subject--> [A] (M1)
  G3: [S1] --predicate--> [B] (M1)
  G4: [S1] --object--> [C] (M1)
  G5: [S1] --type--> [Statement] (M2)
  G6: [S1] --subject--> [D] (M2)
  G7: [S1] --predicate--> [E] (M2)
  G8: [S1] --object--> [F] (M2)



 Quoting statements and the question of truth
 --------------------------------------------

Instead of having a boolean for each statement, indicating if the
statement is a fact or just a reified statement, we could use models
and selections for denoting truth.

The idea is that nonfact statements are placed in another model, or
is given a special property saying that the statement is not
endorsed.  Let's say model M1 has statements serialized as:

  [S1] --type--> [Statement]
  [S1] --subject--> [A]
  [S1] --predicate--> [B]
  [S1] --object--> [C]
  [D] --E--> [S1]

( Could be drawn as { D E { A B C } }. )

One thought I had was to store like this:

  S1: [A] --B--> [C] (G1)
  G2: [D] --E--> [S1] (M1)
  G3: [G1] --type--> [Model] (G1)
  G4: [G1] --quotedIn--> [M1] (G1)


Models will have a bunch of metadata about the origin, date and other
things.  One model could include other models.  But the here invented
'quotedIn' can be used to remember that the statement was explicitly
included in the model.  The 'quotedIn' property is of no importance
for the reasoning.  But it can be used to recreate the model in
serialized XML.

What's considered to be true will depend on what model's you trust.


 Aliases
 -------

Another question is that of resource aliases.  G2 above may be
refering to the same statement as another URI.  If it's infered that
it's indeed the same *stating*, this will be marked up as that it's an
alias to the 'real' name.  Let's look at an example of one model with
the original stating and another model quoting that statement, without
knowing the stating URI:

 S1: [A] --B--> [C] (M1)

 S2: [A] --B--> [C] (M2)
 S3: [D] --E--> [S2] (M2)

The context of this may let the application infere that S1 and S2 are
indeed the same stating, thus adding:

 S4: [S2] --aliasFor--> [S1] (G1)

G1 is here the context depending on the inference rule and the
involved models.




Am I right in thinking that you probably thinking that I make all this
much more complicated than it ought to be? ;-)



 [1] http://public.research.mimesweeper.com/RDF/RDFContexts.html
 [2] http://www.uxn.nu/wraf/

-- 
/ Jonas Liljegren

The Wraf project http://www.uxn.nu/wraf/
Sponsored by http://www.rit.se/
Received on Thursday, 23 November 2000 10:46:51 UTC