RE: N3 contexts vs RDF reification from Lee Jonas on 2001-04-26 (www-rdf-interest@w3.org from April 2001)

From: Lee Jonas <lee.jonas@cakehouse.co.uk>
Date: Thu, 26 Apr 2001 17:31:35 +0100
To: "'Seth Russell'" <seth@robustai.net>, RDF-IG <www-rdf-interest@w3.org>
Cc: Tim Berners-Lee <timbl@w3.org>, Guha <guha@alpiri.com>, pat hayes <phayes@ai.uwf.edu>
Message-ID: <51ED29F31E20D411AAFD00105A4CD7A77114@zingiber.cakehouse.co.uk>
Seth Russell [mailto:seth@robustai.net] wrote:

>From: "Lee Jonas" <lee.jonas@cakehouse.co.uk>
>
>> Seth Russell [mailto:seth@robustai.net] wrote:
>>
>> >From: "Lee Jonas" <lee.jonas@cakehouse.co.uk>
>> >
>> >> I like the concept, but does it mean n+1 tuples? e.g.:
>> >> (stmtid1,p,s,o)
>> >> (stmtid2,references,ctx1,stmtid)
>> >
>> >Yes, but I don't see that as a problem, do you?
>>
>> Actually, it becomes n+n*m tuples (worse-case), e.g.:
>>
>> (stmtid1,p1,s1,o1)
>> (stmtid2,p2,s2,o2)
>> (stmtid3,ref,ctx1,stmtid1)
>> (stmtid4,ref,ctx1,stmtid2)
>> (stmtid5,ref,ctx2,stmtid1)
>> (stmtid6,ref,ctx2,stmtid2)
>>
>> This bothers me ever so slightly (though not a lot): my proposal would
not
>> fare much better at n*m:
>>
>> (ctx1,p1,s1,o1)
>> (ctx1,p2,s2,o2)
>> (ctx2,p1,s1,o1)
>> (ctx2,p2,s2,o2)
>
>Well yes my proposal definitely stores more records.  However, depending on
>implementation, it may actually store the same amount of data.  Consider a
>case where we are just concerned with context but don't want to do second
>order reasoning about that context, then we are permitted to factor all of
>the context arcs into another table and can eliminate the statement id on
>those arcs.  This may have the added benefit of making the SQL queries
>easier .. but I'm not sure yet.   So we end up with:
>
> (stmtid1,p1,s1,o1)
> (stmtid2,p2,s2,o2)
> (ctx1,stmtid1)
> (ctx1,stmtid2)
> (ctx2,stmtid1)
> (ctx2,stmtid2)
>
>Which stores exactly the same amount of actual data as your proposal.
>Please refer to the mentograph at
>http://robustai.net/mentography/SemStructure.gif
>

Although these all seem valid points, I think we are talking at
cross-purposes here.  I am refering to the number of tuples generated by an
RDF parser, whereas you are refering to the number and size of records
stored.

Both are valid concerns.

>> Arguably, you are storing the ephemeral statements, whereas all triples
>> generated from parsing the RDF/XML are *occurrences* of said statements.
>
>Yes.  I like to think of it as storing the ~ideal~ statements and then
>allowing different contexts to view them.
>

Yes, I like that mindset.
BTW, s/ephemeral/abstract in my last point.

>> >However, the extra tangible arc label assigning the triple to a context
>> >proves useful for other reasons in my system.  With it one can talk
about
>a
>> >statement being in a context ... and that statement being in a context
>...
>> >and so on .. and so on .. and so on.   With your system there is no such
>> >explicit arc of which we can speak.
>>
>> No, but that information is captured directly in the quadruple.  I can
map
>> quadruples into 3 tables (and vice-versa):
>
>Yes, I understand.  I think our methods are different only in the direction
>of the context arrow ..
>please refer to the mentograph at:
>http://robustai.net/mentography/orthoganilizingContext.gif
>

I like this diagram.  It is exactly how I picture the three alternative ways
of implementing higher-order statements in my mind.


>But, as acknowledged, with you proposal there is no arc label for context,
>therefore there is no explicit link to a property node for that concept.
It
>seems that whatever is done with your context will need to be hard coded
>into the system; it could not be specified via a schema.
>

Yes, this is the crux, see my summary at the end of this mail.

>But to be honest I think your technique has more aesthetic appeal ... it
>feels cleaner .. I like to think of context as a container ..... so now I
am
>wavering.   I had originally planned to do it your way, then somebody
>convinced me that drawing the arcs in the opposite direction would be more
>flexible.
>
>Help! .. Help! .. Help!  ...  decisions .. decisions ... decisions ....
>
>Seth
>

My understanding:

Summary
=======
There are at least two competing proposals for representing contexts in RDF.
The concept of 'context', although similar, differs slightly with respect to
'higher-order' statements, ('reification' and making statements about
statements).

Proposal 1: Non-RDFCore
=======================
Contexts are resources implemented in a layer above RDFCore (i.e. the RDF
M&S spec).  They may or may not be anonymous.  They contain links to
"Reified" Statements as defined in the RDF M&S 1.0 spec (i.e. a RDFResource
of type rdfs:Statement that is the subject of an rdf:Predicate, rdf:Subject
and rdf:Object statement).  Statements may be included within multiple
Contexts (i.e. there is a many-to-many relationship between Contexts and
Statements).

Tuples generated from parsing statement occurrences are triples as defined
in the RDF M&S 1.0 spec: (predicate * subject * object).  Context membership
is modelled with additional Statements in the RDF Model Layer.

Proposal 2: RDFCore
===================
Contexts are represented directly within RDFCore.  They may or may not be
anonymous.  Every Statement is associated with at least one Context.  If
none is specified they are associated implicitly with their containing
RDFGraph (e.g. as represented by a single RDF document), which is a
rdfs:subClassOf Context.  Contexts obsolete the explicit representation of
Statement "Reification" within RDFCore (as described in the RDF M&S 1.0
spec), as they provide a convenient, alternative way to specify
'higher-order' Statements (i.e. Statements about Statements).  Statements
may be included within multiple Contexts (i.e. there is a many-to-many
relationship between Contexts and Statements).

Tuples generated from parsing statement occurrences are quadruples: (context
* predicate * subject * object).  The triple (predicate * subject * object)
identifies the abstract (i.e. ~Ideal~) Statement it is an occurrence of.
Context membership is represented directly in RDFCore.

Proposal 1 Advantages
=====================
* Directly compatible with RDF 1.0
* Doesn't require significant departure from the RDF M&S 1.0 spec.
* Directly compatible with current RDF parsers.
* Doesn't break existing RDF applications.
* Other vocabularies can *choose* whether to support contexts or not.

Proposal 2 Advantages
=====================
* Directly compatible with N3
* Forms a superset of RDF 1.0 functionality; although not directly
compatible, it would be backwardly compatible with RDF 1.0 documents (due to
implicit RDFGraph context).
* A significant departure from the RDF M&S 1.0 spec in terms of representing
RDFGraph & Context directly and eliminating explicit representation of
"Reified" Statements.
* The implicit duality of a Statement (reified & non-reified forms) is more
expressive with fewer tuples, facilitating RDF consumers treating the same
Statement as reified, non-reified or both.
* Also solves problems related to: 1) malformed "Reified Statements" as
described in the RDF M&S 1.0 spec and 2) semantics of rdf:Description as a
collection of reified versions of Statements (i.e. when to reify a
Description).
* Contexts are modelled intrinsically, fundamentally reflecting that every
Statement is merely an assertion, determination of whether it applies and
whether it can be trusted to be "fact", based on both RDF Producer specified
provenance and RDF Consumer policy, becomes easier to implement.


Note that a database/parser/application that modelled RDF using Proposal 2
can map to RDF 1.0 compliant triples that represent Contexts as specified in
Proposal 1 (and vice-versa).  This is what I intend to do in my new database
application - in addition to RDF, I want to support N3 with its
expressiveness of Contexts intact.

regards

Lee
Received on Thursday, 26 April 2001 12:31:21 UTC