Re: Discussion-Paper: A Logical Interpretation of RDF from Wolfram Conen on 2000-09-04 (www-rdf-interest@w3.org from September 2000)

From: Wolfram Conen <conen@wi-inf.uni-essen.de>
Date: Tue, 05 Sep 2000 00:30:48 +0200
To: "McBride, Brian" <bwm@hplb.hpl.hp.com>
CC: www-rdf-interest@w3.org
Message-ID: <39B42298.B09A5A8F@wi-inf.uni-essen.de>
[This is an answer to Brian's last EMail]

Brian McBride wrote:

:> There are no (in the strict sense) anonymous resources in the 
:> RDF triple
:> model. One may argue that in a "flat" (this will be explained below)
:> semantic network/triple model, it is (only) the NAME that gives an
:> entity its existence. Anonymous resources seem to be a by-product of
:> "convinient" RDF syntax - on first (and second?) sight this induces a
:> "mismatch" between (triple) model and 
:> (graphical/serialization) syntax.
:
:I find the spec confusing on this issue.  I do agree that to have a 
:mismatch between the model and the syntax is unfortunate.  I
:resolved this by interpreting the spec to say that the model also had
:anonymous resources.  I am beginning to wonder whether I have over
:generalised what the spec actually says.
:
:My interpretation of:
:
:  The intention of this sentence is to make the value of the Creator
:  property a structured entity. In RDF such an entity is represented
:  as another resource. The sentence above does not give a name to 
:  that resource; it is anonymous, so in the diagram below we represent
:  it with an empty oval:
:
:from 2.1.1 of the spec is that the model does allow anonymous
resources.
:There are several subsequent references in the spec to 'anonymous bag
:containers' where the anonymous bag is a model construct.
:
:I'd like to see the argument about entities only existing when they
have
:a name.  I'll borrow an example from Dan Brickley - an entity can be 
:referred to by its properties e.g. the person who's email address is
:lassila@w3.org refers to an entity which to my knowledge has no URI.
:
:I think this is a grey area of the spec.  One of the great benefits
:an effort such as yours to formalise the specification is that it 
:highlights such issues.  It is for you to consider whether this is an
:issue that deserves such a spotlight.  I'd argue that it is more
:significant than the issue you raised over properties having multiple
:values.

Ah, ok. There is the "Basic RDF Model" (described in 2.1. with text and
graphs), than we have the different serialization syntax(es) and
finally, in Section 5, the "Formal Model of RDF" (first mentioning the
triples) shows up. We concentrated on the "triple representation" (which
should, in principle, make no difference in comparison to consider all 3
flavours of the model, because Section 5 says:

 " This specification shows three representations of the data model; as
3-tuples (triples), as a graph, and in XML. These representations have
equivalent meaning."

But it makes a little difference. I'll try to give the argument. A first
hint is given in the Appendix A (Glossary) with the defintiion of
Triple:

"A representation of a statement used by RDF, consisting of just the
property, the RESOURCE IDENTIFIER, and the property value in that order.
" 

A second aspect comes from the formal model itself: 
"1. There is a set called Resources. "

A third aspect stems from the definition of triples (Section 5 again):
"4.There is a set called Statements, each element of which is a triple
of the form  {pred, sub, obj} ... sub is a resource ..."

Now, as we are in a text-based world of discourse (considering the
triple model), we may conclude that we can not represent/distinct a
member of a set other than by giving it a NAME (completely in accordance
with the Glossary definition.) - now, however, it starts to become
tricky: if we have only one construct available to denote that an
entities is a member of a set and if this construct is "give the entity
an IDENTIFIER/NAME" than there is no way to express that something is
anonymous (i.e. "it has NO NAME") and is also member of the set. (ok,
you could divide the namespace into "NAMES for NAMED RESOURCES" and
"NAMES FOR UNNAMED RESOURCES", but this does not look to plausible...).
So, in essence, a "thing" can only be used in a triple of the form
{pred, sub, obj} if you can literally "write down" what it is (i.e. give
it a "NAME" - which does, in a sense, not only represent but "is" the
thing to be indentified). 

Hm, well, sounds probably a bit too "philosophical". The point I want to
make is not to say that there are no anonymous resources in the RDF data
model, but that, in a flat triple model (and as triples are members of
the set Statements and are, thus, not allowed to be inside of other
triples according to the formal model - so the triples "have" to be
flat) there is no (logically consistent) way to express anonymity. And,
writing this, I have to say that this, in a sense, also violates the
postulated equivalence of the above mentioned three representations of
the data model (what a name, by the way, data model ;).

This becomes relevant in a few places of Section 6, like:
"Each propertyElt E contained by a Description element results in the
creation of a triple {p,r,v} where: ... 2. r is the resource whose
identifier is given by the value of the about attribute of the
Description or a new resource whose identifier is the value of the ID
attribute of the Description, if present; ELSE THE RESOURCE HAS NO
IDENTIFIER."

Ok, but, in the triple model, r does not represent anything already
magically known in the triple model - no, the resource "behind" r IS its
IDENTIFIER. (In the graphical model, it is trivial to make a seperation
between named and unnamed resources -- simply leave the name out of the
oval - but this option is not present in the triple model - so, what do
you replace r for?).

However, some will find this discussion probably rather academic because
the "practical" solution of Parsers is to GENERATE names for "anonymous"
resources -- this way, a resource can be (represented) in the triple
model. Nevertheless, as we tried to argue in the last email, the
necessities for generating names could be diminished if a nested triple
model would be used.

:> 
:> :
:> :o (5) lit(o) => instanceOf(o, rdfs_Literal)
:> :
:> :  I was curious why there is no equivalent 
:> :
:> :      res(o) => instanceOf(o, rdfs_Resource)
:> :
:> 
:> res(o) => instanceOf(o, rdfs_Resource) should not be asserted
:> "automatically" because if it is NOT INFERABLE, this allow 
:> the detection
:> of domain/range constraint violations.
:
:Hmmmm, consider (forgiving the shorthand):
:
:<rdfs:Property rdf:about="http://foo/linksTo">
:  <rdfs:domain rdf:resource="rdfs:Resource"/>
:  <rdfs:range  rdf:resource="rdfs:Resource"/>
:</rdfs:Property>
:
:<rdf:Description rdf:about=
:    "http://www-uk.hpl.hp.com/people/bwm/index.htm">
:  <foo:linksTo rdf:resource=
:    "http://www-uk.hpl.hp.com/people/bwm/rdf/index.htm">
:</rdf:Description>
:
:Have the domain or range constraints been violated?  If I am 
:undertanding correctly, the model you have proposed would
:say yes, because there is no explicit type property in the
:model for the web pages which are the subject and object
:of the linksTo statement.

Violation depends on the "tripleization" of the above. In the paper, we
took the (easy and comfortable) point of view that the model we want to
reason about (check constraints, query it, extend it) is already given
as triples. To complement the discussion there, a detailed analysis of
"what serialization syntax leads to which triples" should be done (this
is, clearly, already encoded in SirPAC or Jan's Prolog-Parser). Ideally,
this would end in an commonly accepted TRANSFORMATION GRAMMAR (Jan's
grammatical rules could be a good start).

So, my answer is: if a "tripleization" makes the resource-type
information in the syntax above explicit, no violation would be
detected. If it will not be made explicit, one could consider this an
incomplete tripleization and a violation will (reasonably) detected.

One may, however, argue, that everything in subject or predicate
position (and everything not a literal in object position) should
"automatically" be considered to be an instance of rdf:Resource - if you
would like to have it this way (and you seem to want it ;), the rule you
suggested would be a nice solution (seems reasonable to me, we should
probably add it).

:If so, I disagree.  Anything identified by a URI
:is a resource and even though there are no specific type
:properties, no constraint violation exists.
:

Maybe it should read: anything identified by a URI can be a resource? I
say this because only if "something" is part of the model, it becomes a
resource in the model context. So, only if the URI is "recognized" (it
has to be in subject or predicate position somewhere - which is probably
not in accordance to "declaring" an object explicitly as resource as is
described in the RDF M&S document, oh well, if the type-resource
relationship would be made explicit, this would work again ;) as
something that identifies a resource (in the model), it becomes an
identifier (uri(x) predicate in our rules) for a "modeled" resource. So,
our statement would be "every x that makes the (logic) predicate uri(x)
true is considered to be a resource. 

:Good work.  Its great how it highlights issues in the spec.

Thanks, good to hear. Maybe something useful will grow out of it. More
of your valuable remarks are always welcome.

Wolfram and Reinhold

PS: Still reasoning about "Why are statements not a subset of resource?"
Received on Monday, 4 September 2000 18:20:45 UTC