Re: Associations in RDF from MDaconta@aol.com on 2002-07-19 (www-rdf-interest@w3.org from July 2002)

From: <MDaconta@aol.com>
Date: Fri, 19 Jul 2002 15:17:29 EDT
To: decoy@iki.fi
CC: www-rdf-interest@w3.org
Message-ID: <22.2bdb9569.2a69bfc9@aol.com>
Hi Decoy,

In a message dated 7/19/02 5:12:15 AM US Mountain Standard Time, decoy@iki.fi 
writes:
> >The relation itself is not part-of or subordinate to either entity.
>  
>  Quite. But you neglect the fact that in the RDF world, that's always the
>  case. If you view RDF through the triple model, you'll see that everything
>  is a relation. There is no subordination in the sense you're talking about
>  in RDF. RDFS and DAML do model classes the way you describe, but if this
>  is a problem, a higher level modelling vocabulary, including explicit
>  association types, will likely be used as the primary format, and
>  RDFS/DAML produced from that automatically.

Hmmmm.  I think you answer for me below in your discussion on literals
in RDF.  In that we do not have a single form of a triple:  we have

S --> P --> Object
and 
S --> P --> Literal

What I am asking for is to formalize this dichotomy.  This is a major fork in 
the road and unifying it under the term 
predicate and serializing it in an XML hierarchy pushes us towards the 
latter.  Let's look at an example
from the RDF M&S spec:

<rdf:RDF>
  <rdf:Description about="http://www.w3.org/Home/Lassila">
    <s:Creator rdf:resource="http://www.w3.org/staffId/85740"/>
  </rdf:Description>

  <rdf:Description about="http://www.w3.org/staffId/85740">
    <v:Name>Ora Lassila</v:Name>
    <v:Email>lassila@w3.org</v:Email>
  </rdf:Description>
</rdf:RDF>

Here we have a relation (Creator) between two entities and some 
attributes (Name, Email) of the "...staffId/85740" entity.  Whether I map this
out as triples in N3 notation ... or leave it as is ... how does an RDF 
processor
distinguish the relations versus the attributes WITHOUT looking at its 
referent?
So, now if I am taking this chunk of RDF and turning it into Java objects -- 
I model
this by having two entities one with 1 attribute (creator) and 1 with 2 
attributes (name,email).
In essence, losing the possibly critical fact that creator is a relation ... 
in a sea of entities.

But I am also talking about more than discovery.  I see the modeling of 
relations between
entities as the "killer value proposition" of RDF.  It is intuitive because 
when I talk about
some common sense natural language examples like:

Osama is-A terrorist
Osama blew-up a building

I am telling people that we need to model that "killer relationship" 
"blew-up" and that the bang
for the buck is that we really need to watch people who blow stuff up.  So it 
is important to create
ontologies that really model in detail this act of blowing things up so we 
can track it in all of its
manifestations (how do you blow things up, what are the degrees of blowing 
things up, how do we
know something has been blown up ... and possibly create asymmetric linkages 
that allow us to be 
proactive.  (Joe Schmoe bought stuff that can be used to blow things up, so 
Joe Schmoe may be a terrorist...)

So again, why is this important ... modeling action verbs is a clear, 
intuitive value proposition that is not well
covered by XML Schema ... I see this as the "killer justification" for RDF 
and think it should
be strengthened. 

> The same happens with attributes -- in RDF, everything is a binary relation,
>  and I'd be hard pressed to view the named subjects and objects as existing
>  in the sense that your Employee and Department tables do in an RDBMS.
>  Again, remember that in RDF naming something is as good as declaring it.

Understand your point on binary relations ... and I agree with it.  I believe 
you
are referring to the use of predicate logic.

motherOf(Mary, John)    // asserts Mary is the mother of John.

But I have not seen this exploited well.  Is it that we don't know how?
Do you have any good RDF examples that move beyond Class/attribute 
modeling?  The only one I have seen that even tangentially touches this 
area is the foaf:knows stuff.  That is a cool start.

>  (Actually what I've wondered for a long time is, why have literals at all?
>  Somehow it seems to me that they're the only thing which bring
>  subordination, as you call it, into the picture. A property with literal
>  values becomes what you call an attribute, and cannot be dealt with in
>  even terms with the rest of the objects in the world. In my ideal world
>  any literal data would be declared separately, outside RDF proper, or RDF
>  would be factored into two separate parts, the triple model, only dealing
>  with URIs, and a declaration part which enables abstract, non-URL URIs to
>  be bound to literal data.)

I agree that literals are problematic in the RDF model because they spoil
the elegance of the triple by creating a sort of major and minor triple.  
While it
is obvious that constants are necessary in knowledge representation ... it is
difficult for me to see RDF instances outside of the class/attribute model 
(i.e. 
rdf:type).  In other words, how do we create instances of assertions (or 
state that an assertion
is an instance of another assertion)?  Why is this important?  It is easier 
for
me to just use XML Schema for instances of classes.  So, why would I use
RDF for class instances??  I believe this is the center of the debate over 
RSS as RDF or XML Schema.  If RSS used more assertions (especially ones
with good action verbs i.e. relations) this would not be debated.
 
>  Want me to draft such a thing? It's just a couple of classes, some RDFS
>  salt, maybe with some DAML thrown in for flavor. I've been thinking about
>  trying my hand at a Petri net vocabulary anyway, and those definitely need
>  a complex association type.

I think the exercise would be useful but I would prefer to see it "built-in".
Because, in my mind, this brings me back to decoy:Relation.
  
>  >But I am saying that it is so basic that the "separate association type"
>  >should be an integral part of RDFS so that all tools will understand it.
>  
>  Well, currently RDF is following a model where the individual
>  specifications are rather slim, and are stacked on top of each other to
>  form the infamous "layer cake", a Good Thing. I see no essential reason
>  why you couldn't build a association ontology, submit it as a W3C Note and
>  see whether it takes off. Beyond that, things much more essential to the
>  infrastructure (like logic primitives; DAML+OIL up till WebOnt WG was
>  born) are being developed outside of the RDF specs, so I don't see why
>  complex associations should be taken to be "foundational" in any sense.

This is a good point to which I only have one exception: they need to 
be foundational because implied subordination of a relation can be 
semantically
incorrect.  In other words, it must be made explicit that an RDF processor 
should not subordinate a relation unless instructed to do so by the governing
application.  I don't see how this can be done by layering.  Also, since 
RDFS is not yet a recommendation ... this is the time to make such a change.
I also see the addition of rdfs:Relation as a minor change on the order of 
rdf:Property.
  
>  I would tend to see complex associations as one of many ways to utilize
>  RDF, and not essential to all RDF applications. It doesn't add to the
>  power of the model, unlike DAML+OIL or TimBL's log: primitives, but rather
>  facilitates interoperability between those RDF apps which need complex
>  association types. In the context of the cake, that sort of thing usually
>  becomes part of the topping.

Again, I disagree on the issue of subordination.  I think that subordination
is a key component of all assertions.  It is the foundation for natural 
language
constructs like paragraphs and sentences.  It is too important to have it
be ambiguous (in other words, is this Object subordinate to the Subject
or not?)  In my mind, by even using the term Property ... you have created
an implied subordination.  Thus, do not make it implied ... use Property
where it is so and use Relation where it is not.

>  Lately I've been subclassing FOAF properties this way in order not to 
break 
>  compatibility with the existing FOAF eaters. You'll only run into trouble 
when the associations
>  are highly parametrized, without clear bounds on how many types there
>  could be, but your particular example would seem to fit the picture
>  nicely.

I see your point about subclassing properties and agree that this nicely
handles the issue of degree of association.  It is the ramifications of 
"Property" that I dislike.  I think this is proven by the way we model a 
property in Object Oriented Programming --- clearly as an attribute.  
  
>  Pure triple stores typically take the form of fully inverted, three column
>  tables. You'd expect the data to be primary sorted on the subject, too.
>  Searching such a structure for an instance, given a keyable property, and
>  then searching for all properties attached to the instance, is quite
>  efficient. (Insertions are less so, though.) But I'm not convinced there
>  is a huge performance win, here, until you have an extremely heavy duty
>  RDF application. Nowadays those are still rare.

Good point.  I really don't have any evidence that this would be a 
performance 
issue.  My assumption is based on "relations" being the "killer-app" of RDF
and thus the # of them increasing significantly.  Only if that occurs, would 
optimizations based on that fact be warranted.
  
>  >This is a separate issue -- but you really think the spec is elegant?
>  
>  The model, absolutely. The syntax, far less so. That's why many of us work
>  in N3 and let CWM do the dirty work. ;)

In terms of adoption, I think multiple serializations is a no-win proposition.
As soon as you start even attempting to explain it ... eyes glaze over.  It 
is 
inherently unconvincing.  This leads us to: "yeah, if it is so elegant how 
come
you can't express it in a simple way?"
  
>  >Instead, I would argue that it mixes metaphors (linguistics and OOP); is
>  >at times ambiguous where the rubber meets the road (due to multiple
>  >serializations)
>  
>  There's no unambiguity, here, when you conceive of the thing in triples.
>  At the very worst you can always go for a minimal subset of the XML
>  serialization. That's perfectly legal and a subset not plagued by all the
>  complexities of the striped representation is relatively easy to find.
>  Beyond that, it's all triples.

I believe there is ambiguity in determining whether two chunks of RDF are 
equal if
they are serialized differently.  Are there any tools that make this statement
incorrect?

Additionally, I was not solely talking mathematical ambiguity ... just the 
informal
ambiguity for an adopter to choose a serialization format should not
exist.  I think multiple serialization formats is the antithesis of 
standardization.  
  
>  >and is unclear about what it does best (resource description versus
>  >knowledge representation).
>  
>  RDF itself does, and should do, neither. It's pure infrastructure like,
>  say, Unicode. With suitable schemata, you get the easy part, resource
>  description. With RDFS and DAML+OIL, you get shared ontologies. With
>  higher level vocabularies you get true knowledge representation, with FPL
>  and all that jazz. I think that's the way it's supposed to be, given that
>  many people will do FOAF only for a long period before getting into
>  something more exotic.

Interesting assertion but I don't agree because even infrastructure is 
designed for a purpose.  Your infrastructure stresses one thing over
another ... for example, I would say that the RDF infrastructure's 
main purpose is to create assertions while XML Schema's main 
purpose is to create a container (document type).  Assertions versus
Containers.  Simple and clear.  However, when all your examples of
assertions are simply to create Class/attribute instances you are using
assertions to create Containers.  I assert this class has this property.
Thank you but we have that in XML Schema.  Thus, your examples 
demonstrate that you don't know what you want to do with the 
infrastructure by creating a redundant capability.
  
>  >As proof of the above, I would point to its lack of mainstream adoption.  
>  I'd say that's more inertia than a sign of inherent weakness in RDF.

I hope you are right but believe it is more in line with my above
point about purpose.

>  >Do you bet the business on RDF in its current form?  
>  If I had a business to bet, I probably would. ;)

Unfortunately, I don't see the tech titans agreeing with you.  When Microsoft 
releases the MS Office formats as RDF  ... I will say you made a smart bet.

Best wishes,

 - Mike
----------------------------------------------------
Michael C. Daconta
Director, Web & Technology Services
www.mcbrad.com
Received on Friday, 19 July 2002 15:18:29 UTC