Re: Associations in RDF from Sampo Syreeni on 2002-07-19 (www-rdf-interest@w3.org from July 2002)

From: Sampo Syreeni <decoy@iki.fi>
Date: Fri, 19 Jul 2002 15:11:58 +0300 (EEST)
To: <MDaconta@aol.com>
cc: <www-rdf-interest@w3.org>
Message-ID: <Pine.SOL.4.30.0207191339470.20406-100000@kruuna.Helsinki.FI>
On 2002-07-18, MDaconta@aol.com uttered to decoy@iki.fi:

>The relation itself is not part-of or subordinate to either entity.

Quite. But you neglect the fact that in the RDF world, that's always the
case. If you view RDF through the triple model, you'll see that everything
is a relation. There is no subordination in the sense you're talking about
in RDF. RDFS and DAML do model classes the way you describe, but if this
is a problem, a higher level modelling vocabulary, including explicit
association types, will likely be used as the primary format, and
RDFS/DAML produced from that automatically.

>So, while I know you can model it with a "worksFor" attribute that is a
>reference to a Department object ... I would argue that makes your class
>brittle and logically incorrect.

And you would be correct. But the only reason you're seeing worksFor as an
attribute of Employee is because that's the way RDF's XML serialization is
built. When the relation is expressed in triples, you can just as easily
view the relation as separate, or as tied to the Department instead. The
same happens with attributes -- in RDF, everything is a binary relation,
and I'd be hard pressed to view the named subjects and objects as existing
in the sense that your Employee and Department tables do in an RDBMS.
Again, remember that in RDF naming something is as good as declaring it.

(Actually what I've wondered for a long time is, why have literals at all?
Somehow it seems to me that they're the only thing which bring
subordination, as you call it, into the picture. A property with literal
values becomes what you call an attribute, and cannot be dealt with in
even terms with the rest of the objects in the world. In my ideal world
any literal data would be declared separately, outside RDF proper, or RDF
would be factored into two separate parts, the triple model, only dealing
with URIs, and a declaration part which enables abstract, non-URL URIs to
be bound to literal data.)

>>But you can still model any such construct by willing a separate
>>association type into existence. You would model each "degreed"
>>association as a member of such a type, with properties (TopicMap people
>>would say facets, I think) hanging off the instance. [...]
>
>This is what I am asking for.

Want me to draft such a thing? It's just a couple of classes, some RDFS
salt, maybe with some DAML thrown in for flavor. I've been thinking about
trying my hand at a Petri net vocabulary anyway, and those definitely need
a complex association type.

>But I am saying that it is so basic that the "separate association type"
>should be an integral part of RDFS so that all tools will understand it.

Well, currently RDF is following a model where the individual
specifications are rather slim, and are stacked on top of each other to
form the infamous "layer cake", a Good Thing. I see no essential reason
why you couldn't build a association ontology, submit it as a W3C Note and
see whether it takes off. Beyond that, things much more essential to the
infrastructure (like logic primitives; DAML+OIL up till WebOnt WG was
born) are being developed outside of the RDF specs, so I don't see why
complex associations should be taken to be "foundational" in any sense.

I would tend to see complex associations as one of many ways to utilize
RDF, and not essential to all RDF applications. It doesn't add to the
power of the model, unlike DAML+OIL or TimBL's log: primitives, but rather
facilitates interoperability between those RDF apps which need complex
association types. In the context of the cake, that sort of thing usually
becomes part of the topping.

>Again, this is the idea but I would prefer not to have dc:relation,
>my:relation, your:relation ... etc. there should be a universal
>rdfs:relation that can be subclassed.

Perhaps I was being a little terse. What I meant was:

foaf:loves rdfs:subPropertyOf foaf:likes.
foaf:hates rdfs:subPropertyOf foaf:dislikes.
foaf:likes rdfs:subPropertyOf foaf:hasFeelingsAbout.
foaf:dislikes rdfs:subPropertyOf foaf:hasFeelingsAbout.

And so on. Then do you typing with foaf:hasFeelingsAbout, and the rest
will follow. That sort of thing works beautifully when there is a neat,
limited set of relation types to worry about. Lately I've been subclassing
FOAF properties this way in order not to break compatibility with the
existing FOAF eaters. You'll only run into trouble when the associations
are highly parametrized, without clear bounds on how many types there
could be, but your particular example would seem to fit the picture
nicely.

>Yes, but RDFS layers the concept of Classes on top of the S, P, O of the
>triple.

The point was, you can build an RDF store which knows about classes and
utilizes them in its internal schema and query optimization. For instance,
storing each rdfs:Class in its own table, with the structure dictated by
the relevant RDF Schema, setting a key constraint on any
daml:UnambiguousProperty and so on. You get the benefits of fast access
and strong typing, but do not have to add to RDF to accomplish that.
Win-win, I say.

>Thus, if indexed properly, a search for an Employee instance is not a
>search of the entire space (though I admit I don't know if RDF stores
>index this way or even at all).

Pure triple stores typically take the form of fully inverted, three column
tables. You'd expect the data to be primary sorted on the subject, too.
Searching such a structure for an instance, given a keyable property, and
then searching for all properties attached to the instance, is quite
efficient. (Insertions are less so, though.) But I'm not convinced there
is a huge performance win, here, until you have an extremely heavy duty
RDF application. Nowadays those are still rare.

>This is a separate issue -- but you really think the spec is elegant?

The model, absolutely. The syntax, far less so. That's why many of us work
in N3 and let CWM do the dirty work. ;)

>Instead, I would argue that it mixes metaphors (linguistics and OOP); is
>at times ambiguous where the rubber meets the road (due to multiple
>serializations)

There's no unambiguity, here, when you conceive of the thing in triples.
At the very worst you can always go for a minimal subset of the XML
serialization. That's perfectly legal and a subset not plagued by all the
complexities of the striped representation is relatively easy to find.
Beyond that, it's all triples.

>and is unclear about what it does best (resource description versus
>knowledge representation).

RDF itself does, and should do, neither. It's pure infrastructure like,
say, Unicode. With suitable schemata, you get the easy part, resource
description. With RDFS and DAML+OIL, you get shared ontologies. With
higher level vocabularies you get true knowledge representation, with FPL
and all that jazz. I think that's the way it's supposed to be, given that
many people will do FOAF only for a long period before getting into
something more exotic.

>As proof of the above, I would point to its lack of mainstream adoption.

I'd say that's more inertia than a sign of inherent weakness in RDF.

>Do you bet the business on RDF in its current form?

If I had a business to bet, I probably would. ;)
-- 
Sampo Syreeni, aka decoy - mailto:decoy@iki.fi, tel:+358-50-5756111
student/math+cs/helsinki university, http://www.iki.fi/~decoy/front
openpgp: 050985C2/025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2
Received on Friday, 19 July 2002 08:12:04 UTC