Re: Associations in RDF from Sampo Syreeni on 2002-07-21 (www-rdf-interest@w3.org from July 2002)

From: Sampo Syreeni <decoy@iki.fi>
Date: Sun, 21 Jul 2002 21:48:06 +0300 (EEST)
To: <MDaconta@aol.com>
cc: <www-rdf-interest@w3.org>
Message-ID: <Pine.SOL.4.30.0207212040370.11453-100000@kruuna.Helsinki.FI>
On 2002-07-19, MDaconta@aol.com uttered to decoy@iki.fi:

>how does an RDF processor distinguish the relations versus the
>attributes WITHOUT looking at its referent?

It doesn't. Why should it? An attribute is just a mapping to a value space
which happens to different from URIs. And not that different at that,
considering data: URIs.

>So, now if I am taking this chunk of RDF and turning it into Java
>objects -- I model this by having two entities one with 1 attribute
>(creator) and 1 with 2 attributes (name,email). In essence, losing the
>possibly critical fact that creator is a relation ...  in a sea of
>entities.

This might be a problem when coding RDF processors in Java. It is not an
RDF problem, however. What RDF is about is labelled graphs. Binary
relations will most often be modelled as a bipartite subgraph, with
triples labelled with the predicate and subject and object serving as the
only data fields. If you look at the representation, neither side of the
relation need even be explicitly declared. The relation clearly stands on
its own.

The same holds of RDFS, too -- no property is declared to be a part of
either the range or the domain. Again, the only reason we view such
relations as having been degraded into attributes is that the relevant
specs occasionally depict them that way and that the XML serialization
looks like the subject is somehow more tightly bound to the properties
than the object. The only time the principle breaks down is when one
considers putting literals on the left side of a relation.

>So again, why is this important ... modeling action verbs is a clear,
>intuitive value proposition that is not well covered by XML Schema ...

I just cannot see why this is. If you want a simple, binary relation, you
can declare it quite nicely. Separately, if that pleases you. If you want
to speak about the relation, you may as well reify its instances, giving
them URI's, and talk about them. If you want n-aries, you sorta-reify
those instead, and talk about them. It's all something RDFS does just
fine, when you think about it.

>Do you have any good RDF examples that move beyond Class/attribute
>modeling?  The only one I have seen that even tangentially touches this
>area is the foaf:knows stuff.  That is a cool start.

It is indeed. So the problem is perhaps more in the way people currently
use RDF than in the lack of expressive power. foaf:knows is a fairly
typical many-to-many binary relation, whereas most of the properties one
is bound to see nowadays are one-to-one. I'm certain one-to-many and
many-to-many relations become more common as more RDF apps are thought of.

>I think the exercise would be useful but I would prefer to see it
>"built-in". Because, in my mind, this brings me back to decoy:Relation.

OK. Just for the fun of it I'll put the association stuff up on
http://www.iki.fi/~decoy/shared/meta/rdf-stuff .

>In other words, it must be made explicit that an RDF processor should
>not subordinate a relation unless instructed to do so by the governing
>application. I don't see how this can be done by layering.  Also, since
>RDFS is not yet a recommendation ... this is the time to make such a
>change.

If I'm not entirely mistaken, not one RDF processor subjugates properties
as of this moment. I also cannot see why RDFS would change that -- RDF
infra is something you will implement far before meddling with RDFS in any
way. I wouldn't oppose some explicit verbiage in the RDFS spec to point
out that properties are separate from their domains and ranges, or that at
the triple level, we deal with something quite different from, say, HTML
META elements. But that's just about it. I can't help but view anything
more as quite far fetched and unnecessary.

>I also see the addition of rdfs:Relation as a minor change on the order
>of rdf:Property.

This is far from true. Embedding such a construct at the level of the data
model would be quite out of the question -- RDF is, and will be, about
labelled graphs. Sets of triples, that is. What you want to do, you will
always have to layer on top of that. We can argue about the proper level
where this should take place, but not much more. Besides, when we talk
about relations with attributes, we are implicitly talking about
reification of sorts. That's an area where I wouldn't be surprised to see
even the very basics moved outside of M&S.

>Again, I disagree on the issue of subordination.  I think that
>subordination is a key component of all assertions.  It is the
>foundation for natural language constructs like paragraphs and
>sentences. It is too important to have it be ambiguous (in other words,
>is this Object subordinate to the Subject or not?)

It never is, in RDF. At the very most the association might have some
extraneous, schema-imposed constraints like being one-to-one, or
human-readable semantics which somehow correspond to your notion of
subordination. But that's it. RDF is neutral on the issue.

>In my mind, by even using the term Property ... you have created an
>implied subordination.

Here I can agree with you, as I suspect many would. However, it's far too
late to do anything about the name, now.

>Thus, do not make it implied ... use Property where it is so and use
>Relation where it is not.

Relation wouldn't be suitable, either, since it wouldn't cover non-binary
relations.

>My assumption is based on "relations" being the "killer-app" of RDF and
>thus the # of them increasing significantly.  Only if that occurs, would
>optimizations based on that fact be warranted.

Precisely. Again I encourage you to draft and submit a W3C note with a
relation ontology, if you believe the issue to warrant that. If such a
thing does become highly successful, I have no trouble seeing it evolve
into a REC. This sort of thing is pretty much the way to go with IETF and,
supposedly, W3C -- development, adoption, experience and only then
standardization.

>This leads us to: "yeah, if it is so elegant how come you can't express
>it in a simple way?"

But you can. It's called N-Triples.

>I believe there is ambiguity in determining whether two chunks of RDF
>are equal if they are serialized differently.  Are there any tools that
>make this statement incorrect?

Not at the moment, no. Building one on top of a decent RDF toolkit is just
about trivial, however. It entails implementing RDF MT closure rules
(which many of us have done, for our own purposes), sorting the result and
searching for differences. After taking care of such minor nuisances as
anonymous nodes, it isn't too hard.

>Additionally, I was not solely talking mathematical ambiguity ... just
>the informal ambiguity for an adopter to choose a serialization format
>should not exist.  I think multiple serialization formats is the
>antithesis of standardization.

So do you also think multiple programming languages are that? They are all
ways of serializing a finite state machine, after all.

This seemingly naive point is highly generalizable, too. Most of computer
science (and science as a whole, too) is about building levels of order,
or taxonomies. In this context having a description of something at one
level should be viewed more as a foundation to build higher level
descriptions on, not the be all and end all of knowledge. I think the RDF
layer cake takes a very reasonable approach in this light. N-Triples, N3
and RDF/XML take their places in it quite nicely -- N-Triples is the
lowest level, braindead serialization. You use XML for interchange and
higher level serialization work. N3 is the hackers' testbed.

>However, when all your examples of assertions are simply to create
>Class/attribute instances you are using assertions to create Containers.
>I assert this class has this property. Thank you but we have that in XML
>Schema.  Thus, your examples demonstrate that you don't know what you
>want to do with the infrastructure by creating a redundant capability.

If that is widely perceived to be a problem, I don't see any problem with
a mapping permitting DTD's to be used as type declarations in RDF.
However, I would contend RDFS+DAML gives you an idiom far better suited
for information modelling. (E.g. there is no notion of distributed
inheritance in the document type world.) I would view these two
applications as separate, with surprisingly little overlap in capability.

>Unfortunately, I don't see the tech titans agreeing with you.  When
>Microsoft releases the MS Office formats as RDF ... I will say you made
>a smart bet.

I would tend to think MS isn't likely to go with standards. Elegant ones
even less. They're more likely to embrace and extend. But that isn't
because RDF is flawed. It's because there are things at work here which
have little to do with information representation. Those things work in
the direction of MS forgetting about standards and smaller entities
following them.

So, I would bet my (hypothetical) business on RDF, but would also bet on
MS betting their business on something else.
-- 
Sampo Syreeni, aka decoy - mailto:decoy@iki.fi, tel:+358-50-5756111
student/math+cs/helsinki university, http://www.iki.fi/~decoy/front
openpgp: 050985C2/025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2
Received on Sunday, 21 July 2002 14:48:40 UTC