Re: Is it a semantic problem? from tpassin@comcast.net on 2005-06-20 (semantic-web@w3.org from June 2005)

From: <tpassin@comcast.net>
Date: Mon, 20 Jun 2005 15:06:05 +0000
To: Petko Petkov <p.d.petkov@gmail.com>, Danny Ayers <danny.ayers@gmail.com>
Cc: semantic-web@w3.org
Message-Id: <062020051506.9652.42B6DB5D000BD476000025B4220642461302079C9C0E9F9B@comcast.net>

Petko Petkov  wrote -

> My question was about how to group certain items in order to create a
> relation. For example:
...

> It is important to know that RDF is a language designed around
> statements; it is not XML. The best way to write RDF is to construct
> the statements first and then serialize them as XML or N3.
> 
> This was a very important lesson for me.

It is an important lesson.  But there is a deeper lesson - representing information in rdf is primarily an exercise in modeling, very much like the conceptual data model for a relational database.  Just as there are different valid ways to model the same domain, depending on what you want to do and what your modeling style is, there are different ways to model a given collection of data in rdf.  None if these ways is likely to be the one and only "correct" way, because there usually *is* no one right way.

Furthermore, all these models are just models, and as such do not have full fidelity to the real body of information to be modeled (except  sometimes in simple or toy problems).

So,

> Both of them require different types of queries although the
> statements above can be considered the same. 

...

and 

> I am not able to extract all the titles unless I run two queries:

I don't think it is realistic in the near future (if ever) to expect computers to figure out suitable queries to cover all likely modeling variations.  And when and if that should become practical, it will not be just a matter of writing sufficiently clever queries, it will come about because the software will be able to discover underlying equivalences between different models.  This will be a matter of graph transformations according to definite rules.

For example, in Conceptual Graphs, there are certain transformation rules. There are many examples of such transformations in John Sowa's various writings. We don't have them yet for rdf, at least not in any standard, widely agreed to way.

In the meantime, to reduce these kinds of problems, I think it is good practice to follow these guidelines -

1) Make your models as simple as possible.
2) Normalize (in the entitiy-relationship modeling sense) your models cleanly.
3) Group things that belong together "under" (attached to) a single node, which may be a bnode but does not have to be anonymous.  Such a grouping is similar to a row in a relational database.  Note that all items in a row should depend on (or modify) the primary key, but not each other.  The primary key of a row corresponds to the subject - e.g., to the bnode in question.

BUT, when you do this, be careful not to introduce denormalization by having these grouped things depend on each other.  Again, this is just good E-R normalization practice.

4)  It can be useful and enlightening to try to read your rdf fragments as sentences.  Start at any node, and try to read the graph out loud.  If it doesn't seem to make much sense, it probably isn't modeled very well.

Cheers,

Tom P

Received on Monday, 20 June 2005 15:06:13 UTC