Is it a semantic problem? from Petko Petkov on 2005-06-18 (semantic-web@w3.org from June 2005)

From: Petko Petkov <p.d.petkov@gmail.com>
Date: Sat, 18 Jun 2005 09:39:10 +0100
To: semantic-web@w3.org
Message-ID: <703a6acc05061801395a3d9a6c@mail.gmail.com>
Hello everybody. First of all I hope that you don't get this post
wrong. I am a true fan of semantic web and RDF. However, I believe
that there are certain areas which need more attention.

Let me explain why:

RDF is an interesting tool that allows you to aggregate tones of data
providing the users with sort of database which can be queried. For
example if we want to know more information about the resource
http://www.w3.org we can run a simple query which will get all the
triples associated with the resource http://www.w3.org.

Now, usually when we have the following RDF file:
…
<item rdf:about="http://www.w3.org">
<title>Some Title</title>
<link>http://www.w3.org</link>
<description>Description Here</description>
</item>

<item rdf:about="http://www.w3.org">
<title>Some Title 2</title>
<link>http://www.w3.org</link>
<description>Description Here 2</description>
</item>
…

it will be processed like this:

http://www.w3.org title "Some Title"
http://www.w3.org link "http://www.w3.org"
http://www.w3.org description "Description Here"
http://www.w3.org title "Some Title 2"
http://www.w3.org link "http://www.w3.org"
http://www.w3.org description "Description Here 2"

Now, if we run a query that says give me all titles for the resource
http://www.w3.org it won't be a problem at all. However, searching for
all the titles of http://www.w3.org with the accompanied descriptions
is not possible, since we don't know which description belongs to
which title.

Some people may argue that this is not an issue but think about the
following situation. Let's say that I know John from john@w3.org and I
know his public key. I go to google to query the semantic web for some
information about http://www.w3.org. Google returns a list of
aggregated data. Because I am very paranoid, I trust only information
sources that belong to certain people. The file may look like this.

<item rdf:about="http://www.w3.org">
<title>Some Title</title>
<link>http://www.w3.org</link>
<description>Description Here</description>
<ns:writtenBy>Bob</ns: writtenBy >
</item>

<item rdf:about="http://www.w3.org">
<title>Some Title 2</title>
<link>http://www.w3.org</link>
<description>Description Here 2</description>
<ns:writtenBy>John</ns: writtenBy >
</item>

This  is not that secure, however lets presume that we trust only
sources that have the field writtenBy from the namespace ns which
contain the Literal John (probably writtenBy should contain a resource
such as FOAF).

http://www.w3.org title "Some Title"
http://www.w3.org link "http://www.w3.org"
http://www.w3.org description "Description Here"
http://www.w3.org post written by "Bob"
http://www.w3.org title "Some Title 2"
http://www.w3.org link "http://www.w3.org"
http://www.w3.org description "Description Here 2"
http://www.w3.org post written by "John"

We can find that Bob and John are speaking for the same resource but
we cannot find who says what. Is "Some Title" associated with Bob or
John. We don't know.

Usually this can be fixed by looking at the RDF structure as a DOM
tree. However, I don't believe that this is the solution since we
loose the wonderful aggregation flexibilities of RDF.

Although many of you may believe that this is not an issue, I would
like to see how such problems can be fixed.

Thank you very much for your time.
Received on Saturday, 18 June 2005 11:06:51 UTC