Re: SemWeb CMS question from ben syverson on 2006-02-20 (semantic-web@w3.org from February 2006)

From: ben syverson <w3@likn.org>
Date: Mon, 20 Feb 2006 04:59:46 -0600
To: semantic-web@w3.org
Message-Id: <158A859F-4BE7-421F-AD03-A64A7238286F@likn.org>
Hi Daniel,

Your message raises some important questions. I'll address them inline:

On Feb 20, 2006, at 4:07 AM, Daniel Harris wrote:

> - I'm not sure but it seems that you have one name space in your  
> system. The way we are modeling Kendra Base is to have one name  
> space per user. All assertions made by a user are marked as such.  
> In the real world people say the same words but mean different  
> things so we need to allow for that.

There is one namespace, but it is collaborative. The names themselves  
are not the critical issue, since likn does not consider names to be  
unique. So if you say "python is an instance of programming  
language," qbot will ask something like "Are you sure? I though  
python was a type of snake. Does 'python' have another meaning?" At  
that point, you can abandon your own assertion and say "No, sorry,  
you're right," or you can say "Yes, python also is an instance of  
programming language."

 From that point on, users asking qbot about 'python' will get into  
its disambiguation routine. Once you've sorted out which 'python' you  
mean, you can refer to it throughout your session without having to  
disambiguate it every time.


> - Where is the issue of trust in terms of assertions made. Do I  
> trust all assertions in your system? Can I select the users which I  
> trust and the assertions which I trust? Both of which will impact  
> on the results of any query/search I make.

Well, this is certainly an area where I'm open to suggestion. Because  
every statement is reified internally, it's relatively simple to  
allow this. The question is whether the usefulness outweighs the  
tricky issues you'd face in creating an interface for this.

At the moment, I have a rough filter which reevaluates users who make  
consistently unpopular assertions. If you can't seem to agree with  
the community, qbot will start to respond to your assertions with  
something like "No one else seems to agree with you; would you like  
me to guide you through the tutorial again?" But I'm hesitant to  
block vandals altogether, as I think this could inadvertently block  
plenty of valid dissent.

The problem with ignoring users is that you get breakage between  
nodes. For example, if you read a bunch of UserX's posts, decide he's  
a complete wingnut, and ignore his assertions, it may break pre- 
existing connections that you rely on. Maybe UserX was the only one  
who asserted that "car" has the component "sound system," so even if  
you've described your own car's sound system to qbot, all of a sudden  
it will start to say "I wasn't aware that 'car' had a relationship to  
'sound system;' care to elaborate?"

Rather than selecting which users are "good" and which are "bad," I  
tend to prefer the democratic method of letting the assertions sink  
or swim based on the community's response. This is more or less what  
Wikipedia does, and with enough users, it seems to work okay.


> - I'm unsure how you have implemented "Everyone in the community  
> agrees, and the model shifts". Surely, everybody is just making  
> assertions. And they agree by making an assertion that looks like:  
> when I say blah1574 it means the same as when John476 says blah68547.

That's certainly a possibility, but not the only one. When you say  
"blah1574," Qbot searches the database for any direct matches. If it  
finds one, it assumes you're talking about that specific node (which  
is also a specific URI, conveniently enough). If it finds more than  
one, it will almost always ask you which one you're referring to  
before proceeding.

If someone has a definition of "light" that is either unworkable or a  
homograph (same spelling), you can simply tell Qbot something like:
"light is a type of electromagnetic radiation."
Qbot will protest:
"But I heard that light was a type of beer!"
"Yes, but light is also a type of electromagnetic radiation."

In practice that gets you a "fresh" version of light to work with.  
There's no capability to exactly say "when I say blah1574 it means  
the same as when John476 says blah68547," but you can say things like  
"color is the same as colour." I don't think individual people will  
make enough assertions to make it worth exposing all that reification  
to the user.


> - You talk about an and "outdated relationship" and "ambiguous"  
> relationships. To the author of those relationships they are most  
> likely current and unambiguous. It depends from whose standpoint  
> you are viewing the sea of assertions. So, I guess I'm unsure where  
> your standpoint is; where you're viewing from.

Very true, and this underscores how incredibly subjective all of  
these assertions are. Clearly one could always get more specific, or  
structure things in a different way. I think over the past few days  
I've been letting go of the idea that everything needs to be  
addressed hierarchically (the hair of the head of ben's body), and  
embracing the idea that "Ben's hair" should be a valid term as long  
as Ben has a component, possession or property named "hair"  
somewhere. This largely solves the problem of needing to maintain a  
1:1 relationship between the overall model/vocabulary/ontology and  
the individual objects within.

- ben syverson
likn
Received on Monday, 20 February 2006 10:59:53 UTC