Re: web to semantic web : an automated approach

Maybe if we consider the GIGO principle from way back when programmers created code to run under MS-DOS, Windows and the Internet did not exist, and portable computers weighed 18 pounds or more, the basic fact in software engineering was garbage in garbage out.

You cannot expect information or more precise raw data to produce meaningful semantic content if it was never produced in a format allowing for semantic output.

We will have to live with the fact that maybe more than half of all "content" on the Web will never lend itself to conversion into useful semantic content.

Milton Ponson
GSM: +297 747 8280
Rainbow Warriors Core Foundation
PO Box 1154, Oranjestad
Aruba, Dutch Caribbean
www.rainbowwarriors.net (under revision)
Project Paradigm: A structured approach to bringing the tools for sustainable development to all stakeholders worldwide
www.projectparadigm.info (under construction)
NGO-Opensource: Creating ICT tools for NGOs worldwide for Project Paradigm
www.ngo-opensource.org (proposed project)
MetaPortal: providing online access to web sites and repositories of data and information for sustainable development
www.metaportal.info (proposed project)
SemanticWebSoftware, part of NGO-Opensource to enable SW technologies in the Metaportal project (proposed site: www.semanticwebsoftware.org)


--- On Sun, 10/26/08, रविंदर ठा <ravinderthakur@gmail.com> wrote:
From: रविंदर ठा <ravinderthakur@gmail.com>
Subject: web to semantic web : an automated approach
To: public-lod@w3.org
Date: Sunday, October 26, 2008, 2:59 PM

Hello Friends,


I have been following semantic web for some time now and have seen quite a lot of projects being run (dbpedia, FOAF, LOD etc) trying to generate/organize some semantic content. While these approaches might have been successful in their goals, one major problem plaguing semantic web as a whole is the lack of semantic content. Unfortunately there is nothing in sight that we can rely on to generate semantic content for the truckloads of information being put on web everyday. I strongly feel that one of the _wrong_ assumption in semantic web community is that content creators will be creating a semantic data. This I think is too much for the asking from even more technically sound part of web community let along whole of the web community. It hasn't happened over last so many years and I don't see it happening in the near future.



To really move the semantic web forward is a mechanism to device a mechanism to _automatcially_ convert the information over the web to semantic information. There are many softwares/services that can be used for this purpose.. I am currently developing one prototype for this purpose. This prototype uses services from OpenCalais(http://www.opencalais.com/) to convert ordinary text to semantic form. This service is very limited in what entities supports at the moment but its a very good start. I am pretty sure there will be many other good options available that might be unknown to me. The currently very primitive prototype can be seen at http://arcse.appspot.com. This currently implements very few of the ideas I have for this. This is hosted on Google's AppEngine so sometime gives timeout messages internally so please bear with this :).



This automatic conversion however is not a simple task and needs work in lot in domains ranging form NLP to artificial intelligence to semantic web to logic etc. So that's why this mail. I will be more than happy if we can join together to form a like minded team that can work on solving this most important problem plaguing semantic web currently.



Waiting for your suggestions/criticisms. And Happy Diwali too :)
 Ravinder Thakur


PS : I posted similar query here as well. That generated some good debate.

Received on Sunday, 26 October 2008 16:19:30 UTC