- From: John Flynn <jflynn@bbn.com>
- Date: Mon, 20 Oct 2008 11:00:45 -0400
- To: "'ravinder thakur'" <ravinderthakur@gmail.com>, <semantic-web@w3.org>, <semantic_web@googlegroups.com>
The popular opinion in the community seems to be that the data for the Semantic Web will mostly come from large structured data sources. However, currently a large amount of the information on the Web is contained in unstructured form. One of the key reasons that large unstructured sources of data remains unavailable to the Semantic Web is that very little effort has been made to make it easy and compelling for traditional html web site developers to mark up their data in a way that it can simply be accessed via the Semantic Web. Both RDFa and HTML2 are addressing this issue, but there is still no simple way to html tag specific local web site data as instances of a widely used ontology located at a remote site. You might envision a generally accepted ontology on a domain such as "wine" that many of the individual html web sites on that subject would link their data to as instances. A capability to search that ontology could lead back to the marked up instance data, which might, in turn, give a compelling reason for the web site developers to go to the effort of making the changes to their web site. But, this could only happen if a very simple way is provided for them to mark up their data as instances of a remote ontology while also allowing the data to show up in traditional web browser. John -----Original Message----- From: semantic-web-request@w3.org [mailto:semantic-web-request@w3.org] On Behalf Of ravinder thakur Sent: Sunday, October 19, 2008 3:08 PM To: semantic-web@w3.org; semantic_web@googlegroups.com Subject: web to semantic web : an automated approach Hello friends, I have been following semantic web for some time now and have seen quite a lot of projects being run (dbpedia, FOAF etc) trying to generate some semantic content. While these approaches might have been successful in their goals, one major problem plaguing semantic web as a whole is the lack of semantic content. Unfortunately there is nothing in sight that we can rely on to generate semantic content for the truckloads of information being put on web everyday. I think one of the _wrong_ assumption in semantic web community is that content creators will be creating a semantic data which I think is too much for the asking from even more technically sound part of web community let along whole of the web community. It hasn't happened over last so many years and I don't see it happening in the near future. I think what we need to move the semantic web forward is a mechanism to _automatcially_ convert the information over the web to semantic information. There are many softwares/services that can be used for this purpose. I am currently developing one prototype for this purpose. This prototype uses services from OpenCalais(http://www.opencalais.com/) to convert ordinary text to semantic form. This service is very limited in what entities supports at the moment but its a very good start. I am pretty sure there will be many other good options available that might be unknown to me. The currently very primitive prototype can be seen at http://arcse.appspot.com. This currently implements very few of the ideas I have for this. This is hosted on Google's AppEngine so sometime gives timeout messages internally so please bear with this :). This automatic conversion however is not a simple task and needs work in lot in domains ranging form NLP to artificial intelligence to semantic web to logic etc. So thats why this mail. I will be more than happy if we can join together to form a like minded team that can work on solving this most important problem plaguing semantic web currently. Waiting for your suggestions/criticisms Ravinder Thakur
Received on Monday, 20 October 2008 15:01:18 UTC